استراتژی تصمیم گیری در بزرگراه برای خودروی خودران جهت انجام مانور سبقت‌گیری با استفاده از روش یادگیری تقویتی عمیق

نوع مقاله : مقاله پژوهشی

نویسندگان

دانشکده مهندسی مکانیک، دانشگاه صنعتی خواجه نصیرالدین طوسی، تهران، ایران.

چکیده

رانندگی خودکار یک فناوری جدید برای کاهش تصادفات رانندگی و بهبود راندمان رانندگی می‌باشد. در این پژوهش، یک سیاست تصمیم‌گیری مبتنی بر یادگیری تقویتی عمیق برای خودروهای خودران جهت سناریو سبقت‌گیری در بزرگراه ارائه شده است. برای این منظور ابتدا یک محیط ترافیکی بزرگراهی ایجاد می‌شود که هدف در آن عبور عامل از وسایل نقلیه اطراف با یک مانور کارآمد و ایمن می‌باشد. همچنین یک چارچوب کنترل سلسله مراتبی برای کنترل این وسایل نقلیه ارائه شده است که دستورات سطح بالا تصمیمات رانندگی را مدیریت می‌کند و دستورات سطح پایین به نظارت بر سرعت و شتاب وسیله نقلیه می‌پردازد. سپس، روش خاص مبتنی بر یادگیری تقویتی عمیق  به نام الگوریتم گرادیان سیاست قطعی عمیق  برای استخراج سیاست تصمیم‌گیری در بزرگراه استفاده می‌شود. سپس عملکرد الگوریتم گرادیان سیاست قطعی عمیق با الگوریتم شبکه عمیق کیو مورد مقایسه قرار گرفته است و نتایج استخراج شده از دو الگوریتم مورد ارزیابی و بررسی قرار خواهند گرفت. همچنین در این پژوهش برای شبیه‌سازی مسئله ذکرشده یعنی سبقت‌گیری در محیط بزرگراه از نرم افزار متلب نسخه 2022 استفاده شده است. نتایج شبیه‌‌سازی نشان می‌دهد که سیاست سبقت‌گیری مبتنی بر الگوریتم گرادیان سیاست قطعی عمیق می‌‌تواند وظایف رانندگی در بزرگراه را به طور اثربخش و ایمن انجام دهد.

کلیدواژه‌ها

موضوعات


عنوان مقاله [English]

Highway decision-making strategy for autonomous vehicle for overtaking maneuver using deep reinforcement learning (DRL) method

نویسندگان [English]

  • Ali Rizehvandi
  • Shahram Azadi
Faculty of Mechanical Engineering, K.N.Toosi University of Technology, Tehran, Iran
چکیده [English]

Automated driving represents a novel technology aimed at reducing traffic accidents and enhancing driving efficiency. This research introduces a deep reinforcement learning (DRL) approach for autonomous vehicles, focusing on overtaking scenarios on highways. Initially, a highway traffic environment is established, to guide the agent through surrounding vehicles both efficiently and safely. A hierarchical control framework is outlined to manage high-level driving decisions alongside low-level control aspects like car speed and acceleration. Subsequently, a specialized DRL-based method known as Deep Deterministic Policy Gradient (DDPG) is employed to devise decision-making strategies on the highway. The DDPG offers continuous action space exploration, making it suitable for tasks like autonomous driving where actions are not discrete. Unlike DQN, it can handle high-dimensional action spaces more effectively, enhancing its applicability in complex environments like highway driving. The efficacy of the DDPG algorithm is compared to that of the DQN algorithm, with subsequent evaluation of the results. Simulation outcomes demonstrate that the DDPG algorithm adeptly handles highway driving tasks with efficiency and safety. The study underscores the potential of DRL techniques, particularly the DDPG approach, in advancing the capabilities of autonomous vehicles and improving their performance in complex driving scenarios.  

کلیدواژه‌ها [English]

  • Autonomous Vehicles
  • Decision Making
  • DRL Method
  • Overtaking
  • DDPG Algorithm
[1] A. Raj, J. A. Kumar, and P. Bansal, A multicriteria decision-making approach to study barriers to the adoption of autonomous vehicles, Transp Res Part A, Policy Pract, 133 (2020) 122-137.
[2] T. Liu, B. Tian, Y. Ai, L. Chen, F. Liu, and D. Cao, Dynamic states prediction in autonomous vehicles: Comparison of three different methods, IEEE Intell Transp Syst Conf (ITSC), (2019) 3750-3755.
[3] A. Rasouli and J. K. Tsotsos, Autonomous vehicles that interact with pedestrians: A survey of theory and practice, IEEE Trans Intell Transp Syst, 21(3) (2020) 900-918.
[4] C. Gkartzonikas and K. Gkritza, What have we learned? A review of stated preference and choice studies on autonomous vehicles, Transp Res Part C, Emerg Technol., 98 (2019) 323-337.
[5] C.J. Hoel, K. Driggs-Campbell, K. Wolff, L. Laine, and M. J. Kochenderfer, Combining planning and deep reinforcement learning in tactical decision making for autonomous driving, IEEE Trans Intell Vehicles, 5(2) (2020) 294-305.
[6] C. Yang, Y. Shi, L. Li, and X. Wang, Efficient mode transition control for a parallel hybrid electric vehicle with adaptive dual-loop control framework, IEEE Trans Veh Technol, 69(2) (2020) 1519-1532.
[7] C.-J. Hoel, K. Wolff, and L. Laine, Tactical decision-making in autonomous driving by reinforcement learning with uncertainty estimation, IEEE Intelligent Vehicles Symposium (IV), (2020) 1292-1298.
[8] SAE On-Road Automated Vehicle Standards Committee, Taxonomy and definitions for terms related to on-road motor vehicle automated driving systems, SAE Standard J, 3016, (2014) 1-16.
[9] Qin, Y., Tang, X., Jia, T., Duan, Z., Zhang, J., Li, Y., & Zheng, L., Noise and vibration suppression in hybrid electric vehicles: State of the art and challenges, Renewable and Sustainable Energy Reviews, 124, (2020) 109782.
[10] Hart, P., & Knoll, A., Using counterfactual reasoning and reinforcement learning for decision-making in autonomous driving, Journal of Autonomous Vehicles, 15(2), (2020) 123-145.
[11] W. Song, G. Xiong, H. Chen, Intention-aware autonomous driving decision-making in an uncontrolled intersection, Math Problems Eng, (2016) 1-15.
[12] Yang, C., You, S., Wang, W., Li, L., & Xiang, C, A stochastic predictive energy management strategy for plug-in hybrid electric vehicles based on fast rolling optimization, IEEE Transactions on Industrial Electronics, 67(11), (2020) 9659-9670.
[13] Furda, A., & Vlacic, L., Enabling safe autonomous driving in real-world city traffic using multiple criteria decision-making, IEEE Intelligent Transportation Systems Magazine, 3(1), (2011) 4-17.
[14] Nie, J., Zhang, J., Ding, W., Wan, X., Chen, X., & Ran, B, Decentralized cooperative lane-changing decision-making for connected autonomous vehicles, IEEE Access, 4, (2016) 9413-9420.
[15] Li, L., Ota, K., & Dong, M, Humanlike driving: Empirical decision-making system for autonomous vehicles, IEEE Transactions on Vehicular Technology, 67(8), (2018) 6814-6823.
[16] Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., & Hassabis, D., Human-level control through deep reinforcement learning, Nature, 518(7540), (2015) 529-533.
[17] Duan, J., Li, S. E., Guan, Y., Sun, Q., & Cheng, B., Hierarchical reinforcement learning for self-driving decision-making without reliance on labeled driving data, IET Intelligent Transportation Systems, 14(5), (2020) 297-305.
[18] Kim, M., Lee, S., Lim, J., Choi, J., & Kang, S. G., Unexpected collision avoidance driving strategy using deep reinforcement learning, IEEE Access, 8, (2020) 17243-17252.
[19] Hang, Q., Lin, J., Sha, Q., He, B., & Li, G., Deep interactive reinforcement learning for path following of autonomous underwater vehicle, IEEE Access, 8, (2020) 24258-24268.
[20] Chen, C., Jiang, J., Lv, N., & Li, S., An intelligent path planning scheme of autonomous vehicles platoon using deep reinforcement learning on the network edge, IEEE Access, 8, (2020) 99059-99069.
[21] Yang, C., Zha, M., Wang, W., Liu, K., & Xiang, C, Efficient energy management strategy for hybrid electric vehicles/plug-in hybrid electric vehicles: Review and recent advances under intelligent transportation system, IET Intelligent Transportation Systems, 14(7), (2020) 702-711.
[22] Han, S., & Miao, F., Behavior planning for connected autonomous vehicles using feedback deep reinforcement learning, Journal of Autonomous Systems, 10(3), (2020) 112-134.
[23] Nageshrao, S., Tseng, H. E., & Filev, D, Autonomous highway driving using deep reinforcement learning, In Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics (SMC) (2019) 2326-2331.
[24] Li, G., Yang, Y., Zhang, T., Qu, X., Cao, D., Cheng, B., & Li, K, Risk assessment-based collision avoidance decision-making for autonomous vehicles in multi scenarios, Transportation Research Part C: Emerging Technologies, 122, (2021) 102820.
[25] Li, G., Yang, L., Li, S., Luo, X., Qu, X., & Paul, G., Human-like decision-making of artificial drivers in intelligent transportation systems: An end-to-end driving behavior prediction approach, IEEE Intelligent Transportation Systems Magazine, 14(1), (2022) 24-36.
[26] Duan, J., Guan, Y., Li, S. E., Ren, Y., & Cheng, B., Distributional Soft Actor-Critic: Off-Policy Reinforcement Learning for Addressing Value Estimation Errors, IEEE Transactions on Neural Networks and Learning Systems, 33(5), (2022) 2345-2357.
[27] Li, G., Li, S., Li, S., & Qu, X., Continuous decision-making for autonomous driving at intersections using deep deterministic policy gradient, IET Intelligent Transportation Systems, 16(2), (2021) 1669-1681.
[28] Liu, T., Huang, B., Deng, Z., Wang, H., Tang, X., Wang, X., & Cao, D., Heuristics-oriented overtaking decision making for autonomous vehicles using reinforcement learning, IET Electrical Systems in Transportation, 1(99), (2020) 1-8.
[29] Treiber, M., Hennecke, A., & Helbing, D., Congested traffic states in empirical observations and microscopic simulations, Physical Review E, Statistical Physics, Plasmas, Fluids, and Related Interdisciplinary Topics, 62(2), (2000) 1805-1824.
[30] Zhou, M., Qu, X., & Jin, S., On the impact of cooperative autonomous vehicles in improving freeway merging: A modified intelligent driver model-based approach, IEEE Transactions on Intelligent Transportation Systems, 18(6), (2017) 1422-1428.
[31] Kesting, A., Treiber, M., & Helbing, D, General lane-changing model MOBIL for car-following models, Transportation Research Record: Journal of the Transportation Research Board, 1999(1), (2007) 86-94.
[32] Liu, T., Hu, X., Hu, W., & Zou, Y, A heuristic planning reinforcement learning-based energy management for power-split plug-in hybrid electric vehicles, IEEE Transactions on Industrial Informatics, 15(12), (2019) 6436-6445.
[33] Liu, T., Tang, X., Wang, H., Yu, H., & Hu, X, Adaptive hierarchical energy management design for a plug-in hybrid electric vehicle, IEEE Transactions on Vehicular Technology, 68(12), (2019) 11513-11522.
[34]  Hu, X., Liu, T., Qi, X., & Barth, M, Reinforcement learning for hybrid and plug-in hybrid electric vehicle energy management: Recent advances and prospects, IEEE Industrial Electronics Magazine, 13(3), (2019) 16-25.
[35] Liu, T., Yu, H., Guo, H., Qin, Y., & Zou, Y, Online energy management for multimode plug-in hybrid electric vehicles, IEEE Transactions on Industrial Informatics, 15(7), (2019) 4352-4361.
[36] M. L. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming, Hoboken, NJ, USA: Wiley, 2014.
[37] Z. Wang, T. Schaul, M. Hessel, H. Van Hasselt, M. Lanctot, and N. De Freitas, Dueling network architectures for deep reinforcement learning, in Proc ICML, (2016) 1995-2003.