UAV hybrid path planning based on hierarchical deep reinforcement learning

LYU Chao; LI Muchen; OU Jiajun

doi:10.13700/j.bh.1001-5965.2023.0550

Volume 51 Issue 10

Oct. 2025

Turn off MathJax

Article Contents

Journal of Beijing University of Aeronautics and Astronautics > 2025 > 51(10): 3451-3459.

LYU C，LI M C，OU J J. UAV hybrid path planning based on hierarchical deep reinforcement learning[J]. Journal of Beijing University of Aeronautics and Astronautics，2025，51（10）：3451-3459 （in Chinese） doi: 10.13700/j.bh.1001-5965.2023.0550

Citation:

PDF( 1798 KB)

UAV hybrid path planning based on hierarchical deep reinforcement learning

doi: 10.13700/j.bh.1001-5965.2023.0550

LYU Chao¹,
LI Muchen²,
OU Jiajun^{3
,
,}

1.
School of Aeronautic Science and Engineering，Beihang University，Beijing 100191，China
2.
Equipment Project Management Center，Aerospace System Department of Strategic Support Force Equipment Department，Beijing 100193，China
3.
Institute of Unmanned System，Beihang University，Beijing 100191，China

More Information

Corresponding author: E-mail：oujiajun@cq5520.com
Received Date: 28 Aug 2023
Accepted Date: 20 Oct 2023

Available Online: 03 Nov 2023

Publish Date: 01 Nov 2023

Abstract

Abstract

In the application of UAV, it is necessary to realize autonomous and safe flight in a non-clearance environment. For both known and unknown obstacles, it is necessary to carry out autonomous obstacle avoidance for unknown obstacles while making global path planning based on known obstacles. To achieve safe flight in the semi-known obstacle environment, a hybrid path planning method based on hierarchical deep reinforcement learning is proposed, which naturally combines autonomous perception, obstacle avoidance, and global path planning, two sub-tasks that UAV must perform in autonomous flight, to achieve an efficient combination of functions. In the hierarchical deep reinforcement learning model, the obstacle avoidance and navigation sub-task models can be trained separately, and the system state can be abstracted through the two trained models. On this basis, the top-level model can be trained to achieve effective scheduling of the two sub-task outputs. The obstacle avoidance and navigation sub-task models can be trained independently in the hierarchical deep reinforcement learning model, and the system state can be abstracted through the two trained models. It can reduce the difficulty of model training while maintaining the ability to execute the model task.
- UAV,
- path planning,
- autonomous obstacle avoidance,
- deep reinforcement learning,
- hierarchical model

FullText(HTML)

References(17)

References

[1]	支琛博, 张爱军, 杜新阳, 等. 改进A算法的移动机器人全局路径规划研究[J]. 计算机仿真, 2023, 40(2): 486-491. doi: 10.3969/j.issn.1006-9348.2023.02.090 ZHI C B, ZHANG A J, DU X Y, et al. Research on global path planning of mobile robot based on improved A algorithm[J]. Computer Simulation, 2023, 40(2): 486-491(in Chinese). doi: 10.3969/j.issn.1006-9348.2023.02.090
[2]	栾添添, 王皓, 孙明晓, 等. 基于动态变采样区域RRT的无人车路径规划[J]. 控制与决策, 2023, 38(6): 1721-1729. LUAN T T, WANG H, SUN M X, et al. Path planning of unmanned vehicle based on dynamic variable sampling area RRT[J]. Control and Decision, 2023, 38(6): 1721-1729(in Chinese).
[3]	李樾, 韩维, 陈清阳, 等. 基于快速扩展随机树算法的多无人机编队重构方法研究[J]. 西北工业大学学报, 2019, 37(3): 601-611. doi: 10.1051/jnwpu/20193730601 LI Y, HAN W, CHEN Q Y, et al. Research on formation reconfiguration of UAVs based on RRT algorithm[J]. Journal of Northwestern Polytechnical University, 2019, 37(3): 601-611(in Chinese). doi: 10.1051/jnwpu/20193730601
[4]	HARIK E H, KORSAETH A. Combining hector SLAM and artificial potential field for autonomous navigation inside a greenhouse[J]. Robotics, 2018, 7(2): 22. doi: 10.3390/robotics7020022
[5]	RHODES C, LIU C J, CHEN W H. Autonomous source term estimation in unknown environments: from a dual control concept to UAV deployment[J]. IEEE Robotics and Automation Letters, 2022, 7(2): 2274-2281. doi: 10.1109/LRA.2022.3143890
[6]	YU X Q, WANG P, ZHANG Z X. Learning-based end-to-end path planning for lunar rovers with safety constraints[J]. Sensors, 2021, 21(3): 796. doi: 10.3390/s21030796
[7]	WU K Y, ESFAHANI M A, YUAN S H, et al. TDPP-Net: achieving three-dimensional path planning via a deep neural network architecture[J]. Neurocomputing, 2019, 357: 151-162. doi: 10.1016/j.neucom.2019.05.001
[8]	黄昱洲, 王立松, 秦小麟. 一种基于深度强化学习的无人小车双层路径规划方法[J]. 计算机科学, 2023, 50(1): 194-204. HUANG Y Z, WANG L S, QIN X L. Bi-level path planning method for unmanned vehicle based on deep reinforcement learning[J]. Computer Science, 2023, 50(1): 194-204(in Chinese).
[9]	封硕, 舒红, 谢步庆. 基于改进深度强化学习的三维环境路径规划[J]. 计算机应用与软件, 2021, 38(1): 250-255. FENG S, SHU H, XIE B Q. 3D environment path planning based on improved deep reinforcement learning[J]. Computer Applications and Software, 2021, 38(1): 250-255(in Chinese).
[10]	GUPTA S, TOLANI V, DAVIDSON J, et al. Cognitive mapping and planning for visual navigation[J]. International Journal of Computer Vision, 2020, 128(5): 1311-1330. doi: 10.1007/s11263-019-01236-7
[11]	XIE L H, WANG S, MARKHAM A, et al. Towards monocular vision based obstacle avoidance through deep reinforcement learning[EB/OL]. (2017-06-29) [2023-08-20]. http://arxiv.org/abs/1706.09829.
[12]	SINGLA A, PADAKANDLA S, BHATNAGAR S. Memory-based deep reinforcement learning for obstacle avoidance in UAV with limited environment knowledge[J]. IEEE Transactions on Intelligent Transportation Systems, 2019, 22(1): 107-118.
[13]	ZHU Y K, MOTTAGHI R, KOLVE E, et al. Target-driven visual navigation in indoor scenes using deep reinforcement learning[C]//Proceedings of the IEEE International Conference on Robotics and Automation. Piscataway: IEEE Press, 2017: 3357-3364.
[14]	WEN S H, ZHAO Y F, YUAN X, et al. Path planning for active SLAM based on deep reinforcement learning under unknown environments[J]. Intelligent Service Robotics, 2020, 13(2): 263-272. doi: 10.1007/s11370-019-00310-w
[15]	HAUSKNECHT M, STONE P. Deep recurrent Q-learning for partially observable MDPS[C]//Proceedings of the 2015 International Conference on Learning Representations. New York: ACM, 2015: 29-37.
[16]	WANG Z Y, SCHAUL T, HESSEL M, et al. Dueling network architectures for deep reinforcement learning[C]//Proceedings of the 33rd International Conference on Machine Learning. New York: ACM, 2016, 48: 1995-2003.
[17]	VAN H H, GUEZ A, SILVER D. Deep reinforcement learning with double Q-learning[C]//Proceedings of the AAAI Conference on Artificial Intelligence. New York: AAAI Press, 2016, 30(1): 2094-2100.

Relative Articles

Supplements(0)

Cited By

Proportional views

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(13) / Tables(5)

Get Citation

PDF

XML

Article Metrics

Article views(756) PDF downloads(85)

UAV hybrid path planning based on hierarchical deep reinforcement learning

doi: 10.13700/j.bh.1001-5965.2023.0550

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Proportional views

Related

UAV hybrid path planning based on hierarchical deep reinforcement learning

doi: 10.13700/j.bh.1001-5965.2023.0550

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Proportional views

Related

Export File

Citation

Format

Content