
다중 무인 수중운동체 위협 환경에서 PPO 기반 강화학습을 이용한 잠수함의 최적 회피 및 의사결정 전략 연구
Ⓒ 2025 Korea Society for Naval Science & Technology
초록
본 연구는 다중 무인 수중운동체(UUV)의 위협 환경에서 잠수함의 생존율을 높이기 위해, 음향 탐지 모델을 보상 함수 설계에 통합한 강화학습 기반의 회피 전략을 제안한다. 구체적으로, 시뮬레이션 환경은 다중 UUV, 잠수함의 운동 모델링과 음향 탐지 모델로 구성하였다. 보상 함수는 잠수함이 적대적 UUV의 탐지 및 공격을 능동적으로 회피할 수 있도록 설계되었다. 시뮬레이션 결과, 제안한 강화학습 기반 회피 전략은 기존의 고정된 패턴 전략과 비교하여 잠수함의 생존율을 크게 높였다. 또한, 음향 탐지 모델에 따라 탐지 신호를 최소화하는 최적의 회피 기동을 학습함으로써, 효과적으로 회피하고 높은 생존율을 달성하였다.
Abstract
This study proposes a reinforcement learning-based evasion strategy that integrates an acoustic detection model into the reward function design to enhance submarine survivability in a multi-unmanned underwater vehicle (UUV) threat environment. Specifically, the simulation environment was constructed with multi-UUV and submarine motion modeling, along with an acoustic detection model. The reward function was designed to enable the submarine to actively evade detection and attacks from hostile UUVs. Simulation results show that the proposed reinforcement learning-based evasion strategy significantly increased submarine survivability compared to conventional fixed-pattern strategies. Furthermore, by learning optimal evasion maneuvers that minimize detection signals based on the acoustic detection model, the strategy achieved effective evasion and high survivability.
Keywords:
Unmanned Underwater Vehicle, Submarine, Reinforcement Learning, Evasion Strategy, Acoustic Detection Model키워드:
무인 수중운동체, 잠수함, 강화학습, 회피 전략, 음향 탐지 모델Acknowledgments
본 연구는 LIG Nex1의 지원을 받아 수행된 연구 결과임.
References
-
B. Kang and W. Yun, “Hierarchical Reinforcement Learning for Submarine Torpedo Countermeasures and Evasive Manoeuvres,” IEEE Access, 2024.
[https://doi.org/10.1109/ACCESS.2024.3487152]
-
J.-M. Pak, B.-H. Ku, Y.-H. Lee, D.-G. Ryu, W.-Y. Hong, H.-S. Ko, and M.-T. Lim, “Effectiveness Analysis for a Lightweight Torpedo Considering Evasive Maneuvering and Torpedo Acoustic Counter Measures of a Target,” Journal of the Korea Society for Simulation, Vol. 20, No. 4, pp. 1–11, 2011.
[https://doi.org/10.9709/JKSS.2011.20.4.001]
- A. Mjelde, “A Homing Torpedo: The Effect of the Tactical Situation and the Torpedo Parameters on the Torpedo Effectiveness,” Ph.D. dissertation, Naval Postgraduate School, Monterey, CA, USA, 1977.
- K. R. Armo, “The Relationship Between a Submarine’s Maximum Speed and Its Evasive Capability,” M.S. thesis, Naval Postgraduate School, Monterey, CA, USA, 2000.
-
J.-H. Chung, G.-S. Kim, S.-H. Park, J.-H. Kim, and W. Yun, “Reinforcement Learning-Based Deception Tactics for Torpedo Threat Evasion,” Journal of the Korean Institute of Communications and Information Sciences, Vol. 49, No. 3, pp. 333–345, 2024.
[https://doi.org/10.7840/kics.2024.49.3.333]
- R. J. Urick, Principles of Underwater Sound, 3rd ed. New York, NY, USA: McGraw-Hill, 1983.
- H. Medwin and C. S. Clay, Fundamentals of Acoustical Oceanography. New York, NY, USA: Academic Press, 1998.