A reinforcement learning model for route allocation optimization: A case study of C-130H transport aircraft in the Royal Thai Air Force

Patikorn Anchuen; Phummipat Daungklang; Nuntipat Phisutthangkoon; Nattawat Tanomchad; Hatsadin Jantaboon

doi:10.60101/jarst.2026.265635

PDF

Published: Jun 24, 2026

DOI: https://doi.org/10.60101/jarst.2026.265635

Keywords:

Air logistics Route allocation Reinforcement learning C-130H transport aircraft

Patikorn Anchuen

Office of Graduate Studies, Navaminda Kasatriyadhiraj Royal Air Force Academy, THAILAND

Phummipat Daungklang

Department of Computer Science, Navaminda Kasatriyadhiraj Royal Air Force Academy, THAILAND

Nuntipat Phisutthangkoon

Department of Computer Science, Navaminda Kasatriyadhiraj Royal Air Force Academy, THAILAND

Nattawat Tanomchad

Department of Computer Science, Navaminda Kasatriyadhiraj Royal Air Force Academy, THAILAND

Hatsadin Jantaboon

Office of Graduate Studies, Navaminda Kasatriyadhiraj Royal Air Force Academy THAILAND

Abstract

This study aims to develop a route-allocation model for transport aircraft in the Royal Thai Air Force's (RTAF) air logistics operations. A Reinforcement Learning (RL) approach was applied to optimize resource allocation by determining the most suitable routes based on cargo capacity distribution, thereby reducing operational flight distances and, consequently, the frequency of aircraft maintenance. This research was conducted in a simulated environment using domestic air transport data as a reference for C-130H transport aircraft of Squadron 601, Wing 6, RTAF. Experimental results show that the developed model significantly improves air transport operational efficiency in support and service provision, facilitating network operations by reducing flight distances and increasing process continuity under varying cargo capacity conditions compared to current practices. Ultimately, this contributes to the improvement and sustainability of defense operations. The proposed work schedule also demonstrates adaptability to dynamic operational constraints and changing demand.

How to Cite

1.

Anchuen P, Daungklang P, Phisutthangkoon N, Tanomchad N, Jantaboon H. A reinforcement learning model for route allocation optimization: A case study of C-130H transport aircraft in the Royal Thai Air Force. J Appl Res Sci Tech [internet]. 2026 Jun. 24 [cited 2026 Jul. 19];. available from: https://ph01.tci-thaijo.org/index.php/rmutt-journal/article/view/265635

Issue

Online First

Section

Research Articles

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

References

Royal Thai Air Force. 20-Year Air Force Strategy (2018-2037). Revised 2020. Bangkok (Thailand): Royal Thai Air Force; 2020.

Royal Thai Air Force. Air Force Operational Plan 2023–2027. Bangkok (Thailand): Royal Thai Air Force; 2022.

Mouton CA, Powers JD, Romano DM, Guo C, Bednarz S, O'Connell C. Fuel reduction for the mobility air forces [Internet]. Santa Monica (CA): RAND Corporation; 2015 [cited 2026 Jan 26]. Available from: https://www.rand.org/content/dam/rand/pubs/research_reports/RR700/RR757/RAND_RR757.pdf.

Jayapal C, Devadharshini R, Praniis R, Praniis R. Enhancing fleet operations with a vehicle route management solution. In: Proceedings of the 2nd International Conference on Advancements in Electrical, Electronics, Communication, Computing and Automation (ICAECA); 2023; Coimbatore, India. p. 1-6.

Coppo M, Agostini M, De Matteis G, Bertolini M. Optimal management of commercial electric vehicle fleets with recharging stations and time-varying electricity prices. Energies. 2025;18(3):453.

Bao Y. A summary and discussion on the current state of CVRP research. Appl Comput Eng. 2024;115(1):35-42.

Reddy TS, Dhanush D, Krithin T, Sayan J. Supply chain logistics with hybrid optimization using ADMM and vehicle routing problem. In: Proceedings of the International Conference on; 2024.

Santos MJ, Jorge D, Bonomi V, Ramos TRP, Barbosa-Póvoa AP. Enhancing logistics through a vehicle routing problem with deliveries, pick-ups, and backhauls. Int Trans Oper Res. 2024.

Desticioğlu Taşdemir B, Özyörük B. Mathematical model for multi-depot simultaneously pick-up and delivery vehicle routing problem with stochastic pick-up demand. Gazi Univ J Sci. 2025;38(1):219-35.

Jantaboon H, Anchuen P, Bua-ngam P. An optimization model for Air Force logistics: A case study of C-130H transport aircraft. In: Proceedings of the NCDT2025 Conference; 2025. p. 146-50.

Uthansakul P, Anchuen P, Uthansakul M, Khan AA. QoE-aware self-tuning of service priority factor for resource allocation optimization in LTE networks. IEEE Trans Veh Technol. 2020;69(1):887-900.

Sharma A, Srivastava S, Rautela PS, Joshi B. Metaheuristics-based routing optimization in on-chip network. In: Proceedings of the ACM International Conference; 2023.

Anchuen P, Uthansakul P, Uthansakul M, Poochaya S. Fleet optimization of smart electric motorcycle system using deep reinforcement learning. Comput Mater Contin. 2022;71(1):1925-43.

Poochaya S, Uthansakul P, Uthansakul M, Anchuen P, Thammakul K, Khan AA, et al. A multi-mode public transportation system using vehicular to network architecture. Comput Mater Contin. 2022;73(3).

Daungklang P, Mahavongvun A. Create a process for manually editing rapid 3D mapping datasets [Internet]. 2011 [cited 2026 Jan 26]. Available from: https://www.diva-portal.org/smash/get/diva2:426325/FULLTEXT01.pdf .

Sutton RS, Barto AG. Reinforcement learning: An introduction. 2nd ed. Cambridge (MA): MIT Press; 2018.

Silver D, Lever G, Heess N, Degris T, Wierstra D, Riedmiller M. Deterministic policy gradient algorithms. In: Proceedings of the International Conference on Machine Learning (ICML); 2014. p. 387-95.

Arulkumaran K, Deisenroth M, Brundage M, Bharath A. Deep reinforcement learning: A brief survey. IEEE Signal Process Mag. 2017;34(6):26-38.

Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Marc G, et al. Human-level control through deep reinforcement learning. Nature. 2015;518(7540):529-33.

Busoniu L, Babuska R, De Schutter B, Ernst D. Reinforcement learning and dynamic programming using function approximators. Boca Raton (FL): CRC Press; 2017.

Watkins C, Dayan P. Q-learning. Mach Learn. 1992;8(3-4):279-92.

Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, et al. Playing Atari with deep reinforcement learning. arXiv preprint. 2013;arXiv:1312.5602.

Lillicrap TP, Hunt JJ, Pritzel A, Heess N, Erez T, Tassa Y, et al. Continuous control with deep reinforcement learning. arXiv preprint. 2015;arXiv:1509.02971.

Hessel M, Modayil J, Van Hasselt H, Schaul T, Ostrovski G, Dabney W, et al. Rainbow: Combining improvements in deep reinforcement learning. In: Proceedings of the AAAI Conference on Artificial Intelligence; 2018. p. 3215-22.

Wang Z, Schaul T, Hessel M, Hasselt H, Lanctot M, Freitas N. Dueling network architectures for deep reinforcement learning. In: Proceedings of the 33rd International Conference on Machine Learning (ICML); 2016. p. 1995-2003.

Van Hasselt H, Guez A, Silver D. Deep reinforcement learning with double Q-learning. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence; 2016. p. 2094-2100.

Baneshi F, Hernandez ML. Network-level aircraft trajectory planning via multi-agent deep reinforcement learning. Comput Ind Eng. 2025;175:109057.

Koh S, Zhou B, Fang H. Real-time deep reinforcement learning-based vehicle routing and navigation. Appl Soft Comput. 2020;96:106694.

Issue per Year:	3
Acceptance Rate:	71%
Review Speed:	93 days
2025 No. of Submissions:	38 paper
No. of Published:	27 paper
No. of Reviewers:	93
Time to First screen:	19 days
Review Time:	93 days
Submission to Acceptance:	126 days

Article Sidebar

Main Article Content

Abstract

Article Details

References