การเรียนรู้แบบเสริมกำลังโดยใช้โมเดลสำหรับควบคุมการเคลื่อนที่ในแนวระนาบของแขนหุ่นยนต์

เมธา เมืองประเสริฐ; พิศักดิ์ เจิมประยงค์; กิตติพงศ์ บุญโล่ง

Authors

เมธา เมืองประเสริฐ นิสิต, สาขาวิศวกรรมเครื่องกล คณะวิศวกรรมศาสตร์ มหาวิทยาลัยบูรพา 169 ถ.ลงหาดบางแสน ต.แสนสุข อ.เมืองชลบุรี จ.ชลบุรี 20131
พิศักดิ์ เจิมประยงค์ อาจารย์, สาขาวิศวกรรมเครื่องกล คณะวิศวกรรมศาสตร์ มหาวิทยาลัยบูรพา 169 ถ.ลงหาดบางแสน ต.แสนสุข อ.เมืองชลบุรี จ.ชลบุรี 20131
กิตติพงศ์ บุญโล่ง อาจารย์, สาขาวิศวกรรมเครื่องกล คณะวิศวกรรมศาสตร์ มหาวิทยาลัยบูรพา 169 ถ.ลงหาดบางแสน ต.แสนสุข อ.เมืองชลบุรี จ.ชลบุรี 20131

Keywords:

robotic arm control, model-based reinforcement learning, machine learning regression, covariance matrix adaptation evolution strategy

Abstract

This research proposes model-based reinforcement learning (MBRL) for planar motion control of 2-DOF and 3-DOF robotic arms. Three case studies - placing task, 2-DOF and 3-DOF reaching tasks - are used as test problems. The 2-DOF and 3-DOF reaching tasks were investigated with additional noise in motion control signal and different training techniques. Within MBRL, 3 machine learning regression techniques, Gaussian process regression (GPR), artificial neural network (ANN) and support vector regression (SVR) were used to create environment model and then combined with an optimization algorithm, covariance matrix adaptation evolution strategy (CMA-ES). They were also benchmarked with the standard technique inverse kinematics (IK). The results show that MBRL with GPR and CMA-ES has the highest performance against the other 3 techniques. Since GPR is approximating covariance that considered noise, therefore, its success rates, which are 100%, 96-100% and 98-100% success rate in placing task, 2-DOF and 3-DOF reaching task respectively, was higher than those of ANN, SVR and IK, obviously. Although GPR spent the most training time, GPR was more suitable than other techniques of which the approximately average success rate was only 50%.

References

Roche M. The MAKO robotic-arm knee arthroplasty system. Archives of Orthopaedic and Trauma Surgery 2021;141(12):2043-7.

Rivera GP, Eichmann C, Scherzinger S, Puck L, Roennau A, Dillmann R. Flexible, Personal service robot for ALS patients. IEEE International Conference on Robotics and Biomimetics (ROBIO); 2019. p. 1595-600.

Zhao Y, Gong L, Liu C, Huang Y. Dual-arm robot design and testing for harvesting tomato in greenhouse. IFAC-PapersOnLine 2016;49(16):161-5.

Jiang R, Wang Z, He B, Zhou Y, Li G, Zhu Z. A data-efficient goal-directed deep reinforcement learning method for robot visuomotor skill. Neurocomputing 2021;462: 389-401.

Zhang Z, Zheng C. Simulation of robotic arm grasping control based on proximal policy optimization algorithm. Journal of Physics: Conference Series 2022;2203(1):012065.

Joshi S, Kumra S, Sahin F. Robotic grasping using deep reinforcement learning. IEEE International Conference on Automation Science and Engineering (CASE); 2020. p. 1461-66.

Li X, Shang W, Cong S. Model-based reinforcement learning for robot control. IEEE International Conference on Advanced Robotics and Mechatronics (ICARM); 2020. p. 300-5.

Deisenroth MP, Rasmussen CE, Fox D. Learning to control a low-cost manipulator using data-efficient reinforcement learning. Robotics: Science and Systems VII 2011;7:57-64.

Moerland TM, Broekens J, Jonker CM. Model-based reinforcement learning: a survey. arXiv preprint arXiv:200616712 2020.

Williams C, Rasmussen C. Gaussian processes for regression. Advances in neural information processing systems 1995;8.

Abraham A. Artificial neural networks. Handbook of measuring system design 2005

Rahimi HN, Nazemizadeh M. Dynamic analysis and intelligent control techniques for flexible manipulators: a review. Advanced Robotics 2014;28(2):63-76.

Drucker H, Burges CJ, Kaufman L, Smola A, Vapnik V. Support vector regression machines. Advances in neural information processing systems 1996;9.

Beny MA. Support vector machine: regression [Internet]. 2019 [cited 2022 Nov 15]. Available from: https://medium.com/it-paragon/support-vector-machine-regression-cf65348b6345

Hansen N. The CMA evolution strategy: a tutorial. arXiv preprint arXiv:160400772 2016

Tan U, Rabaste O, Adnet C, Ovarlez JP. On the eclipsing phenomenon with phase codes. IEEE International Radar Conference (RADAR); 2019. p. 1-5.

Libera DA, Romeres D, Jha DK, Yerazunis B, Nikovski D. Model-based reinforcement learning for physical systems without velocity and acceleration measurements. IEEE Robotics and Automation Letters 2020;5(2):3548-55.

Rokbani N, Alimi AM. Inverse kinematics using particle swarm optimization, a statistical analysis. Procedia Engineering 2013;64:1602-11.