การเรียนรู้แบบเสริมกำลังโดยใช้โมเดลสำหรับควบคุมการเคลื่อนที่ในแนวระนาบของแขนหุ่นยนต์
Keywords:
robotic arm control, model-based reinforcement learning, machine learning regression, covariance matrix adaptation evolution strategyAbstract
This research proposes model-based reinforcement learning (MBRL) for planar motion control of 2-DOF and 3-DOF robotic arms. Three case studies - placing task, 2-DOF and 3-DOF reaching tasks - are used as test problems. The 2-DOF and 3-DOF reaching tasks were investigated with additional noise in motion control signal and different training techniques. Within MBRL, 3 machine learning regression techniques, Gaussian process regression (GPR), artificial neural network (ANN) and support vector regression (SVR) were used to create environment model and then combined with an optimization algorithm, covariance matrix adaptation evolution strategy (CMA-ES). They were also benchmarked with the standard technique inverse kinematics (IK). The results show that MBRL with GPR and CMA-ES has the highest performance against the other 3 techniques. Since GPR is approximating covariance that considered noise, therefore, its success rates, which are 100%, 96-100% and 98-100% success rate in placing task, 2-DOF and 3-DOF reaching task respectively, was higher than those of ANN, SVR and IK, obviously. Although GPR spent the most training time, GPR was more suitable than other techniques of which the approximately average success rate was only 50%.
References
Roche M. The MAKO robotic-arm knee arthroplasty system. Archives of Orthopaedic and Trauma Surgery 2021;141(12):2043-7.
Rivera GP, Eichmann C, Scherzinger S, Puck L, Roennau A, Dillmann R. Flexible, Personal service robot for ALS patients. IEEE International Conference on Robotics and Biomimetics (ROBIO); 2019. p. 1595-600.
Zhao Y, Gong L, Liu C, Huang Y. Dual-arm robot design and testing for harvesting tomato in greenhouse. IFAC-PapersOnLine 2016;49(16):161-5.
Jiang R, Wang Z, He B, Zhou Y, Li G, Zhu Z. A data-efficient goal-directed deep reinforcement learning method for robot visuomotor skill. Neurocomputing 2021;462: 389-401.
Zhang Z, Zheng C. Simulation of robotic arm grasping control based on proximal policy optimization algorithm. Journal of Physics: Conference Series 2022;2203(1):012065.
Joshi S, Kumra S, Sahin F. Robotic grasping using deep reinforcement learning. IEEE International Conference on Automation Science and Engineering (CASE); 2020. p. 1461-66.
Li X, Shang W, Cong S. Model-based reinforcement learning for robot control. IEEE International Conference on Advanced Robotics and Mechatronics (ICARM); 2020. p. 300-5.
Deisenroth MP, Rasmussen CE, Fox D. Learning to control a low-cost manipulator using data-efficient reinforcement learning. Robotics: Science and Systems VII 2011;7:57-64.
Moerland TM, Broekens J, Jonker CM. Model-based reinforcement learning: a survey. arXiv preprint arXiv:200616712 2020.
Williams C, Rasmussen C. Gaussian processes for regression. Advances in neural information processing systems 1995;8.
Abraham A. Artificial neural networks. Handbook of measuring system design 2005
Rahimi HN, Nazemizadeh M. Dynamic analysis and intelligent control techniques for flexible manipulators: a review. Advanced Robotics 2014;28(2):63-76.
Drucker H, Burges CJ, Kaufman L, Smola A, Vapnik V. Support vector regression machines. Advances in neural information processing systems 1996;9.
Beny MA. Support vector machine: regression [Internet]. 2019 [cited 2022 Nov 15]. Available from: https://medium.com/it-paragon/support-vector-machine-regression-cf65348b6345
Hansen N. The CMA evolution strategy: a tutorial. arXiv preprint arXiv:160400772 2016
Tan U, Rabaste O, Adnet C, Ovarlez JP. On the eclipsing phenomenon with phase codes. IEEE International Radar Conference (RADAR); 2019. p. 1-5.
Libera DA, Romeres D, Jha DK, Yerazunis B, Nikovski D. Model-based reinforcement learning for physical systems without velocity and acceleration measurements. IEEE Robotics and Automation Letters 2020;5(2):3548-55.
Rokbani N, Alimi AM. Inverse kinematics using particle swarm optimization, a statistical analysis. Procedia Engineering 2013;64:1602-11.
Downloads
Published
Issue
Section
License
Copyright (c) 2023 Kasem Bundit University
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
*Copyright
The article has been published in Kasem Bundit Engineering Journal (KBEJ) is the copyright of the Kasem Bundit University. Do not bring all of the messages or republished except permission from the university.
* Responsibility
If the article is published as an article that infringes the copyright or has the wrong content the author of article must be responsible.