Comparison of Multiple Reinforcement Learning and Deep Reinforcement Learning Methods for the Task Aimed at Achieving the Goal
Abstract
Reinforcement Learning (RL) and Deep Reinforcement Learning (DRL) methods are a promising approach to solving complex tasks in the real world with physical robots. In this paper, we compare several reinforcement learning (Q-Learning, SARSA) and deep reinforcement learning (Deep Q-Network, Deep Sarsa) methods for a task aimed at achieving a specific goal using robotics arm UR3. The main optimization problem of this experiment is to find the best solution for each RL/DRL scenario and minimize the Euclidean distance accuracy error and smooth the resulting path by the Bézier spline method. The simulation and real word applications are controlled by the Robot Operating System (ROS). The learning environment is implemented using the OpenAI Gym library which uses the RVIZ simulation tool and the Gazebo 3D modeling tool for dynamics and kinematics.
References
Aguero, C., et al. Inside the virtual robotics challenge: Simulating real-time robotic disaster response. Automation Science and Engineering, IEEE Transactions on 12, 2 (April 2015), 494-506.
Ammad, M., and Ramli, A. Cubic b-spline curve interpolation with arbitrary derivatives on its data points. In 2019 23rd International Conference in Information Visualization - Part II (2019), pp. 156-159.
Andersen, T. T. Optimizing the universal robots ros driver. Technical University ofDenmark, Department of Electrical Engineering (2015).
Bingol, O. R., and Krishnamurthy, A. NURBS-Python: An open-source object-oriented NURBS modeling framework in Python. SoftwareX 9 (2019), 85-94.
Bogunowicz, D., Rybnikov, A., Vendidandi, K., and Chervinskii, F.Sim2real for peg-hole insertion with eye-in-hand camera. arXiv:2005.14401 (05 2020).
Brockman, G., Cheung, V., Pettersson, L.,Schneider, J., Schulman, J., Tang, J., and Zaremba, W. Openai gym.arXiv:1606.01540 (2016).
Coleman, D., Sucan, I. A., Chitta, S., and Correll, N. Reducing the barrier to entry of complex robotic software: a moveit!case study. Journal of Software Engineering for Robotics 5 (2014), 3-16.
Coumans, E., and Bai, Y. Pybullet, a python module for physics simulation for games, robotics and machine learning. http://pybullet.org, 2016-2019.
El-Shamouty, M., Wu, X., Yang, S., Albus, M., and Huber, M. F. Towards safe human-robot collaboration using deep reinforcement learning. In 2020 IEEE International Conference on Robotics and Automation (ICRA) (2020), pp. 4899-4905.
Franceschetti, A., Tosello, E., Castaman,N., and Ghidoni, S. Robotic arm control and task training through deep reinforcement learning. arXiv:2005.02632 (01 2021).
Francois-Lavet, V., Henderson, P., Islam, R., Bellemare, M. G., and Pineau, J. An introduction to deep reinforcement learning. arXiv:1811.12560 (2018).
Hundt, A., et al. "good robot!": Efficient reinforcement learning for multi-step visual tasks with sim to real transfer. IEEE Robotics and Automation Letters PP (08 2020), 1-1.
Hulka, T., Matousek, R., Dobrovsky, L., Dosoudilova, M., and Nolle, L. Optimization of snake-like robot locomotion using ga: Serpenoid design. MENDEL 26, 1 (Aug. 2020), 1-6.
Kingma, D., and Ba, J. Adam: A method for stochastic optimization. International Conference on Learning Representations (12 2014).
Koenig, N., and Howard, A. Design and use paradigms for gazebo, an open-source multi-robot simulator. In IEEE/RSJ International Conference on Intelligent Robots and Systems (Sendai, Japan, Sep 2004), pp. 2149-2154.
Kristensen, C., Sorensen, F., Nielsen, H., Andersen, M., Bendtsen, S., and Bogh, S. Towards a robot simulation framework for e-waste disassembly using reinforcement learning. Procedia Manufacturing 38 (01 2019), 225-232.
Kudela, J. Social distancing as p-dispersion problem. IEEE Access 8 (2020), 149402-149411.
Lin, C., and Li, M. Motion planning with obstacle avoidance of an ur3 robot using charge system search. In 2018 18th International Conference on Control, Automation and Systems (ICCAS) (2018), pp. 746-750.
Mahmood, A., Korenkevych, D., Komer,B., and Bergstra, J.Setting up a reinforcement learning task with a real-world robot. arXiv:1803.07067 (03 2018).
Mesquita, A., Nogueira, Y., Vidal, C., Cavalcante-Neto, J., and Serafim, P. Autonomous foraging with sarsa-based deep reinforcement learning. In 2020 22nd Symposium on Virtual and Augmented Reality (SVR) (2020), pp. 425-433.
Nguyen, H., and La, H. Review of deep reinforcement learning for robot manipulation. In 2019 Third IEEE International Conference on Robotic Computing (IRC) (2019), pp. 590-595.
Rupam Mahmood, A., Korenkevych, D., Komer, B. J., and Bergstra, J. Setting up a reinforcement learning task with a real-world robot. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2018), pp. 4635-4640.
Scheiderer, C., Thun, T., and Meisen, T. Bezier curve based continuous and smooth motion planning for self-learning industrial robots. Procedia Manufacturing 38 (2019), 423-430. 29th International Conference on Flexible Automation and Intelligent Manufacturing ( FAIM 2019), June 24-28, 2019, Limerick, Ireland, Beyond Industry 4.0: Industrial Advances, Engineering Education and Intelligent Manufacturing.
Silver, D., et al. Mastering the game of go with deep neural networks and tree search. Nature 529 (01 2016), 484-489.
Stanford Artificial Intelligence Laboratory et al. Robotic operating system.
Sucan, I. A., and Chitta, S. Moveit. [online] Available at: moveit.ros.org.
Sutton, R. S., and Barto, A. G. Reinforcement Learning: An Introduction, second ed. The MIT press, 2018.
Universal Robots. Ur3. [online] Available at: https://www.universal-robots.com.
van Hasselt, H., Guez, A., and Silver, D. Deep reinforcement learning with double q-learning. arxiv:1509.06461 (2016).
Vince, J. Mathematics for Computer Graphics, fifth ed. Springer, London, 2017.
Xinyu, W., Xiaojuan, L., Yong, G., Jiadong, S., and Rui, W. Bidirectional potential guided rrt* for motion planning. IEEE Access 7 (2019), 95046-95057.
Zamora, I., Lopez, N., Vilches, V., and Cordero, A.Extending the openai gym for robotics: a toolkit for reinforcement learning using ros and gazebo. arXiv:1608.05742 (08 2016).
Zeng, X. Reinforcement learning based approach for the navigation of a pipe-inspection robot at sharp pipe corners. University of Twente, September 2019.
Copyright (c) 2021 MENDEL
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
MENDEL open access articles are normally published under a Creative Commons Attribution-NonCommercial-ShareAlike (CC BY-NC-SA 4.0) https://creativecommons.org/licenses/by-nc-sa/4.0/ . Under the CC BY-NC-SA 4.0 license permitted 3rd party reuse is only applicable for non-commercial purposes. Articles posted under the CC BY-NC-SA 4.0 license allow users to share, copy, and redistribute the material in any medium of format, and adapt, remix, transform, and build upon the material for any purpose. Reusing under the CC BY-NC-SA 4.0 license requires that appropriate attribution to the source of the material must be included along with a link to the license, with any changes made to the original material indicated.