A Cognitive Robot Collaborative Reinforcement Learning Algorithm

Amit Gil; Helman Stern; Yael Edan

Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 32804

A Cognitive Robot Collaborative Reinforcement Learning Algorithm

Authors: Amit Gil, Helman Stern, Yael Edan

Abstract:

A cognitive collaborative reinforcement learning algorithm (CCRL) that incorporates an advisor into the learning process is developed to improve supervised learning. An autonomous learner is enabled with a self awareness cognitive skill to decide when to solicit instructions from the advisor. The learner can also assess the value of advice, and accept or reject it. The method is evaluated for robotic motion planning using simulation. Tests are conducted for advisors with skill levels from expert to novice. The CCRL algorithm and a combined method integrating its logic with Clouse-s Introspection Approach, outperformed a base-line fully autonomous learner, and demonstrated robust performance when dealing with various advisor skill levels, learning to accept advice received from an expert, while rejecting that of less skilled collaborators. Although the CCRL algorithm is based on RL, it fits other machine learning methods, since advisor-s actions are only added to the outer layer.

Keywords: Robot learning, human-robot collaboration, motion planning, reinforcement learning.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1062826

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1670

References:

[1] C. J. C. H. Watkins, "Learning from Delayed Rewards," Ph.D. dissertation, Psychology Dept., Cambridge University, 1989.
[2] R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, Cambridge, MA: MIT Press, 1998.
[3] T. G. Dietterich, "Hierarchical reinforcement learning with the maxq value function decomposition," Journal of Artificial Intelligence Research, 1999, vol. 13, pp. 227-303.
[4] V. N. Papudesi and M. Huber, "Learning from reinforcement and advice using composite reward functions," in Proc. 16th Int. FLAIRS Conf., pp. 361-365, St. Augustine, FL, 2003.
[5] L. Mihalkova and R. Mooney, "Using active relocation to aid reinforcement," in Proc. 19th Int. FLAIRS Conf., Florida, 2006,
[6] U. Kartoun, H. Stern, and Y. Edan, "Human-robot collaborative learning system for inspection," IEEE Int. Conf. on Systems, Man, and Cybernetics, pp. 4249-4255, Taipei, Taiwan, 2006.
[7] V. U. Cetina, "Supervised Reinforcement Learning Using Behavior Models," IEEE Computer Society 6th Int. Conf. on Machine Learning and Applications, Cincinnati, Ohio, USA, 2007.
[8] C. Breazeal and A, Thomaz, "Learning from Human Teachers with Socially Guided Exploration," IEEE Int. Conf. on Robotics and Automation, Pasadena, CA, USA, 2008.
[9] J. A. Clouse, "An introspection approach to querying a trainer," technical report 96-13, University of Massachusetts, Amherst, MA, 1996.
[10] M. A. Goodrich, R. D. R. Olsen, J. W. Crandall and T. J. Palmer, Experiments in adjustable autonomy," in Proceedings of the IJCAI Workshop on Autonomy, Delegation and Control: Interacting with Intelligent Agents, 2001, pp. 1624-1629.