Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 87760
CyberSteer: Cyber-Human Approach for Safely Shaping Autonomous Robotic Behavior to Comply with Human Intention
Authors: Vinicius G. Goecks, Gregory M. Gremillion, William D. Nothwang
Abstract:
Modern approaches to train intelligent agents rely on prolonged training sessions, high amounts of input data, and multiple interactions with the environment. This restricts the application of these learning algorithms in robotics and real-world applications, in which there is low tolerance to inadequate actions, interactions are expensive, and real-time processing and action are required. This paper addresses this issue introducing CyberSteer, a novel approach to efficiently design intrinsic reward functions based on human intention to guide deep reinforcement learning agents with no environment-dependent rewards. CyberSteer uses non-expert human operators for initial demonstration of a given task or desired behavior. The trajectories collected are used to train a behavior cloning deep neural network that asynchronously runs in the background and suggests actions to the deep reinforcement learning module. An intrinsic reward is computed based on the similarity between actions suggested and taken by the deep reinforcement learning algorithm commanding the agent. This intrinsic reward can also be reshaped through additional human demonstration or critique. This approach removes the need for environment-dependent or hand-engineered rewards while still being able to safely shape the behavior of autonomous robotic agents, in this case, based on human intention. CyberSteer is tested in a high-fidelity unmanned aerial vehicle simulation environment, the Microsoft AirSim. The simulated aerial robot performs collision avoidance through a clustered forest environment using forward-looking depth sensing and roll, pitch, and yaw references angle commands to the flight controller. This approach shows that the behavior of robotic systems can be shaped in a reduced amount of time when guided by a non-expert human, who is only aware of the high-level goals of the task. Decreasing the amount of training time required and increasing safety during training maneuvers will allow for faster deployment of intelligent robotic agents in dynamic real-world applications.Keywords: human-robot interaction, intelligent robots, robot learning, semisupervised learning, unmanned aerial vehicles
Procedia PDF Downloads 260