go back

Assessing Transferability from Simulation to Reality for Reinforcement Learning

Fabio Muratore, Michael Gienger, Jan Peters, "Assessing Transferability from Simulation to Reality for Reinforcement Learning", IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020.

Abstract

Learning robot control policies from physics simulations is of great interest for the robotics community as it mayrender the learning process faster, cheaper, and safer by alleviating the need for expensive real-world experiments. However, the direct transfer of the learned behavior from simulation to realityis a major challenge. Optimizing a policy on a slightly faulty simulator can easily lead to the maximization of the ‘SimulationOptimization Bias’ (SOB). In this case, the optimizer exploitsmodeling errors of the simulator such that the resulting behavior can potentially damage the robot. We tackle this challenge by applying domain randomization, i.e., randomizing the parameters of the physics simulator during learning. We propose an algorithm called Simulation-based Policy Optimization with Transferability Assessment (SPOTA) which uses an estimator of the SOB toformulate a stopping criterion for training. The introduced estimator quantifies the over-fitting to the set of domains experienced during training. Our experimental results in two different environments show that the new simulation-based policysearch algorithm is able to learn a control policy exclusively from a randomized simulator, which can be applied directly to real system without any additional training on the latter.



Download Bibtex file Download PDF

Search