Fabio Muratore, Michael Gienger, Jan Peters, "Assessing Transferability in Reinforcement Learning from Randomized Simulations", IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020.Abstract
Learning robot control policies from physics simulations is of great interest for the robotics community as it mayrender the learning process faster, cheaper, and safer by alleviating the need for expensive real-world experiments. However, the direct transfer of the learned behavior from simulation to realityis a major challenge. Optimizing a policy on a slightly faulty simulator can easily lead to the maximization of the ‘SimulationOptimization Bias’ (SOB). In this case, the optimizer exploitsmodeling errors of the simulator such that the resulting behavior can potentially damage the robot. We tackle this challenge by applying domain randomization, i.e., randomizing the parameters of the physics simulator during learning. We propose an algorithm called Simulation-based Policy Optimization with Transferability Assessment (SPOTA) which uses an estimator of the SOB toformulate a stopping criterion for training. The introduced estimator quantifies the over-fitting to the set of domains experienced during training. Our experimental results in two different environments show that the new simulation-based policysearch algorithm is able to learn a control policy exclusively from a randomized simulator, which can be applied directly to real system without any additional training on the latter.
The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.