Abstract
In recent years, domain randomization has gained a lot of traction as a method for sim-to-real transfer of reinforcement learning policies; however, coming up with optimal randomization ranges can be difficult.
In this paper, we introduce DROPO, a novel method for estimating domain randomization ranges for a safe sim-to-real transfer.
Unlike prior work, DROPO only requires a precollected offline dataset of trajectories, and does not converge to point estimates.
We demonstrate that DROPO is capable of recovering dynamic parameter distributions in simulation and finding a distribution capable of compensating for an unmodelled phenomenon.
We also evaluate the method on two zero-shot sim-to-real transfer scenarios, showing a successful domain transfer and improved performance over prior methods.
Authored by Gabriele Tiboni, Karol Arndt, Ville Kyrki.
which is later used to train a policy that can be directly transferred to the real world.
Highlights
Citing
@article{tiboni2023dropo, title = {DROPO: Sim-to-real transfer with offline domain randomization}, journal = {Robotics and Autonomous Systems}, pages = {104432}, year = {2023}, issn = {0921-8890}, doi = {https://doi.org/10.1016/j.robot.2023.104432}, url = {https://www.sciencedirect.com/science/article/pii/S0921889023000714}, author = {Gabriele Tiboni and Karol Arndt and Ville Kyrki}, keywords = {Robot learning, Transfer learning, Reinforcement learning, Domain randomization} }