In this work, we propose an Imitation Learning strategy to efficiently compress a computationally expensive MPC into a deep neural network policy that is robust to previously unseen disturbances.
By using a robust variant of the MPC, called Robust Tube MPC, and leveraging properties from the controller, we introduce computationally-efficient data augmentation methods that enable a significant reduction of the number of MPC demonstrations and training efforts required to generate a robust policy.
Our approach opens the possibility of zero-shot transfer of a policy trained from a single MPC demonstration collected in a nominal domain, such as a simulation or a robot in a lab/controlled environment, to a new domain with previously unseen bounded model errors/perturbations.
Numerical evaluations performed using linear and nonlinear MPC for agile flight on a multirotor show that our method outperforms strategies commonly employed in IL (such as Dataset-Aggregation (DAgger) and Domain Randomization (DR)) in terms of demonstration-efficiency, training time, and robustness to perturbations unseen during training. Experimental evaluations validate the efficiency and real-world robustness.
Arxiv: arxiv.org/abs/...
Accepted to the IEEE Transactions on Robotics (T-RO) 2024.
Interested to know more? Check out our related work:
- Tube-NeRF: Efficient learning of vision-based policies: • Tube-NeRF: Efficient I...
- SAMA: Efficient learning of adaptive policies: • Experimental Results f...
Негізгі бет Efficient Deep Learning of Robust Policies from MPC via Imitation and Tube-Guided Data Augmentation
Пікірлер