Refinery: Active Fine-tuning and Deployment-time Optimization for Contact-Rich Policies

📅 2025-10-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high performance variance, insufficient success rate (~80%), and fragile sequential execution of policies in contact-intensive robotic assembly tasks under diverse initial conditions, this paper proposes Refinery—a novel framework featuring deployment-time dynamic optimization. Refinery introduces a lightweight online fine-tuning mechanism guided by Bayesian optimization, coupled with Gaussian mixture model-driven sampling of initialization conditions, enabling robust multi-step policy cascading without additional training. The method jointly integrates simulation-to-reality transfer and online adaptation for contact-rich policies, and—crucially—enables runtime adaptive selection of optimal execution conditions for the first time. Experiments demonstrate that Refinery raises the average success rate to 91.51% (+10.98%) in simulation while maintaining comparable performance on physical hardware, successfully completing continuous assembly of up to eight components.

Technology Category

Application Category

📝 Abstract
Simulation-based learning has enabled policies for precise, contact-rich tasks (e.g., robotic assembly) to reach high success rates (~80%) under high levels of observation noise and control error. Although such performance may be sufficient for research applications, it falls short of industry standards and makes policy chaining exceptionally brittle. A key limitation is the high variance in individual policy performance across diverse initial conditions. We introduce Refinery, an effective framework that bridges this performance gap, robustifying policy performance across initial conditions. We propose Bayesian Optimization-guided fine-tuning to improve individual policies, and Gaussian Mixture Model-based sampling during deployment to select initializations that maximize execution success. Using Refinery, we improve mean success rates by 10.98% over state-of-the-art methods in simulation-based learning for robotic assembly, reaching 91.51% in simulation and comparable performance in the real world. Furthermore, we demonstrate that these fine-tuned policies can be chained to accomplish long-horizon, multi-part assembly$unicode{x2013}$successfully assembling up to 8 parts without requiring explicit multi-step training.
Problem

Research questions and friction points this paper is trying to address.

Improving low success rates of robotic assembly policies
Reducing performance variance across diverse initial conditions
Enabling reliable policy chaining for multi-part assemblies
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian Optimization-guided fine-tuning for policies
Gaussian Mixture Model-based sampling for initialization
Active framework improving policy chaining robustness
🔎 Similar Papers
No similar papers found.