🤖 AI Summary
In camera manufacturing, high-precision active optical alignment between lenses and image sensors is severely hindered by complex, unmodeled, and hard-to-measure manufacturing tolerances; conventional model-based or rule-based approaches suffer from poor robustness and high deployment costs. This paper proposes an end-to-end reinforcement learning framework that, for the first time, learns optical alignment policies directly in pixel space—without explicit tolerance modeling or hand-crafted rules. Our contributions are threefold: (1) Relign, an open-source physics-based simulator that faithfully models stochastic tolerances and actuation noise while enabling seamless integration with machine learning; (2) a unified architecture integrating physics-driven rendering, pixel-level observation modeling, and robotic control interfaces; and (3) state-of-the-art performance on benchmark tasks—3.2× faster alignment convergence, 47% higher accuracy, and significantly improved robustness over traditional optimization and supervised learning methods—validated in real-world production-line deployment.
📝 Abstract
Aligning a lens system relative to an imager is a critical challenge in camera manufacturing. While optimal alignment can be mathematically computed under ideal conditions, real-world deviations caused by manufacturing tolerances often render this approach impractical. Measuring these tolerances can be costly or even infeasible, and neglecting them may result in suboptimal alignments. We propose a reinforcement learning (RL) approach that learns exclusively in the pixel space of the sensor output, eliminating the need to develop expert-designed alignment concepts. We conduct an extensive benchmark study and show that our approach surpasses other methods in speed, precision, and robustness. We further introduce relign, a realistic, freely explorable, open-source simulation utilizing physically based rendering that models optical systems with non-deterministic manufacturing tolerances and noise in robotic alignment movement. It provides an interface to popular machine learning frameworks, enabling seamless experimentation and development. Our work highlights the potential of RL in a manufacturing environment to enhance efficiency of optical alignments while minimizing the need for manual intervention.