🤖 AI Summary
This work addresses the challenge of autonomous optimization in unknown and time-varying environments where prior knowledge of optimal operating conditions is unavailable. The authors propose a structure-exploiting dual control method that, for the first time, reveals the convex–nonlinear composite structure of the DCEE objective function. By linearizing only the nonlinear residual component while preserving the outer convex loss, the approach constructs a generalized Gauss–Newton Hessian approximation that relies solely on first-order derivatives, guaranteeing positive semidefiniteness and computational efficiency. Evaluated on a vehicle cruise optimization task, the method achieves a maximum computation latency of 83 microseconds on a typical automotive-grade CPU—nearly an order of magnitude faster than existing approaches—while simultaneously improving control performance.
📝 Abstract
This paper develops a fast numerical dual control for exploration and exploitation (DCEE) method to address auto-optimization problems in unknown environments. In auto-optimization problems, the optimal operating condition is unknown a priori and may vary with the environment. As in classical dual control techniques, computational burden remains a major concern in DCEE for active learning. Existing DCEE methods provide a principled exploration-exploitation objective, but mainly realized through standard optimization packages or explicit gradient-type update laws, where the numerical structure of the DCEE has not been fully exploited. This paper shows that the reward function in DCEE has an inherent convex-over-nonlinear structure, where the exploitation and exploration terms form a unified nonlinear residual map equipped with a convex outer loss. Benefiting from this structure, a structure-exploiting numerical method is developed by linearizing only the nonlinear residual map while preserving the convex outer loss. Thus, each subproblem is transformed into a structured convex form that can be solved reliably. The resulting generalized Gauss-Newton Hessian approximation is positive semidefinite and depends only on first-order derivatives, thereby supporting fast online computation. The proposed method is evaluated on a vehicle cruising auto-optimization problem and compared with existing methods. Simulation and hardware-in-the-loop experimental results show that the proposed method improves control performance and achieves a speedup of approximately one order of magnitude, with a microsecond-level maximum computation time of only 83 μs on a typical vehicle embedded CPU.