🤖 AI Summary
This work addresses the problem of sequentially extracting maximal work from $N$ identical copies of an unknown pure single-qubit state to charge a quantum battery, requiring joint optimization of immediate energy extraction and quantum state learning. We propose the first framework that incorporates the exploration–exploitation trade-off from reinforcement learning into quantum work extraction, integrating adaptive quantum feedback control, dynamic measurement strategies, and optimized unitary operations to achieve nearly dissipationless energy transfer. We theoretically establish an upper bound on energy dissipation of $O((log N)^k)$—a quasi-logarithmic scaling—contrasting sharply with conventional quantum tomography-based approaches, which incur a fundamental dissipation lower bound of $Omega(N^{-1})$, thereby achieving exponential improvement in scalability. The core contribution lies in establishing a cross-paradigmatic connection between quantum work extraction and sequential decision-making, providing a scalable new paradigm for quantum thermodynamic control under resource constraints.
📝 Abstract
We investigate work extraction protocols designed to transfer the maximum possible energy to a battery using sequential access to $N$ copies of an unknown pure qubit state. The core challenge is designing interactions to optimally balance two competing goals: charging of the battery optimally using the qubit in hand, and acquiring more information by qubit to improve energy harvesting in subsequent rounds. Here, we leverage exploration-exploitation trade-off in reinforcement learning to develop adaptive strategies achieving energy dissipation that scales only poly-logarithmically in $N$. This represents an exponential improvement over current protocols based on full state tomography.