Evolutionary Retrofitting

📅 2024-10-15
🏛️ ACM Transactions on Evolutionary Learning and Optimization
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of optimizing model parameters post-training using non-differentiable real-world feedback (e.g., user ratings, BLEU, word error rate), this paper proposes AfterLearnER: a gradient-free framework that, during inference or post-training, directly optimizes a critical subset of parameters via evolutionary algorithms (e.g., CMA-ES), requiring only数十 to hundreds of scalar feedback evaluations—enabling anytime optimization and human-feedback-driven dynamic adaptation. Theoretically, it introduces a generalization bound analysis to ensure robustness against overfitting. Empirically, AfterLearnER is evaluated across diverse tasks—including depth estimation, speech resynthesis, Doom gameplay, code translation, and latent diffusion—demonstrating significant improvements over conventional fine-tuning on practical metrics such as BLEU, image quality, and game scores. It is the first method to achieve gradient-free, few-shot, and high-generalization post-training refinement.

Technology Category

Application Category

📝 Abstract
AfterLearnER (After Learning Evolutionary Retrofitting) consists in applying evolutionary optimization to refine fully trained machine learning models by optimizing a set of carefully chosen parameters or hyperparameters of the model, with respect to some actual, exact, and hence possibly non-differentiable error signal, performed on a subset of the standard validation set. The efficiency of AfterLearnER is demonstrated by tackling non-differentiable signals such as threshold-based criteria in depth sensing, the word error rate in speech re-synthesis, the number of kills per life at Doom, computational accuracy or BLEU in code translation, image quality in 3D generative adversarial networks (GANs), and user feedback in image generation via Latent Diffusion Models (LDM). This retrofitting can be done after training, or dynamically at inference time by taking into account the user feedback. The advantages of AfterLearnER are its versatility, the possibility to use non-differentiable feedback, including human evaluations (i.e., no gradient is needed), the limited overfitting supported by a theoretical study, and its anytime behavior. Last but not least, AfterLearnER requires only a small amount of feedback, i.e., a few dozen to a few hundred scalars, compared to the tens of thousands needed in most related published works.
Problem

Research questions and friction points this paper is trying to address.

Optimizing trained models using evolutionary algorithms on non-differentiable error signals
Refining model parameters post-training with limited validation data feedback
Applying evolutionary retrofitting to improve performance on threshold-based metrics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Evolutionary optimization refines trained machine learning models
Handles non-differentiable error signals like human feedback
Requires minimal feedback data compared to traditional methods
🔎 Similar Papers
No similar papers found.
M
Mathurin Videau
Meta AI, France and TAU, INRIA and LISN (CNRS & Univ. Paris-Saclay), France
M
M. Zameshina
Univ Gustave Eiffel, CNRS, LIGM, France and Meta AI, France
A
A. Leite
INSA Rouen Normandy, University of Rouen Normandy, LITIS UR 4108, France
Laurent Najman
Laurent Najman
Professor, Laboratoire d'Informatique Gaspard Monge, ESIEE, Université Gustave Eiffel
Computer visionImage processing
Marc Schoenauer
Marc Schoenauer
Senior Researcher Emeritus, INRIA
Evolutionary ComputationMachine Learning
Olivier Teytaud
Olivier Teytaud
facebook
computer gamescomputer visionRobust OptimizationMachine Learning