Training LLM Agents to Empower Humans

📅 2025-10-15

📈 Citations: 0

✨ Influential: 0

career value

225K/year

🤖 AI Summary

Existing assistive agent training paradigms overemphasize autonomous task completion, neglect human agency, and rely heavily on costly human feedback. Method: We propose Empower, the first framework to formulate “maximizing human empowerment” as a self-supervised objective. It fine-tunes agents solely on offline textual data—without explicit human feedback or verifiable reward functions—by simulating human evaluation across multiple rounds in a code-assisted environment. Empower models and optimizes collaborative strategies wherein agents proactively cede control or offer timely suggestions at critical decision points. Contribution/Results: A user study shows 78% of participants preferred Empower agents; suggestion acceptance increased by 31%, redundant suggestions decreased by 38%, and programming task success rates improved by 192% on average. Empower establishes a scalable, low-cost, human-centered paradigm for AI alignment.

Technology Category

Application Category

📝 Abstract

Assistive agents should not only take actions on behalf of a human, but also step out of the way and cede control when there are important decisions to be made. However, current methods for building assistive agents, whether via mimicking expert humans or via RL finetuning on an inferred reward, often encourage agents to complete tasks on their own rather than truly assisting the human attain their objectives. Additionally, these methods often require costly explicit human feedback to provide a training signal. We propose a new approach to tuning assistive language models based on maximizing the human's empowerment, their ability to effect desired changes in the environment. Our empowerment-maximizing method, Empower, only requires offline text data, providing a self-supervised method for fine-tuning language models to better assist humans. To study the efficacy of our approach, we conducted an 18-person user study comparing our empowerment assistant with a strong baseline. Participants preferred our assistant 78% of the time (p=0.015), with a 31% higher acceptance rate and 38% fewer suggestions. Additionally, we introduce a new environment for evaluating multi-turn code assistance using simulated humans. Using this environment, we show that agents trained with Empower increase the success rate of a simulated human programmer on challenging coding questions by an average of 192% over an SFT baseline. With this empowerment objective, we provide a framework for useful aligned AI agents at scale using only offline data without the need for any additional human feedback or verifiable rewards.

Problem

Research questions and friction points this paper is trying to address.

Current agents complete tasks independently instead of assisting humans

Existing methods require costly explicit human feedback for training

Proposes empowerment-maximizing approach using offline data without human feedback

Innovation

Methods, ideas, or system contributions that make the work stand out.

Maximizing human empowerment through self-supervised fine-tuning

Using offline text data without human feedback

Increasing success rates in multi-turn code assistance

🔎 Similar Papers

A Survey on Large Language Model based Autonomous Agents