Naturally Occurring Feedback is Common, Extractable and Useful

📅 2024-07-15

📈 Citations: 1

✨ Influential: 0

career value

182K/year

🤖 AI Summary

This study addresses the high cost and unsustainability of conventional human-annotated preference feedback for aligning language models with human preferences. To overcome this limitation, we propose a “natural feedback mining” paradigm: first, manual verification confirms that approximately 30% of real-world dialogues contain explicit user feedback; second, we develop a hybrid feedback identification model—combining rule-based heuristics and supervised fine-tuning—to automatically extract over 200,000 high-quality feedback instances from more than one million authentic human–model interactions. This work provides the first systematic empirical validation of both the prevalence and supervisory utility of naturally occurring feedback. Experiments demonstrate that preference models trained on this data significantly outperform strong baselines across multiple alignment benchmarks—including reward modeling, RLHF, and direct preference optimization—thereby establishing a low-resource, scalable, and sustainable pathway for language model alignment.

Technology Category

Application Category

📝 Abstract

Human feedback data is a critical component in developing language models. However, collecting this feedback is costly and ultimately not scalable. Inspired by the way human interlocutors provide spontaneous unsolicited feedback to each other, we propose to extract feedback that users naturally include when interacting with chat models. We manually annotated conversations to confirm the presence of naturally occurring feedback in a standard corpus, finding that as much as 30% of the chats include explicit feedback. Comparing to older datasets, we find that naturally occurring feedback is more prevalent in recent conversation datasets, suggesting that more than ever, naturally occurring feedback can serve as a valuable resource for feedback data. We propose a method for automatically extracting this feedback, and apply it to over 1M conversations to obtain hundreds of thousands of feedback samples. The extracted feedback shows promise: training with it improves over baseline models and enhances model alignment to human preferences.

Problem

Research questions and friction points this paper is trying to address.

Extracting naturally occurring feedback from user interactions.

Reducing costs and improving scalability of feedback collection.

Enhancing language models using extracted feedback data.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Extract naturally occurring feedback from chats

Automate feedback extraction from 1M conversations

Use extracted feedback to improve model alignment

🔎 Similar Papers

No similar papers found.

OpenAI

$230K – $385K • Offers Equity • Multiple Ranges

San Francisco, CA, USA

Authors to Follow