Crowd-SFT: Crowdsourcing for LLM Alignment

📅 2025-06-04

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

Traditional supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) rely heavily on small-scale expert annotations, suffering from high costs, strong annotator bias, and poor scalability. To address these limitations, this paper introduces the first open-source crowdsourced SFT framework enabling large-scale, low-barrier, and fairly incentivized human feedback collection. Our method features: (1) a point-based reward mechanism calibrated via Shapley values, ensuring fair attribution of annotation contributions and scalable incentive alignment; and (2) a multi-model iterative selection framework that significantly accelerates convergence through dynamic optimization. Experiments demonstrate that our framework reduces the distance between the target model’s outputs and ideal responses by 55%. Moreover, the point-based rewards exhibit strong consistency with Shapley values (Spearman’s ρ > 0.92), validating the framework’s fairness, robustness, and scalability.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) increasingly rely on Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF) to align model responses with human preferences. While RLHF employs a reinforcement learning approach with a separate reward model, SFT uses human-curated datasets for supervised learning. Both approaches traditionally depend on small, vetted groups of annotators, making them costly, prone to bias, and limited in scalability. We propose an open, crowd-sourced fine-tuning framework that addresses these limitations by enabling broader feedback collection for SFT without extensive annotator training. Our framework promotes incentive fairness via a point-based reward system correlated with Shapley values and guides model convergence through iterative model updates. Our multi-model selection framework demonstrates up to a 55% reduction in target distance over single-model selection, enabling subsequent experiments that validate our point-based reward mechanism's close alignment with Shapley values (a well-established method for attributing individual contributions) thereby supporting fair and scalable participation.

Problem

Research questions and friction points this paper is trying to address.

LLM alignment relies on costly small annotator groups

Traditional SFT and RLHF methods lack scalability and fairness

Current approaches are prone to bias and limited feedback

Innovation

Methods, ideas, or system contributions that make the work stand out.

Open crowd-sourced fine-tuning framework

Point-based reward system with Shapley values

Iterative model updates for convergence

🔎 Similar Papers

No similar papers found.