Socratic Students: Teaching Language Models to Learn by Asking Questions

📅 2025-12-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing large language models (LLMs) predominantly rely on static knowledge retrieval, lacking the capability to actively identify knowledge gaps and formulate high-value questions—critical for dynamic knowledge acquisition in real-world scenarios such as educational tutoring and medical consultation. Method: This work introduces the first systematic “student-led interactive learning” paradigm, endowing LLMs with three core capabilities: knowledge-gap detection, strategic question generation, and dynamic feedback integration. We propose a DPO-based question-quality optimization framework that supports both cross-model distillation (from small to strong models) and self-distillation of questioning strategies. The framework integrates dynamic interaction modeling, self-skepticism mechanisms, and multi-turn knowledge consolidation prompting. Contribution/Results: Evaluated on mathematical reasoning and programming benchmarks, our approach achieves absolute Pass@k improvements ≥0.5 over static retrieval baselines. After DPO fine-tuning, smaller models demonstrate significantly enhanced question quality and learning efficiency, validating the paradigm’s scalability and effectiveness.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) excel at static interactions, where they answer user queries by retrieving knowledge encoded in their parameters. However, in many real-world settings, such as educational tutoring or medical assistance, relevant information is not directly available and must be actively acquired through dynamic interactions. An interactive agent would recognize its own uncertainty, ask targeted questions, and retain new knowledge efficiently. Prior work has primarily explored effective ways for a teacher to instruct the student, where the teacher identifies student gaps and provides guidance. In this work, we shift the focus to the student and investigate effective strategies to actively query the teacher in seeking useful information. Across math and coding benchmarks, where baseline student models begin with near-zero performance, we show that student-led approaches consistently yield absolute Pass@k improvements of at least 0.5 over static baselines. To improve question quality, we train students using Direct Preference Optimization (DPO) with guidance from either self or stronger students. We find that this guided training enables smaller models to learn how to ask better questions, further enhancing learning efficiency.
Problem

Research questions and friction points this paper is trying to address.

Teaching language models to ask questions for learning
Improving interactive knowledge acquisition in dynamic settings
Enhancing question quality through guided preference optimization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Student-led questioning for active knowledge acquisition
Direct Preference Optimization to improve question quality
Smaller models learning to ask better questions
🔎 Similar Papers
No similar papers found.
R
Rajeev Bhatt Ambati
UNC Chapel Hill
T
Tianyi Niu
UNC Chapel Hill
A
Aashu Singh
Meta
S
Shlok Mishra
Meta
S
Shashank Srivastava
UNC Chapel Hill
Snigdha Chaturvedi
Snigdha Chaturvedi
Associate Professor, University of North Carolina, Chapel Hill
Natural Language Processing