TEC: A Collection of Human Trial-and-error Trajectories for Problem Solving

πŸ“… 2026-04-08
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This study addresses the challenge that current AI systems struggle to learn effective trial-and-error strategies, primarily due to the absence of high-quality datasets capturing authentic human problem-solving behaviors involving iterative attempts and corrections. To bridge this gap, the authors developed an interactive online platform with structured tasks designed to systematically collect users’ complete action trajectories, error feedback, and metacognitive reflection texts across multiple rounds of trial and error. The project gathered data from 46 participants completing 58 tasks, yielding 5,370 trial-and-error trajectories and 41,229 pages of web-based interaction logs. This work presents the first large-scale public dataset of human trial-and-error behavior and demonstrates that human accuracy in such tasks significantly outperforms that of contemporary large language models, thereby filling a critical data and research void in the field.
πŸ“ Abstract
Trial-and-error is a fundamental strategy for humans to solve complex problems and a necessary capability for Artificial Intelligence (AI) systems operating in real-world environments. Although several trial-and-error AI techniques have recently been proposed, most of them rely on simple heuristics designed by researchers and achieve limited performance gains. The core issue is the absence of appropriate data: current models cannot learn from detailed records of how humans actually conduct trial-and-error in practice. To address this gap, we introduce a data annotation platform and a corresponding dataset, termed Trial-and-Error Collection (TEC). The platform records users' complete trajectories across multiple trials and collects their reflections after receiving error feedback. Using this platform, we record the problem-solving processes of 46 participants on 58 tasks, resulting in 5,370 trial trajectories along with error reflections across 41,229 webpages. With this dataset, we observe that humans achieve substantially higher accuracy compared to LLMs, which demonstrates that humans are more effective in trial-and-error than LLMs. We believe that the TEC platform and dataset provide a valuable foundation for understanding human trial-and-error behavior and for developing more capable AI systems. Platform and dataset are publicly available.
Problem

Research questions and friction points this paper is trying to address.

trial-and-error
human problem solving
AI learning
behavioral data
error feedback
Innovation

Methods, ideas, or system contributions that make the work stand out.

trial-and-error
human problem solving
trajectory dataset
error reflection
AI learning from human behavior
πŸ”Ž Similar Papers
X
Xinkai Zhang
Gaoling School of Artificial Intelligence, Renmin University of China; Quancheng Laboratory, Beijing 100872, China
Jingtao Zhan
Jingtao Zhan
Tsinghua University
Information RetrievalNatural Language ProcessingAI
Y
Yiqun Liu
Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China
Qingyao Ai
Qingyao Ai
Associate Professor, Dept. of CS&T, Tsinghua University
Information RetrievalMachine Learning