🤖 AI Summary
Structured data bias detection suffers from low automation, poor generalizability, and heavy reliance on manual effort. To address this, we propose the first end-to-end multi-agent collaborative framework for user-driven bias identification: leveraging LLM-based agents with specialized roles to jointly orchestrate multi-stage task planning, dynamic tool invocation, interpretable analysis, and interactive visualization. Concurrently, we introduce the first benchmark specifically designed for structured data bias detection—featuring multidimensional evaluation metrics and a large-scale suite of real-world and synthetic test cases. Experimental results demonstrate that our framework achieves state-of-the-art performance in coverage, accuracy, and interpretability, significantly outperforming both single-agent and conventional approaches. This work establishes a reproducible, scalable technical paradigm and standardized evaluation methodology for fair data science.
📝 Abstract
Detecting biases in structured data is a complex and time-consuming task. Existing automated techniques are limited in diversity of data types and heavily reliant on human case-by-case handling, resulting in a lack of generalizability. Currently, large language model (LLM)-based agents have made significant progress in data science, but their ability to detect data biases is still insufficiently explored. To address this gap, we introduce the first end-to-end, multi-agent synergy framework, BIASINSPECTOR, designed for automatic bias detection in structured data based on specific user requirements. It first develops a multi-stage plan to analyze user-specified bias detection tasks and then implements it with a diverse and well-suited set of tools. It delivers detailed results that include explanations and visualizations. To address the lack of a standardized framework for evaluating the capability of LLM agents to detect biases in data, we further propose a comprehensive benchmark that includes multiple evaluation metrics and a large set of test cases. Extensive experiments demonstrate that our framework achieves exceptional overall performance in structured data bias detection, setting a new milestone for fairer data applications.