Multi-Agent Debate: A Unified Agentic Framework for Tabular Anomaly Detection

📅 2026-02-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the instability of traditional tabular anomaly detection methods under distribution shifts, missing data, and rare anomalies, as well as their lack of interpretability regarding model disagreements. To overcome these limitations, we propose MAD (Multi-Agent Debate), a novel framework that introduces multi-agent debate mechanisms into anomaly detection for the first time. MAD integrates normalized anomaly scores, confidence estimates, and structured evidence from multiple detectors through a coordination layer and leverages a large language model (LLM) as a critic to enhance reasoning, producing auditable debate trajectories and consolidated scores. The framework unifies paradigms such as mixture-of-experts and advice learning, while supporting mathematically provable regret bounds and conformal calibration. Experiments demonstrate that MAD significantly improves robustness across multiple benchmarks, effectively controls false positive rates, and provides clear traceability of model disagreements.

Technology Category

Application Category

📝 Abstract
Tabular anomaly detection is often handled by single detectors or static ensembles, even though strong performance on tabular data typically comes from heterogeneous model families (e.g., tree ensembles, deep tabular networks, and tabular foundation models) that frequently disagree under distribution shift, missingness, and rare-anomaly regimes. We propose MAD, a Multi-Agent Debating framework that treats this disagreement as a first-class signal and resolves it through a mathematically grounded coordination layer. Each agent is a machine learning (ML)-based detector that produces a normalized anomaly score, confidence, and structured evidence, augmented by a large language model (LLM)-based critic. A coordinator converts these messages into bounded per-agent losses and updates agent influence via an exponentiated-gradient rule, yielding both a final debated anomaly score and an auditable debate trace. MAD is a unified agentic framework that can recover existing approaches, such as mixture-of-experts gating and learning-with-expert-advice aggregation, by restricting the message space and synthesis operator. We establish regret guarantees for the synthesized losses and show how conformal calibration can wrap the debated score to control false positives under exchangeability. Experiments on diverse tabular anomaly benchmarks show improved robustness over baselines and clearer traces of model disagreement
Problem

Research questions and friction points this paper is trying to address.

tabular anomaly detection
model disagreement
distribution shift
missingness
rare anomalies
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-Agent Debate
Tabular Anomaly Detection
Heterogeneous Model Coordination
Conformal Calibration
Exponentiated-Gradient Coordination
🔎 Similar Papers
No similar papers found.