🤖 AI Summary
This work addresses a critical limitation in existing large language model (LLM)-based tabular anomaly detection methods, which often randomize column order during table serialization, thereby ignoring inter-column causal relationships and constraining performance. To overcome this, the study introduces causal discovery into LLM-driven anomaly detection for the first time, proposing a causality-aware column reordering strategy coupled with a dynamic column importance weighting mechanism. Causal knowledge is effectively injected into the model through linear ordering optimization and fine-tuning. Extensive experiments across more than 30 benchmark datasets demonstrate that the proposed approach significantly outperforms current state-of-the-art methods, confirming the efficacy of causal modeling in enhancing anomaly detection accuracy.
📝 Abstract
Detecting anomalies in tabular data is critical for many real-world applications, such as credit card fraud detection. With the rapid advancements in large language models (LLMs), state-of-the-art performance in tabular anomaly detection has been achieved by converting tabular data into text and fine-tuning LLMs. However, these methods randomly order columns during conversion, without considering the causal relationships between them, which is crucial for accurately detecting anomalies. In this paper, we present CausalTaD, a method that injects causal knowledge into LLMs for tabular anomaly detection. We first identify the causal relationships between columns and reorder them to align with these causal relationships. This reordering can be modeled as a linear ordering problem. Since each column contributes differently to the causal relationships, we further propose a reweighting strategy to assign different weights to different columns to enhance this effect. Experiments across more than 30 datasets demonstrate that our method consistently outperforms the current state-of-the-art methods. The code for CausalTAD is available at https://github.com/350234/CausalTAD.