🤖 AI Summary
This work addresses the scarcity of high-quality, publicly available multimodal datasets in microservice systems, which has hindered research on anomaly detection and root cause analysis. To bridge this gap, the authors introduce AnoMod, a novel dataset constructed by injecting four types of anomalies—spanning performance, service, database, and code layers—into two open-source microservice benchmarks, SocialNetwork and TrainTicket. Concurrently, five modalities of system telemetry are collected: logs, metrics, distributed traces, API responses, and code coverage, enabling an end-to-end fault simulation environment. AnoMod supports the evaluation of cross-modal anomaly detection approaches and facilitates fine-grained root cause localization at both service and code levels, thereby advancing end-to-end fault diagnosis in microservice architectures.
📝 Abstract
Microservice systems (MSS) have become a predominant architectural style for cloud services. Yet the community still lacks high-quality, publicly available datasets for anomaly detection (AD) and root cause analysis (RCA) in MSS. Most benchmarks emphasize performance-related faults and provide only one or two monitoring modalities, limiting research on broader failure modes and cross-modal methods. To address these gaps, we introduce a new multimodal anomaly dataset built on two open-source microservice systems: SocialNetwork and TrainTicket. We design and inject four categories of anomalies (Ano): performance-level, service-level, database-level, and code-level, to emulate realistic anomaly modes. For each scenario, we collect five modalities (Mod): logs, metrics, distributed traces, API responses, and code coverage reports, offering a richer, end-to-end view of system state and inter-service interactions. We name our dataset, reflecting its unique properties, as AnoMod. This dataset enables (1) evaluation of cross-modal anomaly detection and fusion/ablation strategies, and (2) fine-grained RCA studies across service and code regions, supporting end-to-end troubleshooting pipelines that jointly consider detection and localization.