RCRank: Multimodal Ranking of Root Causes of Slow Queries in Cloud Database Systems

📅 2025-03-06

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

To address the challenges of diagnosing root causes of slow queries and prioritizing optimization efforts in cloud databases, this paper pioneers modeling root-cause identification as a multimodal learning task. We propose a Root-Cause Adaptive Cross-Modal Transformer architecture coupled with an impact-aware training objective. The method jointly encodes four heterogeneous modalities—SQL queries, execution plans, runtime logs, and performance metrics—and enhances cross-modal alignment via self-supervised pretraining. A root-cause adaptive cross-attention mechanism is designed to dynamically fuse modality-specific features based on diagnostic relevance. Evaluated on both real-world and synthetic datasets, our approach achieves significant improvements over state-of-the-art methods across key metrics, including root-cause identification accuracy, ranking quality (NDCG and MRR), and optimization decision support. It enables effective prioritization of high-impact, high-priority query optimizations in production cloud database environments.

Technology Category

Application Category

📝 Abstract

With the continued migration of storage to cloud database systems,the impact of slow queries in such systems on services and user experience is increasing. Root-cause diagnosis plays an indispensable role in facilitating slow-query detection and revision. This paper proposes a method capable of both identifying possible root cause types for slow queries and ranking these according to their potential for accelerating slow queries. This enables prioritizing root causes with the highest impact, in turn improving slow-query revision effectiveness. To enable more accurate and detailed diagnoses, we propose the multimodal Ranking for the Root Causes of slow queries (RCRank) framework, which formulates root cause analysis as a multimodal machine learning problem and leverages multimodal information from query statements, execution plans, execution logs, and key performance indicators. To obtain expressive embeddings from its heterogeneous multimodal input, RCRank integrates self-supervised pre-training that enhances cross-modal alignment and task relevance. Next, the framework integrates root-cause-adaptive cross Transformers that enable adaptive fusion of multimodal features with varying characteristics. Finally, the framework offers a unified model that features an impact-aware training objective for identifying and ranking root causes. We report on experiments on real and synthetic datasets, finding that RCRank is capable of consistently outperforming the state-of-the-art methods at root cause identification and ranking according to a range of metrics.

Problem

Research questions and friction points this paper is trying to address.

Identifies and ranks root causes of slow queries

Uses multimodal data for accurate root cause analysis

Improves slow-query revision effectiveness through prioritization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal machine learning for root cause analysis

Self-supervised pre-training for cross-modal alignment

Impact-aware training objective for root cause ranking

🔎 Similar Papers

No similar papers found.