TorchTraceAP: A New Benchmark Dataset for Detecting Performance Anti-Patterns in Computer Vision Models

📅 2025-12-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Performance anti-patterns are prevalent in training and inference of computer vision (CV) models, yet existing approaches struggle to automatically and precisely localize problematic segments within long-duration execution traces. Method: We introduce the first benchmark dataset for CV performance anti-pattern detection—comprising 600+ PyTorch execution traces spanning diverse hardware platforms and CV tasks—and propose an iterative detection paradigm combining lightweight temporal modeling for coarse screening with large language models (LLMs) for fine-grained classification and diagnostic feedback, thereby overcoming LLM context and reasoning limitations. Our method integrates PyTorch Profiler analysis, cross-platform support (CUDA/ROCm), and a standardized annotation protocol. Contribution/Results: The framework achieves significantly higher detection accuracy than unsupervised clustering and rule-based statistical methods, demonstrates strong generalization across classification, detection, segmentation, and generation tasks, and supports end-to-end localization alongside actionable optimization recommendations—enabling the first evaluable, reproducible, benchmarked anti-pattern detection for CV systems.

Technology Category

Application Category

📝 Abstract
Identifying and addressing performance anti-patterns in machine learning (ML) models is critical for efficient training and inference, but it typically demands deep expertise spanning system infrastructure, ML models and kernel development. While large tech companies rely on dedicated ML infrastructure engineers to analyze torch traces and benchmarks, such resource-intensive workflows are largely inaccessible to computer vision researchers in general. Among the challenges, pinpointing problematic trace segments within lengthy execution traces remains the most time-consuming task, and is difficult to automate with current ML models, including LLMs. In this work, we present the first benchmark dataset specifically designed to evaluate and improve ML models' ability to detect anti patterns in traces. Our dataset contains over 600 PyTorch traces from diverse computer vision models classification, detection, segmentation, and generation collected across multiple hardware platforms. We also propose a novel iterative approach: a lightweight ML model first detects trace segments with anti patterns, followed by a large language model (LLM) for fine grained classification and targeted feedback. Experimental results demonstrate that our method significantly outperforms unsupervised clustering and rule based statistical techniques for detecting anti pattern regions. Our method also effectively compensates LLM's limited context length and reasoning inefficiencies.
Problem

Research questions and friction points this paper is trying to address.

Detect performance anti-patterns in computer vision model traces
Automate identification of problematic segments in lengthy execution traces
Improve ML models' ability to classify and provide feedback on anti-patterns
Innovation

Methods, ideas, or system contributions that make the work stand out.

Benchmark dataset for detecting performance anti-patterns
Iterative lightweight ML and LLM approach
Compensates LLM context limits and reasoning inefficiencies
🔎 Similar Papers
No similar papers found.