Scaling Mobile Chaos Testing with AI-Driven Test Execution

📅 2026-02-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of validating system resilience in mobile applications, where traditional chaos engineering struggles due to the combinatorial explosion of user journeys, geographic variability, and backend failure modes. We present the first AI-driven, large-scale mobile chaos testing framework, which integrates DragonCrawl—an LLM-powered automated traversal tool—with uHavoc, a service-level fault injection system. This approach enables adaptive exploration and automated testing of critical user flows under realistic backend degradation scenarios, eliminating the need for manually authored test cases. It effectively uncovers mobile-specific crashes and dependency violations that surface only on-device. Deployed across Uber’s three core applications since Q1 2024, the framework has executed over 180,000 tests covering 47 key user journeys, identifying 23 resilience risks—including 12 critical functional blockers—with a root-cause localization precision of 88% (Precision@5) and 99% test reliability.

Technology Category

Application Category

📝 Abstract
Mobile applications in large-scale distributed systems are susceptible to backend service failures, yet traditional chaos engineering approaches cannot scale mobile testing due to the combinatorial explosion of flows, locations, and failure scenarios that need validation. We present an automated mobile chaos testing system that integrates DragonCrawl, an LLM-based mobile testing platform, with uHavoc, a service-level fault injection system. The key insight is that adaptive AI-driven test execution can navigate mobile applications under degraded backend conditions, eliminating the need to manually write test cases for each combination of user flow, city, and failure type. Since Q1 2024, our system has executed over 180,000 automated chaos tests across 47 critical flows in Uber's Rider, Driver, and Eats applications, representing approximately 39,000 hours of manual testing effort that would be impractical at this scale. We identified 23 resilience risks, with 70% being architectural dependency violations where non-critical service failures degraded core user flows. Twelve issues were severe enough to prevent trip requests or food orders. Two caused application crashes detectable only through mobile chaos testing, not backend testing alone. Automated root cause analysis reduced debugging time from hours to minutes, achieving 88% precision@5 in attributing mobile failures to specific backend services. This paper presents the system design, evaluates its performance under fault injection (maintaining 99% test reliability), and reports operational experience demonstrating that continuous mobile resilience validation is achievable at production scale.
Problem

Research questions and friction points this paper is trying to address.

mobile chaos testing
resilience validation
backend service failures
combinatorial explosion
distributed systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

AI-driven testing
mobile chaos engineering
LLM-based test generation
automated fault injection
resilience validation
🔎 Similar Papers
No similar papers found.
J
Juan Marcano
Uber Technologies, Inc., Sunnyvale, CA, USA
A
Ashish Samant
Uber Technologies, Inc., Sunnyvale, CA, USA
Kai Song
Kai Song
TikTok Inc.
NLP & LLM
L
Lingchao Chen
Uber Technologies, Inc., Sunnyvale, CA, USA
K
Kaelan Mikowicz
Uber Technologies, Inc., Sunnyvale, CA, USA
T
Tim Smyth
Uber Technologies, Inc., Sunnyvale, CA, USA
M
Mengdie Zhang
Uber Technologies, Inc., Sunnyvale, CA, USA
Ali Zamani
Ali Zamani
Honorary Research Fellow, School of ITEE, The University of Queensland
Computational ElectromagneticsMicrowave ImagingMedical ImagingSignal Processing
A
Arturo Bravo Rovirosa
Uber Technologies, Inc., Sunnyvale, CA, USA
S
Sowjanya Puligadda
Uber Technologies, Inc., Sunnyvale, CA, USA
S
Srikanth Prodduturi
Uber Technologies, Inc., Sunnyvale, CA, USA
Mayank Bansal
Mayank Bansal
Gen AI Leader @ AWS AI Labs, Ex-AmazonGo/JWO, Ex-Waymo/Google-Self-Driving, Ex-Sarnoff/SRI
AgentsAGIGenAIComputer VisionRobotics