TaskAudit: Detecting Functiona11ity Errors in Mobile Apps via Agentic Task Execution

📅 2025-10-14

📈 Citations: 0

✨ Influential: 0

career value

147K/year

🤖 AI Summary

Existing mobile accessibility auditing tools rely on static analysis or shallow contextual heuristics, limiting their ability to detect functional interaction errors—such as label-function mismatches, navigational disorientation, and missing feedback. This paper introduces TaskAudit, the first task-agent-based dynamic detection framework for functional accessibility flaws. It synthesizes user tasks via large language models, executes them using screen-reader-driven agents, and analyzes interaction traces to uncover deep-seated functional defects. Evaluated on 54 real-world app screens, TaskAudit identified 48 previously undetected functional accessibility issues—substantially outperforming conventional tools, which detected only 4–20 errors per app. By pioneering task-oriented interactive simulation in accessibility evaluation, this work establishes a novel methodology and provides a practical, scalable tool for mobile accessibility testing.

Technology Category

Application Category

📝 Abstract

Accessibility checkers are tools in support of accessible app development and their use is encouraged by accessibility best practices. However, most current checkers evaluate static or mechanically-generated contexts, failing to capture common accessibility errors impacting mobile app functionality. We present TaskAudit, an accessibility evaluation system that focuses on detecting functiona11ity errors through simulated interactions. TaskAudit comprises three components: a Task Generator that constructs interactive tasks from app screens, a Task Executor that uses agents with a screen reader proxy to perform these tasks, and an Accessibility Analyzer that detects and reports accessibility errors by examining interaction traces. Evaluation on real-world apps shows that our strategy detects 48 functiona11ity errors from 54 app screens, compared to between 4 and 20 with existing checkers. Our analysis demonstrates common error patterns that TaskAudit can detect in addition to prior work, including label-functionality mismatch, cluttered navigation, and inappropriate feedback.

Problem

Research questions and friction points this paper is trying to address.

Detecting mobile app accessibility errors through simulated interactive tasks

Identifying functionality errors missed by static accessibility checkers

Uncovering label-functionality mismatches and navigation issues in apps

Innovation

Methods, ideas, or system contributions that make the work stand out.

Simulates interactive tasks via agent execution

Uses screen reader proxy for accessibility testing

Detects errors through interaction trace analysis

🔎 Similar Papers

Automatically Analyzing Performance Issues in Android Apps: How Far Are We?

2024-07-06arXiv.orgCitations: 0

Uber

USD$162,000 per year - USD$180,000 per year. You will be eligible to participate in Uber's bonus program, and may be offered an equity award & other types of comp. All full-time employees are eligible to participate in a 401(k) plan. You will also be eligible for various benefits. More details can be found at the following link [https://jobs.uber.com/en/benefits](https://jobs.uber.com/en/benefits).

Sunnyvale, CA, USA

Authors to Follow