TraceLens: Question-Driven Debugging for Taint Flow Understanding

📅 2025-08-10

📈 Citations: 0

✨ Influential: 0

career value

165K/year

🤖 AI Summary

Existing taint analysis tools lack interactive, end-user–oriented debugging capabilities, hindering investigation of causal and counterfactual questions—such as “Why?”, “Why not?”, and “What-if?”—and failing to expose global data connectivity across multiple sources and sinks due to limitations of tree- or list-only visualizations. This paper introduces TraceLens: the first question-answering–enabled taint flow debugging interface, integrating static taint analysis with configurable hybrid visualization (tree + list views). It supports user-defined sources/sinks and speculative analysis of third-party library model impacts. A controlled user study demonstrates that, compared to CodeQL, TraceLens improves fault localization accuracy by 21% and reduces cognitive workload by 45% (measured via NASA-TLX), while significantly increasing developers’ confidence in identifying critical data flows.

Technology Category

Application Category

📝 Abstract

Taint analysis is a security analysis technique used to track the flow of potentially dangerous data through an application and its dependent libraries. Investigating why certain unexpected flows appear and why expected flows are missing is an important sensemaking process during end-user taint analysis. Existing taint analysis tools often do not provide this end-user debugging capability, where developers can ask why, why-not, and what-if questions about dataflows and reason about the impact of configuring sources and sinks, and models of 3rd-party libraries that abstract permissible and impermissible data flows. Furthermore, a tree-view or a list-view used in existing taint-analyzer's visualization makes it difficult to reason about the global impact on connectivity between multiple sources and sinks. Inspired by the insight that sensemaking tool-generated results can be significantly improved by a QA inquiry process, we propose TraceLens, a first end-user question-answer style debugging interface for taint analysis. It enables a user to ask why, why-not, and what-if questions to investigate the existence of suspicious flows, the non-existence of expected flows, and the global impact of third-party library models. TraceLens performs speculative what-if analysis, to help a user in debugging how different connectivity assumptions affect overall results. A user study with 12 participants shows that participants using TraceLens achieved 21% higher accuracy on average, compared to CodeQL. They also reported a 45% reduction in mental demand (NASA-TLX) and rated higher confidence in identifying relevant flows using TraceLens.

Problem

Research questions and friction points this paper is trying to address.

Investigates unexpected and missing data flows in taint analysis

Enables why, why-not, and what-if questions for dataflow debugging

Improves understanding of global impact on source-sink connectivity

Innovation

Methods, ideas, or system contributions that make the work stand out.

Question-answer style debugging interface

Speculative what-if analysis for connectivity

Visualization for global impact reasoning

🔎 Similar Papers

No similar papers found.