ViBR: Automated Bug Replay from Video-based Reports using Vision-Language Models

📅 2026-04-21

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

This work addresses the challenge of automatically reproducing software defects from GUI screen recordings, a task where existing approaches often rely on brittle image heuristics, explicit touch annotations, or complex pre-constructed UI state graphs. The authors propose a lightweight, fully automated method that requires neither application instrumentation nor prior knowledge of the UI structure, operating solely on raw screen recordings to achieve high-fidelity defect replay. Their approach leverages CLIP embeddings for precise action boundary segmentation and integrates a vision-language model to enable region-aware GUI state comparison and guided reproduction. Experimental results demonstrate that the method successfully replays 72% of recorded defects, significantly outperforming current baselines and ablated variants, thereby eliminating the need for auxiliary metadata or customized recording setups.

Technology Category

Application Category

📝 Abstract

Bug reports play a critical role in software maintenance by helping users convey encountered issues to developers. Recently, GUI screen capture videos have gained popularity as a bug reporting artifact due to their ease of use and ability to retain rich contextual information. However, automatically reproducing bugs from such recordings remains a significant challenge. Existing methods often rely on fragile image-processing heuristics, explicit touch indicators, or pre-constructed UI transition graphs, which require non-trivial instrumentation and app-specific setup. This paper presents ViBR, a lightweight and fully automated approach that reproduces bugs directly from GUI recordings. Specifically, ViBR combines CLIP-based embedding similarity for action boundary segmentation with Vision-Language Models (VLMs) for region-aware GUI state comparison and guided bug replay. Experimental results show that ViBR successfully reproduces 72% of bug recordings, significantly outperforming state-of-the-art baselines and ablation variants.

Problem

Research questions and friction points this paper is trying to address.

bug replay

video-based bug reports

GUI recording

automated bug reproduction

software maintenance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Vision-Language Models

Bug Replay

GUI Recording