Visual Embedding of Screen Sequences for User-Flow Search in Example-driven Communication

📅 2025-03-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
UX practitioners often struggle to retrieve semantically matched user flows—i.e., screen sequences representing user tasks—hindering exemplar-based communication with design and development teams. To address this, we propose the first vision-language cross-modal retrieval method tailored for user flows. Our approach employs contrastive learning to jointly embed screen images (extracted via CNN or ViT) and natural language task descriptions, thereby establishing a human-perceptually grounded relevance metric. Crucially, we pioneer the application of contrastive learning to user flow representation, enabling precise retrieval of semantically consistent screen sequences given natural language queries. Human-in-the-loop relevance evaluation demonstrates that our method significantly outperforms baseline approaches in judging task-level semantic similarity, empirically validating both the effectiveness of visual embeddings for modeling user flows and their practical utility in real-world UX workflows.

Technology Category

Application Category

📝 Abstract
Effective communication of UX considerations to stakeholders (e.g., designers and developers) is a critical challenge for UX practitioners. To explore this problem, we interviewed four UX practitioners about their communication challenges and strategies. Our study identifies that providing an example user flow-a screen sequence representing a semantic task-as evidence reinforces communication, yet finding relevant examples remains challenging. To address this, we propose a method to systematically retrieve user flows using semantic embedding. Specifically, we design a model that learns to associate screens' visual features with user flow descriptions through contrastive learning. A survey confirms that our approach retrieves user flows better aligned with human perceptions of relevance. We analyze the results and discuss implications for the computational representation of user flows.
Problem

Research questions and friction points this paper is trying to address.

Challenges in communicating UX considerations effectively to stakeholders.
Difficulty in finding relevant example user flows for communication.
Proposing a method to retrieve user flows using semantic embedding.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Semantic embedding for user-flow retrieval
Contrastive learning associates visual features
Model aligns user flows with human relevance
🔎 Similar Papers
No similar papers found.
D
Daeheon Jeong
School of Computing, KAIST
Hyehyun Chu
Hyehyun Chu
Master student @ School of Computing, KAIST
HCIHAI