Trust in Vision-Language Models: Insights from a Participatory User Workshop

📅 2025-11-17

📈 Citations: 0

✨ Influential: 0

career value

222K/year

🤖 AI Summary

To address the lack of clarity regarding user trust mechanisms in vision-language models (VLMs), this study pioneers a participatory workshop methodology, systematically collecting contextualized trust perception data through iterative, real-user interactions and co-design activities. Departing from conventional AI-performance–centric evaluation paradigms, it adopts an empirically grounded, user-centered perspective to develop a trust assessment framework structured along four dimensions: task context, explainability, controllability, and error recovery. The research identifies key dynamic factors shaping trust evolution—particularly explanation consistency and feedback timeliness—and proposes a scalable, qualitative–design–integrated methodology. This work provides both theoretical foundations and actionable design guidelines for trust-oriented VLM interfaces, advancing trustworthy AI research from a technology-centric to a human-centered paradigm.

Technology Category

Application Category

📝 Abstract

With the growing deployment of Vision-Language Models (VLMs), pre-trained on large image-text and video-text datasets, it is critical to equip users with the tools to discern when to trust these systems. However, examining how user trust in VLMs builds and evolves remains an open problem. This problem is exacerbated by the increasing reliance on AI models as judges for experimental validation, to bypass the cost and implications of running participatory design studies directly with users. Following a user-centred approach, this paper presents preliminary results from a workshop with prospective VLM users. Insights from this pilot workshop inform future studies aimed at contextualising trust metrics and strategies for participants' engagement to fit the case of user-VLM interaction.

Problem

Research questions and friction points this paper is trying to address.

Understanding how user trust in Vision-Language Models builds and evolves

Addressing reliance on AI models instead of participatory user studies

Developing contextualized trust metrics for user-VLM interaction strategies

Innovation

Methods, ideas, or system contributions that make the work stand out.

User-centered participatory workshop for trust evaluation

Contextualizing trust metrics for VLM interaction

Engaging participants to inform future trust studies

🔎 Similar Papers

Pre-trained Vision-Language Models Learn Discoverable Visual Concepts