"It Became My Buddy, But I'm Not Afraid to Disagree": A Multi-Session Study of UX Evaluators Collaborating with Conversational AI Assistants

📅 2026-03-13

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

This study investigates how the perceived expertise of conversational AI assistants influences user experience (UX) evaluators’ analytical strategies, trust dynamics, and task efficiency during multi-turn collaborative sessions. In a five-round conversational experiment, twelve professional UX evaluators collaborated with an AI agent exhibiting either “novice” or “expert” characteristics. A mixed-methods analysis integrating behavioral logs, subjective rating scales, and semi-structured interviews revealed that the expert-styled AI significantly enhanced users’ perceptions of efficiency, credibility, and comprehensiveness. Trust exhibited a non-linear trajectory—initially declining before increasing over time—and evaluators shifted their analytical approach from double-pass video review in early rounds to single-pass completion in later stages. These findings suggest that AI expertise should be dynamically calibrated across task phases to optimize human-AI collaboration and performance alignment.

Technology Category

Application Category

📝 Abstract

AI-assisted usability analysis can potentially reduce the time and effort of finding usability problems, yet little is known about how AI's perceived expertise influences evaluators' analytic strategies and perceptions over time. We ran a within-subjects, five-session study (six hours per participant) with 12 professional UX evaluators who worked with two conversational assistants designed to appear novice- or expert-like (differing in suggestion quantity and response accuracy). We logged behavioral measures (number of passes, suggestion acceptance rate), collected subjective ratings (trust, perceived efficiency), and conducted semi-structured interviews. Participants experienced an initial novelty effect and a subsequent dip in trust that recovered over time. Their efficiency improved as they shifted from a two-pass to a one-pass video inspection approach. Evaluators ultimately rated the experienced CA as significantly more efficient, trustworthy, and comprehensive, despite not perceiving expertise differences early on. We conclude with design implications for adapting AI expertise to enable calibrated human-AI collaboration.

Problem

Research questions and friction points this paper is trying to address.

conversational AI

usability evaluation

human-AI collaboration

perceived expertise

trust

Innovation

Methods, ideas, or system contributions that make the work stand out.

conversational AI

human-AI collaboration

trust dynamics