Towards Recommending Usability Improvements with Multimodal Large Language Models

📅 2025-08-22

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

Traditional UI usability evaluation is resource-intensive, expert-dependent, and thus inaccessible to small organizations. Method: This paper proposes an automated evaluation framework based on multimodal large language models (MLLMs), formalizing usability assessment as a three-stage recommendation task—problem identification, severity ranking, and improvement suggestion generation—and jointly modeling interface text, visual screenshots, and DOM structure without manual annotation. Contribution/Results: The end-to-end method achieves human-expert-level performance: high problem identification accuracy, strong inter-rater consistency in severity ranking (Kendall’s τ = 0.78), and high-quality suggestions (expert-rated 4.2/5.0). By eliminating reliance on domain experts and labeled data, it significantly lowers the barrier to usability evaluation, enhancing accessibility and scalability of usability engineering.

Technology Category

Application Category

📝 Abstract

Usability describes a set of essential quality attributes of user interfaces (UI) that influence human-computer interaction. Common evaluation methods, such as usability testing and inspection, are effective but resource-intensive and require expert involvement. This makes them less accessible for smaller organizations. Recent advances in multimodal LLMs offer promising opportunities to automate usability evaluation processes partly by analyzing textual, visual, and structural aspects of software interfaces. To investigate this possibility, we formulate usability evaluation as a recommendation task, where multimodal LLMs rank usability issues by severity. We conducted an initial proof-of-concept study to compare LLM-generated usability improvement recommendations with usability expert assessments. Our findings indicate the potential of LLMs to enable faster and more cost-effective usability evaluation, which makes it a practical alternative in contexts with limited expert resources.

Problem

Research questions and friction points this paper is trying to address.

Automating usability evaluation to reduce expert dependency

Recommending severity-ranked UI improvements via multimodal analysis

Enabling cost-effective usability testing for resource-limited organizations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal LLMs analyze UI text, visuals, structure

LLMs rank usability issues by severity

Automates evaluation for cost-effective expert alternative

🔎 Similar Papers

Harnessing Multimodal Large Language Models for Multimodal Sequential Recommendation