Measuring and identifying factors of individuals' trust in Large Language Models

📅 2025-02-28

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

This study addresses the lack of validated instruments for measuring human trust in large language models (LLMs) within conversational contexts. To bridge this gap, we propose TILLMI—the first cognitive–affective, two-dimensional trust measurement framework tailored to LLM interactions—grounded in McAllister’s trust theory and adapted to the LLM context. We introduce a novel paradigm for validating LLM simulation validity and, through a large-scale empirical survey (N = 1,000), establish a parsimonious six-item scale with two interpretable factors: “closeness” and “reliance.” Rigorous psychometric evaluation—including exploratory and confirmatory factor analysis (EFA/CFA), alongside assessments of personality (BFI) and cognitive flexibility—confirms TILLMI’s excellent reliability and validity (CFI = .995, RMSEA = .046). Results further reveal significant moderating effects of age, gender, and LLM usage experience on trust levels, and robust associations: positive correlations with openness and extraversion, and a negative correlation with neuroticism.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) can engage in human-looking conversational exchanges. Although conversations can elicit trust between users and LLMs, scarce empirical research has examined trust formation in human-LLM contexts, beyond LLMs' trustworthiness or human trust in AI in general. Here, we introduce the Trust-In-LLMs Index (TILLMI) as a new framework to measure individuals' trust in LLMs, extending McAllister's cognitive and affective trust dimensions to LLM-human interactions. We developed TILLMI as a psychometric scale, prototyped with a novel protocol we called LLM-simulated validity. The LLM-based scale was then validated in a sample of 1,000 US respondents. Exploratory Factor Analysis identified a two-factor structure. Two items were then removed due to redundancy, yielding a final 6-item scale with a 2-factor structure. Confirmatory Factor Analysis on a separate subsample showed strong model fit ($CFI = .995$, $TLI = .991$, $RMSEA = .046$, $p_{X^2}>.05$). Convergent validity analysis revealed that trust in LLMs correlated positively with openness to experience, extraversion, and cognitive flexibility, but negatively with neuroticism. Based on these findings, we interpreted TILLMI's factors as"closeness with LLMs"(affective dimension) and"reliance on LLMs"(cognitive dimension). Younger males exhibited higher closeness with- and reliance on LLMs compared to older women. Individuals with no direct experience with LLMs exhibited lower levels of trust compared to LLMs' users. These findings offer a novel empirical foundation for measuring trust in AI-driven verbal communication, informing responsible design, and fostering balanced human-AI collaboration.

Problem

Research questions and friction points this paper is trying to address.

Developed Trust-In-LLMs Index (TILLMI) to measure trust in Large Language Models.

Identified two trust dimensions: closeness with and reliance on LLMs.

Validated TILLMI with 1,000 US respondents, showing strong model fit.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Developed Trust-In-LLMs Index (TILLMI) framework

Used LLM-simulated validity for psychometric scale

Validated scale with 1,000 US respondents

🔎 Similar Papers

Adaptive Guardrails For Large Language Models via Trust Modeling and In-Context Learning