🤖 AI Summary
Quantifying and establishing trust in machine learning systems remains challenging for non-expert users. Method: This paper introduces the “Leap of Faith” (LoF) theoretical framework, formally defining trust as the degree of cognitive alignment between a user’s mental model and the system’s observable behavior—thereby enabling an objective, measurable, and constructible intrinsic trust metric grounded in behavioral evidence rather than subjective self-reports. We propose the LoF matrix and a neuro-symbolic hybrid architecture, using expert-validated rule-based agents as cognitive anchors to jointly assess mental-model alignment and behavioral trust indicators. Contribution/Results: Evaluated in a three-month field study on sleep intervention, our approach enables dynamic visualization of trust states and quantification of trust “leap distances.” It significantly enhances interpretability and practical deployability of trustworthy ML in high-stakes scenarios.
📝 Abstract
Human trust is a prerequisite to trustworthy AI adoption, yet trust remains poorly understood. Trust is often described as an attitude, but attitudes cannot be reliably measured or managed. Additionally, humans frequently conflate trust in an AI system, its machine learning (ML) technology, and its other component parts. Without fully understanding the 'leap of faith' involved in trusting ML, users cannot develop intrinsic trust in these systems. A common approach to building trust is to explain a ML model's reasoning process. However, such explanations often fail to resonate with non-experts due to the inherent complexity of ML systems and explanations are disconnected from users' own (unarticulated) mental models. This work puts forward an innovative way of directly building intrinsic trust in ML, by discerning and measuring the Leap of Faith (LoF) taken when a user decides to rely on ML. The LoF matrix captures the alignment between an ML model and a human expert's mental model. This match is rigorously and practically identified by feeding the user's data and objective function into both an ML agent and an expert-validated rules-based agent: a verified point of reference that can be tested a priori against a user's own mental model. This represents a new class of neuro-symbolic architecture. The LoF matrix reveals to the user the distance that constitutes the leap of faith between the rules-based and ML agents. For the first time, we propose trust metrics that evaluate whether users demonstrate trust through their actions rather than self-reported intent and whether such trust is deserved based on outcomes. The significance of the contribution is that it enables empirical assessment and management of ML trust drivers, to support trustworthy ML adoption. The approach is illustrated through a long-term high-stakes field study: a 3-month pilot of a multi-agent sleep-improvement system.