๐ค AI Summary
Fine-grained, real-time trust estimation remains challenging in human-robot collaboration due to the coarse, binary nature of conventional task-outcome feedback. Method: This paper proposes a continuous-reward-driven Beta reputation model that performs online Bayesian inference using per-step scalar reward signalsโenabling millisecond-level dynamic trust modeling. It uniquely integrates maximum-entropy reinforcement learning to automatically synthesize task-adaptive reward functions, eliminating reliance on handcrafted metrics. Contribution/Results: Experiments demonstrate significant improvements in both estimation accuracy and temporal responsiveness over baseline methods. The framework is validated across multiple benchmark human-robot collaboration tasks, confirming its generalizability and effectiveness. The implementation is publicly available.
๐ Abstract
When interacting with each other, humans adjust their behavior based on perceived trust. However, to achieve similar adaptability, robots must accurately estimate human trust at sufficiently granular timescales during the human-robot collaboration task. A beta reputation is a popular way to formalize a mathematical estimation of human trust. However, it relies on binary performance, which updates trust estimations only after each task concludes. Additionally, manually crafting a reward function is the usual method of building a performance indicator, which is labor-intensive and time-consuming. These limitations prevent efficiently capturing continuous changes in trust at more granular timescales throughout the collaboration task. Therefore, this paper presents a new framework for the estimation of human trust using a beta reputation at fine-grained timescales. To achieve granularity in beta reputation, we utilize continuous reward values to update trust estimations at each timestep of a task. We construct a continuous reward function using maximum entropy optimization to eliminate the need for the laborious specification of a performance indicator. The proposed framework improves trust estimations by increasing accuracy, eliminating the need for manually crafting a reward function, and advancing toward developing more intelligent robots. The source code is publicly available. https://github.com/resuldagdanov/robot-learning-human-trust