🤖 AI Summary
Traditional evaluation metrics (e.g., accuracy) fail to capture the relationship between predictive uncertainty and actual reliability in safety-critical applications. To address this, we propose PaTAS—a Parallel Trust Assessment System—that establishes an independent, subjective-logic-based trust propagation channel alongside the main neural network. PaTAS introduces trust nodes and trust functions to quantify, in parallel with forward inference, the trustworthiness of inputs, parameters, and activations. It further proposes (i) a parameter trust dynamic updating mechanism and (ii) an instance-level reasoning-path trust assessment method, enabling fine-grained, interpretable trustworthy inference. Experiments demonstrate that PaTAS yields trust estimates exhibiting symmetry, convergence, and interpretability. It reliably distinguishes benign from adversarial examples and identifies unreliable predictions under diverse data degradation conditions, substantially narrowing the gap between confidence scores and true reliability—thereby enhancing AI transparency, robustness, and safety.
📝 Abstract
Trustworthiness has become a key requirement for the deployment of artificial intelligence systems in safety-critical applications. Conventional evaluation metrics such as accuracy and precision fail to capture uncertainty or the reliability of model predictions, particularly under adversarial or degraded conditions. This paper introduces the emph{Parallel Trust Assessment System (PaTAS)}, a framework for modeling and propagating trust in neural networks using Subjective Logic (SL). PaTAS operates in parallel with standard neural computation through emph{Trust Nodes} and emph{Trust Functions} that propagate input, parameter, and activation trust across the network. The framework defines a emph{Parameter Trust Update} mechanism to refine parameter reliability during training and an emph{Inference-Path Trust Assessment (IPTA)} method to compute instance-specific trust at inference. Experiments on real-world and adversarial datasets demonstrate that PaTAS produces interpretable, symmetric, and convergent trust estimates that complement accuracy and expose reliability gaps in poisoned, biased, or uncertain data scenarios. The results show that PaTAS effectively distinguishes between benign and adversarial inputs and identifies cases where model confidence diverges from actual reliability. By enabling transparent and quantifiable trust reasoning within neural architectures, PaTAS provides a principled foundation for evaluating model reliability across the AI lifecycle.