Assessing Trustworthiness of AI Training Dataset using Subjective Logic -- A Use Case on Bias

📅 2025-08-19

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

This work addresses the challenge of reliably assessing the overall trustworthiness—particularly fairness and bias—of AI training datasets under uncertainty. We propose the first dataset-level trustworthiness evaluation framework grounded in subjective logic, introducing this formalism to model global dataset properties for the first time. Our method enables interpretable and robust quantification of uncertainty in bias estimation when evidence is incomplete, heterogeneous, or conflicting. By constructing a trust proposition model and an uncertainty-aware inference mechanism, it precisely identifies systemic biases such as class imbalance. Experiments on a traffic sign recognition dataset demonstrate that the framework delivers strong interpretability, robustness, and generalization across both centralized and federated learning settings.

Technology Category

Application Category

📝 Abstract

As AI systems increasingly rely on training data, assessing dataset trustworthiness has become critical, particularly for properties like fairness or bias that emerge at the dataset level. Prior work has used Subjective Logic to assess trustworthiness of individual data, but not to evaluate trustworthiness properties that emerge only at the level of the dataset as a whole. This paper introduces the first formal framework for assessing the trustworthiness of AI training datasets, enabling uncertainty-aware evaluations of global properties such as bias. Built on Subjective Logic, our approach supports trust propositions and quantifies uncertainty in scenarios where evidence is incomplete, distributed, and/or conflicting. We instantiate this framework on the trustworthiness property of bias, and we experimentally evaluate it based on a traffic sign recognition dataset. The results demonstrate that our method captures class imbalance and remains interpretable and robust in both centralized and federated contexts.

Problem

Research questions and friction points this paper is trying to address.

Assessing AI training dataset trustworthiness for bias

Evaluating global fairness properties with uncertainty

Extending Subjective Logic to dataset-level trust propositions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Subjective Logic framework for dataset trustworthiness

Uncertainty-aware evaluation of global bias properties

Works in centralized and federated learning contexts

🔎 Similar Papers

SoK: Machine Learning for Misinformation Detection