🤖 AI Summary
Large language models (LLMs) exhibit opaque and poorly calibrated uncertainty explanations in high-stakes domains such as clinical decision-making and legal reasoning, undermining trust and accountability.
Method: This paper proposes a virtue-ethics- and moral-psychology-informed, rule-based explainability framework. It formalizes moral principles—including *precaution*, *deference*, and *responsibility*—as executable logic rules, implemented via a lightweight Prolog engine. The framework dynamically aligns uncertainty levels (low/medium/high) with context-appropriate response strategies and generates semantically clear, justification-rich natural-language explanations using templated generation.
Contribution/Results: Experiments demonstrate significant improvements in rule coverage, fairness, and user trust calibration. In real-world high-risk scenarios, the approach enhances LLMs’ social responsibility and interpretability without compromising efficiency, offering a principled, human-centered mechanism for uncertainty-aware reasoning.
📝 Abstract
Large language models (LLMs) are increasingly used in high-stakes settings, where explaining uncertainty is both technical and ethical. Probabilistic methods are often opaque and misaligned with expectations of transparency. We propose a framework based on rule-based moral principles for handling uncertainty in LLM-generated text. Using insights from moral psychology and virtue ethics, we define rules such as precaution, deference, and responsibility to guide responses under epistemic or aleatoric uncertainty. These rules are encoded in a lightweight Prolog engine, where uncertainty levels (low, medium, high) trigger aligned system actions with plain-language rationales. Scenario-based simulations benchmark rule coverage, fairness, and trust calibration. Use cases in clinical and legal domains illustrate how moral reasoning can improve trust and interpretability. Our approach offers a transparent, lightweight alternative to probabilistic models for socially responsible natural language generation.