SafeChat: A Framework for Building Trustworthy Collaborative Assistants and a Case Study of its Usefulness

📅 2025-04-08

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

Current LLM-based chatbots face critical challenges in high-stakes domains (e.g., elections, healthcare), including uninterpretable responses, poor content controllability, absence of standardized trustworthiness evaluation, and high development barriers. To address these, we propose a safety-collaborative assistant framework tailored for high-trust scenarios. Our approach introduces a novel dual-safety mechanism—“response provenance + non-response”—enabling domain-agnostic response traceability, real-time trust quantification, and strict content control. It integrates retrieval-augmented generation, extractive summarization, rule-based validation, and LLM-assisted verification, complemented by a CSV-driven low-code development paradigm and an automated trustworthiness evaluation pipeline. Built upon Rasa, the framework has been deployed in applications such as ElectionBot-SC across multiple sensitive domains, demonstrably enhancing security and user trust. The complete implementation is open-sourced and widely adopted by the community.

Technology Category

Application Category

📝 Abstract

Collaborative assistants, or chatbots, are data-driven decision support systems that enable natural interaction for task completion. While they can meet critical needs in modern society, concerns about their reliability and trustworthiness persist. In particular, Large Language Model (LLM)-based chatbots like ChatGPT, Gemini, and DeepSeek are becoming more accessible. However, such chatbots have limitations, including their inability to explain response generation, the risk of generating problematic content, the lack of standardized testing for reliability, and the need for deep AI expertise and extended development times. These issues make chatbots unsuitable for trust-sensitive applications like elections or healthcare. To address these concerns, we introduce SafeChat, a general architecture for building safe and trustworthy chatbots, with a focus on information retrieval use cases. Key features of SafeChat include: (a) safety, with a domain-agnostic design where responses are grounded and traceable to approved sources (provenance), and 'do-not-respond' strategies to prevent harmful answers; (b) usability, with automatic extractive summarization of long responses, traceable to their sources, and automated trust assessments to communicate expected chatbot behavior, such as sentiment; and (c) fast, scalable development, including a CSV-driven workflow, automated testing, and integration with various devices. We implemented SafeChat in an executable framework using the open-source chatbot platform Rasa. A case study demonstrates its application in building ElectionBot-SC, a chatbot designed to safely disseminate official election information. SafeChat is being used in many domains, validating its potential, and is available at: https://github.com/ai4society/trustworthy-chatbot.

Problem

Research questions and friction points this paper is trying to address.

Addressing reliability and trustworthiness concerns in LLM-based chatbots

Preventing harmful content generation and ensuring traceable responses

Enabling fast, scalable development for trust-sensitive applications

Innovation

Methods, ideas, or system contributions that make the work stand out.

Domain-agnostic design with traceable responses

Automated trust assessments and summarization

CSV-driven workflow for scalable development

🔎 Similar Papers

No similar papers found.