Taxonomy of Comprehensive Safety for Clinical Agents

📅 2025-09-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Clinical chatbots face severe safety challenges due to insufficient safe-response generation, risking misdiagnosis or harmful outputs. To address this, we propose TACOS—a novel, fine-grained 21-class safety taxonomy specifically designed for clinical conversational agents. TACOS is the first framework to jointly model safety control and tool invocation within user intent recognition, enabling differentiated safety thresholds for clinical versus non-clinical queries and explicit modeling of external tool dependencies. Leveraging a human-annotated TACOS dataset, we develop an end-to-end safety intent classifier and tool router built upon pretrained language models, supporting coordinated multi-level safety policies. Experiments demonstrate substantial improvements in safe-response accuracy. Furthermore, our analysis reveals that both training data distribution and the base model’s prior knowledge critically influence safety performance—highlighting key factors previously underexplored in clinical LLM safety research.

Technology Category

Application Category

📝 Abstract
Safety is a paramount concern in clinical chatbot applications, where inaccurate or harmful responses can lead to serious consequences. Existing methods--such as guardrails and tool calling--often fall short in addressing the nuanced demands of the clinical domain. In this paper, we introduce TACOS (TAxonomy of COmprehensive Safety for Clinical Agents), a fine-grained, 21-class taxonomy that integrates safety filtering and tool selection into a single user intent classification step. TACOS is a taxonomy that can cover a wide spectrum of clinical and non-clinical queries, explicitly modeling varying safety thresholds and external tool dependencies. To validate our framework, we curate a TACOS-annotated dataset and perform extensive experiments. Our results demonstrate the value of a new taxonomy specialized for clinical agent settings, and reveal useful insights about train data distribution and pretrained knowledge of base models.
Problem

Research questions and friction points this paper is trying to address.

Develops safety taxonomy for clinical chatbot applications
Integrates safety filtering with tool selection process
Addresses nuanced safety demands in clinical domain
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-grained taxonomy for clinical safety classification
Integrates safety filtering with tool selection
Explicitly models safety thresholds and dependencies
🔎 Similar Papers
No similar papers found.
J
Jean Seo
AITRICS/KAITRCS
H
Hyunkyung Lee
AITRICS/KAITRCS
Gibaeg Kim
Gibaeg Kim
AITRICS
NLPLLM
W
Wooseok Han
AITRICS/KAITRCS
J
Jaehyo Yoo
AITRICS/KAITRCS
S
Seungseop Lim
AITRICS/KAITRCS
K
Kihun Shin
Department of Rehabilitation Medicine, Severance Hospital, Yonsei University
Eunho Yang
Eunho Yang
KAIST
Machine LearningStatistics