LLM Safety for Children

📅 2025-02-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the critical gap in content safety for large language models (LLMs) used by children under 18. Methodologically, we propose the first developmentally grounded, child-specific safety evaluation framework: (1) constructing an interpretable and generalizable child user model; (2) designing a multi-dimensional harm taxonomy and role-driven adversarial prompt generation strategy; and (3) conducting systematic benchmarking across six state-of-the-art LLMs. Our key contribution is the empirical identification of child-specific safety vulnerabilities—such as self-harm induction, privacy disclosure, and cognitive misdirection—that remain undetected by conventional adult-oriented safety evaluations. Detection rates for these risks fall below 30%—substantially lower than corresponding benchmarks in adult contexts. By exposing these previously overlooked failure modes, our work bridges a foundational theoretical and practical gap in AI safety assessment for minors, providing both scientific grounding and a scalable methodology for evidence-based governance of LLMs targeting children.

Technology Category

Application Category

📝 Abstract
This paper analyzes the safety of Large Language Models (LLMs) in interactions with children below age of 18 years. Despite the transformative applications of LLMs in various aspects of children's lives such as education and therapy, there remains a significant gap in understanding and mitigating potential content harms specific to this demographic. The study acknowledges the diverse nature of children often overlooked by standard safety evaluations and proposes a comprehensive approach to evaluating LLM safety specifically for children. We list down potential risks that children may encounter when using LLM powered applications. Additionally we develop Child User Models that reflect the varied personalities and interests of children informed by literature in child care and psychology. These user models aim to bridge the existing gap in child safety literature across various fields. We utilize Child User Models to evaluate the safety of six state of the art LLMs. Our observations reveal significant safety gaps in LLMs particularly in categories harmful to children but not adults
Problem

Research questions and friction points this paper is trying to address.

Assessing LLM safety for children
Identifying risks in child interactions
Developing Child User Models for evaluation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Child User Models development
Safety evaluation for LLMs
Mitigating content harms for children
🔎 Similar Papers
No similar papers found.
P
Prasanjit Rath
H
Hari Shrawgi
P
Parag Agrawal
Sandipan Dandapat
Sandipan Dandapat
Microsoft, India
Machine TranslationNatural Language Processing