Anonpsy: A Graph-Based Framework for Structure-Preserving De-identification of Psychiatric Narratives

📅 2026-01-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge that patient identities in psychiatric narratives are not only disclosed through explicit identifiers but also implicitly embedded in personalized life events and clinical structures, rendering conventional de-identification methods inadequate in balancing semantic fidelity and privacy preservation. To this end, the authors propose a graph-guided semantic rewriting framework: first, a semantic graph is constructed to represent clinical entities, temporal anchors, and their interrelationships; then, graph-constrained perturbations are applied to preserve critical diagnostic structures; finally, a graph-conditioned large language model generates de-identified text. Evaluated on 90 clinical narratives, this approach—marking the first integration of structured semantic graphs into psychiatric text privacy—significantly reduces re-identification risk and semantic distortion compared to pure LLM baselines, while maintaining high diagnostic fidelity and enabling fine-grained control over retained and modified content.

Technology Category

Application Category

📝 Abstract
Psychiatric narratives encode patient identity not only through explicit identifiers but also through idiosyncratic life events embedded in their clinical structure. Existing de-identification approaches, including PHI masking and LLM-based synthetic rewriting, operate at the text level and offer limited control over which semantic elements are preserved or altered. We introduce Anonpsy, a de-identification framework that reformulates the task as graph-guided semantic rewriting. Anonpsy (1) converts each narrative into a semantic graph encoding clinical entities, temporal anchors, and typed relations; (2) applies graph-constrained perturbations that modify identifying context while preserving clinically essential structure; and (3) regenerates text via graph-conditioned LLM generation. Evaluated on 90 clinician-authored psychiatric case narratives, Anonpsy preserves diagnostic fidelity while achieving consistently low re-identification risk under expert, semantic, and GPT-5-based evaluations. Compared with a strong LLM-only rewriting baseline, Anonpsy yields substantially lower semantic similarity and identifiability. These results demonstrate that explicit structural representations combined with constrained generation provide an effective approach to de-identification for psychiatric narratives.
Problem

Research questions and friction points this paper is trying to address.

de-identification
psychiatric narratives
semantic preservation
patient privacy
clinical structure
Innovation

Methods, ideas, or system contributions that make the work stand out.

graph-based de-identification
semantic graph
structure-preserving rewriting
clinical narrative anonymization
graph-conditioned LLM
🔎 Similar Papers
No similar papers found.
K
Kyungho Lim
Department of Psychiatry, Yonsei University College of Medicine; Institute of Behavioral Sciences in Medicine, Yonsei University College of Medicine
Byung-Hoon Kim
Byung-Hoon Kim
Yonsei University, College of Medicine
PsychiatryNeuroimagingLarge Multimodal Models