Frame Semantic Patterns for Identifying Underreporting of Notifiable Events in Healthcare: The Case of Gender-Based Violence

📅 2025-10-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the pervasive underreporting of gender-based violence (GBV) incidents in primary care electronic health records (EHRs). We propose a transparent, efficient, and language-agnostic semantic-driven NLP framework. Methodologically, we construct eight fine-grained GBV retrieval patterns grounded in semantic role labeling and apply them to automatically identify potential underreported cases from 21 million sentences of open Portuguese-language text (Brazilian variant), followed by expert validation. Our key contributions are threefold: (1) the first application of an interpretable semantic framework to public health surveillance—ensuring model traceability, ethical compliance, and low-carbon computation; (2) elimination of reliance on large-scale annotated data; and (3) inherent cross-lingual transferability. Evaluation yields a precision of 0.726, significantly improving GBV detection rates. Results demonstrate robustness, scalability, and practical utility in real-world clinical settings.

Technology Category

Application Category

📝 Abstract
We introduce a methodology for the identification of notifiable events in the domain of healthcare. The methodology harnesses semantic frames to define fine-grained patterns and search them in unstructured data, namely, open-text fields in e-medical records. We apply the methodology to the problem of underreporting of gender-based violence (GBV) in e-medical records produced during patients' visits to primary care units. A total of eight patterns are defined and searched on a corpus of 21 million sentences in Brazilian Portuguese extracted from e-SUS APS. The results are manually evaluated by linguists and the precision of each pattern measured. Our findings reveal that the methodology effectively identifies reports of violence with a precision of 0.726, confirming its robustness. Designed as a transparent, efficient, low-carbon, and language-agnostic pipeline, the approach can be easily adapted to other health surveillance contexts, contributing to the broader, ethical, and explainable use of NLP in public health systems.
Problem

Research questions and friction points this paper is trying to address.

Identifying underreported gender-based violence in medical records
Developing semantic frame patterns to detect notifiable healthcare events
Applying NLP methodology to improve public health surveillance systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses semantic frames for fine-grained pattern definition
Searches patterns in unstructured e-medical record text
Creates transparent language-agnostic health surveillance pipeline
🔎 Similar Papers
No similar papers found.
L
Lívia Dutra
Federal University of Juiz de Fora | FrameNet Brasil
A
Arthur Lorenzi
Vital Strategies Brasil
L
Laís Berno
Federal University of Juiz de Fora | FrameNet Brasil
F
Franciany Campos
Federal University of Juiz de Fora | FrameNet Brasil
K
Karoline Biscardi
Federal University of Minas Gerais
K
Kenneth Brown
Federal University of Juiz de Fora | FrameNet Brasil
Marcelo Viridiano
Marcelo Viridiano
Visiting researcher at Case Western Reserve University
Frame SemanticsMultimodality
Frederico Belcavello
Frederico Belcavello
Federal University of Juiz de Fora | FrameNet Brasil Computational Linguistics Lab
linguisticscommunicationsframe semanticsTV
Ely Matos
Ely Matos
Universidade Federal de Juiz de Fora
Computational Cognitive LinguisticsWeb Development
O
Olívia Guaranha
Vital Strategies Brasil
E
Erik Santos
Vital Strategies Brasil
S
Sofia Reinach
Vital Strategies Brasil
Tiago Timponi Torrent
Tiago Timponi Torrent
Professor of Linguistics, Universidade Federal de Juiz de Fora
Computational LinguisticsCognitive LinguisticsFrame SemanticsConstruction GrammarNatural Language Understanding