EndoExtract: Co-Designing Structured Text Extraction from Endometriosis Ultrasound Reports

📅 2026-01-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the inefficiency in downstream analysis and modeling caused by the prevalence of unstructured text in endometriosis ultrasound reports, which traditionally rely on manual abstraction. The authors propose a locally deployed large language model (LLM)-driven system that automatically extracts structured clinical data while innovatively integrating key pain points from clinical workflows into its interactive design. Key features include mandatory review of interpretable fields, automatic highlighting of source evidence, and decoupling the pace of batch extraction from human verification. This approach shifts users from field-by-field data entry to a supervisory validation paradigm, substantially improving efficiency. The work also uncovers emerging challenges, such as risks associated with rapid skimming and handling missing data, thereby highlighting new directions for human-in-the-loop clinical NLP systems.

Technology Category

Application Category

📝 Abstract
Endometriosis ultrasound reports are often unstructured free-text documents that require manual abstraction for downstream tasks such as analytics, machine learning model training, and clinical auditing. We present \textbf{EndoExtract}, an on-premise LLM-powered system that extracts structured data from these reports and surfaces interpretive fields for human review. Through contextual inquiry with research assistants, we identified key workflow pain points: asymmetric trust between numerical and interpretive fields, repetitive manual highlighting, fatigue from sustained comparison, and terminology inconsistency across radiologists. These findings informed an interface that surfaces only interpretive fields for mandatory review, automatically highlights source evidence within PDFs, and separates batch extraction from human-paced verification. A formative workshop revealed that \textbf{EndoExtract} supports a shift from field-by-field data entry to supervisory validation, though participants noted risks of over-skimming and challenges in managing missing data.
Problem

Research questions and friction points this paper is trying to address.

endometriosis
ultrasound reports
structured text extraction
manual abstraction
clinical documentation
Innovation

Methods, ideas, or system contributions that make the work stand out.

structured text extraction
large language model (LLM)
human-in-the-loop
clinical NLP
co-design
🔎 Similar Papers
No similar papers found.
H
Haiyi Li
Univ. of Adelaide, Adelaide, SA, Australia
Yiyang Zhao
Yiyang Zhao
Ingdan Labs
Internet of ThingsMobile Computing
Y
Yutong Li
Univ. of Adelaide, Adelaide, SA, Australia
A
A. Deslandes
Robinson Inst., Univ. of Adelaide, Adelaide, Australia
J
Jodie Avery
Robinson Inst., Univ. of Adelaide, Adelaide, Australia
M
M. Hull
Robinson Inst., Univ. of Adelaide, Adelaide, Australia
H
Hsiang-Ting Chen
Univ. of Adelaide, Adelaide, SA, Australia