🤖 AI Summary
In low-resource settings like Pakistan, language barriers and limited digital literacy hinder electronic medical record (EMR) adoption and timely identification of high-risk obstetric conditions.
Method: This study introduces the first lightweight, Urdu-language mobile voice AI system for maternal care, integrating multilingual automatic speech recognition (ASR) fine-tuning, clinical terminology–enhanced prompt engineering for large language models (LLMs), structured voice input design, and a clinician feedback loop—enabling end-to-end “speech recognition → clinical understanding → structured EMR generation → risk alerting.” The system requires no reading or writing ability.
Contribution/Results: Deployed over seven months, it generated >500 structured EMRs and accurately flagged >300 high-risk events. Urdu ASR word error rate, field-filling accuracy, and clinical relevance of alerts all met practical deployment thresholds, demonstrating the feasibility and effectiveness of literacy-agnostic, voice-first clinical AI interaction.
📝 Abstract
We present the design, implementation, and in-situ deployment of a smartphone-based voice-enabled AI system for generating electronic medical records (EMRs) and clinical risk alerts in maternal healthcare settings. Targeted at low-resource environments such as Pakistan, the system integrates a fine-tuned, multilingual automatic speech recognition (ASR) model and a prompt-engineered large language model (LLM) to enable healthcare workers to engage naturally in Urdu, their native language, regardless of literacy or technical background. Through speech-based input and localized understanding, the system generates structured EMRs and flags critical maternal health risks. Over a seven-month deployment in a not-for-profit hospital, the system supported the creation of over 500 EMRs and flagged over 300 potential clinical risks. We evaluate the system's performance across speech recognition accuracy, EMR field-level correctness, and clinical relevance of AI-generated red flags. Our results demonstrate that speech based AI interfaces, can be effectively adapted to real-world healthcare settings, especially in low-resource settings, when combined with structured input design, contextual medical dictionaries, and clinician-in-the-loop feedback loops. We discuss generalizable design principles for deploying voice-based mobile healthcare AI support systems in linguistically and infrastructurally constrained settings.