🤖 AI Summary
This work addresses the privacy risks associated with cloud-based inference of mental health AI systems in high-sensitivity settings such as military operations, correctional facilities, and telemedicine. To mitigate these concerns, the authors propose a fully on-device mobile AI platform featuring a novel “zero-data-offload” architecture. The system integrates lightweight open-source large language models—including Gemma, Phi-3.5-mini, and Qwen2—and employs quantization, domain-specific fine-tuning, and an on-device multi-model consensus reasoning mechanism to enable DSM-5-compliant mental disorder screening and diagnostic support directly on standard mobile hardware. Experimental results demonstrate that the platform achieves clinical-grade accuracy comparable to cloud-based counterparts while ensuring that all sensitive data remain strictly on-device, thereby effectively balancing stringent privacy requirements with practical diagnostic performance.
📝 Abstract
Privacy represents one of the most critical yet underaddressed barriers to AI adoption in mental healthcare -- particularly in high-sensitivity operational environments such as military, correctional, and remote healthcare settings, where the risk of patient data exposure can deter help-seeking behavior entirely. Existing AI-enabled psychiatric decision support systems predominantly rely on cloud-based inference pipelines, requiring sensitive patient data to leave the device and traverse external servers, creating unacceptable privacy and security risks in these contexts. In this paper, we propose a zero-egress, on-device AI platform for privacy-preserving psychiatric decision support, deployed as a cross-platform mobile application. The proposed system extends our prior work on fine-tuned LLM consortiums for psychiatric diagnosis standardization by fundamentally re-architecting the inference pipeline for fully local execution -- ensuring that no patient data is transmitted to, processed by, or stored on any external server at any stage. The platform integrates a consortium of three lightweight, fine-tuned, and quantized open-source LLMs -- Gemma, Phi-3.5-mini, and Qwen2 -- selected for their compact architectures and proven efficiency on resource-constrained mobile hardware. An on-device orchestration layer coordinates ensemble inference and consensus-based diagnostic reasoning, producing DSM-5-aligned assessments for conditions. The platform is designed to assist clinicians with differential diagnosis and evidence-linked symptom mapping, as well as to support patient-facing self-screening with appropriate clinical safeguards. Initial evaluation demonstrates that the proposed zero-egress deployment achieves diagnostic accuracy comparable to its server-side predecessor while sustaining real-time inference latency on commodity mobile hardware.