🤖 AI Summary
Medical AI deployment faces critical bottlenecks: fragmented preprocessing pipelines, poor model interoperability, and stringent privacy compliance requirements. To address these challenges, we propose the first multi-agent framework for end-to-end medical inference. It integrates automated data ingestion, differential privacy–driven anonymization, embedded tabular feature extraction, and multi-stage MedGemma-based medical image modeling. Task-specific agents autonomously perform feature selection, model matching, and preprocessing recommendation. Our framework introduces a novel multimodal interpretability mechanism by fusing SHAP/LIME attributions with DETR attention maps. Evaluated on public datasets—including geriatric medicine, palliative care, and colonoscopy imaging—the system operates fully autonomously without expert intervention. It significantly reduces domain-expert dependency while enhancing privacy compliance, cross-system compatibility, and deployment efficiency. The architecture demonstrates high scalability and cost-effectiveness, enabling robust, production-ready medical AI inference.
📝 Abstract
Building and deploying machine learning solutions in healthcare remains expensive and labor-intensive due to fragmented preprocessing workflows, model compatibility issues, and stringent data privacy constraints. In this work, we introduce an Agentic AI framework that automates the entire clinical data pipeline, from ingestion to inference, through a system of modular, task-specific agents. These agents handle both structured and unstructured data, enabling automatic feature selection, model selection, and preprocessing recommendation without manual intervention. We evaluate the system on publicly available datasets from geriatrics, palliative care, and colonoscopy imaging. For example, in the case of structured data (anxiety data) and unstructured data (colonoscopy polyps data), the pipeline begins with file-type detection by the Ingestion Identifier Agent, followed by the Data Anonymizer Agent ensuring privacy compliance, where we first identify the data type and then anonymize it. The Feature Extraction Agent identifies features using an embedding-based approach for tabular data, extracting all column names, and a multi-stage MedGemma-based approach for image data, which infers modality and disease name. These features guide the Model-Data Feature Matcher Agent in selecting the best-fit model from a curated repository. The Preprocessing Recommender Agent and Preprocessing Implementor Agent then apply tailored preprocessing based on data type and model requirements. Finally, the ``Model Inference Agent" runs the selected model on the uploaded data and generates interpretable outputs using tools like SHAP, LIME, and DETR attention maps. By automating these high-friction stages of the ML lifecycle, the proposed framework reduces the need for repeated expert intervention, offering a scalable, cost-efficient pathway for operationalizing AI in clinical environments.