🤖 AI Summary
Existing medical large language model (LLM) prompt optimization methods struggle to simultaneously ensure deep domain knowledge integration and clinical safety, thereby limiting model reliability and practical utility. To address this, we propose an evolutionary prompt optimization framework specifically designed for clinical scenarios. Our approach innovatively integrates a medical terminology-aware attention mechanism, component-level evolutionary algorithms, and a semantic validation module, supported by a multi-dimensional evaluation system that preserves clinical reasoning integrity while enabling safe and precise prompt generation. The framework further incorporates reinforcement learning for end-to-end automated optimization. Experimental results across diverse medical tasks demonstrate a 24.7% reduction in factual errors, a 19.6% improvement in domain specificity, and a 15.3% increase in clinician preference scores, validating its effectiveness and clinical applicability.
📝 Abstract
Prompt engineering significantly influences the reliability and clinical utility of Large Language Models (LLMs) in medical applications. Current optimization approaches inadequately address domain-specific medical knowledge and safety requirements. This paper introduces EMPOWER, a novel evolutionary framework that enhances medical prompt quality through specialized representation learning, multi-dimensional evaluation, and structure-preserving algorithms. Our methodology incorporates: (1) a medical terminology attention mechanism, (2) a comprehensive assessment architecture evaluating clarity, specificity, clinical relevance, and factual accuracy, (3) a component-level evolutionary algorithm preserving clinical reasoning integrity, and (4) a semantic verification module ensuring adherence to medical knowledge. Evaluation across diagnostic, therapeutic, and educational tasks demonstrates significant improvements: 24.7% reduction in factually incorrect content, 19.6% enhancement in domain specificity, and 15.3% higher clinician preference in blinded evaluations. The framework addresses critical challenges in developing clinically appropriate prompts, facilitating more responsible integration of LLMs into healthcare settings.