CPEMH: An Agentic Framework for Prompt-Driven Behavior Evaluation and Assurance in Foundation-Model Systems for Mental Health Screening

📅 2026-05-11
📈 Citations: 0
Influential: 0
📄 PDF

career value

223K/year
🤖 AI Summary
This work addresses the inconsistency in large language model (LLM) behavior during mental health screening—stemming from prompt variations and the absence of systematic safeguards—by proposing the first prompt-driven behavioral assurance framework tailored for mental health applications. The framework employs a modular multi-agent architecture, wherein a coordinator orchestrates collaboration between reasoning and evaluation agents to automatically design, assess, and select optimal prompting strategies. It further incorporates traceable prompt lifecycle management to enhance accountability. Evaluated against core criteria including F1 score, bias control, and robustness, the approach significantly improves the stability, cross-context consistency, and auditability of LLM behaviors in depression screening, thereby demonstrating its practical utility in clinically sensitive conversational scenarios.
📝 Abstract
This paper presents CPEMH, an agentic framework designed to evaluate prompt-driven behavior in foundation-model systems operating on transcript-based datasets for mental-health screening. CPEMH serves as an engineering methodology for behavioral assurance in large-scale language systems, introducing an orchestrated architecture that autonomously performs the design, evaluation, and selection of prompt strategies, enabling systematic control of behavioral variability across contexts. Its modular agentic design, combining orchestrator, inference, and evaluation agents, ensures traceability, reproducibility, and robustness throughout the prompting lifecycle. A case study on automated depression screening from interview transcripts demonstrates the framework's capacity to stabilize and audit foundation-model behavior in conversational and clinically sensitive domains. Lessons learned emphasize the role of modular orchestration in behavioral assurance, the prioritization of stability over architectural complexity, and the integration of F1, bias, and robustness as core acceptance criteria.
Problem

Research questions and friction points this paper is trying to address.

prompt-driven behavior
behavioral assurance
foundation models
mental health screening
behavioral variability
Innovation

Methods, ideas, or system contributions that make the work stand out.

agentic framework
prompt-driven behavior
behavioral assurance
modular orchestration
foundation-model evaluation
🔎 Similar Papers
No similar papers found.