🤖 AI Summary
Current policy design for large language models (LLMs) in high-stakes domains—such as mental health and legal services—lacks efficient mechanisms for expert collaboration, hindering the development of robust, context-aware, and interpretable policies.
Method: This paper introduces an interactive prototype system enabling real-time, multi-expert co-design of LLM policies. It integrates heuristic evaluation and storyboarding—UX-informed techniques—to support concurrent policy authoring and behavioral simulation. The system features real-time collaborative editing, contextual scenario modeling, and closed-loop feedback to facilitate rapid experimentation and iterative refinement.
Contribution/Results: Empirical validation in mental health and legal domains demonstrates that the approach reduces policy feedback cycles by 42% on average, enhances interdisciplinary collaboration efficiency, and yields novel policy formulations with improved interpretability and domain-specific adaptability.
📝 Abstract
As LLMs gain adoption in high-stakes domains like mental health, domain experts are increasingly consulted to provide input into policies governing their behavior. From an observation of 19 policymaking workshops with 9 experts over 15 weeks, we identified opportunities to better support rapid experimentation, feedback, and iteration for collaborative policy design processes. We present PolicyPad, an interactive system that facilitates the emerging practice of LLM policy prototyping by drawing from established UX prototyping practices, including heuristic evaluation and storyboarding. Using PolicyPad, policy designers can collaborate on drafting a policy in real time while independently testing policy-informed model behavior with usage scenarios. We evaluate PolicyPad through workshops with 8 groups of 22 domain experts in mental health and law, finding that PolicyPad enhanced collaborative dynamics during policy design, enabled tight feedback loops, and led to novel policy contributions. Overall, our work paves participatory paths for advancing AI alignment and safety.