🤖 AI Summary
This study addresses the challenges of accessibility in biomedical research arising from complex workflows and uneven resource distribution by proposing the first human-AI collaborative framework for this domain, termed “Vibe Medicine.” The framework enables researchers to orchestrate AI agents—equipped with over a thousand open-source medical capabilities—through natural language to execute cross-modal, multi-stage scientific tasks, while retaining full control over goal specification and critical decision-making. Built upon large language models and the Hermes Agent architecture, the system demonstrates end-to-end automation feasibility in scenarios such as rare disease diagnosis, drug repurposing, and clinical trial design. It also includes a systematic evaluation of key risks, including hallucination, privacy leakage, and overreliance, thereby advancing a trustworthy and equitable paradigm for AI-assisted scientific discovery.
📝 Abstract
With the emergence of large language models (LLMs) and AI agent frameworks, the human-AI co-work paradigm known as Vibe Coding is changing how people code, making it more accessible and productive. In scientific research, where workflows are more complex and the burden of specialized labor limits independent researchers and those in low-resource areas, the potential impact is even greater, particularly in biomedicine, which involves heterogeneous data modalities and multi-step analytical pipelines. In this paper, we introduce Vibe Medicine, a co-work paradigm in which clinicians and researchers direct skill-augmented AI agents through natural language to execute complex, multi-step biomedical workflows, while retaining the role of research director who specifies objectives, reviews intermediate results, and makes domain-informed decisions. The enabling infrastructure consists of three layers: capable LLMs, agent frameworks such as OpenClaw and Hermes Agent, and the OpenClaw medical skills collection, which includes more than 1,000 curated skills from multiple open-source repositories. We analyze the architecture and skill categories of this collection across ten biomedical domains, and present case studies covering rare disease diagnosis, drug repurposing, and clinical trial design that demonstrate end-to-end workflows in practice. We also identify the principal risks, such as hallucination, data privacy, and over-reliance, and outline directions toward more reliable, trustworthy, and clinically integrated agent-assisted research that advances research and technological equity and reduces health care resource disparities.