🤖 AI Summary
This work proposes CoLabScience, an active AI assistant designed to overcome the passive responsiveness of traditional large language models in scientific collaboration and better support dynamic biomedical discovery. CoLabScience leverages the PULI framework—Positive-Unlabeled Learning-to-Intervene—which integrates positive-unlabeled learning with reinforcement learning to enable timely interventions in streaming research dialogues. The approach incorporates dialogue state tracking and long-short-term memory modeling to capture contextual dynamics effectively. To facilitate evaluation, the authors construct BSDD, the first biomedical dialogue intervention dataset based on PubMed. Experimental results demonstrate that CoLabScience significantly outperforms existing baselines in both intervention accuracy and collaborative task utility, highlighting the potential of proactive large language models in intelligent scientific collaboration.
📝 Abstract
The integration of Large Language Models (LLMs) into scientific workflows presents exciting opportunities to accelerate biomedical discovery. However, the reactive nature of LLMs, which respond only when prompted, limits their effectiveness in collaborative settings that demand foresight and autonomous engagement. In this study, we introduce CoLabScience, a proactive LLM assistant designed to enhance biomedical collaboration between AI systems and human experts through timely, context-aware interventions. At the core of our method is PULI (Positive-Unlabeled Learning-to-Intervene), a novel framework trained with a reinforcement learning objective to determine when and how to intervene in streaming scientific discussions, by leveraging the team's project proposal and long- and short-term conversational memory. To support this work, we introduce BSDD (Biomedical Streaming Dialogue Dataset), a new benchmark of simulated research discussion dialogues with intervention points derived from PubMed articles. Experimental results show that PULI significantly outperforms existing baselines in both intervention precision and collaborative task utility, highlighting the potential of proactive LLMs as intelligent scientific assistants.