BloClaw: An Omniscient, Multi-Modal Agentic Workspace for Next-Generation Scientific Discovery

📅 2026-04-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses critical limitations in current AI-for-Science frameworks—namely, brittle JSON-based tool invocation, fragile execution environments prone to interruption, and static user interfaces ill-suited for high-dimensional scientific data. To overcome these challenges, the authors propose a multimodal agent operating system featuring three core innovations: an XML-Regex dual-path routing protocol, a runtime state-interception sandbox, and a state-driven dynamic viewport UI that fundamentally redefines agent-computer interaction. The system integrates RDKit, ESMFold, retrieval-augmented generation (RAG), and automated visualization capture to support complex scientific workflows in domains such as cheminformatics and protein folding. Experimental results demonstrate a dramatic reduction in tool-calling error rates from 17.6% to 0.2%, alongside seamless execution of multimodal research pipelines and automatic capture of dynamic visual outputs.
📝 Abstract
The integration of Large Language Models (LLMs) into life sciences has catalyzed the development of "AI Scientists." However, translating these theoretical capabilities into deployment-ready research environments exposes profound infrastructural vulnerabilities. Current frameworks are bottlenecked by fragile JSON-based tool-calling protocols, easily disrupted execution sandboxes that lose graphical outputs, and rigid conversational interfaces inherently ill-suited for high-dimensional scientific data.We introduce BloClaw, a unified, multi-modal operating system designed for Artificial Intelligence for Science (AI4S). BloClaw reconstructs the Agent-Computer Interaction (ACI) paradigm through three architectural innovations: (1) An XML-Regex Dual-Track Routing Protocol that statistically eliminates serialization failures (0.2% error rate vs. 17.6% in JSON); (2) A Runtime State Interception Sandbox that utilizes Python monkey-patching to autonomously capture and compile dynamic data visualizations (Plotly/Matplotlib), circumventing browser CORS policies; and (3) A State-Driven Dynamic Viewport UI that morphs seamlessly between a minimalist command deck and an interactive spatial rendering engine. We comprehensively benchmark BloClaw across cheminformatics (RDKit), de novo 3D protein folding via ESMFold, molecular docking, and autonomous Retrieval-Augmented Generation (RAG), establishing a highly robust, self-evolving paradigm for computational research assistants. The open-source repository is available at https://github.com/qinheming/BloClaw.
Problem

Research questions and friction points this paper is trying to address.

AI Scientists
tool-calling protocols
execution sandboxes
conversational interfaces
scientific data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-Modal Agentic Workspace
XML-Regex Dual-Track Routing
Runtime State Interception Sandbox
State-Driven Dynamic Viewport
AI for Science (AI4S)
🔎 Similar Papers
No similar papers found.
Yao Qin
Yao Qin
UCSB & Google DeepMind
Machine LearningComputer VisionNatural Language Processing
Y
Yangyang Yan
AI Innovation Department, Beijing 1st Biotech Group Co., Ltd.
J
Jinhua Pang
Diplomatic Negotiation Simulation and Data Lab
X
Xiaoming Zhang
First Medical Center, Chinese PLA General Hospital, No. 28 Fuxing Road, Haidian District, Beijing, China