🤖 AI Summary
Existing automated scientific research systems predominantly rely on single-agent architectures or rigid pipelines, which struggle to accommodate complex, dynamic research demands and often lack interactivity, observability, and reproducibility. To address these limitations, this work proposes a lab-native multi-agent research paradigm that enables customizable agent roles, real-time monitoring, artifact inspection, and rollback-based recovery within an autonomous research platform. The platform introduces the novel Claw-Code Harness mechanism, which seamlessly integrates local code, data, and experimental execution into a closed loop, and supports three distinct research modes: exploration, discussion, and replication. In evaluations across five AI research cases, expert reviewers consistently rated the proposed system as significantly superior to the AutoResearchClaw baseline in terms of idea novelty, experimental completeness, and overall paper quality.
📝 Abstract
We present Claw AI Lab, a lab-native autonomous research platform that advances automated research from a hidden prompt-to-paper pipeline into an interactive AI laboratory. Rather than centering the system around a single agent or a fixed serial workflow, we allow users to instantiate a full research team from one prompt, with customizable roles, collaborative workflows, real-time monitoring, artifact inspection, and rollback/resume control through a unified dashboard. The platform also supports distinct research modes for exploration, multi-agent discussion, and reproduction, making autonomous research substantially more steerable and laboratory-like in practice. A key practical contribution of Claw AI Lab lies in its Claw-Code Harness, which connects local codebases, datasets, and checkpoints to runnable experiments and feeds execution artifacts back into the research loop. As a result, the harness improves not only execution integration, but also experimental completion and result integrity: experiments are easier to inspect, iterate on, and faithfully transfer into final papers, reducing common failure modes such as partial runs and malformed result reporting. In our internal evaluation on five AI research case studies, using AutoResearchClaw as the baseline, Claw AI Lab is consistently preferred by AI expert judges on idea novelty, experiment completeness, and paper presentation quality. We view Claw AI Lab as an early step toward a new paradigm: autonomous research as usable, interactive, and reliability-aware scientific infrastructure.