🤖 AI Summary
Current large model–driven scientific automation systems struggle to accommodate researchers’ personalized requirements—such as resource constraints, methodological preferences, and desired output formats—limiting the practical utility of their generic outputs. This work proposes a three-layer co-evolutionary multi-agent framework that, for the first time, enables joint evolution of a skill repository, user memory, and unlabeled policy learning. Procedural skill distillation facilitates cross-project knowledge reuse, a memory module encodes individual historical experience, and unsupervised policy learning transforms unstructured user feedback into sustained system optimization. Empirical evaluations demonstrate that the proposed system significantly outperforms state-of-the-art AI research assistants across multiple benchmarks, while continuously improving research quality and reducing computational resource consumption through extended use.
📝 Abstract
LLM-powered multi-agent systems can now automate the full research pipeline from ideation to paper writing, but a fundamental question remains: automation for whom? Researchers operate under different resource configurations, hold different methodological preferences, and target different output formats. A system that produces uniform outputs regardless of these differences will systematically under-serve every individual user, making personalization a precondition for research automation to be genuinely usable. However, achieving it requires three capabilities that current systems lack: accumulating reusable procedural knowledge across projects, retaining user-specific experience across sessions, and internalizing implicit preferences that resist explicit formalization. We propose NanoResearch, a multi-agent framework that addresses these gaps through tri-level co-evolution. A skill bank distills recurring operations into compact procedural rules reusable across projects. A memory module maintains user- and project-specific experience that grounds planning decisions in each user's research history. A label-free policy learning converts free-form feedback into persistent parameter updates of the planner, reshaping subsequent coordination. These three layers co-evolve: reliable skills produce richer memory, richer memory informs better planning, and preference internalization continuously realigns the loop to each user. Extensive experiments demonstrate that NanoResearch delivers substantial gains over state-of-the-art AI research systems, and progressively refines itself to produce better research at lower cost over successive cycles.