🤖 AI Summary
Existing autonomous research agents lack the domain-specific reasoning, method selection, and data acquisition capabilities required for spatial data science, limiting their ability to support rigorous geographic information science research. This work proposes an end-to-end autonomous research system tailored for spatial data science, employing a skill-first architecture to orchestrate the full research lifecycle. The system integrates 21 domain-specific workflow skills, nine expert sub-agents, and a custom MCP server, and introduces the novel “binding engineering” paradigm to design spatial analysis and data download skill units. It further incorporates multi-agent coordination, lifecycle hooks, safety gating, decoupled generation and evaluation, human-in-the-loop oversight, and state persistence mechanisms. Evaluated by six domain experts and three large language models across seven dimensions, the system significantly outperforms general-purpose agents, achieving substantial improvements in both research efficiency and output quality.
📝 Abstract
The automation of scientific research workflows has emerged as a transformative frontier in artificial intelligence, yet existing autonomous research agents remain largely domain-agnostic, lacking the specialized reasoning, method selection, and data acquisition capabilities required for rigorous spatial data science. This paper introduces NORA (Night Owl Research Agent), a harness-engineered, multi-agent autonomous research system purpose-built for GIScience and spatial data science. NORA orchestrates the complete research lifecycle through a skills-first architecture comprising 21 domain-specialized workflow skills, 9 specialist sub-agents, and custom Model Context Protocol (MCP) servers. Central to the system's design are two novel domain-specialized skills: a spatial analysis skill unit that encodes decision frameworks for exploratory spatial data analysis, spatial regression, and diagnostics; and a spatial data download skill that supports reproducible acquisition from authoritative geospatial data sources. We formalize the concept of harness engineering for scientific research agents, demonstrating how lifecycle hooks, safety gates, generator-evaluator separation, human-in-the-loop, and state persistence ensure reliable and reproducible autonomous research. We evaluate NORA through case studies by 6 domain specialists and 3 LLM reviewers across seven dimensions (novelty, quality, rigor, etc). Results demonstrate that domain-specialized harness engineering substantially improves the efficiency and quality of research output compared to general-purpose agent configurations.