🤖 AI Summary
This work addresses the limitations of large language models in scientific workflows—particularly their constrained context length and reasoning capabilities, which hinder autonomous execution of complex research tasks—by proposing a “Local Body/Remote Brain” hybrid architecture. This framework integrates local executors with cloud-based large models to develop two scientific agents: DeepTS/DeepCollector for automated large-scale time-series data processing, and DeepScribe for transforming complex mathematical and physical lectures into structured reports. The system leverages Cellular RAG for fine-grained knowledge extraction, remote data validation, and distributed concurrency control. Deployed via a Python coordinator in Google Colab, it has been successfully applied to time-series analysis, knowledge graph construction, and extended to high-energy physics (DeepQCD), significantly advancing the automation and intelligence of scientific research tasks.
📝 Abstract
This paper details two novel frameworks for developing autonomous, agentic AI in scientific workflows. Both systems leverage a hybrid Local Body, Remote Brain architecture via Google Colab, utilizing Python-based local orchestrators to invoke large language model (LLM) cloud backends. The first agent, DeepTS/DeepCollector, automates the large-scale curation, extraction, and deduplication of time-series datasets. The second, DeepScribe, is an autonomous presentation analyzer that converts visually dense, mathematically complex physics lectures into structured scientific reports. Through practical systems engineering-such as granular attribute extraction (Cellular RAG), remote data inspection, and distributed concurrency controls-we demonstrate how agentic AI can overcome the context and reasoning limitations of current state-of-the-art systems to rigorously support scientific workflows. Finally, we outline a generalization of DeepTS to support deep knowledge graphs and discuss the application of this conceptual approach to high-energy physics (DeepQCD).