Exploring Robust Multi-Agent Workflows for Environmental Data Management

📅 2026-04-02

📈 Citations: 0

✨ Influential: 0

career value

169K/year

🤖 AI Summary

This work addresses the critical risk of hallucinated outputs from large language models (LLMs) in environmental data management, particularly when irreversible operations—such as DOI assignment—are involved. To mitigate this, the authors propose EnviSmart, a novel system that intrinsically embeds reliability into its architectural design. EnviSmart employs a tripartite knowledge framework—separating externalized behaviors, domain knowledge, and skills—and a role-isolated multi-agent architecture. At key trust boundaries, it integrates deterministic verification and auditable handoff mechanisms to enable error isolation and rapid response. Evaluated on the SF2Bench benchmark, the system processed data from 2,452 sites within two days, successfully intercepting a pervasive coordinate transformation error affecting all sites. Typical faults were detected within 10 minutes with zero user exposure and fully remediated within 80 minutes.

Technology Category

Application Category

📝 Abstract

Embedding LLM-driven agents into environmental FAIR data management is compelling - they can externalize operational knowledge and scale curation across heterogeneous data and evolving conventions. However, replacing deterministic components with probabilistic workflows changes the failure mode: LLM pipelines may generate plausible but incorrect outputs that pass superficial checks and propagate into irreversible actions such as DOI minting and public release. We introduce EnviSmart, a production data management system deployed on campus-wide storage infrastructure for environmental research. EnviSmart treats reliability as an architectural property through two mechanisms: a three-track knowledge architecture that externalizes behaviors (governance constraints), domain knowledge (retrievable context), and skills (tool-using procedures) as persistent, interlocking artifacts; and a role-separated multi-agent design where deterministic validators and audited handoffs restore fail-stop semantics at trust boundaries before irreversible steps. We compare two production deployments. The University's GIS Center Ecological Archive (849 curated datasets) serves as a single-agent baseline. SF2Bench, a compound flooding benchmark comprising 2,452 monitoring stations and 8,557 published files spanning 39 years, validates the multi-agent workflow. The multi-agent approach improved both efficiency - completed by a single operator in two days with repeated artifact reuse across deployments - and reliability: audited handoffs detected and blocked a coordinate transformation error affecting all 2,452 stations before publication. A representative incident (ISS-004) demonstrated boundary-based containment with 10-minute detection latency, zero user exposure, and 80-minute resolution. This paper has been accepted at PEARC 2026.

Problem

Research questions and friction points this paper is trying to address.

LLM-driven agents

FAIR data management

probabilistic workflows

reliability

irreversible actions

Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-agent workflow

LLM reliability

FAIR data management