🤖 AI Summary
Existing agent-based search systems face three key challenges: unstable natural language reasoning trajectories, context explosion due to accumulation of raw intermediate traces, and sharp performance degradation on complex multi-hop queries. This paper introduces a structured symbolic action protocol coupled with a compact context register mechanism. We propose the first tripartite symbolic action framework—comprising Planning, Solving, and Backtracking spaces—to enable parseable, traceable, and controllable logical reasoning. Additionally, we design a lightweight context register that dynamically compresses state representations and suppresses unbounded context growth. Our method integrates symbolic modeling, protocol-driven action design, context state compression, and LLM/LRM-guided multi-step tool invocation with reflective refinement. Evaluated on Qwen2.5/3 series models, our approach consistently and significantly outperforms state-of-the-art baselines on multi-hop question answering—under both prompting-only and fine-tuning regimes.
📝 Abstract
Recent advances in Large Language Models (LLMs) and Large Reasoning Models (LRMs) have enabled agentic search systems that interleave multi-step reasoning with external tool use. However, existing frameworks largely rely on unstructured natural-language reasoning and accumulate raw intermediate traces in the context, which often leads to unstable reasoning trajectories, context overflow, and degraded performance on complex multi-hop queries. In this study, we introduce Laser, a general framework for stabilizing and scaling agentic search. Laser defines a symbolic action protocol that organizes agent behaviors into three spaces: planning, task-solving, and retrospection. Each action is specified with explicit semantics and a deterministic execution format, enabling structured and logical reasoning processes and reliable action parsing. This design makes intermediate decisions interpretable and traceable, enhancing explicit retrospection and fine-grained control over reasoning trajectories. In coordination with parsable actions, Laser further maintains a compact context register that stores only essential states of the reasoning process, allowing the agent to reason over long horizons without uncontrolled context expansion. Experiments on Qwen2.5/3-series models across challenging multi-hop QA datasets show that Laser consistently outperforms existing agentic search baselines under both prompting-only and fine-tuning settings, demonstrating that Laser provides a principled and effective foundation for robust, scalable agentic search.