JAF: Judge Agent Forest

πŸ“… 2026-01-29
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the limitation of traditional judge agents, which evaluate query-response pairs in isolation and thus fail to capture cross-instance inconsistencies that hinder the reasoning optimization of the main agent. To overcome this, we propose the Judge Agent Forest (JAF) framework, which elevates judge agents from local evaluators to global learners by jointly reasoning over related query-response pairs. JAF integrates belief propagation with ensemble learning to construct a contextual neighborhood knowledge graph and introduces an interpretable, relation-aware mechanism for diverse exemplar selection, surpassing the constraints of conventional kNN-based embedding approaches. By synergistically combining in-context learning (ICL), locality-sensitive hashing (LSH), semantic embeddings, LLM-driven hash predicates, and label supervision, JAF significantly enhances the main agent’s ability to refine its reasoning pathways through collective feedback, as demonstrated in large-scale cloud misconfiguration classification tasks.

Technology Category

Application Category

πŸ“ Abstract
Judge agents are fundamental to agentic AI frameworks: they provide automated evaluation, and enable iterative self-refinement of reasoning processes. We introduce JAF: Judge Agent Forest, a framework in which the judge agent conducts joint inference across a cohort of query--response pairs generated by a primary agent, rather than evaluating each in isolation. This paradigm elevates the judge from a local evaluator to a holistic learner: by simultaneously assessing related responses, the judge discerns cross-instance patterns and inconsistencies, whose aggregate feedback enables the primary agent to improve by viewing its own outputs through the judge's collective perspective. Conceptually, JAF bridges belief propagation and ensemble-learning principles: overlapping in-context neighborhoods induce a knowledge-graph structure that facilitates propagation of critique, and repeated, randomized evaluations yield a robust ensemble of context-sensitive judgments. JAF can be instantiated entirely via ICL, with the judge prompted for each query using its associated primary-agent response plus a small, possibly noisy set of peer exemplars. While kNN in embedding space is a natural starting point for exemplars, this approach overlooks categorical structure, domain metadata, or nuanced distinctions accessible to modern LLMs. To overcome these limitations, we develop a flexible locality-sensitive hashing (LSH) algorithm that learns informative binary codes by integrating semantic embeddings, LLM-driven hash predicates, supervision from categorical labels, and relevant side information. These hash codes support efficient, interpretable, and relation-aware selection of diverse exemplars, and further optimize exploration of CoT reasoning paths. We validate JAF with an empirical study on the demanding task of cloud misconfigs triage in large-scale cloud environments.
Problem

Research questions and friction points this paper is trying to address.

judge agent
in-context learning
exemplar selection
cloud misconfiguration triage
relation-aware reasoning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Judge Agent Forest
joint inference
locality-sensitive hashing
ensemble learning
chain-of-thought reasoning
πŸ”Ž Similar Papers
No similar papers found.
Sahil Garg
Sahil Garg
Averlon
Artificial IntelligenceDeep LearningNatural Language Processing
B
Brad Cheezum
AI Research at Averlon, Redmond, WA
S
Sridhar Dutta
AI Research at Averlon, Redmond, WA
Vishal Agarwal
Vishal Agarwal
IIT(BHU), Varanasi
Software Defined Networks