π€ AI Summary
This work addresses the poor performance of in-context example (ICE) selection for multi-step reasoning tasksβsuch as mathematical reasoning, logical deduction, and code generation. We propose a training-free ICE retrieval framework that integrates graph-structured representation with Bayesian networks, eliminating the need for fine-tuning. Unlike conventional text-embedding approaches that capture only shallow semantic similarity, our method explicitly models the reasoning process as a directed graph and employs Bayesian networks to encode causal dependencies among reasoning steps, thereby aligning with human hierarchical reasoning. Graph representation learning enables efficient and semantically grounded ICE retrieval. Empirical evaluation across mathematical reasoning, logical reasoning, and code generation demonstrates substantial improvements over state-of-the-art ICE selection methods, achieving superior few-shot reasoning accuracy and higher retrieval efficiency.
π Abstract
In-context learning (ICL) enables large language models (LLMs) to generalize to new tasks by incorporating a few in-context examples (ICEs) directly in the input, without updating parameters. However, the effectiveness of ICL heavily relies on the selection of ICEs, and conventional text-based embedding methods are often inadequate for tasks that require multi-step reasoning, such as mathematical and logical problem solving. This is due to the bias introduced by shallow semantic similarities that fail to capture the deeper reasoning structures required for these tasks. We present GraphIC, a novel approach that leverages graph-based representations of reasoning processes, coupled with Bayesian Networks (BNs) to select ICEs. Graph structures inherently filter out shallow semantics while preserving the core reasoning structure. Importantly, BNs capture the dependency of a node's attributes on its parent nodes, closely mirroring the hierarchical nature of human cognition-where each thought is shaped by preceding ones. This makes BNs particularly well-suited for multi-step reasoning tasks, aligning the process more closely with human-like reasoning. Extensive experiments across three types of reasoning tasks (mathematical reasoning, code generation, and logical reasoning) demonstrate that GraphIC outperforms both training-free and training-based models in selecting ICEs, excelling in terms of both effectiveness and efficiency. We show that GraphIC enhances ICL's performance and interoperability, significantly advancing ICE selection for multi-step reasoning tasks.