🤖 AI Summary
This work addresses the limitations of large language models (LLMs) in structured reasoning—specifically, their insufficient fine-grained control, interpretability, and logical fidelity. We propose a lightweight, zero-fine-tuning, retrieval-free, and ensemble-free method grounded solely in prompt engineering. Our approach employs multi-turn few-shot prompting to guide Llama-3-8B-Instruct in precisely extracting problem constraints and explicitly decomposing chain-of-thought reasoning into statement–evidence pairs, each validated for logical validity. To ensure structural consistency, we integrate regex-based span normalization and strict JSON Schema validation. Evaluated on LLMSR@XLLM25, our method ranks fifth overall; its macro-F1 score matches that of significantly more complex, resource-intensive baselines. This demonstrates that high-fidelity structured reasoning can be achieved efficiently through carefully designed prompting and minimal post-processing—without architectural modification or external components.
📝 Abstract
We present Team asdfo123's submission to the LLMSR@XLLM25 shared task, which evaluates large language models on producing fine-grained, controllable, and interpretable reasoning processes. Systems must extract all problem conditions, decompose a chain of thought into statement-evidence pairs, and verify the logical validity of each pair. Leveraging only the off-the-shelf Meta-Llama-3-8B-Instruct, we craft a concise few-shot, multi-turn prompt that first enumerates all conditions and then guides the model to label, cite, and adjudicate every reasoning step. A lightweight post-processor based on regular expressions normalises spans and enforces the official JSON schema. Without fine-tuning, external retrieval, or ensembling, our method ranks 5th overall, achieving macro F1 scores on par with substantially more complex and resource-consuming pipelines. We conclude by analysing the strengths and limitations of our approach and outlining directions for future research in structural reasoning with LLMs. Our code is available at https://github.com/asdfo123/LLMSR-asdfo123.