Pi-GPS: Enhancing Geometry Problem Solving by Unleashing the Power of Diagrammatic Information

📅 2025-03-07
📈 Citations: 0
Influential: 0
📄 PDF

career value

183K/year
🤖 AI Summary
Large language models (LLMs) suffer from hallucination and erroneous theorem selection in geometric problem solving due to ambiguous textual descriptions. Method: This paper proposes Pi-GPS, a diagram-driven multimodal reasoning framework featuring a novel diagram-guided textual disambiguation micro-module. Pi-GPS integrates a multimodal large language model (MLLM) corrector with a geometric rule validator to enable neural-symbolic collaborative reasoning. Contribution/Results: We systematically demonstrate— for the first time—the critical role of diagrammatic information in mitigating hallucination and improving theorem prediction accuracy. On the Geometry3K benchmark, Pi-GPS achieves nearly 10% higher accuracy than the current state-of-the-art neural-symbolic methods, significantly enhancing both robustness and interpretability of geometric reasoning.

Technology Category

Application Category

📝 Abstract
Geometry problem solving has garnered increasing attention due to its potential applications in intelligent education field. Inspired by the observation that text often introduces ambiguities that diagrams can clarify, this paper presents Pi-GPS, a novel framework that unleashes the power of diagrammatic information to resolve textual ambiguities, an aspect largely overlooked in prior research. Specifically, we design a micro module comprising a rectifier and verifier: the rectifier employs MLLMs to disambiguate text based on the diagrammatic context, while the verifier ensures the rectified output adherence to geometric rules, mitigating model hallucinations. Additionally, we explore the impact of LLMs in theorem predictor based on the disambiguated formal language. Empirical results demonstrate that Pi-GPS surpasses state-of-the-art models, achieving a nearly 10% improvement on Geometry3K over prior neural-symbolic approaches. We hope this work highlights the significance of resolving textual ambiguity in multimodal mathematical reasoning, a crucial factor limiting performance.
Problem

Research questions and friction points this paper is trying to address.

Resolves textual ambiguities in geometry using diagrammatic information.
Enhances geometry problem solving with a rectifier and verifier module.
Improves performance in multimodal mathematical reasoning tasks.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Pi-GPS framework resolves textual ambiguities using diagrams
Micro module combines rectifier and verifier for geometric accuracy
LLMs enhance theorem prediction with disambiguated formal language
🔎 Similar Papers
No similar papers found.