Towards Spoken Mathematical Reasoning: Benchmarking Speech-based Models over Multi-faceted Math Problems

📅 2025-05-21

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

This work addresses the underexplored problem of speech-input mathematical reasoning. We introduce Spoken-MQA, the first benchmark for spoken mathematical question answering, covering arithmetic, single- and multi-step contextual reasoning, and knowledge-intensive reasoning. Methodologically, we propose a dual-evaluation framework—end-to-end and cascaded—integrating automatic speech recognition (ASR), large language models (LLMs), multimodal LLMs (MLLMs), and custom speech-to-math data synthesis. Our key contributions are threefold: (1) We systematically identify three critical deficiencies of speech-enabled LLMs in spoken math understanding: weak direct arithmetic computation capability, excessive reliance on LaTeX notation, and severe degradation in knowledge-intensive reasoning; (2) We empirically verify that while contextual reasoning remains moderately robust (>60% accuracy), arithmetic and knowledge reasoning both fall below 40%; (3) We establish a novel benchmark and analytical paradigm for evaluating and advancing logical reasoning capabilities in speech-driven AI systems.

Technology Category

Application Category

📝 Abstract

Recent advances in large language models (LLMs) and multimodal LLMs (MLLMs) have led to strong reasoning ability across a wide range of tasks. However, their ability to perform mathematical reasoning from spoken input remains underexplored. Prior studies on speech modality have mostly focused on factual speech understanding or simple audio reasoning tasks, providing limited insight into logical step-by-step reasoning, such as that required for mathematical problem solving. To address this gap, we introduce Spoken Math Question Answering (Spoken-MQA), a new benchmark designed to evaluate the mathematical reasoning capabilities of speech-based models, including both cascade models (ASR + LLMs) and end-to-end speech LLMs. Spoken-MQA covers a diverse set of math problems, including pure arithmetic, single-step and multi-step contextual reasoning, and knowledge-oriented reasoning problems, all presented in unambiguous natural spoken language. Through extensive experiments, we find that: (1) while some speech LLMs perform competitively on contextual reasoning tasks involving basic arithmetic, they still struggle with direct arithmetic problems; (2) current LLMs exhibit a strong bias toward symbolic mathematical expressions written in LaTex and have difficulty interpreting verbalized mathematical expressions; and (3) mathematical knowledge reasoning abilities are significantly degraded in current speech LLMs.

Problem

Research questions and friction points this paper is trying to address.

Evaluating speech-based models for mathematical reasoning from spoken input

Addressing limitations in logical step-by-step reasoning with spoken math problems

Assessing bias and performance gaps in verbalized mathematical expression interpretation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces Spoken-MQA benchmark for speech-based math reasoning

Evaluates cascade models and end-to-end speech LLMs

Covers diverse math problems in spoken language

🔎 Similar Papers

MathScape: Evaluating MLLMs in multimodal Math Scenarios through a Hierarchical Benchmark