🤖 AI Summary
Existing prompting methods struggle to effectively generate or invoke governing equations when solving applied mathematical problems in domains such as finance and physics, thereby limiting the reasoning capabilities of large language models. This work proposes Formula-One Prompting (F-1), which introduces mathematical equations as an intermediate representation in prompt engineering for the first time. F-1 employs a two-stage framework: it first derives governing equations from the problem description and then adaptively selects among Chain-of-Thought (CoT), Program-of-Thought (PoT), or direct computation strategies for solution. The entire process—equation formulation and strategy selection—is completed within a single model call. Evaluated across five models and four benchmarks, F-1 outperforms CoT by 5.76% and PoT by 8.42% on average, with a notable 13.30% improvement on FinanceMath and particularly strong performance on physics-related problems.
📝 Abstract
Prompting techniques such as Chain-of-Thought (CoT) and Program-of-Thought (PoT) improve LLM mathematical reasoning by structuring intermediate steps in natural language or code. However, applied mathematics problems in domains like finance, physics, and cryptography often require recalling or deriving governing equations, a step that current approaches do not explicitly leverage. We propose Formula-One Prompting (F-1), a two-phase approach that uses mathematical equations as an intermediate representation before adaptive solving. F-1 first formulates governing equations from problem descriptions, then selects a solving strategy among CoT, PoT, or direct computation based on the generated equations, all within a single LLM call. Results across five models and four benchmarks show F-1 outperforms CoT by +5.76% and PoT by +8.42% on average. Crucially, gains are largest in applied domains: +13.30% on FinanceMath over CoT, and within OlympiadBench, larger gains on physics (+2.55%) than pure math (+0.44%). This demonstrates that F-1 is more effective than CoT in applied mathematics problems.