Learning Randomized Reductions and Program Properties

📅 2024-12-24
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of manually deriving Random Self-Reductions (RSRs) for ensuring correctness of numerical programs—a process that is labor-intensive and inherently unscalable. We propose the first automated method for learning RSRs, introducing polynomial-time linear regression as a novel technique for RSR inference. Our approach integrates symbolic analysis with machine learning to establish a theoretically grounded learning framework, and we release RSR-Bench—the first dedicated benchmark for RSR evaluation. The system supports C program analysis and provides both a Python package and a web interface. Experimental evaluation on benchmarks including NLA-DigBench demonstrates that our method significantly outperforms state-of-the-art tools in synthesizing nonlinear invariants. It achieves breakthroughs in scalability, robustness, and sample efficiency—enabling reliable RSR derivation from substantially fewer program executions.

Technology Category

Application Category

📝 Abstract
The correctness of computations remains a significant challenge in computer science, with traditional approaches relying on automated testing or formal verification. Self-testing/correcting programs introduce an alternative paradigm, allowing a program to verify and correct its own outputs via randomized reductions, a concept that previously required manual derivation. In this paper, we present Bitween, a method and tool for automated learning of randomized (self)-reductions and program properties in numerical programs. Bitween combines symbolic analysis and machine learning, with a surprising finding: polynomial-time linear regression, a basic optimization method, is not only sufficient but also highly effective for deriving complex randomized self-reductions and program invariants, often outperforming sophisticated mixed-integer linear programming solvers. We establish a theoretical framework for learning these reductions and introduce RSR-Bench, a benchmark suite for evaluating Bitween's capabilities on scientific and machine learning functions. Our empirical results show that Bitween surpasses state-of-the-art tools in scalability, stability, and sample efficiency when evaluated on nonlinear invariant benchmarks like NLA-DigBench. Bitween is open-source as a Python package and accessible via a web interface that supports C language programs.
Problem

Research questions and friction points this paper is trying to address.

Automating discovery of randomized self-reductions for mathematical functions
Developing neuro-symbolic approach using LLMs for novel query functions
Creating benchmark suite for evaluating self-reduction discovery methods
Innovation

Methods, ideas, or system contributions that make the work stand out.

Automated learning of randomized self-reductions for mathematical functions
Learning framework using linear regression outperforms existing symbolic methods
Neuro-symbolic approach with LLMs dynamically discovers novel query functions