π€ AI Summary
Deep learning models are susceptible to numerical variability in floating-point computations, which undermines their robustness and reliability. To address this issue, this work proposes an efficient and scalable evaluation methodology that integrates probabilistic rounding and a novel round-up/round-down mode into PyTorch, coupled with the Verificarlo compiler via an instruction-set management library for rapid numerical variability analysis. The approach seamlessly integrates into existing codebases without requiring modifications and demonstrates strong scalability across models ranging from 1 to 341 million parameters. Compared to the Verrou tool, it achieves a 5β60Γ speedup while preserving model accuracy and performance.
π Abstract
We introduce Fuzzy PyTorch, a framework for rapid evaluation of numerical variability in deep learning (DL) models. As DL is increasingly applied to diverse tasks, understanding variability from floating-point arithmetic is essential to ensure robust and reliable performance. Tools assessing such variability must be scalable, efficient, and integrate seamlessly with existing frameworks while minimizing code modifications. Fuzzy PyTorch enables this by integrating stochastic arithmetic into PyTorch through Probabilistic Rounding with Instruction Set Management, a novel library interfacing with Verificarlo, a numerical analysis compiler. The library offers stochastic rounding mode and a novel mode; up-down rounding. Comparative evaluations show Fuzzy PyTorch maintains model performance and achieves runtime reductions of 5x to 60x versus Verrou, a state-of-the-art tool. We further demonstrate scalability by running models from 1 to 341 million parameters, confirming applicability across small and large DL architectures. Overall, Fuzzy PyTorch provides an efficient, scalable, and practical solution for assessing numerical variability in deep learning, enabling researchers and practitioners to quantify and manage floating-point uncertainty without compromising performance or computational efficiency.