TorchFX: A modern approach to Audio DSP with PyTorch and GPU acceleration

📅 2025-04-11

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

Existing GPU-accelerated DSP libraries struggle to simultaneously satisfy real-time audio processing requirements, algorithmic flexibility, and seamless integration with AI models. This paper introduces the first high-performance, PyTorch-native audio DSP library supporting multi-channel, differentiable, and ultra-low-latency signal processing. Our method centers on three key innovations: (1) a novel declarative syntax for constructing filter chains using a pipe operator; (2) a unified FIR/IIR filter interface, accelerated via custom CUDA kernels and tensor-level optimizations for efficient multi-channel GPU computation; and (3) native integration of DSP operations into PyTorch’s autograd-enabled execution graph, enabling end-to-end differentiability and co-design of signal processing and deep learning components. Experiments demonstrate up to several-fold speedups over SciPy for multi-channel filtering on modern GPUs. The library is open-sourced and compatible with mainstream GPU architectures, establishing a robust foundation for AI-driven real-time audio systems.

Technology Category

Application Category

📝 Abstract

The burgeoning complexity and real-time processing demands of audio signals necessitate optimized algorithms that harness the computational prowess of Graphics Processing Units (GPUs). Existing Digital Signal Processing (DSP) libraries often fall short in delivering the requisite efficiency and flexibility, particularly in integrating Artificial Intelligence (AI) models. In response, we introduce TorchFX: a GPU-accelerated Python library for DSP, specifically engineered to facilitate sophisticated audio signal processing. Built atop the PyTorch framework, TorchFX offers an Object-Oriented interface that emulates the usability of torchaudio, enhancing functionality with a novel pipe operator for intuitive filter chaining. This library provides a comprehensive suite of Finite Impulse Response (FIR) and Infinite Impulse Response (IIR) filters, with a focus on multichannel audio files, thus facilitating the integration of DSP and AI-based approaches. Our benchmarking results demonstrate significant efficiency gains over traditional libraries like SciPy, particularly in multichannel contexts. Despite current limitations in GPU compatibility, ongoing developments promise broader support and real-time processing capabilities. TorchFX aims to become a useful tool for the community, contributing to innovation and progress in DSP with GPU acceleration. TorchFX is publicly available on GitHub at https://github.com/matteospanio/torchfx.

Problem

Research questions and friction points this paper is trying to address.

Optimizing audio signal processing with GPU acceleration

Bridging DSP and AI model integration gaps

Enhancing efficiency for multichannel audio processing

Innovation

Methods, ideas, or system contributions that make the work stand out.

GPU-accelerated Python library for DSP

Object-Oriented interface with PyTorch

Comprehensive FIR and IIR filters suite

🔎 Similar Papers

Audio xLSTMs: Learning Self-supervised audio representations with xLSTMs