Neural Signal Compression using RAMAN tinyML Accelerator for BCI Applications

📅 2025-04-09

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

To address bandwidth limitations and thermal power constraints in wireless transmission of raw neural data from high-density intracortical brain–computer interfaces, this work proposes a hardware-aware lightweight compression framework. We design a depthwise separable convolutional autoencoder (DS-CAE) and integrate a novel balanced random pruning strategy with the RAMAN tinyML accelerator’s zero-skip/weight-reconstruction mechanism, enabling index-free, low-power edge encoding on an Efinix Ti60 FPGA. Experiments demonstrate a 150× compression ratio for local field potentials (LFPs), achieving SNDR/SDNR of 22.6 dB/27.4 dB and R² scores of 0.81/0.94. Model parameter storage is reduced by 32.4%, while logic element and register utilization are significantly optimized. This is the first hardware–neuro co-designed sparse coding paradigm tailored to neural data streams, jointly achieving high compression ratio, high fidelity, and ultra-low edge computational overhead.

Technology Category

Application Category

📝 Abstract

High-quality, multi-channel neural recording is indispensable for neuroscience research and clinical applications. Large-scale brain recordings often produce vast amounts of data that must be wirelessly transmitted for subsequent offline analysis and decoding, especially in brain-computer interfaces (BCIs) utilizing high-density intracortical recordings with hundreds or thousands of electrodes. However, transmitting raw neural data presents significant challenges due to limited communication bandwidth and resultant excessive heating. To address this challenge, we propose a neural signal compression scheme utilizing Convolutional Autoencoders (CAEs), which achieves a compression ratio of up to 150 for compressing local field potentials (LFPs). The CAE encoder section is implemented on RAMAN, an energy-efficient tinyML accelerator designed for edge computing, and subsequently deployed on an Efinix Ti60 FPGA with 37.3k LUTs and 8.6k register utilization. RAMAN leverages sparsity in activation and weights through zero skipping, gating, and weight compression techniques. Additionally, we employ hardware-software co-optimization by pruning CAE encoder model parameters using a hardware-aware balanced stochastic pruning strategy, resolving workload imbalance issues and eliminating indexing overhead to reduce parameter storage requirements by up to 32.4%. Using the proposed compact depthwise separable convolutional autoencoder (DS-CAE) model, the compressed neural data from RAMAN is reconstructed offline with superior signal-to-noise and distortion ratios (SNDR) of 22.6 dB and 27.4 dB, along with R2 scores of 0.81 and 0.94, respectively, evaluated on two monkey neural recordings.

Problem

Research questions and friction points this paper is trying to address.

Compress neural signals for efficient wireless transmission in BCIs.

Implement energy-efficient tinyML accelerator for edge computing.

Optimize hardware-software co-design to reduce storage and computation.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Convolutional Autoencoders for neural compression

RAMAN tinyML accelerator for edge computing

Hardware-software co-optimization with pruning

🔎 Similar Papers

No similar papers found.

Qualcomm

$158,400.00 - $237,600.00

USA / Germany / Taiwan

Authors to Follow