FILCO: Flexible Composing Architecture with Real-Time Reconfigurability for DNN Acceleration

📅 2026-04-08

📈 Citations: 0

✨ Influential: 0

career value

224K/year

🤖 AI Summary

Existing DNN accelerators on heterogeneous platforms struggle to simultaneously achieve high memory and computational efficiency across diverse workloads, suffering from a mismatch between hardware capabilities and workload demands. This work proposes FILCO, a flexible compositional architecture that supports real-time fine-grained reconfiguration, enabling it to dynamically operate as either a unified accelerator or multiple independent ones to achieve workload-adaptive optimal resource allocation. By integrating reconfigurable hardware, runtime dataflow scheduling, and a two-stage design space exploration framework, FILCO overcomes the traditional trade-off between flexibility and efficiency. Experimental results on a 7nm AMD Versal VCK190 platform demonstrate that FILCO achieves 1.3× to 5× higher throughput and hardware efficiency across a range of workloads compared to state-of-the-art approaches.

Technology Category

Application Category

📝 Abstract

With the development of deep neural network (DNN) enabled applications, achieving high hardware resource efficiency on diverse workloads is non-trivial in heterogeneous computing platforms. Prior works discuss dedicated architectures to achieve maximal resource efficiency. However, a mismatch between hardware and workloads always exists in various diverse workloads. Other works discuss overlay architecture that can dynamically switch dataflow for different workloads. However, these works are still limited by flexibility granularity and induce much resource inefficiency. To solve this problem, we propose a flexible composing architecture, FILCO, that can efficiently match diverse workloads to achieve the optimal storage and computation resource efficiency. FILCO can be reconfigured in real-time and flexibly composed into a unified or multiple independent accelerators. We also propose the FILCO framework, including an analytical model with a two-stage DSE that can achieve the optimal design point. We also evaluate the FILCO framework on the 7nm AMD Versal VCK190 board. Compared with prior works, our design can achieve 1.3x - 5x throughput and hardware efficiency on various diverse workloads.

Problem

Research questions and friction points this paper is trying to address.

DNN acceleration

hardware efficiency

workload diversity

flexible architecture

resource mismatch

Innovation

Methods, ideas, or system contributions that make the work stand out.

Flexible Composing Architecture

Real-Time Reconfigurability

DNN Acceleration