RETENTION: Resource-Efficient Tree-Based Ensemble Model Acceleration with Content-Addressable Memory

📅 2025-06-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high memory overhead and low utilization of tree models on content-addressable memory (CAM) hardware accelerators, this paper proposes RETENTION, an end-to-end framework. Methodologically, it introduces (1) the first iterative pruning criterion tailored for Bagging ensemble models, jointly optimizing accuracy and structural sparsity; and (2) a customized tree-mapping scheme integrating rule-matching and data-aware strategies to mitigate storage redundancy caused by “don’t-care” entries in CAM. Experimentally, the mapping scheme alone improves spatial efficiency by 1.46×–21.30×; the full framework achieves 4.35×–207.12× CAM capacity compression with <3% accuracy degradation. RETENTION thus establishes an efficient, scalable, software-hardware co-optimization pathway for accelerating tree-based models on CAM.

Technology Category

Application Category

📝 Abstract
Although deep learning has demonstrated remarkable capabilities in learning from unstructured data, modern tree-based ensemble models remain superior in extracting relevant information and learning from structured datasets. While several efforts have been made to accelerate tree-based models, the inherent characteristics of the models pose significant challenges for conventional accelerators. Recent research leveraging content-addressable memory (CAM) offers a promising solution for accelerating tree-based models, yet existing designs suffer from excessive memory consumption and low utilization. This work addresses these challenges by introducing RETENTION, an end-to-end framework that significantly reduces CAM capacity requirement for tree-based model inference. We propose an iterative pruning algorithm with a novel pruning criterion tailored for bagging-based models (e.g., Random Forest), which minimizes model complexity while ensuring controlled accuracy degradation. Additionally, we present a tree mapping scheme that incorporates two innovative data placement strategies to alleviate the memory redundancy caused by the widespread use of don't care states in CAM. Experimental results show that implementing the tree mapping scheme alone achieves $1.46 imes$ to $21.30 imes$ better space efficiency, while the full RETENTION framework yields $4.35 imes$ to $207.12 imes$ improvement with less than 3% accuracy loss. These results demonstrate that RETENTION is highly effective in reducing CAM capacity requirement, providing a resource-efficient direction for tree-based model acceleration.
Problem

Research questions and friction points this paper is trying to address.

Reduces CAM capacity for tree-based model inference
Improves memory efficiency in ensemble model acceleration
Minimizes accuracy loss while pruning model complexity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses content-addressable memory for acceleration
Introduces iterative pruning for Random Forest
Optimizes tree mapping to reduce redundancy
🔎 Similar Papers
No similar papers found.
Y
Yi-Chun Liao
Department of Computer Science and Information Engineering, National Taiwan University, Taipei 10617, Taiwan
C
Chieh-Lin Tsai
Department of Computer Science and Information Engineering, National Taiwan University, Taipei 10617, Taiwan
Yuan-Hao Chang
Yuan-Hao Chang
Professor, Dept. of CSIE, National Taiwan University; IEEE Fellow
Comuter SystemComputer ArchitectureEmbedded SystemOperating SystemNon-volatile Memory
C
Cam'elia Slimani
IRIT, Université de Toulouse, Toulouse INP–UT3, CNRS, 31062 Toulouse, France
Jalil Boukhobza
Jalil Boukhobza
Full Professor in computer science, ENSTA-Bretagne
storage systemsflash memoryembedded systemsCloud storageNon-volatile-memory
T
Tei-Wei Kuo
Department of Computer Science and Information Engineering, National Taiwan University, Taipei 10617, Taiwan. He is also with Delta Electronics, Taoyuan 33378, Taiwan.