Beyond one-hot encoding? Journey into compact encoding for large multi-class segmentation

📅 2025-10-01

📈 Citations: 0

✨ Influential: 0

career value

219K/year

🤖 AI Summary

In whole-brain MRI segmentation with a large number of classes (108), one-hot encoding incurs prohibitive computational and memory overhead. Method: This work proposes a compact label representation paradigm based on binary encoding—introducing, for the first time in medical image segmentation, logarithmic-complexity coding schemes (e.g., Error-Correcting Output Codes), label embedding trees, and soft/hard decoding strategies. It incorporates class-to-codeword optimization, class-weighted loss adaptation, and error-correction mechanisms to reduce model parameters and GPU memory consumption. Contribution/Results: Although current binary-encoded models achieve lower Dice scores (DSC = 39.3–73.8) than the one-hot baseline (DSC = 82.4), they reveal a fundamental trade-off among encoding robustness, decoding accuracy, and semantic consistency. The approach demonstrates feasibility and scalability for large-scale, fine-grained medical segmentation, offering a novel, resource-efficient pathway beyond conventional label representations.

Technology Category

Application Category

📝 Abstract

This work presents novel methods to reduce computational and memory requirements for medical image segmentation with a large number of classes. We curiously observe challenges in maintaining state-of-the-art segmentation performance with all of the explored options. Standard learning-based methods typically employ one-hot encoding of class labels. The computational complexity and memory requirements thus increase linearly with the number of classes. We propose a family of binary encoding approaches instead of one-hot encoding to reduce the computational complexity and memory requirements to logarithmic in the number of classes. In addition to vanilla binary encoding, we investigate the effects of error-correcting output codes (ECOCs), class weighting, hard/soft decoding, class-to-codeword assignment, and label embedding trees. We apply the methods to the use case of whole brain parcellation with 108 classes based on 3D MRI images. While binary encodings have proven efficient in so-called extreme classification problems in computer vision, we faced challenges in reaching state-of-the-art segmentation quality with binary encodings. Compared to one-hot encoding (Dice Similarity Coefficient (DSC) = 82.4 (2.8)), we report reduced segmentation performance with the binary segmentation approaches, achieving DSCs in the range from 39.3 to 73.8. Informative negative results all too often go unpublished. We hope that this work inspires future research of compact encoding strategies for large multi-class segmentation tasks.

Problem

Research questions and friction points this paper is trying to address.

Reducing computational complexity for large multi-class medical image segmentation

Replacing one-hot encoding with binary encoding to decrease memory requirements

Addressing performance challenges in compact encoding for brain parcellation tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Binary encoding replaces one-hot for efficiency

Error-correcting codes enhance binary encoding robustness

Label embedding trees optimize class representation structure

🔎 Similar Papers

Latent Point Collapse on a Low Dimensional Embedding in Deep Neural Network Classifiers