Rare-Aware Autoencoding: Reconstructing Spatially Imbalanced Data

📅 2026-04-02

📈 Citations: 0

✨ Influential: 0

career value

217K/year

🤖 AI Summary

This work addresses the challenge that autoencoders, when reconstructing images with spatially imbalanced distributions, are often dominated by background regions, leading to blurred reconstructions of rare yet critical spatial details. To overcome this limitation, the study introduces spatial rarity modeling into unsupervised reconstruction for the first time and proposes a novel autoencoding strategy. This approach employs a self-entropy-weighted loss to enhance the model’s focus on statistically rare locations and integrates a sample propagation mechanism to replay hard-to-reconstruct samples, thereby improving reconstruction consistency. By moving beyond conventional class-balancing techniques—which are ill-suited for reconstruction tasks—the method achieves significant performance gains over existing baselines on both simulated and three real-world cross-domain datasets, markedly enhancing reconstruction quality and generalization under spatial imbalance.

Technology Category

Application Category

📝 Abstract

Autoencoders can be challenged by spatially non-uniform sampling of image content. This is common in medical imaging, biology, and physics, where informative patterns occur rarely at specific image coordinates, as background dominates these locations in most samples, biasing reconstructions toward the majority appearance. In practice, autoencoders are biased toward dominant patterns resulting in the loss of fine-grained detail and causing blurred reconstructions for rare spatial inputs especially under spatial data imbalance. We address spatial imbalance by two complementary components: (i) self-entropy-based loss that upweights statistically uncommon spatial locations and (ii) Sample Propagation, a replay mechanism that selectively re-exposes the model to hard to reconstruct samples across batches during training. We benchmark existing data balancing strategies, originally developed for supervised classification, in the unsupervised reconstruction setting. Drawing on the limitations of these approaches, our method specifically targets spatial imbalance by encouraging models to focus on statistically rare locations, improving reconstruction consistency compared to existing baselines. We validate in a simulated dataset with controlled spatial imbalance conditions, and in three, uncontrolled, diverse real-world datasets spanning physical, biological, and astronomical domains. Our approach outperforms baselines on various reconstruction metrics, particularly under spatial imbalance distributions. These results highlight the importance of data representation in a batch and emphasize rare samples in unsupervised image reconstruction. We will make all code and related data available.

Problem

Research questions and friction points this paper is trying to address.

spatial imbalance

autoencoding

rare patterns

image reconstruction

data imbalance

Innovation

Methods, ideas, or system contributions that make the work stand out.

spatial imbalance

autoencoder

self-entropy loss