Shape Distribution Matters: Shape-specific Mixture-of-Experts for Amodal Segmentation under Diverse Occlusions

📅 2025-08-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address low reconstruction accuracy under complex occlusion and extreme deformations (e.g., from rigid furniture to highly deformable clothing), this paper proposes ShapeMoE, a shape-aware sparse mixture-of-experts framework. Methodologically, it constructs a latent shape distribution space encoded via Gaussian embeddings; designs a shape-aware sparse router that establishes interpretable, dynamic “shape-to-expert” mappings—mitigating expert mismatch and insufficient specialization in conventional MoE architectures; and employs lightweight expert networks for fine-grained, shape-adaptive modeling. Evaluated on COCOA-cls, KINS, and D2SA benchmarks, ShapeMoE consistently outperforms state-of-the-art methods, particularly in occluded region segmentation, achieving substantial improvements while maintaining high computational efficiency and expressive capacity.

Technology Category

Application Category

📝 Abstract
Amodal segmentation targets to predict complete object masks, covering both visible and occluded regions. This task poses significant challenges due to complex occlusions and extreme shape variation, from rigid furniture to highly deformable clothing. Existing one-size-fits-all approaches rely on a single model to handle all shape types, struggling to capture and reason about diverse amodal shapes due to limited representation capacity. A natural solution is to adopt a Mixture-of-Experts (MoE) framework, assigning experts to different shape patterns. However, naively applying MoE without considering the object's underlying shape distribution can lead to mismatched expert routing and insufficient expert specialization, resulting in redundant or underutilized experts. To deal with these issues, we introduce ShapeMoE, a shape-specific sparse Mixture-of-Experts framework for amodal segmentation. The key idea is to learn a latent shape distribution space and dynamically route each object to a lightweight expert tailored to its shape characteristics. Specifically, ShapeMoE encodes each object into a compact Gaussian embedding that captures key shape characteristics. A Shape-Aware Sparse Router then maps the object to the most suitable expert, enabling precise and efficient shape-aware expert routing. Each expert is designed as lightweight and specialized in predicting occluded regions for specific shape patterns. ShapeMoE offers well interpretability via clear shape-to-expert correspondence, while maintaining high capacity and efficiency. Experiments on COCOA-cls, KINS, and D2SA show that ShapeMoE consistently outperforms state-of-the-art methods, especially in occluded region segmentation. The code will be released.
Problem

Research questions and friction points this paper is trying to address.

Handles diverse occlusions in amodal segmentation tasks
Addresses shape variation from rigid to deformable objects
Improves expert routing and specialization in Mixture-of-Experts
Innovation

Methods, ideas, or system contributions that make the work stand out.

ShapeMoE uses shape-specific Mixture-of-Experts
Dynamic routing based on latent shape distribution
Lightweight experts for occluded region prediction
🔎 Similar Papers
No similar papers found.
Zhixuan Li
Zhixuan Li
Research Fellow, CCDS, Nanyang Technological University (Singapore)
Computer VisionScene UnderstandingOcclusion Handling
Y
Yujia Liu
School of Computer Science, Peking University, Beijing, China; National Key Laboratory for Multimedia Information Processing, Peking University, Beijing, China; National Engineering Research Center of Visual Technology, Peking University, Beijing, China
Chen Hui
Chen Hui
Harbin Institute of Technology & Nanyang Technological University
image compressionquality assessmentmultimedia securityimage and video processing
J
Jeonghaeng Lee
Department of Electrical and Electronic Engineering, Yonsei University, Korea
S
Sanghoon Lee
Department of Electrical and Electronic Engineering, Yonsei University, Korea
Weisi Lin
Weisi Lin
President's Chair Professor in Computer Science, CCDS, Nanyang Technological Unversity
Perception-inspired signal modelingperceptual multimedia quality evaluationvideo compressionimage processing & analysis