SegCompass: Exploring Interpretable Alignment with Sparse Autoencoders for Enhanced Reasoning Segmentation

📅 2026-05-21

📈 Citations: 0

✨ Influential: 0

career value

177K/year

🤖 AI Summary

This work addresses the challenge of establishing a transparent and interpretable alignment between language reasoning and visual perception in referring segmentation, a task where existing methods often rely on black-box or post-hoc uncontrolled text grounding. The authors propose SegCompass, which introduces sparse autoencoders (SAEs) to map instruction-driven chain-of-thought (CoT) reasoning and visual features into a shared sparse concept space. Alignment is achieved through a differentiable and interpretable mechanism involving a query codebook and slot mapper. The model is trained end-to-end by jointly optimizing reinforcement learning–based reasoning paths and segmentation supervision. Evaluated on five challenging benchmarks, SegCompass matches or exceeds state-of-the-art performance, with sparse concept quality showing strong correlation to mask accuracy—demonstrating the effectiveness and traceability of its alignment mechanism.

📝 Abstract

While large language models provide strong compositional reasoning, existing reasoning segmentation pipelines fail to transparently connect this reasoning to visual perception. Current methods, such as latent query alignment, are end-to-end yet opaque "black boxes". Conversely, textual localization readout is merely readable, not truly interpretable, often functioning as an unconstrained post-hoc step. To bridge this interpretability gap, we propose SegCompass, an end-to-end model that leverages a Sparse Autoencoder (SAE) to forge an explicit, interpretable, and differentiable alignment pathway. Given an image-instruction pair, SegCompass first generates a chain-of-thought (CoT) trace. The core of our method is an SAE that maps both the CoT and visual tokens into a shared, high-dimensional sparse concept space. A query codebook selects salient concepts from this space, which are then spatially grounded by a slot mapper into a multi-slot heatmap that guides the final mask decoder. The entire model is trained jointly, unifying reinforcement learning for the reasoning path with standard segmentation supervision. This SAE-driven interface provides a "white-box" connection that is significantly more traceable than latent queries and more coherent than textual readouts. Extensive experiments on five challenging benchmarks demonstrate that SegCompass matches or surpasses state-of-the-art performance. Crucially, our visual and quantitative analyses show a strong correlation between the quality of the learned sparse concepts and final mask accuracy, confirming that SegCompass achieves superior results through its enhanced and inspectable alignment. Code is available at https://github.com/ZhenyuLU-Heliodore/SegCompass.

Problem

Research questions and friction points this paper is trying to address.

reasoning segmentation

interpretability

alignment

visual perception

compositional reasoning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Sparse Autoencoder

Interpretable Alignment

Reasoning Segmentation