Rethinking Encoder-Decoder Flow Through Shared Structures

πŸ“… 2025-01-24
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Conventional decoders for dense prediction tasks suffer from outdated architectural designs and insufficient cross-layer contextual sharing, limiting feature propagation efficiency and spatial consistency. Method: We propose a novel decoder architecture centered on a learnable, shared β€œbank”—a parameterized module dynamically resampled and fused across multiple scales to enable explicit cross-layer contextual reuse during decoding, thereby departing from traditional serial, layer-wise independent decoding paradigms. Built upon a Transformer backbone, the bank is jointly optimized end-to-end. Contribution/Results: Our approach significantly improves decoding efficiency and spatial coherence. On both natural and synthetic image depth estimation benchmarks, it substantially outperforms state-of-the-art methods, achieving superior accuracy and generalization under large-scale training. To our knowledge, this work presents the first systematic design and empirical validation of a universal, decoder-level contextual sharing mechanism.

Technology Category

Application Category

πŸ“ Abstract
Dense prediction tasks have enjoyed a growing complexity of encoder architectures, decoders, however, have remained largely the same. They rely on individual blocks decoding intermediate feature maps sequentially. We introduce banks, shared structures that are used by each decoding block to provide additional context in the decoding process. These structures, through applying them via resampling and feature fusion, improve performance on depth estimation for state-of-the-art transformer-based architectures on natural and synthetic images whilst training on large-scale datasets.
Problem

Research questions and friction points this paper is trying to address.

Information Encoding
Deep Learning
Image Authentication
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bank Methodology
Image Processing
Information Encoding and Decoding
πŸ”Ž Similar Papers
No similar papers found.
F
Frederik Laboyrie
Samsung R&D Institute UK (SRUK), London, England
M
M. K. Yucel
Samsung R&D Institute UK (SRUK), London, England
Albert SaΓ -Garriga
Albert SaΓ -Garriga
Principal Research Engenieer at Samsung Electonics
Parallel ComputingComputer VisionSource to Source Compilers