FCN+: Global Receptive Convolution Makes FCN Great Again

๐Ÿ“… 2023-03-08
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 1
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address the limited receptive field of fully convolutional networks (FCNs), which hinders effective global context modeling, this paper proposes parameter-free Global Receptive-field Convolution (GRC). GRC extends the effective receptive field significantly by leveraging channel-wise grouping and spatial coordinate offset samplingโ€”without introducing additional learnable parameters or computational overhead. It is the first method to achieve synergistic integration of pixel-wise dense prediction and global semantic awareness within the FCN framework. Evaluated on PASCAL VOC 2012, Cityscapes, and ADE20K, FCNs enhanced with GRC achieve state-of-the-art semantic segmentation accuracy, substantially outperforming baseline FCNs while maintaining nearly identical model size and inference cost.
๐Ÿ“ Abstract
Fully convolutional network (FCN) is a seminal work for semantic segmentation. However, due to its limited receptive field, FCN cannot effectively capture global context information which is vital for semantic segmentation. As a result, it is beaten by state-of-the-art methods that leverage different filter sizes for larger receptive fields. However, such a strategy usually introduces more parameters and increases the computational cost. In this paper, we propose a novel global receptive convolution (GRC) to effectively increase the receptive field of FCN for context information extraction, which results in an improved FCN termed FCN+. The GRC provides the global receptive field for convolution without introducing any extra learnable parameters. The motivation of GRC is that different channels of a convolutional filter can have different grid sampling locations across the whole input feature map. Specifically, the GRC first divides the channels of the filter into two groups. The grid sampling locations of the first group are shifted to different spatial coordinates across the whole feature map, according to their channel indexes. This can help the convolutional filter capture the global context information. The grid sampling location of the second group remains unchanged to keep the original location information. By convolving using these two groups, the GRC can integrate the global context into the original location information of each pixel for better dense prediction results. With the GRC built in, FCN+ can achieve comparable performance to state-of-the-art methods for semantic segmentation tasks, as verified on PASCAL VOC 2012, Cityscapes, and ADE20K. Our code will be released at https://github.com/Zhongying-Deng/FCN_Plus.
Problem

Research questions and friction points this paper is trying to address.

Enhance FCN's global context capture
Reduce computational cost in segmentation
Improve semantic segmentation accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Global Receptive Convolution (GRC) introduced
GRC increases FCN's receptive field
GRC integrates global context efficiently
๐Ÿ”Ž Similar Papers
No similar papers found.
Zhongying Deng
Zhongying Deng
University of Cambridge
Deep LearningMulti-modal LearningComputer VisionMedical Image Analysis
X
Xiaoyu Ren
Institute of Atmospheric Physics, Chinese Academy of Sciences
J
Jin Ye
Shanghai Artificial Intelligence Laboratory
Junjun He
Junjun He
Shanghai Jiao Tong University
Y
Y. Qiao
Shanghai Artificial Intelligence Laboratory, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences