UniStitch: Unifying Semantic and Geometric Features for Image Stitching

📅 2026-03-11

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

This work addresses the long-standing disconnect between handcrafted geometric features and learned semantic representations in traditional image stitching methods, which has hindered their effective integration. To overcome this limitation, we propose UniStitch, a novel framework that unifies geometric and semantic representations for the first time. Specifically, a Neural Point Transformer maps sparse geometric keypoints into dense semantic feature maps, while an Adaptive Mixture of Experts module enables ordered alignment and adaptive fusion of these complementary features. The resulting end-to-end deep image stitching pipeline achieves state-of-the-art performance, significantly outperforming existing approaches—particularly in complex, challenging scenes—and establishes a new benchmark for image stitching accuracy and robustness.

Technology Category

Application Category

📝 Abstract

Traditional image stitching methods estimate warps from hand-crafted geometric features, whereas recent learning-based solutions leverage semantic features from neural networks instead. These two lines of research have largely diverged along separate evolution, with virtually no meaningful convergence to date. In this paper, we take a pioneering step to bridge this gap by unifying semantic and geometric features with UniStitch, a unified image stitching framework from multimodal features. To align discrete geometric features (i.e., keypoint) with continuous semantic feature maps, we present a Neural Point Transformer (NPT) module, which transforms unordered, sparse 1D geometric keypoints into ordered, dense 2D semantic maps. Then, to integrate the advantages of both representations, an Adaptive Mixture of Experts (AMoE) module is designed to fuse geometric and semantic representations. It dynamically shifts focus toward more reliable features during the fusion process, allowing the model to handle complex scenes, especially when either modality might be compromised. The fused representation can be adopted into common deep stitching pipelines, delivering significant performance gains over any single feature. Experiments show that UniStitch outperforms existing state-of-the-art methods with a large margin, paving the way for a unified paradigm between traditional and learning-based image stitching.

Problem

Research questions and friction points this paper is trying to address.

image stitching

semantic features

geometric features

feature fusion

multimodal representation

Innovation

Methods, ideas, or system contributions that make the work stand out.

image stitching

semantic-geometric fusion

Neural Point Transformer