Implicit Neural Representation Facilitates Unified Universal Vision Encoding

📅 2026-01-20

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

Existing image representation methods often struggle to simultaneously support both recognition and generation tasks. This work proposes a hypernetwork architecture based on Implicit Neural Representations (INRs), which encodes images into compact model weights that enable efficient reconstruction. By integrating knowledge distillation with pixel-level and perceptual losses, the method establishes a unified visual representation framework. It is the first approach to achieve high-accuracy recognition and high-quality image generation within a single shared embedding space, demonstrating state-of-the-art performance across diverse vision tasks while maintaining a highly compressed embedding dimensionality.

Technology Category

Application Category

📝 Abstract

Models for image representation learning are typically designed for either recognition or generation. Various forms of contrastive learning help models learn to convert images to embeddings that are useful for classification, detection, and segmentation. On the other hand, models can be trained to reconstruct images with pixel-wise, perceptual, and adversarial losses in order to learn a latent space that is useful for image generation. We seek to unify these two directions with a first-of-its-kind model that learns representations which are simultaneously useful for recognition and generation. We train our model as a hyper-network for implicit neural representation, which learns to map images to model weights for fast, accurate reconstruction. We further integrate our INR hyper-network with knowledge distillation to improve its generalization and performance. Beyond the novel training design, the model also learns an unprecedented compressed embedding space with outstanding performance for various visual tasks. The complete model competes with state-of-the-art results for image representation learning, while also enabling generative capabilities with its high-quality tiny embeddings. The code is available at https://github.com/tiktok/huvr.

Problem

Research questions and friction points this paper is trying to address.

image representation learning

recognition

generation

unified model

implicit neural representation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Implicit Neural Representation

Hyper-network

Unified Vision Encoding