New York Smells: A Large Multimodal Dataset for Olfaction

📅 2025-11-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current machine olfaction research is hindered by the scarcity of large-scale, multimodal olfactory training data collected in real-world settings. To address this, we introduce the first large-scale multimodal dataset comprising 7,000 odor–image pairs, spanning 3,500 indoor and outdoor object categories, with synchronized chemical sensing and visual acquisition under realistic conditions. Leveraging this dataset, we formulate three novel benchmark tasks: cross-modal odor–image retrieval, vision-free olfactory scene recognition, and fine-grained grass species discrimination—thereby systematically advancing olfactory representation learning. We propose a deep neural network-based cross-modal joint modeling framework, demonstrating that visual information substantially enhances olfactory representation capability. The learned features consistently outperform conventional handcrafted features across all tasks. This work establishes a scalable data foundation and a principled methodological paradigm for machine chemical perception.

Technology Category

Application Category

📝 Abstract
While olfaction is central to how animals perceive the world, this rich chemical sensory modality remains largely inaccessible to machines. One key bottleneck is the lack of diverse, multimodal olfactory training data collected in natural settings. We present New York Smells, a large dataset of paired image and olfactory signals captured ``in the wild.''Our dataset contains 7,000 smell-image pairs from 3,500 distinct objects across indoor and outdoor environments, with approximately 70$ imes$ more objects than existing olfactory datasets. Our benchmark has three tasks: cross-modal smell-to-image retrieval, recognizing scenes, objects, and materials from smell alone, and fine-grained discrimination between grass species. Through experiments on our dataset, we find that visual data enables cross-modal olfactory representation learning, and that our learned olfactory representations outperform widely-used hand-crafted features.
Problem

Research questions and friction points this paper is trying to address.

Lack of diverse multimodal olfactory training data from natural environments
Limited capability for machines to perceive and interpret olfactory information
Need for improved olfactory representation learning beyond hand-crafted features
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large multimodal olfactory-image dataset collection
Cross-modal smell-to-image retrieval benchmark tasks
Learned olfactory representations outperform hand-crafted features
🔎 Similar Papers
No similar papers found.
E
Ege Ozguroglu
Columbia University
J
Junbang Liang
Columbia University
Ruoshi Liu
Ruoshi Liu
Research Scientist, Meta FAIR
Computer VisionRobot Learning
M
Mia Chiquier
Columbia University
M
Michael DeTienne
Osmo Labs
W
Wesley Wei Qian
Osmo Labs
A
Alexandra Horowitz
Columbia University
Andrew Owens
Andrew Owens
Associate Professor, Cornell Tech
Computer Vision
Carl Vondrick
Carl Vondrick
Associate Professor, Columbia University
Computer VisionMachine Learning