New York Smells: A Large Multimodal Dataset for Olfaction

📅 2025-11-25
📈 Citations: 0
Influential: 0
📄 PDF

career value

204K/year
🤖 AI Summary
Current machine olfaction research is hindered by the scarcity of large-scale, multimodal olfactory training data collected in real-world settings. To address this, we introduce the first large-scale multimodal dataset comprising 7,000 odor–image pairs, spanning 3,500 indoor and outdoor object categories, with synchronized chemical sensing and visual acquisition under realistic conditions. Leveraging this dataset, we formulate three novel benchmark tasks: cross-modal odor–image retrieval, vision-free olfactory scene recognition, and fine-grained grass species discrimination—thereby systematically advancing olfactory representation learning. We propose a deep neural network-based cross-modal joint modeling framework, demonstrating that visual information substantially enhances olfactory representation capability. The learned features consistently outperform conventional handcrafted features across all tasks. This work establishes a scalable data foundation and a principled methodological paradigm for machine chemical perception.

Technology Category

Application Category

📝 Abstract
While olfaction is central to how animals perceive the world, this rich chemical sensory modality remains largely inaccessible to machines. One key bottleneck is the lack of diverse, multimodal olfactory training data collected in natural settings. We present New York Smells, a large dataset of paired image and olfactory signals captured ``in the wild.''Our dataset contains 7,000 smell-image pairs from 3,500 distinct objects across indoor and outdoor environments, with approximately 70$ imes$ more objects than existing olfactory datasets. Our benchmark has three tasks: cross-modal smell-to-image retrieval, recognizing scenes, objects, and materials from smell alone, and fine-grained discrimination between grass species. Through experiments on our dataset, we find that visual data enables cross-modal olfactory representation learning, and that our learned olfactory representations outperform widely-used hand-crafted features.
Problem

Research questions and friction points this paper is trying to address.

Lack of diverse multimodal olfactory training data from natural environments
Limited capability for machines to perceive and interpret olfactory information
Need for improved olfactory representation learning beyond hand-crafted features
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large multimodal olfactory-image dataset collection
Cross-modal smell-to-image retrieval benchmark tasks
Learned olfactory representations outperform hand-crafted features
🔎 Similar Papers
No similar papers found.