AROMMA: Unifying Olfactory Embeddings for Single Molecules and Mixtures

📅 2026-01-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing public olfactory datasets are limited in scale and treat single molecules and mixtures separately, hindering the learning of generalizable odor representations. This work proposes the first unified embedding space that leverages a chemical foundation model to encode individual molecules and introduces a permutation-invariant attention-based aggregator to model binary mixtures, effectively capturing their asymmetric interactions. To address missing labels, the approach employs knowledge distillation and class-aware pseudo-labeling, enabling end-to-end alignment of odor representations for both single molecules and mixtures. Evaluated across multiple olfactory benchmarks, the method achieves state-of-the-art performance, with AUROC improvements up to 19.1%, and demonstrates significantly enhanced cross-dataset generalization capability.

Technology Category

Application Category

📝 Abstract
Public olfaction datasets are small and fragmented across single molecules and mixtures, limiting learning of generalizable odor representations. Recent works either learn single-molecule embeddings or address mixtures via similarity or pairwise label prediction, leaving representations separate and unaligned. In this work, we propose AROMMA, a framework that learns a unified embedding space for single molecules and two-molecule mixtures. Each molecule is encoded by a chemical foundation model and the mixtures are composed by an attention-based aggregator, ensuring both permutation invariance and asymmetric molecular interactions. We further align odor descriptor sets using knowledge distillation and class-aware pseudo-labeling to enrich missing mixture annotations. AROMMA achieves state-of-the-art performance in both single-molecule and molecule-pair datasets, with up to 19.1% AUROC improvement, demonstrating a robust generalization in two domains.
Problem

Research questions and friction points this paper is trying to address.

olfactory embeddings
single molecules
mixtures
unified representation
odor perception
Innovation

Methods, ideas, or system contributions that make the work stand out.

unified olfactory embedding
attention-based aggregator
chemical foundation model
knowledge distillation
asymmetric molecular interactions
🔎 Similar Papers
No similar papers found.
D
Dayoung Kang
Department of Electrical Engineering and Computer Science, DGIST
J
JongWon Kim
Artificial Intelligence Major in Department of Interdisciplinary Studies, DGIST
Jiho Park
Jiho Park
Post Doctor of College of Business, Stony Brook University
Financial Mathematics
K
Keonseock Lee
Artificial Intelligence Major in Department of Interdisciplinary Studies, DGIST
Ji-Woong Choi
Ji-Woong Choi
Professor, Electrical Engineering and Computer Science Department, DGIST
Communication Theory & Signal ProcessingVehicular CommunicationsBiomedical Commun.BMI/BCI
Jinhyun So
Jinhyun So
Assistant Professor at DGIST
Distributed AIFederated LearningPrivacy-preserving Machine LearningInformation Theory