The Cell Ontology in the age of single-cell omics

📅 2025-06-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The explosive growth of single-cell omics has created major challenges in cell type annotation and cross-dataset integration, demanding a standardized, FAIR-compliant, and cross-species ontology framework. To address this, we systematically upgraded the Cell Ontology (CL): (1) integrating classical morphological and transcriptomic definitions of cell types for the first time; (2) incorporating large language models (LLMs) to assist term extraction, logical validation, and relational inference—enhancing both efficiency and consistency in ontology curation; and (3) establishing deep semantic interoperability with international initiatives including the Human Cell Atlas and the Brain Initiative Cell Census Network (BICCN). The optimized CL demonstrates markedly improved compatibility and coverage within widely used single-cell analysis tools such as Scanpy and Seurat. It now underpins data reusability in over 100 studies and accelerates standardization of cell types across modalities and species.

Technology Category

Application Category

📝 Abstract
Single-cell omics technologies have transformed our understanding of cellular diversity by enabling high-resolution profiling of individual cells. However, the unprecedented scale and heterogeneity of these datasets demand robust frameworks for data integration and annotation. The Cell Ontology (CL) has emerged as a pivotal resource for achieving FAIR (Findable, Accessible, Interoperable, and Reusable) data principles by providing standardized, species-agnostic terms for canonical cell types - forming a core component of a wide range of platforms and tools. In this paper, we describe the wide variety of uses of CL in these platforms and tools and detail ongoing work to improve and extend CL content including the addition of transcriptomically defined types, working closely with major atlasing efforts including the Human Cell Atlas and the Brain Initiative Cell Atlas Network to support their needs. We cover the challenges and future plans for harmonising classical and transcriptomic cell type definitions, integrating markers and using Large Language Models (LLMs) to improve content and efficiency of CL workflows.
Problem

Research questions and friction points this paper is trying to address.

Integrating single-cell omics data with Cell Ontology standards
Harmonizing classical and transcriptomic cell type definitions
Enhancing CL workflows using markers and Large Language Models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Standardized Cell Ontology for FAIR data principles
Integration of transcriptomically defined cell types
LLMs to enhance Cell Ontology workflows efficiency
🔎 Similar Papers
No similar papers found.
S
Shawn Zheng Kai Tan
Scientific Data Registration, Novo Nordisk A/S, Måløv, Denmark; European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Saffron Walden, CB10 1SD, UK
A
Aleix Puig-Barbe
European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Saffron Walden, CB10 1SD, UK
D
Damien Goutte-Gattat
Department of Physiology, Development and Neuroscience, University of Cambridge, Downing Street, Cambridge CB2 3DY, UK
C
Caroline Eastwood
Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Saffron Walden CB10 1RQ, UK
B
B. Aevermann
Chan Zuckerberg Initiative
A
Alida Avola
Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Saffron Walden CB10 1RQ, UK
J
J. Balhoff
Renaissance Computing Institute, University of North Carolina, Chapel Hill, NC, USA
I
Ismail Ugur Bayindir
Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Saffron Walden CB10 1RQ, UK
J
Jasmine Belfiore
Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Saffron Walden CB10 1RQ, UK
A
Anita R. Caron
European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Saffron Walden, CB10 1SD, UK
D
David S Fischer
Medical University of Vienna, Institute of Artificial Intelligence, Center for Medical Data Science, Vienna, Austria
N
Nancy George
Syngenta, Jealott's Hill, Warfield, Bracknell, UK
B
Benjamin M. Gyori
Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA, USA
M
Melissa A. Haendel
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
C
C. Hoyt
RWTH Aachen University, Institute of Inorganic Chemistry, Landoltweg 1a, 52074, Aachen, Germany
H
Hüseyin Kir
Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Saffron Walden CB10 1RQ, UK
T
Tiago Lubiana
University of São Paulo, São Paulo, Brazil
N
N. Matentzoglu
Semanticly, Athens, Greece
J
James A. Overton
Knocean Inc., Toronto, Ontario, Canada
B
Beverly Peng
Department of Informatics, J. Craig Venter Institute, La Jolla, CA, USA
B
B. Peters
La Jolla Institute for Immunology, 9420 Athena Circle, La Jolla, CA 92037, United States
Ellen M. Quardokus
Ellen M. Quardokus
Indiana University
Multidisciplinary researchCell divisionCell cycle developmentSignal TransductionImaging
P
Patrick L Ray
Allen Institute for Brain Science, Seattle, WA., United States
P
Paola Roncaglia
European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Saffron Walden, CB10 1SD, UK
A
Andrea D Rivera
Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Saffron Walden CB10 1RQ, UK
R
Ray Stefancsik
European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Saffron Walden, CB10 1SD, UK
W
Wei Kheng Teh
European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Saffron Walden, CB10 1SD, UK
S
Sabrina Toro
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
N
N. Vasilevsky
Critical Path Institute, Tucson, AZ, United States
Chuan Xu
Chuan Xu
Analog Devices
Electrical and Electronic Engineering
Y
Yun Zhang
Division of Intramural Research, National Library of Medicine, National Institutes of Health, Bethesda, MD, United States
R
Richard Scheuermann
Division of Intramural Research, National Library of Medicine, National Institutes of Health, Bethesda, MD, United States
C
Chirstopher J Mungall
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA 94720, United States
A
A. Diehl
Department of Biomedical Informatics, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, Buffalo, NY 14203, USA
D
David Osumi-Sutherland
Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Saffron Walden CB10 1RQ, UK