MouseGPT: A Large-scale Vision-Language Model for Mouse Behavior Analysis

📅 2025-03-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional mouse behavioral analysis relies on manual annotation, suffering from poor interpretability and limited generalizability—hindering systematic characterization of the neurobehavioral spectrum. To address this, we introduce the first large-scale vision-language model (VLM) specifically designed for mouse behavior. Our approach integrates self-supervised pose representation learning with open-vocabulary behavioral prompting, implemented within a custom multimodal architecture trained on 42 million frames of multi-pathological behavioral videos. The model enables fine-grained behavior recognition, unsupervised behavioral clustering, and discovery of novel behaviors—all without manual labels—while supporting context-aware semantic interpretation of behavior. It outperforms existing methods across behavioral classification accuracy, cross-experimental robustness, and descriptive richness. This work establishes a high-throughput, interpretable phenotyping paradigm for neuropsychiatric disorders.

Technology Category

Application Category

📝 Abstract
Analyzing animal behavior is crucial in advancing neuroscience, yet quantifying and deciphering its intricate dynamics remains a significant challenge. Traditional machine vision approaches, despite their ability to detect spontaneous behaviors, fall short due to limited interpretability and reliance on manual labeling, which restricts the exploration of the full behavioral spectrum. Here, we introduce MouseGPT, a Vision-Language Model (VLM) that integrates visual cues with natural language to revolutionize mouse behavior analysis. Built upon our first-of-its-kind dataset - incorporating pose dynamics and open-vocabulary behavioral annotations across over 42 million frames of diverse psychiatric conditions - MouseGPT provides a novel, context-rich method for comprehensive behavior interpretation. Our holistic analysis framework enables detailed behavior profiling, clustering, and novel behavior discovery, offering deep insights without the need for labor - intensive manual annotation. Evaluations reveal that MouseGPT surpasses existing models in precision, adaptability, and descriptive richness, positioning it as a transformative tool for ethology and for unraveling complex behavioral dynamics in animal models.
Problem

Research questions and friction points this paper is trying to address.

Quantifying and deciphering intricate mouse behavior dynamics.
Overcoming limitations of traditional machine vision in behavior analysis.
Providing a context-rich method for comprehensive behavior interpretation.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates visual cues with natural language
Utilizes a large-scale dataset with diverse conditions
Enables detailed behavior profiling without manual annotation
🔎 Similar Papers
No similar papers found.
Teng Xu
Teng Xu
Graduate Student, ShanghaiTech University
Computer VisionComputer Graphics
Taotao Zhou
Taotao Zhou
School of Information Science and Technology, ShanghaiTech University; LumiAni Technology
Y
Youjia Wang
School of Information Science and Technology, ShanghaiTech University; LumiAni Technology
P
Peng Yang
School of Life Science and Technology, ShanghaiTech University
S
Simin Tang
School of Life Science and Technology, ShanghaiTech University
K
Kuixiang Shao
School of Information Science and Technology, ShanghaiTech University
Z
Zifeng Tang
School of Life Science and Technology, ShanghaiTech University
Y
Yifei Liu
School of Information Science and Technology, ShanghaiTech University
X
Xinyuan Chen
School of Life Science and Technology, ShanghaiTech University
H
Hongshuang Wang
Laboratory of Chemical Biology, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences
X
Xiaohui Wang
School of Applied Chemistry and Engineering, University of Science and Technology of China
H
Huoqing Luo
School of Life Science and Technology, ShanghaiTech University
Jingya Wang
Jingya Wang
Assistant Professor, ShanghaiTech University
Computer VisionEmbodied AIHuman-Object Interaction
J
Ji Hu
School of Life Science and Technology, ShanghaiTech University
Jingyi Yu
Jingyi Yu
Professor, ShanghaiTech University
Computer VisionComputer Graphics