FoMo4Wheat: Toward reliable crop vision foundation models with globally curated data

📅 2025-09-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
General-purpose pre-trained vision models exhibit insufficient generalization for wheat crop monitoring, primarily due to the complex coupling of fine-grained, highly variable wheat canopy structures and dynamically shifting field conditions. Method: To address this, we introduce the first wheat-specific visual foundation model, trained via self-supervised learning on a Transformer architecture using the ImAg4Wheat dataset—comprising 2.5 million high-resolution wheat images collected across 30 global field sites. Contribution/Results: The model achieves significantly improved cross-site and cross-task transferability across ten diverse field-level visual tasks, spanning both canopy- and organ-level analysis. It consistently outperforms general-domain backbone models (e.g., ViT, ResNet) on all evaluated tasks, establishing a scalable, domain-adapted representation paradigm for crop perception. This work provides a foundational framework for developing specialized agricultural vision models with enhanced robustness and generalizability under real-world farming conditions.

Technology Category

Application Category

📝 Abstract
Vision-driven field monitoring is central to digital agriculture, yet models built on general-domain pretrained backbones often fail to generalize across tasks, owing to the interaction of fine, variable canopy structures with fluctuating field conditions. We present FoMo4Wheat, one of the first crop-domain vision foundation model pretrained with self-supervision on ImAg4Wheat, the largest and most diverse wheat image dataset to date (2.5 million high-resolution images collected over a decade at 30 global sites, spanning >2,000 genotypes and >500 environmental conditions). This wheat-specific pretraining yields representations that are robust for wheat and transferable to other crops and weeds. Across ten in-field vision tasks at canopy and organ levels, FoMo4Wheat models consistently outperform state-of-the-art models pretrained on general-domain dataset. These results demonstrate the value of crop-specific foundation models for reliable in-field perception and chart a path toward a universal crop foundation model with cross-species and cross-task capabilities. FoMo4Wheat models and the ImAg4Wheat dataset are publicly available online: https://github.com/PheniX-Lab/FoMo4Wheat and https://huggingface.co/PheniX-Lab/FoMo4Wheat. The demonstration website is: https://fomo4wheat.phenix-lab.com/.
Problem

Research questions and friction points this paper is trying to address.

General-domain vision models fail to generalize across agricultural tasks
Lack of robust crop-specific representations for variable field conditions
Need for reliable cross-task and cross-species crop vision models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-supervised pretraining on large crop-specific dataset
Wheat-focused foundation model with cross-crop transferability
Outperforms general-domain models in agricultural vision tasks
🔎 Similar Papers
No similar papers found.
B
Bing Han
Engineering Research Center of Plant Phenotyping, Ministry of Education, State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Jiangsu Collaborative Innovation Center for Modern Crop Production, Academy for Advanced Interdisciplinary Studies, Nanjing Agricultural University, Nanjing, China
C
Chen Zhu
Engineering Research Center of Plant Phenotyping, Ministry of Education, State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Jiangsu Collaborative Innovation Center for Modern Crop Production, Academy for Advanced Interdisciplinary Studies, Nanjing Agricultural University, Nanjing, China
D
Dong Han
Beijing University of Posts and Telecommunications, Beijing, China
R
Rui Yu
Engineering Research Center of Plant Phenotyping, Ministry of Education, State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Jiangsu Collaborative Innovation Center for Modern Crop Production, Academy for Advanced Interdisciplinary Studies, Nanjing Agricultural University, Nanjing, China
S
Songliang Cao
National Key Laboratory of Multispectral Information Intelligent Processing Technology, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan, China
J
Jianhui Wu
College of Agronomy, Northwest A&F University, Yangling, China
S
Scott Chapman
School of Agriculture and Food Sustainability, The University of Queensland, Brisbane, Australia
Z
Zijian Wang
School of Electrical Engineering and Computer Science, The University of Queensland, Brisbane, Australia
B
Bangyou Zheng
Agriculture and Food, Commonwealth Scientific and Industrial Research Organization, Queensland Biosciences Precinct, St Lucia, Queensland, Australia
W
Wei Guo
Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo, Japan
Marie Weiss
Marie Weiss
INRAE
remote sensingvegetationleaf area index
B
Benoit de Solan
Arvalis, LPA CAPTE, Avignon, France
Andreas Hund
Andreas Hund
Group of Crop Science, ETH Zurich
physiological breedingcrop phenotyingwheat
Lukas Roth
Lukas Roth
ETH Zürich
K
Kirchgessner Norbert
Institute of Agricultural Sciences, ETH Zurich, Zurich, Switzerland
A
Andrea Visioni
International Center for Agricultural Research in the Dry Areas (ICARDA), Rabat, Morocco
Yufeng Ge
Yufeng Ge
University of Nebraka-Lincoln
Agricultural and Biosystems Engineering
W
Wenjuan Li
State Key Laboratory of Efficient Utilization of Arable Land in China, the Institute of Agricultural Resources and Regional Planning, Chinese Academy of Agricultural Sciences, Beijing, China
A
Alexis Comar
Hiphen SAS, 22b rue Charrue, Avignon, France
D
Dong Jiang
Engineering Research Center of Plant Phenotyping, Ministry of Education, State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Jiangsu Collaborative Innovation Center for Modern Crop Production, Academy for Advanced Interdisciplinary Studies, Nanjing Agricultural University, Nanjing, China
D
Dejun Han
College of Agronomy, Northwest A&F University, Yangling, China
Fred Baret
Fred Baret
INRA-EMMAH-CAPTE
remote sensingphenotypingleaf canopy modelingcropglobal
Yanfeng Ding
Yanfeng Ding
Nankai University
AI4Compression Large Language Models High-Performance Computing Bioinformatics
H
Hao Lu
National Key Laboratory of Multispectral Information Intelligent Processing Technology, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan, China
Shouyang Liu
Shouyang Liu
Professor, Nanjing Agricultural University
PhenotypingCrop modelingRemote sensing in agriculture