🤖 AI Summary
Current multimodal AI research overemphasizes vision-language (V+L) modalities and often neglects deployment constraints until late stages, hindering real-world adoption. To address this, we propose a *deployment-centric multimodal AI* paradigm that integrates deployability as a first-class design objective throughout the entire development lifecycle—extending beyond V+L to non-standard domains including healthcare, engineering, and climate science, as well as broader socio-technical systems. Methodologically, we introduce a cross-modal, multi-level fusion architecture incorporating heterogeneous multimodal data modeling, deployment-aware neural network design, domain-specific constraint embedding, and an open collaborative framework. We validate our approach on three real-world use cases: pandemic response, autonomous vehicle design, and climate change adaptation. Results demonstrate substantial improvements in model deployability and societal utility, while revealing cross-disciplinary deployment bottlenecks. Our work provides a principled methodology for sustainable, application-oriented multimodal AI.
📝 Abstract
Multimodal artificial intelligence (AI) integrates diverse types of data via machine learning to improve understanding, prediction, and decision-making across disciplines such as healthcare, science, and engineering. However, most multimodal AI advances focus on models for vision and language data, while their deployability remains a key challenge. We advocate a deployment-centric workflow that incorporates deployment constraints early to reduce the likelihood of undeployable solutions, complementing data-centric and model-centric approaches. We also emphasise deeper integration across multiple levels of multimodality and multidisciplinary collaboration to significantly broaden the research scope beyond vision and language. To facilitate this approach, we identify common multimodal-AI-specific challenges shared across disciplines and examine three real-world use cases: pandemic response, self-driving car design, and climate change adaptation, drawing expertise from healthcare, social science, engineering, science, sustainability, and finance. By fostering multidisciplinary dialogue and open research practices, our community can accelerate deployment-centric development for broad societal impact.