MBE-ARI: A Multimodal Dataset Mapping Bi-directional Engagement in Animal-Robot Interaction

📅 2025-04-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Animal–robot interaction (ARI) faces two core challenges: the difficulty of parsing multimodal animal signals (e.g., pose, motion, vocalizations) and the lack of foundational resources enabling bidirectional interaction. To address these, this work introduces the first multimodal dataset specifically designed for bidirectional cow–robot interaction, comprising over 10 hours of synchronized multi-view RGB-D video with fine-grained temporal annotations of body pose and behavior. We propose the first modeling framework for quantifying mutual engagement in animal–robot bidirectional interaction. Furthermore, we design a full-body 39-keypoint pose estimation network tailored to quadrupeds, integrating Transformers with graph convolutional layers; it achieves an mAP of 92.7%, significantly outperforming prior methods and establishing a new state-of-the-art on bovine pose benchmarks. Collectively, this work delivers the first reproducible, closed-loop foundation—spanning perception, reasoning, and decision-making—for advancing ARI research.

Technology Category

Application Category

📝 Abstract
Animal-robot interaction (ARI) remains an unexplored challenge in robotics, as robots struggle to interpret the complex, multimodal communication cues of animals, such as body language, movement, and vocalizations. Unlike human-robot interaction, which benefits from established datasets and frameworks, animal-robot interaction lacks the foundational resources needed to facilitate meaningful bidirectional communication. To bridge this gap, we present the MBE-ARI (Multimodal Bidirectional Engagement in Animal-Robot Interaction), a novel multimodal dataset that captures detailed interactions between a legged robot and cows. The dataset includes synchronized RGB-D streams from multiple viewpoints, annotated with body pose and activity labels across interaction phases, offering an unprecedented level of detail for ARI research. Additionally, we introduce a full-body pose estimation model tailored for quadruped animals, capable of tracking 39 keypoints with a mean average precision (mAP) of 92.7%, outperforming existing benchmarks in animal pose estimation. The MBE-ARI dataset and our pose estimation framework lay a robust foundation for advancing research in animal-robot interaction, providing essential tools for developing perception, reasoning, and interaction frameworks needed for effective collaboration between robots and animals. The dataset and resources are publicly available at https://github.com/RISELabPurdue/MBE-ARI/, inviting further exploration and development in this critical area.
Problem

Research questions and friction points this paper is trying to address.

Robots struggle to interpret complex animal communication cues
Lack of foundational datasets for animal-robot interaction research
Need for tools to enable bidirectional animal-robot collaboration
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal dataset for animal-robot interaction
Full-body pose estimation for quadruped animals
Synchronized RGB-D streams with annotations
🔎 Similar Papers
No similar papers found.
I
Ian Noronha
Department of Agricultural and Biological Engineering, Purdue University, 401 Grant Street, West Lafayette, Indiana, USA
A
A. Jawaji
School of Mechanical Engineering, Purdue University, 585 Purdue Mall, West Lafayette, Indiana, USA
J
Juan Camilo Soto
Department of Agricultural and Biological Engineering, Purdue University, 401 Grant Street, West Lafayette, Indiana, USA
J
Jiajun An
Department of Agricultural and Biological Engineering, Purdue University, 401 Grant Street, West Lafayette, Indiana, USA
Y
Yan Gu
School of Mechanical Engineering, Purdue University, 585 Purdue Mall, West Lafayette, Indiana, USA
Upinder Kaur
Upinder Kaur
Assistant Professor, Purdue University
RoboticsCyber Physical SystemsMulti-Modal PerceptionArtificial Intelligence