MBE-ARI: A Multimodal Dataset Mapping Bi-directional Engagement in Animal-Robot Interaction

📅 2025-04-11

📈 Citations: 0

✨ Influential: 0

career value

217K/year

🤖 AI Summary

Animal–robot interaction (ARI) faces two core challenges: the difficulty of parsing multimodal animal signals (e.g., pose, motion, vocalizations) and the lack of foundational resources enabling bidirectional interaction. To address these, this work introduces the first multimodal dataset specifically designed for bidirectional cow–robot interaction, comprising over 10 hours of synchronized multi-view RGB-D video with fine-grained temporal annotations of body pose and behavior. We propose the first modeling framework for quantifying mutual engagement in animal–robot bidirectional interaction. Furthermore, we design a full-body 39-keypoint pose estimation network tailored to quadrupeds, integrating Transformers with graph convolutional layers; it achieves an mAP of 92.7%, significantly outperforming prior methods and establishing a new state-of-the-art on bovine pose benchmarks. Collectively, this work delivers the first reproducible, closed-loop foundation—spanning perception, reasoning, and decision-making—for advancing ARI research.

Technology Category

Application Category

📝 Abstract

Animal-robot interaction (ARI) remains an unexplored challenge in robotics, as robots struggle to interpret the complex, multimodal communication cues of animals, such as body language, movement, and vocalizations. Unlike human-robot interaction, which benefits from established datasets and frameworks, animal-robot interaction lacks the foundational resources needed to facilitate meaningful bidirectional communication. To bridge this gap, we present the MBE-ARI (Multimodal Bidirectional Engagement in Animal-Robot Interaction), a novel multimodal dataset that captures detailed interactions between a legged robot and cows. The dataset includes synchronized RGB-D streams from multiple viewpoints, annotated with body pose and activity labels across interaction phases, offering an unprecedented level of detail for ARI research. Additionally, we introduce a full-body pose estimation model tailored for quadruped animals, capable of tracking 39 keypoints with a mean average precision (mAP) of 92.7%, outperforming existing benchmarks in animal pose estimation. The MBE-ARI dataset and our pose estimation framework lay a robust foundation for advancing research in animal-robot interaction, providing essential tools for developing perception, reasoning, and interaction frameworks needed for effective collaboration between robots and animals. The dataset and resources are publicly available at https://github.com/RISELabPurdue/MBE-ARI/, inviting further exploration and development in this critical area.

Problem

Research questions and friction points this paper is trying to address.

Robots struggle to interpret complex animal communication cues

Lack of foundational datasets for animal-robot interaction research

Need for tools to enable bidirectional animal-robot collaboration

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal dataset for animal-robot interaction

Full-body pose estimation for quadruped animals

Synchronized RGB-D streams with annotations

🔎 Similar Papers

No similar papers found.