ARIC: An Activity Recognition Dataset in Classroom Surveillance Images

๐Ÿ“… 2024-10-16
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Existing classroom activity recognition research relies heavily on manually captured videos, offers limited activity categories, and overlooks critical challenges in real-world surveillance settingsโ€”namely, severe class imbalance and high inter-class similarity. To address these issues, we propose ARIC, the first multimodal activity recognition benchmark tailored for authentic classroom surveillance. ARIC comprises 32 pedagogically meaningful activity classes, synchronized RGB, thermal, and skeleton modalities, and multi-view recordings. It introduces scene-driven annotation, a continual learning protocol, and a meta few-shot split to systematically tackle class skew and fine-grained discrimination. ARIC supports three core tasks: standard activity recognition, continual learning, and few-shot continual learning. The dataset is publicly released and has been adopted by multiple educational AI research teams. By establishing a realistic, scalable, and rigorously structured benchmark, ARIC provides a foundational resource and a new paradigm for open-ended teaching scene analysis.

Technology Category

Application Category

๐Ÿ“ Abstract
The application of activity recognition in the ``AI + Education"field is gaining increasing attention. However, current work mainly focuses on the recognition of activities in manually captured videos and a limited number of activity types, with little attention given to recognizing activities in surveillance images from real classrooms. Activity recognition in classroom surveillance images faces multiple challenges, such as class imbalance and high activity similarity. To address this gap, we constructed a novel multimodal dataset focused on classroom surveillance image activity recognition called ARIC (Activity Recognition In Classroom). The ARIC dataset has advantages of multiple perspectives, 32 activity categories, three modalities, and real-world classroom scenarios. In addition to the general activity recognition tasks, we also provide settings for continual learning and few-shot continual learning. We hope that the ARIC dataset can act as a facilitator for future analysis and research for open teaching scenarios. You can download preliminary data from https://ivipclab.github.io/publication_ARIC/ARIC.
Problem

Research questions and friction points this paper is trying to address.

Recognize activities in classroom surveillance images
Address class imbalance and high activity similarity
Provide dataset for continual and few-shot learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal dataset for classroom activity recognition
Includes 32 activity categories and real-world scenarios
Supports continual and few-shot continual learning
๐Ÿ”Ž Similar Papers
No similar papers found.
L
Linfeng Xu
School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, China
F
Fanman Meng
School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, China
Qingbo Wu
Qingbo Wu
University of Electronic Science and Technology of China
video codingimage and video quality assessment
Lili Pan
Lili Pan
Associate Professor, University of Electronic Science and Technology of China
Computer visionMachine learning
Heqian Qiu
Heqian Qiu
University of Electronic Science and Technology of China, UESTC
Object DetectionMultimodal
L
Lanxiao Wang
School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, China
K
Kailong Chen
School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, China
K
Kanglei Geng
School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, China
Y
Yilei Qian
School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, China
H
Haojie Wang
School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, China
Shuchang Zhou
Shuchang Zhou
Megvii Inc.
Artificial Intelligence
S
Shimou Ling
School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, China
Zejia Liu
Zejia Liu
School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, China
N
Nanlin Chen
School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, China
Yingjie Xu
Yingjie Xu
Hong Kong University of Science and Technology(Guang Zhou))
Computer Vision
Shaoxu Cheng
Shaoxu Cheng
School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, China
Bowen Tan
Bowen Tan
Carnegie Mellon University
Z
Ziyong Xu
School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, China
H
Hongliang Li
School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, China