A Challenging Benchmark of Anime Style Recognition

📅 2022-04-29
🏛️ 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
📈 Citations: 5
Influential: 0
📄 PDF
🤖 AI Summary
Anime Style Recognition (ASR) aims to determine whether character images originate from the same anime series, yet suffers from a substantial semantic gap due to high inter-character variability and large intra-class variations in pose, illumination, and color—rendering it long underexplored. To address this, we introduce LSASRD, the first large-scale ASR benchmark, comprising 20,937 images from 190 anime series, and propose a cross-character evaluation protocol that formally defines and quantifies the task’s inherent difficulty. Methodologically, we adapt state-of-the-art person re-identification frameworks (AGW, TransReID), incorporating a cross-character verification mechanism and style-robust feature learning. Experiments reveal that even the best-performing model (TransReID) achieves only 42.24% mAP, confirming the task’s significant challenge. All data, code, and baseline implementations are publicly released to establish a reproducible foundation for future ASR research.
📝 Abstract
Given two images of different anime roles, anime style recognition (ASR) aims to learn abstract painting style to determine whether the two images are from the same work, which is an interesting but challenging problem. Unlike biometric recognition, such as face recognition, iris recognition, and person re-identification, ASR suffers from a much larger semantic gap but receives less attention. In this paper, we propose a challenging ASR benchmark. Firstly, we collect a large-scale ASR dataset (LSASRD), which contains 20,937 images of 190 anime works and each work at least has ten different roles. In addition to the large-scale, LSASRD contains a list of challenging factors, such as complex illuminations, various poses, theatrical colors and exaggerated compositions. Secondly, we design a cross-role protocol to evaluate ASR performance, in which query and gallery images must come from different roles to validate an ASR model is to learn abstract painting style rather than learn discriminative features of roles. Finally, we apply two powerful person re-identification methods, namely, AGW and TransReID, to construct the baseline performance on LSASRD. Surprisingly, the recent transformer model (i.e., TransReID) only acquires a 42.24% mAP on LSASRD. Therefore, we believe that the ASR task of a huge semantic gap deserves deep and long-term research. We will open our dataset and code at https://github.com/nkjcqvcpi/ASR.
Problem

Research questions and friction points this paper is trying to address.

Recognizing abstract anime painting styles across different roles
Addressing large semantic gap in anime style recognition
Evaluating cross-role anime style recognition with challenging factors
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large-scale anime dataset with challenging factors
Cross-role protocol to validate style learning
Transformer model baseline performance evaluation
🔎 Similar Papers
No similar papers found.
Haotang Li
Haotang Li
PhD Student, Electrical and Computer Engineering, University of Arizona
Computer VisionPattern Recognition
S
S. Guo
College of Engineering, Huaqiao University
K
Kailin Lyu
College of Engineering, Huaqiao University
X
Xiao Yang
College of Engineering, Huaqiao University
T
Tianchen Chen
College of Engineering, Huaqiao University
J
Jianqing Zhu
College of Engineering, Huaqiao University
Huanqiang Zeng
Huanqiang Zeng
Huaqiao University, China
Image ProcessingVideo CodingComputer Vision