An Adaptive Data-Resilient Multi-Modal Framework for Hierarchical Multi-Label Book Genre Identification

📅 2025-05-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses robust multi-label genre classification for books under incomplete or noisy multimodal inputs—namely, book covers, summaries, metadata, and full-text content. We propose a dynamic modality-adaptive selection mechanism coupled with a hierarchical multi-label joint modeling framework, integrating ViT, BERT, and MLP encoders. Our approach incorporates gated modality weighting and a hierarchical label decoder, along with a missingness-robust training strategy. Key contributions include: (1) the first publicly available hierarchical book genre annotation dataset; (2) stable inference under extreme missingness scenarios—including zero-text and zero-image inputs; and (3) state-of-the-art performance, achieving an average F1-score improvement of 12.6% under multimodal missingness and outperforming prior work by 9.3% in hierarchical accuracy.

Technology Category

Application Category

📝 Abstract
Identifying the finer details of a book's genres enhances user experience by enabling efficient book discovery and personalized recommendations, ultimately improving reader engagement and satisfaction. It also provides valuable insights into market trends and consumer preferences, allowing publishers and marketers to make data-driven decisions regarding book production and marketing strategies. While traditional book genre classification methods primarily rely on review data or textual analysis, incorporating additional modalities, such as book covers, blurbs, and metadata, can offer richer context and improve prediction accuracy. However, the presence of incomplete or noisy information across these modalities presents a significant challenge. This paper introduces IMAGINE (Intelligent Multi-modal Adaptive Genre Identification NEtwork), a framework designed to address these complexities. IMAGINE extracts robust feature representations from multiple modalities and dynamically selects the most informative sources based on data availability. It employs a hierarchical classification strategy to capture genre relationships and remains adaptable to varying input conditions. Additionally, we curate a hierarchical genre classification dataset that structures genres into a well-defined taxonomy, accommodating the diverse nature of literary works. IMAGINE integrates information from multiple sources and assigns multiple genre labels to each book, ensuring a more comprehensive classification. A key feature of our framework is its resilience to incomplete data, enabling accurate predictions even when certain modalities, such as text, images, or metadata, are missing or incomplete. Experimental results show that IMAGINE outperformed existing baselines in genre classification accuracy, particularly in scenarios with insufficient modality-specific data.
Problem

Research questions and friction points this paper is trying to address.

Enhances book genre identification using multi-modal data
Addresses incomplete or noisy information across modalities
Improves classification accuracy with adaptive data resilience
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-modal feature extraction for genre identification
Dynamic data source selection based on availability
Hierarchical classification with resilience to incomplete data
🔎 Similar Papers
No similar papers found.
U
U. Nareti
Dept. of CSE, Indian Institute of Technology Patna, Bihar 801106, India
S
Soumiki Chattopadhyay
Dept. of CSE, Indian Institute of Technology Indore, Madhya Pradesh 453552, India
P
Prolay Mallick
Dept. of CSE, Indian Institute of Technology Indore, Madhya Pradesh 453552, India
S
Suraj Kumar
Dept. of CSE, Indian Institute of Technology Indore, Madhya Pradesh 453552, India
A
Ayush Vikas Daga
Sardar Vallabhbhai National Institute of Technology, Surat 395007, India
Chandranath Adak
Chandranath Adak
Indian Institute of Technology Patna
Computer VisionDeep LearningBiometricsData Analytics
A
Adarsh Wase
Indian Institute of Technology Indore and the Indian Institute of Management Indore, Madhya Pradesh, India
A
Arjab Roy
Dept. of HSS, Indian Institute of Information Technology Guwahati, Assam 781015, India