Crossing Borders: A Multimodal Challenge for Indian Poetry Translation and Image Generation

📅 2025-11-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Indian poetry poses persistent challenges for cross-lingual understanding and visual representation due to its linguistic complexity, rich cultural allusions, and morphologically rich syntax—especially in low-resource Indian languages. To address this, we propose the first bilingual translation–image generation framework tailored for multilingual Indian poetry: it models metaphorical relations via semantic graphs and achieves high-fidelity English translation using an Odds Ratio Preference Alignment algorithm; for cross-modal generation, it integrates large language models with latent diffusion models through prompt tuning and semantic-graph guidance to ensure culturally grounded image synthesis. Experiments demonstrate significant improvements over strong baselines in both human evaluation and quantitative metrics. We release MorphoVerse—a novel multimodal dataset comprising 1,570 poems across 21 Indian languages—filling a critical gap in low-resource poetic multimodal research and advancing SDG4 (Quality Education) and SDG10 (Reduced Inequalities).

Technology Category

Application Category

📝 Abstract
Indian poetry, known for its linguistic complexity and deep cultural resonance, has a rich and varied heritage spanning thousands of years. However, its layered meanings, cultural allusions, and sophisticated grammatical constructions often pose challenges for comprehension, especially for non-native speakers or readers unfamiliar with its context and language. Despite its cultural significance, existing works on poetry have largely overlooked Indian language poems. In this paper, we propose the Translation and Image Generation (TAI) framework, leveraging Large Language Models (LLMs) and Latent Diffusion Models through appropriate prompt tuning. Our framework supports the United Nations Sustainable Development Goals of Quality Education (SDG 4) and Reduced Inequalities (SDG 10) by enhancing the accessibility of culturally rich Indian-language poetry to a global audience. It includes (1) a translation module that uses an Odds Ratio Preference Alignment Algorithm to accurately translate morphologically rich poetry into English, and (2) an image generation module that employs a semantic graph to capture tokens, dependencies, and semantic relationships between metaphors and their meanings, to create visually meaningful representations of Indian poems. Our comprehensive experimental evaluation, including both human and quantitative assessments, demonstrates the superiority of TAI Diffusion in poem image generation tasks, outperforming strong baselines. To further address the scarcity of resources for Indian-language poetry, we introduce the Morphologically Rich Indian Language Poems MorphoVerse Dataset, comprising 1,570 poems across 21 low-resource Indian languages. By addressing the gap in poetry translation and visual comprehension, this work aims to broaden accessibility and enrich the reader's experience.
Problem

Research questions and friction points this paper is trying to address.

Translating morphologically rich Indian poetry with cultural accuracy
Generating meaningful visual representations of poetic metaphors
Addressing resource scarcity for low-resource Indian language poems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses LLMs and Latent Diffusion Models
Employs Odds Ratio Preference Alignment Algorithm
Utilizes semantic graph for metaphor visualization
🔎 Similar Papers
Sofia Jamil
Sofia Jamil
PhD Research Scholar
Large Language ModelNatural Language ProcessingText to Image Generation Models
K
Kotla Sai Charan
Department of Computer Science and Engineering, Indian Institute of Technology Patna, India
S
Sriparna Saha
Department of Computer Science and Engineering, Indian Institute of Technology Patna, India
Koustava Goswami
Koustava Goswami
Research Scientist 2 @ Adobe Research
Natural Language ProcessingLanguage ModelMultimodal Learning
J
Joseph K J
Adobe Research