Crossing Borders: A Multimodal Challenge for Indian Poetry Translation and Image Generation

📅 2025-11-17

📈 Citations: 0

✨ Influential: 0

career value

162K/year

🤖 AI Summary

Indian poetry poses persistent challenges for cross-lingual understanding and visual representation due to its linguistic complexity, rich cultural allusions, and morphologically rich syntax—especially in low-resource Indian languages. To address this, we propose the first bilingual translation–image generation framework tailored for multilingual Indian poetry: it models metaphorical relations via semantic graphs and achieves high-fidelity English translation using an Odds Ratio Preference Alignment algorithm; for cross-modal generation, it integrates large language models with latent diffusion models through prompt tuning and semantic-graph guidance to ensure culturally grounded image synthesis. Experiments demonstrate significant improvements over strong baselines in both human evaluation and quantitative metrics. We release MorphoVerse—a novel multimodal dataset comprising 1,570 poems across 21 Indian languages—filling a critical gap in low-resource poetic multimodal research and advancing SDG4 (Quality Education) and SDG10 (Reduced Inequalities).

Technology Category

Application Category

📝 Abstract

Indian poetry, known for its linguistic complexity and deep cultural resonance, has a rich and varied heritage spanning thousands of years. However, its layered meanings, cultural allusions, and sophisticated grammatical constructions often pose challenges for comprehension, especially for non-native speakers or readers unfamiliar with its context and language. Despite its cultural significance, existing works on poetry have largely overlooked Indian language poems. In this paper, we propose the Translation and Image Generation (TAI) framework, leveraging Large Language Models (LLMs) and Latent Diffusion Models through appropriate prompt tuning. Our framework supports the United Nations Sustainable Development Goals of Quality Education (SDG 4) and Reduced Inequalities (SDG 10) by enhancing the accessibility of culturally rich Indian-language poetry to a global audience. It includes (1) a translation module that uses an Odds Ratio Preference Alignment Algorithm to accurately translate morphologically rich poetry into English, and (2) an image generation module that employs a semantic graph to capture tokens, dependencies, and semantic relationships between metaphors and their meanings, to create visually meaningful representations of Indian poems. Our comprehensive experimental evaluation, including both human and quantitative assessments, demonstrates the superiority of TAI Diffusion in poem image generation tasks, outperforming strong baselines. To further address the scarcity of resources for Indian-language poetry, we introduce the Morphologically Rich Indian Language Poems MorphoVerse Dataset, comprising 1,570 poems across 21 low-resource Indian languages. By addressing the gap in poetry translation and visual comprehension, this work aims to broaden accessibility and enrich the reader's experience.

Problem

Research questions and friction points this paper is trying to address.

Translating morphologically rich Indian poetry with cultural accuracy

Generating meaningful visual representations of poetic metaphors

Addressing resource scarcity for low-resource Indian language poems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses LLMs and Latent Diffusion Models

Employs Odds Ratio Preference Alignment Algorithm

Utilizes semantic graph for metaphor visualization

🔎 Similar Papers

How Culturally Aware are Vision-Language Models?