From Plain Text to Poetic Form: Generating Metrically-Constrained Sanskrit Verses

📅 2025-06-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Generating classical Sanskrit poetry—strictly adhering to metrical constraints such as Anuṣṭubh—in low-resource, morphologically rich languages remains challenging due to scarce high-quality training data and inadequate modeling of prosodic structure. Method: We introduce the first parallel corpus of Sanskrit metrical poetry; propose a metricality-aware constrained decoding strategy; and design a semantic-prosodic co-fine-tuning paradigm integrating Sanskrit morphological analysis, rule-based prosody modeling, and instruction tuning across multiple open-source and commercial LLMs. Contribution/Results: Our decoding approach achieves >99% metrical compliance. Fine-tuned models significantly outperform baselines in semantic fidelity and stylistic appropriateness (p < 0.01, human evaluation). This work establishes a reusable data resource, methodology, and evaluation framework for structured literary generation in low-resource languages.

Technology Category

Application Category

📝 Abstract
Recent advances in large language models (LLMs) have significantly improved natural language generation, including creative tasks like poetry composition. However, most progress remains concentrated in high-resource languages. This raises an important question: Can LLMs be adapted for structured poetic generation in a low-resource, morphologically rich language such as Sanskrit? In this work, we introduce a dataset designed for translating English prose into structured Sanskrit verse, with strict adherence to classical metrical patterns, particularly the Anushtub meter. We evaluate a range of generative models-both open-source and proprietary-under multiple settings. Specifically, we explore constrained decoding strategies and instruction-based fine-tuning tailored to metrical and semantic fidelity. Our decoding approach achieves over 99% accuracy in producing syntactically valid poetic forms, substantially outperforming general-purpose models in meter conformity. Meanwhile, instruction-tuned variants show improved alignment with source meaning and poetic style, as supported by human assessments, albeit with marginal trade-offs in metrical precision.
Problem

Research questions and friction points this paper is trying to address.

Adapting LLMs for structured Sanskrit poetry generation
Translating English prose into metrically-constrained Sanskrit verses
Ensuring metrical and semantic fidelity in low-resource languages
Innovation

Methods, ideas, or system contributions that make the work stand out.

Constrained decoding for metrical accuracy
Instruction-based fine-tuning for semantic fidelity
Dataset for English-Sanskrit verse translation
🔎 Similar Papers
No similar papers found.
M
Manoj Balaji Jagadeeshan
Indian Institute of Technology, Kharagpur
Samarth Bhatia
Samarth Bhatia
Undergraduate, IIT Delhi
machine learningdeep learningoptimization
Pretam Ray
Pretam Ray
Indian Institute of Technology, Kharagpur
Natural Language Processing
Harshul Surana
Harshul Surana
Graduate student, AI Institue, University of South Carolina
NLPSocial computingKnowledge graphsAI for Social Good
P
P. Akhil Rajeev
Indian Heritage Language Computing Group, Centre for Development of Advanced Computing (C-DAC), Bangalore
P
Priya Mishra
Indian Institute of Technology, Bombay
A
Annarao Kulkarni
Indian Heritage Language Computing Group, Centre for Development of Advanced Computing (C-DAC), Bangalore
Ganesh Ramakrishnan
Ganesh Ramakrishnan
Professor, Department of Computer Science and Engineering, Indian Institute of Technology Bombay
Machine LearningRelational LearningInformation ExtractionQuestion AnsweringText Analytics
A
AP Prathosh
Indian Institute of Science, Bangalore
P
Pawan Goyal
Indian Institute of Technology, Kharagpur