Leveraging Author-Specific Context for Scientific Figure Caption Generation: 3rd SciCap Challenge

📅 2025-10-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of simultaneously ensuring content accuracy and preserving authorial writing style in scientific chart caption generation, this paper proposes a two-stage generative framework. In the first stage, DSPy (specifically MIPROv2 and SIMBA) is employed for multimodal context filtering and category-specific prompt optimization. In the second stage, few-shot stylistic fine-tuning is performed using author-profile graphs derived from the LaMP-Cap dataset. This work represents the first approach to jointly enhance both factual fidelity and stylistic consistency in scientific image captioning. Experimental results demonstrate that category-specific prompting improves ROUGE-1 recall by 8.3%; integrating stylistic fine-tuning further boosts BLEU by 40–48% and ROUGE-L by 25–27%, significantly outperforming zero-shot and generic prompt-optimization baselines.

Technology Category

Application Category

📝 Abstract
Scientific figure captions require both accuracy and stylistic consistency to convey visual information. Here, we present a domain-specific caption generation system for the 3rd SciCap Challenge that integrates figure-related textual context with author-specific writing styles using the LaMP-Cap dataset. Our approach uses a two-stage pipeline: Stage 1 combines context filtering, category-specific prompt optimization via DSPy's MIPROv2 and SIMBA, and caption candidate selection; Stage 2 applies few-shot prompting with profile figures for stylistic refinement. Our experiments demonstrate that category-specific prompts outperform both zero-shot and general optimized approaches, improving ROUGE-1 recall by +8.3% while limiting precision loss to -2.8% and BLEU-4 reduction to -10.9%. Profile-informed stylistic refinement yields 40--48% gains in BLEU scores and 25--27% in ROUGE. Overall, our system demonstrates that combining contextual understanding with author-specific stylistic adaptation can generate captions that are both scientifically accurate and stylistically faithful to the source paper.
Problem

Research questions and friction points this paper is trying to address.

Generating accurate scientific figure captions with author-specific styles
Integrating contextual filtering and category-specific prompt optimization
Improving caption quality through stylistic refinement using profile figures
Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-stage pipeline with context filtering and prompt optimization
Category-specific prompts outperforming zero-shot approaches
Profile-informed stylistic refinement improving BLEU and ROUGE scores
🔎 Similar Papers
No similar papers found.
W
Watcharapong Timklaypachara
Department of Biomedical Engineering, Faculty of Engineering, Mahidol University
M
Monrada Chiewhawan
Department of Biomedical Engineering, Faculty of Engineering, Mahidol University
N
Nopporn Lekuthai
Department of Biomedical Engineering, Faculty of Engineering, Mahidol University
Titipat Achakulvisut
Titipat Achakulvisut
Department of Biomedical Engineering, Mahidol University, Thailand
Natural Language ProcessingApplied Machine Learning in BiomedicineScience of Science