Personalized Scientific Figure Caption Generation: An Empirical Study on Author-Specific Writing Style Transfer

📅 2025-09-30

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

This work addresses the challenge of simultaneously preserving author-specific writing style and ensuring high-quality caption generation for scientific figures. We propose a multimodal large language model–based style transfer method that integrates authors’ historical texts, fine-grained stylistic features (e.g., terminology preferences and syntactic structures), and paper metadata (domain, task, figure type) into an end-to-end personalized captioning framework. To our knowledge, this is the first study to identify and formalize the intrinsic trade-off between stylistic fidelity and caption informativeness/accuracy. We mitigate this tension via an author profile enhancement mechanism. Evaluated on the Third SciCap Challenge, our approach achieves a +23.6% improvement in style similarity without compromising caption quality—demonstrating strong practical utility for automated scientific image understanding and AI-assisted scholarly writing systems.

Technology Category

Application Category

📝 Abstract

We study personalized figure caption generation using author profile data from scientific papers. Our experiments demonstrate that rich author profile data, combined with relevant metadata, can significantly improve the personalization performance of multimodal large language models. However, we also reveal a fundamental trade-off between matching author style and maintaining caption quality. Our findings offer valuable insights and future directions for developing practical caption automation systems that balance both objectives. This work was conducted as part of the 3rd SciCap challenge.

Problem

Research questions and friction points this paper is trying to address.

Personalizing scientific figure captions using author profiles

Balancing author style transfer with caption quality maintenance

Improving multimodal models through metadata-enhanced personalization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Using author profiles to personalize caption generation

Combining metadata with multimodal language models

Balancing author style with caption quality trade-off

🔎 Similar Papers

No similar papers found.