๐ค AI Summary
Existing time-series image captioning methods are generic and lack domain-specific adaptation capability, necessitating extensive retraining for cross-domain transfer. Method: We propose TADACap, a training-free, retrieval-based domain-aware captioning framework tailored for high-specialization domains such as finance and healthcare, addressing both domain-specific description deficiency and zero-shot cross-domain adaptation. Contribution/Results: (1) We introduce the first training-free domain adaptation mechanism for time-series images; (2) we design a diversity-aware retrieval strategy, TADACap-diverse, to enhance domain coverage and semantic robustness; (3) we construct a cross-domain imageโcaption pair repository with efficient nearest-neighbor matching. Evaluated on multiple benchmarks, TADACap achieves state-of-the-art semantic accuracy while reducing annotation cost by over 60% and generalizing to unseen domains without fine-tuning.
๐ Abstract
While image captioning has gained significant attention, the potential of captioning time-series images, prevalent in areas like finance and healthcare, remains largely untapped. Existing time-series captioning methods typically offer generic, domain-agnostic descriptions of time-series shapes and struggle to adapt to new domains without substantial retraining. To address these limitations, we introduce TADACap, a retrieval-based framework to generate domain-aware captions for time-series images, capable of adapting to new domains without retraining. Building on TADACap, we propose a novel retrieval strategy that retrieves diverse image-caption pairs from a target domain database, namely TADACap-diverse. We benchmarked TADACap-diverse against state-of-the-art methods and ablation variants. TADACap-diverse demonstrates comparable semantic accuracy while requiring significantly less annotation effort.