LaSTR: Language-Driven Time-Series Segment Retrieval

📅 2026-02-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing time series retrieval methods struggle to accurately localize fine-grained segments based on natural language queries, as they rely on expert-designed global similarity metrics. This work proposes LaSTR, the first framework for language-driven, fine-grained time series segment retrieval. LaSTR leverages the TV2 segmentation algorithm to extract local segments and employs GPT to generate high-quality textual descriptions, thereby constructing a large-scale paired dataset. Built upon a Conformer architecture and trained with contrastive learning, LaSTR aligns time series segments and natural language in a shared embedding space to achieve cross-modal semantic alignment. Experiments demonstrate that LaSTR significantly outperforms random and CLIP-based baselines across various candidate pool sizes, substantially improving retrieval ranking quality and semantic consistency.

Technology Category

Application Category

📝 Abstract
Effectively searching time-series data is essential for system analysis, but existing methods often require expert-designed similarity criteria or rely on global, series-level descriptions. We study language-driven segment retrieval: given a natural language query, the goal is to retrieve relevant local segments from large time-series repositories. We build large-scale segment--caption training data by applying TV2-based segmentation to LOTSA windows and generating segment descriptions with GPT-5.2, and then train a Conformer-based contrastive retriever in a shared text--time-series embedding space. On a held-out test split, we evaluate single-positive retrieval together with caption-side consistency (SBERT and VLM-as-a-judge) under multiple candidate pool sizes. Across all settings, LaSTR outperforms random and CLIP baselines, yielding improved ranking quality and stronger semantic agreement between retrieved segments and query intent.
Problem

Research questions and friction points this paper is trying to address.

time-series retrieval
language-driven retrieval
segment retrieval
natural language query
time-series analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

language-driven retrieval
time-series segmentation
contrastive learning
Conformer
multimodal embedding
🔎 Similar Papers
No similar papers found.
K
Kota Dohi
Research and Development Group, Hitachi, Ltd.
H
Harsh Purohit
Research and Development Group, Hitachi, Ltd.
T
Tomoya Nishida
Research and Development Group, Hitachi, Ltd.
T
Takashi Endo
Research and Development Group, Hitachi, Ltd.
Y
Yusuke Ohtsubo
Research and Development Group, Hitachi, Ltd.
K
Koichiro Yawata
Research and Development Group, Hitachi, Ltd.
K
Koki Takeshita
Research and Development Group, Hitachi, Ltd.
T
Tatsuya Sasaki
Research and Development Group, Hitachi, Ltd.
Yohei Kawaguchi
Yohei Kawaguchi
Hitachi, Ltd.
Acoustic Signal ProcessingSignal ProcessingMachine LearningSpeech ProcessingAI