LaSTR: Language-Driven Time-Series Segment Retrieval

📅 2026-02-28

📈 Citations: 0

✨ Influential: 0

career value

167K/year

🤖 AI Summary

Existing time series retrieval methods struggle to accurately localize fine-grained segments based on natural language queries, as they rely on expert-designed global similarity metrics. This work proposes LaSTR, the first framework for language-driven, fine-grained time series segment retrieval. LaSTR leverages the TV2 segmentation algorithm to extract local segments and employs GPT to generate high-quality textual descriptions, thereby constructing a large-scale paired dataset. Built upon a Conformer architecture and trained with contrastive learning, LaSTR aligns time series segments and natural language in a shared embedding space to achieve cross-modal semantic alignment. Experiments demonstrate that LaSTR significantly outperforms random and CLIP-based baselines across various candidate pool sizes, substantially improving retrieval ranking quality and semantic consistency.

Technology Category

Application Category

📝 Abstract

Effectively searching time-series data is essential for system analysis, but existing methods often require expert-designed similarity criteria or rely on global, series-level descriptions. We study language-driven segment retrieval: given a natural language query, the goal is to retrieve relevant local segments from large time-series repositories. We build large-scale segment--caption training data by applying TV2-based segmentation to LOTSA windows and generating segment descriptions with GPT-5.2, and then train a Conformer-based contrastive retriever in a shared text--time-series embedding space. On a held-out test split, we evaluate single-positive retrieval together with caption-side consistency (SBERT and VLM-as-a-judge) under multiple candidate pool sizes. Across all settings, LaSTR outperforms random and CLIP baselines, yielding improved ranking quality and stronger semantic agreement between retrieved segments and query intent.

Problem

Research questions and friction points this paper is trying to address.

time-series retrieval

language-driven retrieval

segment retrieval

natural language query

time-series analysis

Innovation

Methods, ideas, or system contributions that make the work stand out.

language-driven retrieval

time-series segmentation

contrastive learning