Instruction-based Time Series Editing

📅 2025-08-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing time-series editing methods rely on predefined attribute vectors and iterative sampling, suffering from limited conditional flexibility and poor controllability over editing intensity. This paper introduces the first natural-language-instruction-based time-series editing framework. It constructs a shared multimodal embedding space for time series and textual instructions, enabling continuous intensity control via multi-resolution joint encoding, instruction-conditioned decoding, and embedding interpolation. The method supports both local and global edits, few-shot transfer, and zero-shot generalization to unseen instructions—without requiring hand-crafted attributes or iterative sampling. Evaluated on synthetic and real-world benchmarks, it achieves significant improvements in editing fidelity, semantic alignment, and controllability. Results demonstrate superior flexibility, robustness, and practical applicability across diverse editing scenarios.

Technology Category

Application Category

📝 Abstract
In time series editing, we aim to modify some properties of a given time series without altering others. For example, when analyzing a hospital patient's blood pressure, we may add a sudden early drop and observe how it impacts their future while preserving other conditions. Existing diffusion-based editors rely on rigid, predefined attribute vectors as conditions and produce all-or-nothing edits through sampling. This attribute- and sampling-based approach limits flexibility in condition format and lacks customizable control over editing strength. To overcome these limitations, we introduce Instruction-based Time Series Editing, where users specify intended edits using natural language. This allows users to express a wider range of edits in a more accessible format. We then introduce InstructTime, the first instruction-based time series editor. InstructTime takes in time series and instructions, embeds them into a shared multi-modal representation space, then decodes their embeddings to generate edited time series. By learning a structured multi-modal representation space, we can easily interpolate between embeddings to achieve varying degrees of edit. To handle local and global edits together, we propose multi-resolution encoders. In our experiments, we use synthetic and real datasets and find that InstructTime is a state-of-the-art time series editor: InstructTime achieves high-quality edits with controllable strength, can generalize to unseen instructions, and can be easily adapted to unseen conditions through few-shot learning.
Problem

Research questions and friction points this paper is trying to address.

Enables flexible time series editing via natural language instructions
Overcomes rigid attribute-based editing limitations in diffusion models
Achieves controllable edit strength through multimodal representation learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses natural language for time series editing
Embeds time series and instructions multi-modally
Employs multi-resolution encoders for diverse edits
🔎 Similar Papers
No similar papers found.