From Hawkes Processes to Attention: Time-Modulated Mechanisms for Event Sequences

📅 2026-01-14

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

Existing Transformer-based marked temporal point process (MTPP) approaches rely solely on positional encoding to incorporate temporal information, which limits their ability to capture event-type-specific heterogeneous temporal dynamics. To address this limitation, this work proposes Hawkes Attention, a novel mechanism that deeply integrates the theory of multivariate Hawkes processes with the attention architecture. Specifically, it introduces learnable type-specific neural kernels that dynamically modulate the projections of queries, keys, and values, thereby jointly modeling the interaction between event content and temporal dynamics. The proposed method achieves significant performance gains over current baselines on MTPP tasks and naturally extends to predictive tasks involving complex temporal structures.

Technology Category

Application Category

📝 Abstract

Marked Temporal Point Processes (MTPPs) arise naturally in medical, social, commercial, and financial domains. However, existing Transformer-based methods mostly inject temporal information only via positional encodings, relying on shared or parametric decay structures, which limits their ability to capture heterogeneous and type-specific temporal effects. Inspired by this observation, we derive a novel attention operator called Hawkes Attention from the multivariate Hawkes process theory for MTPP, using learnable per-type neural kernels to modulate query, key and value projections, thereby replacing the corresponding parts in the traditional attention. Benefited from the design, Hawkes Attention unifies event timing and content interaction, learning both the time-relevant behavior and type-specific excitation patterns from the data. The experimental results show that our method achieves better performance compared to the baselines. In addition to the general MTPP, our attention mechanism can also be easily applied to specific temporal structures, such as time series forecasting.

Problem

Research questions and friction points this paper is trying to address.

Marked Temporal Point Processes

Hawkes Processes

Temporal Attention

Event Sequences

Type-specific Temporal Effects

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hawkes Attention

Marked Temporal Point Processes

Neural Kernels