Exploiting Spatiotemporal Properties for Efficient Event-Driven Human Pose Estimation

📅 2025-12-06

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

To address computational redundancy and temporal resolution degradation caused by converting event streams into dense frames in event-camera-based human pose estimation, this paper proposes an end-to-end point-cloud-driven approach. Instead of frame-based representations, it directly constructs spatiotemporal point clouds from raw asynchronous events. We design an event time-slicing convolution module to capture millisecond-scale short-term dependencies, introduce an event slice serialization mechanism for structured temporal modeling, and embed an edge-enhancement module into the point cloud representation to improve spatial detail perception under sparse conditions. The method is compatible with mainstream point cloud backbones—including PointNet, DGCNN, and Point Transformer. Evaluated on the DHP19 dataset, our approach significantly outperforms existing point-cloud-based baselines, achieving consistent improvements in both accuracy and inference efficiency—demonstrating the effectiveness of explicitly leveraging the spatiotemporal sparsity inherent in event streams.

Technology Category

Application Category

📝 Abstract

Human pose estimation focuses on predicting body keypoints to analyze human motion. Event cameras provide high temporal resolution and low latency, enabling robust estimation under challenging conditions. However, most existing methods convert event streams into dense event frames, which adds extra computation and sacrifices the high temporal resolution of the event signal. In this work, we aim to exploit the spatiotemporal properties of event streams based on point cloud-based framework, designed to enhance human pose estimation performance. We design Event Temporal Slicing Convolution module to capture short-term dependencies across event slices, and combine it with Event Slice Sequencing module for structured temporal modeling. We also apply edge enhancement in point cloud-based event representation to enhance spatial edge information under sparse event conditions to further improve performance. Experiments on the DHP19 dataset show our proposed method consistently improves performance across three representative point cloud backbones: PointNet, DGCNN, and Point Transformer.

Problem

Research questions and friction points this paper is trying to address.

Exploiting spatiotemporal properties for efficient event-driven human pose estimation

Designing modules to capture short-term dependencies and structured temporal modeling

Enhancing spatial edge information in point cloud representation for sparse events

Innovation

Methods, ideas, or system contributions that make the work stand out.

Event Temporal Slicing Convolution captures short-term dependencies

Event Slice Sequencing enables structured temporal modeling

Edge enhancement in point cloud representation improves spatial information

🔎 Similar Papers

No similar papers found.