Predicting Road Crossing Behaviour using Pose Detection and Sequence Modelling

📅 2025-08-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the problem of long-range pedestrian crossing-intention prediction for autonomous driving. We propose an end-to-end deep learning framework that first employs a high-accuracy pose estimation model to extract pedestrian keypoint sequences, then integrates multiple temporal modeling approaches—namely GRU, LSTM, and 1D CNN—for dynamic behavior modeling. Experimental results demonstrate that the 1D CNN variant achieves 92.3% prediction accuracy while accelerating inference by 5.8× and 4.2× compared to LSTM and GRU, respectively, thereby striking an effective balance between real-time performance and robustness. The core contribution lies in empirically validating and establishing the superiority of lightweight, efficient 1D convolutional architectures for pedestrian intention prediction—offering a practical, deployable solution for embedded automotive systems.

Technology Category

Application Category

📝 Abstract
The world is constantly moving towards AI based systems and autonomous vehicles are now reality in different parts of the world. These vehicles require sensors and cameras to detect objects and maneuver according to that. It becomes important to for such vehicles to also predict from a distant if a person is about to cross a road or not. The current study focused on predicting the intent of crossing the road by pedestrians in an experimental setup. The study involved working with deep learning models to predict poses and sequence modelling for temporal predictions. The study analysed three different sequence modelling to understand the prediction behaviour and it was found out that GRU was better in predicting the intent compared to LSTM model but 1D CNN was the best model in terms of speed. The study involved video analysis, and the output of pose detection model was integrated later on to sequence modelling techniques for an end-to-end deep learning framework for predicting road crossing intents.
Problem

Research questions and friction points this paper is trying to address.

Predicting pedestrian road crossing intent using pose detection
Comparing sequence models like GRU, LSTM and 1D CNN
Developing end-to-end deep learning framework for autonomous vehicles
Innovation

Methods, ideas, or system contributions that make the work stand out.

Pose detection for pedestrian behavior analysis
Sequence modeling with GRU and LSTM networks
Integrated end-to-end deep learning framework