ARCANE-PedSynth: Synthetic Multi-Pedestrian Datasets with Behavioural Crossing Annotations

📅 2026-05-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the scarcity of high-density, behaviorally annotated multi-pedestrian crossing data in existing autonomous driving simulation platforms, where crossing rates are typically low (around 9%). To overcome this limitation, the authors present an open-source synthetic data generation framework built upon CARLA, featuring a hybrid pedestrian control mechanism that combines AI-driven and human-in-the-loop strategies. They introduce a 12-state finite state machine and five pedestrian behavioral prototypes, substantially increasing crossing rates up to 75%. The framework enables synchronized multimodal outputs—including RGB, LiDAR, and DVS—and provides fine-grained, per-frame behavioral annotations. It ensures high controllability, diversity, and full reproducibility. The accompanying PedSynth++ dataset comprises 533 multi-pedestrian video sequences across 12 weather conditions, offering a high-quality benchmark for pedestrian behavior modeling and perception algorithms.
📝 Abstract
We present ARCANE-PedSynth, an open-source CARLA-based software framework for generating synthetic multi-pedestrian datasets with dense behavioural annotations for pedestrian crossing prediction in autonomous driving. The framework overcomes CARLA's native 9% crossing rate through a hybrid AI-manual pedestrian control architecture, enabling configurable target rates up to 75%. A 12-state behavioural finite state machine with five character archetypes produces diverse crossing behaviours. The framework generates synchronised RGB, LiDAR, and DVS data with per-frame crossing labels, behavioural states, and estimated 2D pose keypoints. We demonstrate ARCANE-PedSynth through PedSynth++, an example dataset generated with the framework, comprising 533 multi-pedestrian clips across 12 weather conditions with RGB, LiDAR, and DVS streams. ARCANE-PedSynth is fully reproducible via CLI parameterisation and Docker containerisation.
Problem

Research questions and friction points this paper is trying to address.

pedestrian crossing prediction
synthetic dataset
autonomous driving
behavioural annotation
multi-pedestrian simulation
Innovation

Methods, ideas, or system contributions that make the work stand out.

synthetic dataset
pedestrian crossing prediction
behavioral annotation
multi-modal sensor data
CARLA simulation
🔎 Similar Papers
No similar papers found.