From Gaming to Research: GTA V for Synthetic Data Generation for Robotics and Navigations

📅 2025-02-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high acquisition cost and poor generalizability of real-world visual data, this paper proposes a large-scale, unlabeled synthetic data generation framework leveraging the *Grand Theft Auto V* (GTA V) game engine, specifically tailored for robotic SLAM and visual place recognition (VPR). We present the first systematic validation of GTA V as a high-fidelity, scalable synthetic data source for SLAM/VPR tasks. Our method introduces an unsupervised VPR data generation algorithm that employs in-game API control, automated trajectory sampling, and multi-view rendering to construct diverse, photorealistic datasets. Evaluation on ORB-SLAM2 and NetVLAD benchmarks demonstrates that models trained solely on synthetic data achieve localization accuracy and recall rates comparable to those trained on real data. Furthermore, hybrid training—combining synthetic and real data—yields consistent performance gains, substantially reducing dependency on costly real-world data collection and annotation.

Technology Category

Application Category

📝 Abstract
In computer vision, the development of robust algorithms capable of generalizing effectively in real-world scenarios more and more often requires large-scale datasets collected under diverse environmental conditions. However, acquiring such datasets is time-consuming, costly, and sometimes unfeasible. To address these limitations, the use of synthetic data has gained attention as a viable alternative, allowing researchers to generate vast amounts of data while simulating various environmental contexts in a controlled setting. In this study, we investigate the use of synthetic data in robotics and navigation, specifically focusing on Simultaneous Localization and Mapping (SLAM) and Visual Place Recognition (VPR). In particular, we introduce a synthetic dataset created using the virtual environment of the video game Grand Theft Auto V (GTA V), along with an algorithm designed to generate a VPR dataset, without human supervision. Through a series of experiments centered on SLAM and VPR, we demonstrate that synthetic data derived from GTA V are qualitatively comparable to real-world data. Furthermore, these synthetic data can complement or even substitute real-world data in these applications. This study sets the stage for the creation of large-scale synthetic datasets, offering a cost-effective and scalable solution for future research and development.
Problem

Research questions and friction points this paper is trying to address.

Synthetic data generation for robotics
Using GTA V for navigation research
SLAM and VPR dataset creation
Innovation

Methods, ideas, or system contributions that make the work stand out.

GTA V for synthetic data
Unsupervised VPR dataset algorithm
Synthetic data substitutes real data
🔎 Similar Papers
No similar papers found.
M
Matteo Scucchia
Department of Computer Science and Engineering, University of Bologna, Italy
Matteo Ferrara
Matteo Ferrara
Department of Computer Science and Engineering - University of Bologna
pattern recognitionbiometric systemsimage processingcomputer visionmachine learning
D
Davide Maltoni
Department of Computer Science and Engineering, University of Bologna, Italy