Simulating Tracking Data to Advance Sports Analytics Research

📅 2025-03-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
High-resolution tracking data for continuous adversarial sports (e.g., soccer, ice hockey) is scarce and expensive to acquire, severely limiting AI-driven analytics. Method: This paper introduces a simulation-based data generation framework built upon the Google Research Football environment. It constructs the first large-scale, open-source soccer tracking dataset featuring schema compatibility with real-world data and spatiotemporal consistency, accompanied by standardized kinematic feature extraction and event detection pipelines. Contribution/Results: Empirical evaluation on ball possession identification and pass prediction demonstrates that models trained on this synthetic data achieve performance within 5% error margin of those trained on real data. The framework significantly enhances reproducibility and scalability in sports AI research. Its core innovation lies in establishing a publicly available, structurally aligned, and task-ready paradigm for simulated tracking data.

Technology Category

Application Category

📝 Abstract
Advanced analytics have transformed how sports teams operate, particularly in episodic sports like baseball. Their impact on continuous invasion sports, such as soccer and ice hockey, has been limited due to increased game complexity and restricted access to high-resolution game tracking data. In this demo, we present a method to collect and utilize simulated soccer tracking data from the Google Research Football environment to support the development of models designed for continuous tracking data. The data is stored in a schema that is representative of real tracking data and we provide processes that extract high-level features and events. We include examples of established tracking data models to showcase the efficacy of the simulated data. We address the scarcity of publicly available tracking data, providing support for research at the intersection of artificial intelligence and sports analytics.
Problem

Research questions and friction points this paper is trying to address.

Simulating soccer tracking data for AI sports analytics
Overcoming limited access to real continuous sports data
Supporting model development with synthetic tracking datasets
Innovation

Methods, ideas, or system contributions that make the work stand out.

Simulated soccer tracking data generation
High-level feature and event extraction
Publicly accessible tracking data support
🔎 Similar Papers
No similar papers found.