🤖 AI Summary
This paper addresses the problem of per-round outcome prediction in VALORANT esports. Unlike conventional approaches relying on match logs and aggregated statistical features, we propose a novel tactical modeling method based on minimap video analysis. Our approach extracts fine-grained spatiotemporal information—including real-time player positions, movement trajectories, and key tactical events (e.g., crosshair holds, spike site control, smoke grenade deployments)—to construct temporally structured tactical representations. Methodologically, we employ TimeSformer to model the spatiotemporal dynamics of minimap video frame sequences and integrate targeted data augmentation to enhance generalization. Experimental results demonstrate that our model achieves 81% accuracy in round-win prediction on the augmented dataset, significantly outperforming baseline models using only static minimap features. This work establishes the efficacy and novelty of video-level dynamic tactical analysis for predictive modeling in first-person shooter (FPS) esports.
📝 Abstract
Recently, research on predicting match outcomes in esports has been actively conducted, but much of it is based on match log data and statistical information. This research targets the FPS game VALORANT, which requires complex strategies, and aims to build a round outcome prediction model by analyzing minimap information in match footage. Specifically, based on the video recognition model TimeSformer, we attempt to improve prediction accuracy by incorporating detailed tactical features extracted from minimap information, such as character position information and other in-game events. This paper reports preliminary results showing that a model trained on a dataset augmented with such tactical event labels achieved approximately 81% prediction accuracy, especially from the middle phases of a round onward, significantly outperforming a model trained on a dataset with the minimap information itself. This suggests that leveraging tactical features from match footage is highly effective for predicting round outcomes in VALORANT.