SCOPE: A Lightweight-training LLM Framework for Air Traffic Control Readback Monitoring

📅 2026-05-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of anomaly detection in air traffic control readback speech, where highly variable phraseology hinders generalization. The authors propose SCOPE, a novel framework that integrates a lightweight open-set classifier with a few-shot in-context learning mechanism based on a frozen large language model. This approach achieves strong generalization, low latency, and high interpretability without requiring fine-tuning of the large model. Evaluated on a semi-synthetic dataset, SCOPE attains an open-set detection accuracy of 91.05% and an anomalous readback correction rate of 96.63%, substantially outperforming the strongest existing baselines, while also enabling semantic interpretation of its decision-making process.
📝 Abstract
Pilot readback of Air Traffic Control (ATC) voice instructions is a primary safeguard against miscommunication in air transportation. However, readback anomalies remain implicated in approximately 80% of aviation incidents. This vulnerability is further exacerbated by rising traffic volume and elevated cognitive workload, thereby motivating automated readback monitoring by machine. Traditional rule-based and machine learning approaches struggle to generalize across the highly variable and evolving phraseology of air traffic controller-pilot communications. While Large Language Models (LLMs) have opened a new avenue through their strong reasoning and generalization capabilities, existing approaches still face deployment and computational barriers in practice. In this work, we propose Semantic reasoning for Communication via Open-set Plug-in with Examples (SCOPE), a novel lightweight-training LLM framework that advances both the efficiency and accuracy of machine-based ATC readback monitoring. The core idea is to couple a plug-in open-set classifier with a carefully designed in-context learning mechanism on top of a frozen LLM. Extensive experiments on the semi-synthetic communication dataset show that SCOPE attains superior accuracy while delivering the low-latency response required for operational environments. Under a few-shot setting, SCOPE achieves 91.05% accuracy in open-set detection and corrects 96.63% of anomalous readbacks, thereby outperforming the strongest available baselines while providing explanations for its decisions. These findings demonstrate the potential of our framework as a practical pathway toward interpretable and controllable ATC readback monitoring.
Problem

Research questions and friction points this paper is trying to address.

readback monitoring
air traffic control
Large Language Models
anomaly detection
aviation safety
Innovation

Methods, ideas, or system contributions that make the work stand out.

lightweight-training LLM
open-set classification
in-context learning
readback monitoring
air traffic control
Q
Qihan Deng
Department of Mechanical and Aerospace Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, 999077, Hong Kong
Minghua Zhang
Minghua Zhang
Professor of Atmospheric Sciences, Stony Brook University
climate modeling climate changeatmospheric sciences
Y
Yang Yang
School of Electronic and Information Engineering, Beihang University, Beijing, 100191, China; State Key Laboratory of CNS/ATM, Beijing, 100191, China
Z
Zhenyu Gao
Department of Mechanical and Aerospace Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, 999077, Hong Kong