SCOPE: A Lightweight-training LLM Framework for Air Traffic Control Readback Monitoring

📅 2026-05-28

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

This work addresses the challenge of anomaly detection in air traffic control readback speech, where highly variable phraseology hinders generalization. The authors propose SCOPE, a novel framework that integrates a lightweight open-set classifier with a few-shot in-context learning mechanism based on a frozen large language model. This approach achieves strong generalization, low latency, and high interpretability without requiring fine-tuning of the large model. Evaluated on a semi-synthetic dataset, SCOPE attains an open-set detection accuracy of 91.05% and an anomalous readback correction rate of 96.63%, substantially outperforming the strongest existing baselines, while also enabling semantic interpretation of its decision-making process.

📝 Abstract

Pilot readback of Air Traffic Control (ATC) voice instructions is a primary safeguard against miscommunication in air transportation. However, readback anomalies remain implicated in approximately 80% of aviation incidents. This vulnerability is further exacerbated by rising traffic volume and elevated cognitive workload, thereby motivating automated readback monitoring by machine. Traditional rule-based and machine learning approaches struggle to generalize across the highly variable and evolving phraseology of air traffic controller-pilot communications. While Large Language Models (LLMs) have opened a new avenue through their strong reasoning and generalization capabilities, existing approaches still face deployment and computational barriers in practice. In this work, we propose Semantic reasoning for Communication via Open-set Plug-in with Examples (SCOPE), a novel lightweight-training LLM framework that advances both the efficiency and accuracy of machine-based ATC readback monitoring. The core idea is to couple a plug-in open-set classifier with a carefully designed in-context learning mechanism on top of a frozen LLM. Extensive experiments on the semi-synthetic communication dataset show that SCOPE attains superior accuracy while delivering the low-latency response required for operational environments. Under a few-shot setting, SCOPE achieves 91.05% accuracy in open-set detection and corrects 96.63% of anomalous readbacks, thereby outperforming the strongest available baselines while providing explanations for its decisions. These findings demonstrate the potential of our framework as a practical pathway toward interpretable and controllable ATC readback monitoring.

Problem

Research questions and friction points this paper is trying to address.

readback monitoring

air traffic control

Large Language Models

anomaly detection

aviation safety

Innovation

Methods, ideas, or system contributions that make the work stand out.

lightweight-training LLM

open-set classification

in-context learning