🤖 AI Summary
To address low efficiency, insufficient coverage, and delayed feedback in manual 911 call evaluation, this paper proposes the first fully automated emergency call debriefing framework integrating Signal Temporal Logic (STL) with large language models (LLMs). Methodologically, it introduces STL-based formal specifications into emergency communication quality assessment, establishing an interpretable and verifiable AI quality assurance paradigm; further, it designs an STL-guided LLM runtime verification mechanism to jointly ensure logical rigor and semantic understanding. The system integrates context-aware parsing, rule-driven validation, and automated report generation. Evaluated on 1,701 real-world emergency calls at the Nashville Emergency Communications Center, it reduced manual effort by 311.85 hours, significantly improving assessment coverage, timeliness, and consistency and accuracy of dispatcher feedback.
📝 Abstract
Emergency response services are critical to public safety, with 9-1-1 call-takers playing a key role in ensuring timely and effective emergency operations. To ensure call-taking performance consistency, quality assurance is implemented to evaluate and refine call-takers' skillsets. However, traditional human-led evaluations struggle with high call volumes, leading to low coverage and delayed assessments. We introduce LogiDebrief, an AI-driven framework that automates traditional 9-1-1 call debriefing by integrating Signal-Temporal Logic (STL) with Large Language Models (LLMs) for fully-covered rigorous performance evaluation. LogiDebrief formalizes call-taking requirements as logical specifications, enabling systematic assessment of 9-1-1 calls against procedural guidelines. It employs a three-step verification process: (1) contextual understanding to identify responder types, incident classifications, and critical conditions; (2) STL-based runtime checking with LLM integration to ensure compliance; and (3) automated aggregation of results into quality assurance reports. Beyond its technical contributions, LogiDebrief has demonstrated real-world impact. Successfully deployed at Metro Nashville Department of Emergency Communications, it has assisted in debriefing 1,701 real-world calls, saving 311.85 hours of active engagement. Empirical evaluation with real-world data confirms its accuracy, while a case study and extensive user study highlight its effectiveness in enhancing call-taking performance.