🤖 AI Summary
Clinical EEG interpretation is time-consuming and suffers from poor inter-rater reliability; existing automated methods are largely limited to single disorders (e.g., epilepsy), hindering multi-disease collaborative screening. To address this, we establish the first real-world clinical EEG benchmark for 11 neurological disorders—spanning acute, chronic, and electrophysiologically subtle conditions. To mitigate severe class imbalance, we propose a diagnosis-sensitive threshold calibration strategy and design a Disorder-aware machine learning model that integrates multi-dimensional features—including time-domain, spectral, complexity, and inter-channel correlation measures—from bipolar EEG recordings. Evaluated on a large, heterogeneous clinical dataset, our method achieves >80% recall across most disorders and yields absolute recall improvements of 15–30% for rare diseases. Feature importance analysis aligns with established clinical neurophysiological markers, validating biological plausibility and clinical interpretability.
📝 Abstract
Clinical electroencephalography is routinely used to evaluate patients with diverse and often overlapping neurological conditions, yet interpretation remains manual, time-intensive, and variable across experts. While automated EEG analysis has been widely studied, most existing methods target isolated diagnostic problems, particularly seizure detection, and provide limited support for multi-disorder clinical screening.
This study examines automated EEG-based classification across eleven clinically relevant neurological disorder categories, encompassing acute time-critical conditions, chronic neurocognitive and developmental disorders, and disorders with indirect or weak electrophysiological signatures. EEG recordings are processed using a standard longitudinal bipolar montage and represented through a multi-domain feature set capturing temporal statistics, spectral structure, signal complexity, and inter-channel relationships. Disorder-aware machine learning models are trained under severe class imbalance, with decision thresholds explicitly calibrated to prioritize diagnostic sensitivity.
Evaluation on a large, heterogeneous clinical EEG dataset demonstrates that sensitivity-oriented modeling achieves recall exceeding 80% for the majority of disorder categories, with several low-prevalence conditions showing absolute recall gains of 15-30% after threshold calibration compared to default operating points. Feature importance analysis reveals physiologically plausible patterns consistent with established clinical EEG markers.
These results establish realistic performance baselines for multi-disorder EEG classification and provide quantitative evidence that sensitivity-prioritized automated analysis can support scalable EEG screening and triage in real-world clinical settings.