🤖 AI Summary
To address the challenges of complex interactions between haze and dust pollution events and poor recognition of rare classes, this paper proposes a joint classification framework based on the Factorial Hidden Markov Model (FHMM). The method innovatively incorporates statistically independent hidden state chains and Walsh–Hadamard transformation for efficient dimensionality reduction; models multivariate nonlinear dependencies via Gaussian Copula; employs mutual information–based feature weighting; and enhances rare-class discrimination through globally optimized Viterbi decoding. Experimental results demonstrate substantial improvements: the model achieves a Micro-F1 score of 0.9459, with the dust class F1 increasing from 0.19 to 0.75 and the haze class F1 rising from 0.32 to 0.68. This framework provides robust, fine-grained identification support for targeted environmental governance.
📝 Abstract
Haze and dust pollution events have significant adverse impacts on human health and ecosystems. Their formation-impact interactions are complex, creating substantial modeling and computational challenges for joint classification. To address the state-space explosion faced by conventional Hidden Markov Models in multivariate dynamic settings, this study develops a classification framework based on the Factorial Hidden Markov Model. The framework assumes statistical independence across multiple latent chains and applies the Walsh-Hadamard transform to reduce computational and memory costs. A Gaussian copula decouples marginal distributions from dependence to capture nonlinear correlations among meteorological and pollution indicators. Algorithmically, mutual information weights the observational variables to increase the sensitivity of Viterbi decoding to salient features, and a single global weight hyperparameter balances emission and transition contributions in the decoding objective. In an empirical application, the model attains a Micro-F1 of 0.9459; for the low-frequency classes Dust prevalence below 1% and Haze prevalence below 10%, the F1-scores improve from 0.19 and 0.32 under a baseline FHMM to 0.75 and 0.68. The framework provides a scalable pathway for statistical modeling of complex air-pollution events and supplies quantitative evidence for decision-making in outdoor activity management and fine-grained environmental governance.