Predicting COVID-19 Prevalence Using Wastewater RNA Surveillance: A Semi-Supervised Learning Approach with Temporal Feature Trust

📅 2025-11-27

📈 Citations: 0

✨ Influential: 0

career value

172K/year

🤖 AI Summary

Following COVID-19’s transition to endemicity, wastewater RNA surveillance data exhibit heterogeneous temporal reliability due to variations in sampling protocols, assay sensitivity, and evolving transmission dynamics. Method: We propose a semi-supervised deep neural network that integrates temporal feature credibility modeling. It uses wastewater viral RNA concentrations as primary input—augmented with confounding factors—to learn a nonlinear mapping for daily case forecasting. Crucially, it introduces a dynamic feature reliability weighting mechanism that explicitly quantifies sample quality across epidemic phases and employs semi-supervised learning to mitigate scarcity of high-confidence labeled data. Results: Experiments demonstrate significantly improved generalization under data quality fluctuations: the model achieves high-accuracy daily predictions (MAE < 8.2) during high-reliability periods. It establishes a novel, interpretable, and robust paradigm for non-invasive epidemiological trend monitoring.

Technology Category

Application Category

📝 Abstract

As COVID-19 transitions into an endemic disease that remains constantly present in the population at a stable level, monitoring its prevalence without invasive measures becomes increasingly important. In this paper, we present a deep neural network estimator for the COVID-19 daily case count based on wastewater surveillance data and other confounding factors. This work builds upon the study by Jiang, Kolozsvary, and Li (2024), which connects the COVID-19 case counts with testing data collected early in the pandemic. Using the COVID-19 testing data and the wastewater surveillance data during the period when both data were highly reliable, one can train an artificial neural network that learns the nonlinear relation between the COVID-19 daily case count and the wastewater viral RNA concentration. From a machine learning perspective, the main challenge lies in addressing temporal feature reliability, as the training data has different reliability over different time periods.

Problem

Research questions and friction points this paper is trying to address.

Estimates COVID-19 daily cases from wastewater RNA data

Addresses temporal reliability of training data features

Uses deep learning to model nonlinear virus concentration relationships

Innovation

Methods, ideas, or system contributions that make the work stand out.

Semi-supervised learning with temporal feature trust

Deep neural network for wastewater RNA surveillance

Nonlinear relation modeling between case count and RNA concentration

🔎 Similar Papers

LLMs&XAI for Water Sustainability: Seasonal Water Quality Prediction with LIME Explainable AI and a RAG-based Chatbot for Insights