Rapid Machine Learning-Driven Detection of Pesticides and Dyes Using Raman Spectroscopy

📅 2025-11-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Raman spectroscopy for pesticide and synthetic dye residue detection suffers from fluorescence interference, high noise levels, and severe peak overlap, leading to low identification accuracy. To address these challenges, this paper proposes MLRaman, a multimodal analytical framework integrating ResNet-18 for deep spectral feature extraction, coupled with an XGBoost–SVM hybrid classifier; class separability is rigorously validated using PCA, t-SNE, and UMAP. Furthermore, a real-time spectral prediction system is developed using Streamlit. Evaluated on multi-class pesticide–dye mixtures, the CNN-XGBoost model achieves 97.4% classification accuracy and an AUC of 1.0. It demonstrates robust generalization on both independent experimental datasets and publicly available literature spectra. The framework significantly enhances model generalizability and deployment feasibility, establishing a highly robust, end-to-end intelligent detection paradigm for food safety and environmental monitoring.

Technology Category

Application Category

📝 Abstract
The extensive use of pesticides and synthetic dyes poses critical threats to food safety, human health, and environmental sustainability, necessitating rapid and reliable detection methods. Raman spectroscopy offers molecularly specific fingerprints but suffers from spectral noise, fluorescence background, and band overlap, limiting its real-world applicability. Here, we propose a deep learning framework based on ResNet-18 feature extraction, combined with advanced classifiers, including XGBoost, SVM, and their hybrid integration, to detect pesticides and dyes from Raman spectroscopy, called MLRaman. The MLRaman with the CNN-XGBoost model achieved a predictive accuracy of 97.4% and a perfect AUC of 1.0, while it with the CNN-SVM model provided competitive results with robust class-wise discrimination. Dimensionality reduction analyses (PCA, t-SNE, UMAP) confirmed the separability of Raman embeddings across 10 analytes, including 7 pesticides and 3 dyes. Finally, we developed a user-friendly Streamlit application for real-time prediction, which successfully identified unseen Raman spectra from our independent experiments and also literature sources, underscoring strong generalization capacity. This study establishes a scalable, practical MLRaman model for multi-residue contaminant monitoring, with significant potential for deployment in food safety and environmental surveillance.
Problem

Research questions and friction points this paper is trying to address.

Detecting pesticides and dyes rapidly using Raman spectroscopy
Overcoming spectral noise and fluorescence in Raman data
Developing a deep learning model for accurate contaminant identification
Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep learning framework with ResNet-18 feature extraction
Hybrid CNN-XGBoost model achieving 97.4% accuracy
User-friendly Streamlit application for real-time prediction
🔎 Similar Papers
No similar papers found.
Q
Quach Thi Thai Binh
Faculty of Physics and Physics Engineering, University of Science, Ho Chi Minh City 700000, Viet Nam
T
Thuan Phuoc
Faculty of Physics and Physics Engineering, University of Science, Ho Chi Minh City 700000, Viet Nam
Xuan Hai
Xuan Hai
Faculty of Physics and Physics Engineering, University of Science, Ho Chi Minh City 700000, Viet Nam
T
Thang Bach Phan
Center for Innovative Materials and Architectures (INOMAR)
V
Vu Thi Hanh Thu
Faculty of Physics and Physics Engineering, University of Science, Ho Chi Minh City 700000, Viet Nam
Nguyen Tuan Hung
Nguyen Tuan Hung
National Taiwan University, Tohoku University, MIT
ThermoelectronicsQuantum materialsRaman spectroscopyDFTAI