PFed-Signal: An ADR Prediction Model based on Federated Learning

📅 2025-12-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address spurious adverse drug reaction (ADR) signal detection caused by reporting bias in the FDA Adverse Event Reporting System (FAERS), this paper proposes PFed-Signal, a privacy-preserving federated learning framework. First, biased samples are identified and removed using Euclidean distance-based outlier detection to construct clean local subsets. Subsequently, a Transformer-based model is employed for high-accuracy ADR signal prediction. Our key innovations include: (i) a federated bias-detection mechanism that operates without sharing raw data; (ii) Pfed-Split, a novel data partitioning strategy tailored for heterogeneous, biased clinical reports; and (iii) an ADR-signal dual-module architecture integrating bias mitigation and signal detection. Experiments demonstrate that PFed-Signal significantly outperforms baseline methods in standard signal detection metrics (ROR and PRR), achieving superior performance across accuracy (0.887), F1-score (0.890), recall (0.913), and AUC (0.957). This work establishes a new paradigm for trustworthy, privacy-aware pharmacovigilance.

Technology Category

Application Category

📝 Abstract
The adverse drug reactions (ADRs) predicted based on the biased records in FAERS (U.S. Food and Drug Administration Adverse Event Reporting System) may mislead diagnosis online. Generally, such problems are solved by optimizing reporting odds ratio (ROR) or proportional reporting ratio (PRR). However, these methods that rely on statistical methods cannot eliminate the biased data, leading to inaccurate signal prediction. In this paper, we propose PFed-signal, a federated learning-based signal prediction model of ADR, which utilizes the Euclidean distance to eliminate the biased data from FAERS, thereby improving the accuracy of ADR prediction. Specifically, we first propose Pfed-Split, a method to split the original dataset into a split dataset based on ADR. Then we propose ADR-signal, an ADR prediction model, including a biased data identification method based on federated learning and an ADR prediction model based on Transformer. The former identifies the biased data according to the Euclidean distance and generates a clean dataset by deleting the biased data. The latter is an ADR prediction model based on Transformer trained on the clean data set. The results show that the ROR and PRR on the clean dataset are better than those of the traditional methods. Furthermore, the accuracy rate, F1 score, recall rate and AUC of PFed-Signal are 0.887, 0.890, 0.913 and 0.957 respectively, which are higher than the baselines.
Problem

Research questions and friction points this paper is trying to address.

Eliminates biased data in FAERS using Euclidean distance
Improves ADR prediction accuracy with federated learning
Replaces statistical methods with Transformer-based model
Innovation

Methods, ideas, or system contributions that make the work stand out.

Federated learning eliminates biased data via Euclidean distance
Transformer model predicts ADRs using cleaned dataset
Dataset splitting method isolates ADR-related data for analysis
🔎 Similar Papers
No similar papers found.
T
Tao Li
School of Computer Science, Qufu Normal University, Rizhao, China
Peilin Li
Peilin Li
National University of Singapore
Machine LearningArchitectureGenerative Design
K
Kui Lu
School of Computer Science, Qufu Normal University, Rizhao, China
Yilei Wang
Yilei Wang
Alibaba Cloud
J
Junliang Shang
School of Computer Science, Qufu Normal University, Rizhao, China
G
Guangshun Li
School of Computer Science, Qufu Normal University, Rizhao, China
Huiyu Zhou
Huiyu Zhou
Professor of Machine Learning, University of Leicester, UK
Machine learningcomputer visionmedical image analysishuman-computer interface