A Generalised and Adaptable Reinforcement Learning Stopping Method

📅 2025-05-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing TAR stopping methods struggle to flexibly balance recall against review cost, rely on fixed recall targets, and exhibit weak coupling with underlying classifiers. This paper introduces GRLStop—the first generalizable, reinforcement learning–based TAR stopping framework. It formalizes TAR stopping as a generalized RL environment, enabling a single model to adapt dynamically to arbitrary recall targets; employs online policy learning to jointly optimize recall gain and review cost; and natively integrates mainstream classifiers while decoupling stopping decisions from classification. Extensive experiments across six benchmark datasets—including CLEF e-Health, TREC, and Reuters RCV1—under multiple recall targets demonstrate that GRLStop significantly outperforms conventional stopping methods: it achieves an average 8.2% higher recall and enables seamless target-recall switching without retraining, thereby advancing both effectiveness and deployment flexibility in TAR systems.

Technology Category

Application Category

📝 Abstract
This paper presents a Technology Assisted Review (TAR) stopping approach based on Reinforcement Learning (RL). Previous such approaches offered limited control over stopping behaviour, such as fixing the target recall and tradeoff between preferring to maximise recall or cost. These limitations are overcome by introducing a novel RL environment, GRLStop, that allows a single model to be applied to multiple target recalls, balances the recall/cost tradeoff and integrates a classifier. Experiments were carried out on six benchmark datasets (CLEF e-Health datasets 2017-9, TREC Total Recall, TREC Legal and Reuters RCV1) at multiple target recall levels. Results showed that the proposed approach to be effective compared to multiple baselines in addition to offering greater flexibility.
Problem

Research questions and friction points this paper is trying to address.

Develops RL-based stopping method for Technology Assisted Review
Overcomes limitations in recall/cost tradeoff control
Validates effectiveness across multiple benchmark datasets
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinforcement Learning based TAR stopping approach
Novel RL environment GRLStop for flexibility
Balances recall/cost tradeoff with classifier integration
🔎 Similar Papers
No similar papers found.
R
Reem Bin-Hezam
Department of Information Systems, College of Computer and Information Sciences, Princess Nourah Bint Abdulrahman University, Riyadh, Saudi Arabia
Mark Stevenson
Mark Stevenson
Sheffield University
Natural Language ProcessingComputational LinguisticsInformation RetrievalAI