Identifying Evidence-Based Nudges in Biomedical Literature with Large Language Models

📅 2026-02-10

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

This study addresses the significant challenge of efficiently identifying evidence-based behavioral nudges from over 8 million PubMed articles. To tackle this, the authors propose a scalable AI system that automatically extracts structured nudge evidence from unstructured text through a multi-stage pipeline. Key methodological innovations include a domain-customized retrieval module enabling flexible precision-recall trade-offs, a single-pass inference framework based on the quantized LLaMA-3.1 8B model (OpenScholar) that simultaneously performs classification and field extraction, and reliability safeguards via JSON Schema validation and self-consistency checks. Evaluated on a test set, the system achieves an F1 score of 67.0% and recall of 72.0%, with a high-precision mode attaining 100% precision. The system has been integrated into the Agile Nudge+ platform to support evidence-based intervention design.

Technology Category

Application Category

📝 Abstract

We present a scalable, AI-powered system that identifies and extracts evidence-based behavioral nudges from unstructured biomedical literature. Nudges are subtle, non-coercive interventions that influence behavior without limiting choice, showing strong impact on health outcomes like medication adherence. However, identifying these interventions from PubMed's 8 million+ articles is a bottleneck. Our system uses a novel multi-stage pipeline: first, hybrid filtering (keywords, TF-IDF, cosine similarity, and a"nudge-term bonus") reduces the corpus to about 81,000 candidates. Second, we use OpenScholar (quantized LLaMA 3.1 8B) to classify papers and extract structured fields like nudge type and target behavior in a single pass, validated against a JSON schema. We evaluated four configurations on a labeled test set (N=197). The best setup (Title/Abstract/Intro) achieved a 67.0% F1 score and 72.0% recall, ideal for discovery. A high-precision variant using self-consistency (7 randomized passes) achieved 100% precision with 12% recall, demonstrating a tunable trade-off for high-trust use cases. This system is being integrated into Agile Nudge+, a real-world platform, to ground LLM-generated interventions in peer-reviewed evidence. This work demonstrates interpretable, domain-specific retrieval pipelines for evidence synthesis and personalized healthcare.

Problem

Research questions and friction points this paper is trying to address.

nudges

biomedical literature

evidence-based interventions

behavioral interventions

literature mining

Innovation

Methods, ideas, or system contributions that make the work stand out.

evidence-based nudges

large language models

structured information extraction