Can Broad Biomedical Knowledge be Contextualized into Scenario-Grounded Propositions?

📅 2026-05-26

📈 Citations: 0

✨ Influential: 0

career value

167K/year

🤖 AI Summary

Bridging the gap between general biomedical knowledge and actionable, testable hypotheses for specific experimental or clinical contexts remains a critical challenge. This work proposes SCENE, a novel framework that formalizes knowledge contextualization as an iterative search process through a dual-layer multi-agent architecture to deeply integrate knowledge-driven and data-driven reasoning. The upper-layer agent generates search directions and anchors relevant data patterns, while the lower-layer agent leverages knowledge graph guidance and multi-objective optimization to produce verifiable propositions that balance evidential strength with empirical support. Evaluated in real-world settings, SCENE successfully identified patient subgroups with heterogeneous treatment effects in clinical trials and discovered perturbation contexts with high target-response alignment in the LINCS L1000 study, significantly outperforming existing baselines. The generated hypotheses exhibit strong traceability, reproducibility, and expert verifiability.

📝 Abstract

Biomedical discovery often requires connecting broad biomedical knowledge with specific experimental or clinical data. Background knowledge suggests relevant mechanisms but is usually too general to map directly onto dataset variables, while data-driven patterns can be dataset-specific and hard to interpret mechanistically. We study this missing link as knowledge contextualization: transforming broad biomedical knowledge into evidence-supported, scenario-grounded propositions that domain experts can inspect, replay, and validate. We propose SCENE, a bi-level multi-agent framework that treats knowledge contextualization as iterative search. The upper level converts broad knowledge into search directions and grounds them in the dataset schema. The lower level executes these directions through multi-objective optimization to identify concrete propositions that balance evidential strength and data support. Feedback between the two levels progressively refines the search. We evaluate SCENE in two settings: discovering patient subgroups with heterogeneous treatment benefits in clinical trial scenarios, and identifying context-specific biological responses in LINCS L1000 studies. In clinical trials, SCENE discovers specific, well-supported subgroups and outperforms existing baselines. In L1000 studies, SCENE identifies perturbational contexts with strong target-response matching and high positive rates. These results show that SCENE bridges broad knowledge and scenario-specific evidence, producing traceable, inspectable hypotheses for follow-up validation.

Problem

Research questions and friction points this paper is trying to address.

knowledge contextualization

biomedical knowledge

scenario-grounded propositions

data interpretation

hypothesis generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

knowledge contextualization

multi-agent framework

scenario-grounded propositions