🤖 AI Summary
Bridging the gap between general biomedical knowledge and actionable, testable hypotheses for specific experimental or clinical contexts remains a critical challenge. This work proposes SCENE, a novel framework that formalizes knowledge contextualization as an iterative search process through a dual-layer multi-agent architecture to deeply integrate knowledge-driven and data-driven reasoning. The upper-layer agent generates search directions and anchors relevant data patterns, while the lower-layer agent leverages knowledge graph guidance and multi-objective optimization to produce verifiable propositions that balance evidential strength with empirical support. Evaluated in real-world settings, SCENE successfully identified patient subgroups with heterogeneous treatment effects in clinical trials and discovered perturbation contexts with high target-response alignment in the LINCS L1000 study, significantly outperforming existing baselines. The generated hypotheses exhibit strong traceability, reproducibility, and expert verifiability.
📝 Abstract
Biomedical discovery often requires connecting broad biomedical knowledge with specific experimental or clinical data. Background knowledge suggests relevant mechanisms but is usually too general to map directly onto dataset variables, while data-driven patterns can be dataset-specific and hard to interpret mechanistically. We study this missing link as knowledge contextualization: transforming broad biomedical knowledge into evidence-supported, scenario-grounded propositions that domain experts can inspect, replay, and validate. We propose SCENE, a bi-level multi-agent framework that treats knowledge contextualization as iterative search. The upper level converts broad knowledge into search directions and grounds them in the dataset schema. The lower level executes these directions through multi-objective optimization to identify concrete propositions that balance evidential strength and data support. Feedback between the two levels progressively refines the search. We evaluate SCENE in two settings: discovering patient subgroups with heterogeneous treatment benefits in clinical trial scenarios, and identifying context-specific biological responses in LINCS L1000 studies. In clinical trials, SCENE discovers specific, well-supported subgroups and outperforms existing baselines. In L1000 studies, SCENE identifies perturbational contexts with strong target-response matching and high positive rates. These results show that SCENE bridges broad knowledge and scenario-specific evidence, producing traceable, inspectable hypotheses for follow-up validation.