🤖 AI Summary
This work proposes MechPert, a lightweight framework designed to predict transcriptional responses to unobserved genetic perturbations and guide experimental design. Leveraging a large language model–driven multi-agent system, MechPert independently generates directed regulatory hypotheses with associated confidence scores, then aggregates these through mechanistic consensus to filter spurious associations and construct a weighted neighborhood for downstream prediction. Unlike conventional approaches that rely on functional similarity or static knowledge graphs, MechPert introduces mechanistic consensus as an inductive bias, prioritizing directed regulatory logic over symmetric co-occurrence relationships. Evaluated on four Perturb-seq cell line benchmarks, MechPert achieves up to a 10.5% improvement in Pearson correlation coefficient under low-data regimes (N=50) and demonstrates up to a 46% gain in performance for anchor gene selection in experimental design compared to traditional network centrality methods.
📝 Abstract
Predicting transcriptional responses to unseen genetic perturbations is essential for understanding gene regulation and prioritizing large-scale perturbation experiments. Existing approaches either rely on static, potentially incomplete knowledge graphs, or prompt language models for functionally similar genes, retrieving associations shaped by symmetric co-occurrence in scientific text rather than directed regulatory logic. We introduce MechPert, a lightweight framework that encourages LLM agents to generate directed regulatory hypotheses rather than relying solely on functional similarity. Multiple agents independently propose candidate regulators with associated confidence scores; these are aggregated through a consensus mechanism that filters spurious associations, producing weighted neighborhoods for downstream prediction. We evaluate MechPert on Perturb-seq benchmarks across four human cell lines. For perturbation prediction in low-data regimes ($N=50$ observed perturbations), MechPert improves Pearson correlation by up to 10.5\% over similarity-based baselines. For experimental design, MechPert-selected anchor genes outperform standard network centrality heuristics by up to 46\% in well-characterized cell lines.