The Gaining Paths to Investment Success: Information-Driven LLM Graph Reasoning for Venture Capital Prediction

📅 2025-12-29

📈 Citations: 0

✨ Influential: 0

career value

186K/year

🤖 AI Summary

Venture capital (VC) startup success prediction constitutes an “out-of-graph” forecasting task, where existing methods struggle to jointly model graph-structured evidence and leverage the interpretable reasoning capabilities of large language models (LLMs). Method: We propose the first information-gain-driven graph path retrieval mechanism tailored for VC prediction, integrated with a multi-agent heterogeneous evidence fusion gating architecture. This enables compact chain-of-thought reasoning while mitigating path explosion. Our approach combines RAG-enhanced prompting, multi-perspective graph path sampling, learnable gating-based fusion, and synergistic LLM–GNN reasoning. Contribution/Results: Under strict anti-leakage evaluation, our method achieves a 5.0% absolute improvement in F1 score and a 16.6% gain in Precision@5. It establishes a novel, interpretable, and generalizable paradigm for out-of-graph prediction tasks—including recommendation and risk assessment—by unifying structural evidence modeling with faithful, stepwise LLM reasoning.

Technology Category

Application Category

📝 Abstract

Most venture capital (VC) investments fail, while a few deliver outsized returns. Accurately predicting startup success requires synthesizing complex relational evidence, including company disclosures, investor track records, and investment network structures, through explicit reasoning to form coherent, interpretable investment theses. Traditional machine learning and graph neural networks both lack this reasoning capability. Large language models (LLMs) offer strong reasoning but face a modality mismatch with graphs. Recent graph-LLM methods target in-graph tasks where answers lie within the graph, whereas VC prediction is off-graph: the target exists outside the network. The core challenge is selecting graph paths that maximize predictor performance on an external objective while enabling step-by-step reasoning. We present MIRAGE-VC, a multi-perspective retrieval-augmented generation framework that addresses two obstacles: path explosion (thousands of candidate paths overwhelm LLM context) and heterogeneous evidence fusion (different startups need different analytical emphasis). Our information-gain-driven path retriever iteratively selects high-value neighbors, distilling investment networks into compact chains for explicit reasoning. A multi-agent architecture integrates three evidence streams via a learnable gating mechanism based on company attributes. Under strict anti-leakage controls, MIRAGE-VC achieves +5.0% F1 and +16.6% PrecisionAt5, and sheds light on other off-graph prediction tasks such as recommendation and risk assessment. Code: https://anonymous.4open.science/r/MIRAGE-VC-323F.

Problem

Research questions and friction points this paper is trying to address.

Predicts startup success using graph reasoning for venture capital

Selects high-value graph paths to maximize external prediction performance

Fuses heterogeneous evidence via multi-agent architecture for interpretable theses

Innovation

Methods, ideas, or system contributions that make the work stand out.

Information-gain-driven path retriever for compact reasoning chains

Multi-agent architecture fusing evidence via learnable gating

Framework addresses path explosion and heterogeneous evidence fusion

🔎 Similar Papers

LLM-Enhanced User-Item Interactions: Leveraging Edge Information for Optimized Recommendations