Bottom-up Domain-specific Superintelligence: A Reliable Knowledge Graph is What We Need

📅 2025-07-18

📈 Citations: 0

✨ Influential: 0

career value

178K/year

🤖 AI Summary

Conventional language models, trained top-down on general corpora, lack deep domain-specific abstraction and reasoning capabilities—particularly exhibiting poor cross-subdomain generalization in medicine. Method: We propose a knowledge graph (KG)-driven bottom-up paradigm to build medical-domain superintelligence: (1) constructing a structured medical KG from ICD and UMLS; (2) designing a path-guided task generation pipeline that automatically synthesizes 24,000 hierarchical reasoning tasks with corresponding chain-of-thought annotations; and (3) introducing a KG-supported curriculum learning framework for progressive training—from foundational concepts to complex multi-step reasoning. Contribution/Results: Based on QwQ-32B, we develop QwQ-Med-3, which achieves significant gains on our newly constructed multi-domain medical reasoning benchmark, ICD-Bench (+12.7% on the most challenging tasks), outperforming all existing models. It also demonstrates strong transferability across multiple medical QA benchmarks.

Technology Category

Application Category

📝 Abstract

Language models traditionally used for cross-domain generalization have recently demonstrated task-specific reasoning. However, their top-down training approach on general corpora is insufficient for acquiring abstractions needed for deep domain expertise. This may require a bottom-up approach that acquires expertise by learning to compose simple domain concepts into more complex ones. A knowledge graph (KG) provides this compositional structure, where domain primitives are represented as head-relation-tail edges and their paths encode higher-level concepts. We present a task generation pipeline that synthesizes tasks directly from KG primitives, enabling models to acquire and compose them for reasoning. We fine-tune language models on the resultant KG-grounded curriculum to demonstrate domain-specific superintelligence. While broadly applicable, we validate our approach in medicine, where reliable KGs exist. Using a medical KG, we curate 24,000 reasoning tasks paired with thinking traces derived from diverse medical primitives. We fine-tune the QwQ-32B model on this curriculum to obtain QwQ-Med-3 that takes a step towards medical superintelligence. We also introduce ICD-Bench, an evaluation suite to quantify reasoning abilities across 15 medical domains. Our experiments demonstrate that QwQ-Med-3 significantly outperforms state-of-the-art reasoning models on ICD-Bench categories. Further analysis reveals that QwQ-Med-3 utilizes acquired primitives to widen the performance gap on the hardest tasks of ICD-Bench. Finally, evaluation on medical question-answer benchmarks shows that QwQ-Med-3 transfers acquired expertise to enhance the base model's performance. While the industry's approach to artificial general intelligence (AGI) emphasizes broad expertise, we envision a future in which AGI emerges from the composable interaction of efficient domain-specific superintelligent agents.

Problem

Research questions and friction points this paper is trying to address.

Enhancing domain expertise via knowledge graph-based learning

Generating tasks from KG primitives for model reasoning

Validating medical superintelligence using curated KG tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Knowledge graph for compositional concept learning

Task generation pipeline from KG primitives

Fine-tuning models on KG-grounded curriculum

🔎 Similar Papers

No similar papers found.