MedCoG: Maximizing LLM Inference Density in Medical Reasoning via Meta-Cognitive Regulation

📅 2026-02-08

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

This work addresses the inefficiency of large language models in medical reasoning, where blind expansion often leads to high computational costs with diminishing returns. To mitigate this, the study introduces a metacognitive mechanism into medical large language models for the first time, proposing a dynamic evaluation framework that assesses task complexity, familiarity, and knowledge density to selectively activate procedural, episodic, and factual knowledge as needed. The authors innovatively develop a knowledge graph–driven medical metacognitive agent and define a novel “reasoning density” metric to quantify reasoning efficiency. Experimental results demonstrate that the proposed approach achieves a 5.5× improvement in reasoning density across five challenging medical benchmarks, significantly reducing computational overhead while simultaneously enhancing accuracy.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) have shown strong potential in complex medical reasoning yet face diminishing gains under inference scaling laws. While existing studies augment LLMs with various knowledge types, it remains unclear how effectively the additional costs translate into accuracy. In this paper, we explore how meta-cognition of LLMs, i.e., their self-awareness of their own knowledge states, can regulate the reasoning process. Specifically, we propose MedCoG, a Medical Meta-Cognition Agent with Knowledge Graph, where the meta-cognitive assessments of task complexity, familiarity, and knowledge density dynamically regulate utilization of procedural, episodic, and factual knowledge. The LLM-centric on-demand reasoning aims to mitigate scaling laws by (1) reducing costs via avoiding indiscriminate scaling, (2) improving accuracy via filtering out distractive knowledge. To validate this, we empirically characterize the scaling curve and introduce inference density to quantify inference efficiency, defined as the ratio of theoretically effective cost to actual cost. Experiments demonstrate the effectiveness and efficiency of MedCoG on five hard sets of medical benchmarks, yielding 5.5x inference density. Furthermore, the Oracle study highlights the significant potential of meta-cognitive regulation.

Problem

Research questions and friction points this paper is trying to address.

LLM inference scaling

medical reasoning

inference efficiency

cost-accuracy trade-off

diminishing returns

Innovation

Methods, ideas, or system contributions that make the work stand out.

meta-cognitive regulation

inference density

medical reasoning