HyperGVL: Benchmarking and Improving Large Vision-Language Models in Hypergraph Understanding and Reasoning

📅 2026-04-16

📈 Citations: 0

✨ Influential: 0

career value

217K/year

🤖 AI Summary

Existing large vision-language models (LVLMs) lack systematic evaluation on hypergraph understanding and reasoning, and no dedicated benchmark exists to delineate their performance boundaries. This work introduces HyperGVL, the first LVLM evaluation benchmark specifically designed for hypergraphs, encompassing 12 tasks and 84,000 image–text question-answer pairs that integrate multi-scale synthetic and real-world hypergraph data. The study also systematically investigates 12 textual and visual hypergraph representation schemes and proposes WiseHyGR, a generalizable adaptive representation routing module that substantially enhances LVLMs’ comprehension, reasoning, and generalization capabilities on complex hypergraph topologies. Experimental results demonstrate that WiseHyGR consistently outperforms baseline approaches across diverse hypergraph tasks.

Technology Category

Application Category

📝 Abstract

Large Vision-Language Models (LVLMs) consistently require new arenas to guide their expanding boundaries, yet their capabilities with hypergraphs remain unexplored. In the real world, hypergraphs have significant practical applications in areas such as life sciences and social communities. Recent advancements in LVLMs have shown promise in understanding complex topologies, yet there remains a lack of a benchmark to delineate the capabilities of LVLMs with hypergraphs, leaving the boundaries of their abilities unclear. To fill this gap, in this paper, we introduce $\texttt{HyperGVL}$, the first benchmark to evaluate the proficiency of LVLMs in hypergraph understanding and reasoning. $\texttt{HyperGVL}$ provides a comprehensive assessment of 12 advanced LVLMs across 84,000 vision-language question-answering (QA) samples spanning 12 tasks, ranging from basic component counting to complex NP-hard problem reasoning. The involved hypergraphs contain multiscale synthetic structures and real-world citation and protein networks. Moreover, we examine the effects of 12 textual and visual hypergraph representations and introduce a generalizable router $\texttt{WiseHyGR}$ that improves LVLMs in hypergraph via learning adaptive representations. We believe that this work is a step forward in connecting hypergraphs with LVLMs.

Problem

Research questions and friction points this paper is trying to address.

Large Vision-Language Models

Hypergraph Understanding

Reasoning

Benchmark

Visual-Language QA

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hypergraph

Large Vision-Language Models

Benchmarking