Semantic Distance Measurement based on Multi-Kernel Gaussian Processes

📅 2025-12-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Semantic distance measurement suffers from poor generalizability of classical methods and difficulty in adapting to task-specific objectives and data distributions. To address this, we propose an adaptive semantic distance modeling framework based on multi-kernel Gaussian processes (MK-GPs), which maps textual semantics into latent function representations accompanied by principled uncertainty estimates. Our approach introduces a learnable hybrid covariance kernel—integrating the Matérn and polynomial kernels—and employs supervised hyperparameter learning for data-driven optimization. By departing from conventional fixed-metric paradigms, the method enables context-aware, task-adaptive distance estimation. Empirical evaluation demonstrates substantial improvements in in-context learning (ICL) performance for large language models on fine-grained sentiment classification, validating that adaptive semantic distance modeling meaningfully enhances discriminative capability for downstream tasks.

Technology Category

Application Category

📝 Abstract
Semantic distance measurement is a fundamental problem in computational linguistics, providing a quantitative characterization of similarity or relatedness between text segments, and underpinning tasks such as text retrieval and text classification. From a mathematical perspective, a semantic distance can be viewed as a metric defined on a space of texts or on a representation space derived from them. However, most classical semantic distance methods are essentially fixed, making them difficult to adapt to specific data distributions and task requirements. In this paper, a semantic distance measure based on multi-kernel Gaussian processes (MK-GP) was proposed. The latent semantic function associated with texts was modeled as a Gaussian process, with its covariance function given by a combined kernel combining Matérn and polynomial components. The kernel parameters were learned automatically from data under supervision, rather than being hand-crafted. This semantic distance was instantiated and evaluated in the context of fine-grained sentiment classification with large language models under an in-context learning (ICL) setup. The experimental results demonstrated the effectiveness of the proposed measure.
Problem

Research questions and friction points this paper is trying to address.

Proposes a semantic distance measure using multi-kernel Gaussian processes.
Adapts distance to data distributions and tasks via learned kernel parameters.
Evaluates the measure in fine-grained sentiment classification with LLMs.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses multi-kernel Gaussian processes for semantic distance measurement
Automatically learns kernel parameters from supervised data
Applies the method to fine-grained sentiment classification with LLMs
🔎 Similar Papers
No similar papers found.
Y
Yinzhu Cheng
H
Haihua Xie
Y
Yaqing Wang
M
Miao He
Mingming Sun
Mingming Sun
BIMSA
Cognitive ComputingWorld ModelMulti-Agent System