Kernel Affine Hull Machines for Compute-Efficient Query-Side Semantic Encoding

📅 2026-05-01

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

This work addresses the high computational cost of semantic encoding for online queries, a major bottleneck in Transformer-based retrieval systems. The authors propose a lightweight analytical estimator that replaces repeated neural inference to enable efficient query-side adaptation under a fixed teacher model. The core innovation lies in the first integration of affine hull modeling in a reproducing kernel Hilbert space (RKHS) with normalized least mean squares (NLMS) optimization, establishing an interpretable error decomposition framework for query encoding. Efficiency is further enhanced through RKHS prototype mixture weight estimation and a geometrically driven mapping from lexical features to a frozen semantic space. Evaluated on an Austrian legal benchmark, the method achieves an MSE of 0.000091, R² of 0.9071, cosine similarity of 0.9536, MRR@20 of 0.504, and an 8.5× reduction in query latency.

📝 Abstract

Transformer-based semantic retrieval is highly effective, yet in many deployments the dominant cost lies in online query encoding rather than corpus indexing. We study the fixed-teacher query-adaptation problem and ask whether repeated neural inference can be replaced by a lightweight, analytically explicit estimator without degrading decision-relevant retrieval quality. We propose Kernel Affine Hull Machines (KAHMs), which map inexpensive lexical features into a frozen semantic embedding space by estimating prototype-mixture weights in a rigorously specified RKHS and refining prototypes via normalized least-mean-squares, yielding a transparent decomposition of encoding error into posterior-approximation, generalization, and teacher-noise components. On a controlled Austrian-law benchmark (5,000 queries; 84 laws; 10,762 units), KAHM attains the strongest teacher-space reconstruction among matched learned adapters (MSE 0.000091, R^2 0.9071, cosine 0.9536) and consistently leads rank-sensitive metrics, including mean reciprocal rank at 20 (MRR@20, the average inverse rank of the first relevant result within the top 20), Hit rate at 20 (Hit@20, the fraction of queries with at least one relevant result in the top 20), and Top-1 accuracy (the fraction of queries whose correct item is ranked first), with scores of 0.504, 0.694, and 0.411, respectively. It also reduces per-query latency by a factor of 8.5 relative to direct transformer encoding. These results demonstrate that, in fixed-teacher regimes, lightweight geometric estimators can substitute for online neural encoding, preserving retrieval performance while substantially improving efficiency and interpretability.

Problem

Research questions and friction points this paper is trying to address.

semantic retrieval

query encoding

compute efficiency

fixed-teacher adaptation

neural inference

Innovation

Methods, ideas, or system contributions that make the work stand out.

Kernel Affine Hull Machines

semantic retrieval

query-side encoding