🤖 AI Summary
This work addresses the lexical and modal gaps that impede semantic content matching among agents in the edge–cloud continuum by proposing large language models (LLMs) as the semantic matching engine for content-based publish/subscribe systems. Performance is evaluated on social, legal, and smart home datasets using an offline multi-label retrieval framework, complemented by a dual crossover representation: a context-window crossover guiding compression to reduce LLM invocations, and a discriminative-capacity crossover revealing the boundary beyond which accuracy collapses. The approach innovatively integrates the CoverAndMerge compression pipeline, three composable algorithmic classes, and a cluster-oriented Quality-of-Experience framework to automate LLM tier selection. Experiments demonstrate that only state-of-the-art-scale models effectively handle large subscription sets, that backend model choice predominantly governs system performance, and that compression cannot recover accuracy losses incurred beyond the discriminative crossover point.
📝 Abstract
Large language models (LLMs) can serve as the semantic-matching engine of a content-based publish/subscribe broker for agentic AI across the edge-cloud computing continuum, bridging the vocabulary and modality gaps that defeat keyword and embedding filters. Framed as offline multi-label retrieval over three public datasets spanning social-media, legal, and smart-home sensor domains (six LLMs, seven baselines), our central contribution is a two-crossover cost-accuracy characterisation: an analytical context-window crossover below which a CoverAndMerge compression pipeline reduces LLM invocations, and an empirical discrimination-capacity crossover above which matching accuracy collapses independently of context budget, by a model-dependent factor of parameter count and training generation. Two findings carry practical weight: above the discrimination crossover, compression cannot recover accuracy and only frontier-scale models clear large subscription sets; and there backend choice dominates configuration choice, so model selection, not pipeline tuning, is the primary operator lever. We accompany this with three composable algorithms and a per-cluster Quality-of-Experience framework for autonomic LLM-tier selection.