LACONIC: Dense-Level Effectiveness for Scalable Sparse Retrieval via a Two-Phase Training Curriculum

📅 2026-01-04

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

This work proposes LACONIC, a learnable sparse retrieval model based on Llama-3 (1B/3B/8B), to bridge the performance gap between sparse and dense retrievers while maintaining computational efficiency. Although dense retrieval models achieve strong performance, their high computational and memory demands hinder efficient deployment on CPUs; in contrast, traditional sparse models are efficient but underperform. LACONIC employs a two-stage training strategy: first, weakly supervised pre-finetuning of the causal language model to acquire bidirectional contextual representations, followed by fine-tuning with high-quality hard negatives. The LACONIC-8B variant achieves an nDCG of 60.2 on the MTEB benchmark, placing it within the top 15%, while reducing index memory usage by 71% compared to equivalent dense models. This enables highly efficient CPU-based inference without sacrificing retrieval effectiveness.

Technology Category

Application Category

📝 Abstract

While dense retrieval models have become the standard for state-of-the-art information retrieval, their deployment is often constrained by high memory requirements and reliance on GPU accelerators for vector similarity search. Learned sparse retrieval offers a compelling alternative by enabling efficient search via inverted indices, yet it has historically received less attention than dense approaches. In this report, we introduce LACONIC, a family of learned sparse retrievers based on the Llama-3 architecture (1B, 3B, and 8B). We propose a streamlined two-phase training curriculum consisting of (1) weakly supervised pre-finetuning to adapt causal LLMs for bidirectional contextualization and (2) high-signal finetuning using curated hard negatives. Our results demonstrate that LACONIC effectively bridges the performance gap with dense models: the 8B variant achieves a state-of-the-art 60.2 nDCG on the MTEB Retrieval benchmark, ranking 15th on the leaderboard as of January 1, 2026, while utilizing 71\% less index memory than an equivalent dense model. By delivering high retrieval effectiveness on commodity CPU hardware with a fraction of the compute budget required by competing models, LACONIC provides a scalable and efficient solution for real-world search applications.

Problem

Research questions and friction points this paper is trying to address.

dense retrieval

sparse retrieval

information retrieval

memory efficiency

scalability

Innovation

Methods, ideas, or system contributions that make the work stand out.

learned sparse retrieval

two-phase training curriculum

Llama-3 architecture