OwlerLite: Scope- and Freshness-Aware Web Retrieval for LLM Assistants

📅 2026-01-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses a critical limitation in current browser-based large language model (LLM) assistants employing retrieval-augmented generation (RAG): their reliance on static, outdated indices that offer users no control over the scope of information sources or data freshness, often leading to untrustworthy or stale responses. To overcome this, the paper introduces a novel client-side RAG system that uniquely integrates user-defined, reusable source scopes with a dynamic freshness-aware mechanism. By leveraging semantic change detection to monitor web page updates in real time and trigger selective re-indexing, the system unifies textual relevance, user-specified source constraints, and content recency into a cohesive retrieval model. This approach significantly enhances the timeliness, relevance, and trustworthiness of retrieved results while ensuring transparent and user-controllable provenance.

Technology Category

Application Category

📝 Abstract
Browser-based language models often use retrieval-augmented generation (RAG) but typically rely on fixed, outdated indices that give users no control over which sources are consulted. This can lead to answers that mix trusted and untrusted content or draw on stale information. We present OwlerLite, a browser-based RAG system that makes user-defined scopes and data freshness central to retrieval. Users define reusable scopes-sets of web pages or sources-and select them when querying. A freshness-aware crawler monitors live pages, uses a semantic change detector to identify meaningful updates, and selectively re-indexes changed content. OwlerLite integrates text relevance, scope choice, and recency into a unified retrieval model. Implemented as a browser extension, it represents a step toward more controllable and trustworthy web assistants.
Problem

Research questions and friction points this paper is trying to address.

retrieval-augmented generation
web retrieval
data freshness
source control
trustworthy assistants
Innovation

Methods, ideas, or system contributions that make the work stand out.

scope-aware retrieval
freshness-aware crawling
semantic change detection
retrieval-augmented generation
controllable web assistant
🔎 Similar Papers
No similar papers found.
Saber Zerhoudi
Saber Zerhoudi
Universität Passau
Information RetrievalUser Simulation
M
Michael Dinzinger
University of Passau, Passau, Germany
M
Michael Granitzer
University of Passau, Passau, Germany; IT:U Austria, Linz, Austria
Jelena Mitrović
Jelena Mitrović
University of Passau
Natural Language ProcessingArtificial IntelligenceComputational RhetoricLegal NLP