🤖 AI Summary
Large-scale knowledge bases (e.g., Wikipedia/Wikidata) suffer from hallucination and inefficiency in question answering. Method: We propose a “question-question matching” retrieval paradigm: instruction-tuned LLMs (e.g., Llama-3) generate multi-perspective questions for each knowledge unit; these questions are embedded into a dense vector space using Sentence-BERT or ColBERT. User queries are matched directly against the precomputed question index—enabling zero-shot, generation-free, semantically aligned knowledge access. Crucially, this approach replaces document-level retrieval with question-level retrieval and integrates Wikidata’s RDF schema for structured fact mapping. Contributions/Results: Experiments on Wikipedia and Wikidata achieve >90% top-1 accuracy, sub-100ms latency, and support multimodal (text + multimedia) QA. The method significantly improves scalability, reliability, and retrieval precision while eliminating LLM hallucination.
📝 Abstract
This paper introduces an approach to question answering over knowledge bases like Wikipedia and Wikidata by performing"question-to-question"matching and retrieval from a dense vector embedding store. Instead of embedding document content, we generate a comprehensive set of questions for each logical content unit using an instruction-tuned LLM. These questions are vector-embedded and stored, mapping to the corresponding content. Vector embedding of user queries are then matched against this question vector store. The highest similarity score leads to direct retrieval of the associated article content, eliminating the need for answer generation. Our method achieves high cosine similarity (>0.9 ) for relevant question pairs, enabling highly precise retrieval. This approach offers several advantages including computational efficiency, rapid response times, and increased scalability. We demonstrate its effectiveness on Wikipedia and Wikidata, including multimedia content through structured fact retrieval from Wikidata, opening up new pathways for multimodal question answering.