Trustworthy Answers, Messier Data: Bridging the Gap in Low-Resource Retrieval-Augmented Generation for Domain Expert Systems

πŸ“… 2025-02-26
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address low credibility, poor traceability, and insufficient answer diversity of Retrieval-Augmented Generation (RAG) in low-resource domain expert systems handling heterogeneous multimodal data, this paper proposes an end-to-end trustworthy RAG framework. First, it introduces a structured corpus construction and Q&A auto-generation pipeline tailored for messy, real-world data. Second, it designs a semantic- and evidence-confidence-driven two-stage re-ranking mechanism to enhance retrieval precision. Third, it pioneers an LLM-guided reference matching algorithm that ensures answer grounding in retrieved evidence and enables fully traceable generation. Experiments in automotive engineering demonstrate significant improvements over non-RAG baselines: +1.94 in factual correctness, +1.16 in informativeness, and +1.67 in helpfulness (5-point scale, evaluated by LLM judges), validating the framework’s effectiveness in trustworthy retrieval, interpretable generation, and auditability.

Technology Category

Application Category

πŸ“ Abstract
RAG has become a key technique for enhancing LLMs by reducing hallucinations, especially in domain expert systems where LLMs may lack sufficient inherent knowledge. However, developing these systems in low-resource settings introduces several challenges: (1) handling heterogeneous data sources, (2) optimizing retrieval phase for trustworthy answers, and (3) evaluating generated answers across diverse aspects. To address these, we introduce a data generation pipeline that transforms raw multi-modal data into structured corpus and Q&A pairs, an advanced re-ranking phase improving retrieval precision, and a reference matching algorithm enhancing answer traceability. Applied to the automotive engineering domain, our system improves factual correctness (+1.94), informativeness (+1.16), and helpfulness (+1.67) over a non-RAG baseline, based on a 1-5 scale by an LLM judge. These results highlight the effectiveness of our approach across distinct aspects, with strong answer grounding and transparency.
Problem

Research questions and friction points this paper is trying to address.

Enhancing LLMs in low-resource settings
Optimizing retrieval for trustworthy answers
Improving answer evaluation in domain expert systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Data generation pipeline for multi-modal data
Advanced re-ranking phase for precision
Reference matching algorithm for traceability
πŸ”Ž Similar Papers
No similar papers found.