🤖 AI Summary
To address coarse-grained information localization and non-attributable answers in PDF document question answering, this paper proposes a fine-grained, sentence-level retrieval-augmented generation (RAG) framework. First, PDF documents are parsed and encoded into sentence-level embeddings using Sentence-BERT to construct a dense vector index. Then, RAG is employed to enable precise paragraph retrieval and context-aware answer generation for natural language queries. Our key innovation lies in the first deep integration of RAG with sentence-level retrieval, enabling answer traceability and interactive, highlightable key-sentence extraction. Evaluated on standard PDF QA and summarization benchmarks, our method achieves a ROUGE-L score of 0.521—significantly outperforming baseline models—and improves passage localization accuracy by 23.6%. The framework thus advances accuracy, interpretability, and user experience in PDF-based QA.
📝 Abstract
This study introduces a system leveraging Large Language Models (LLMs) to extract text and enhance user interaction with PDF documents via a conversational interface. Utilizing Retrieval-Augmented Generation (RAG), the system provides informative responses to user inquiries while highlighting relevant passages within the PDF. Upon user upload, the system processes the PDF, employing sentence embeddings to create a document-specific vector store. This vector store enables efficient retrieval of pertinent sections in response to user queries. The LLM then engages in a conversational exchange, using the retrieved information to extract text and generate comprehensive, contextually aware answers. While our approach demonstrates competitive ROUGE values compared to existing state-of-the-art techniques for text extraction and summarization, we acknowledge that further qualitative evaluation is necessary to fully assess its effectiveness in real-world applications. The proposed system gives competitive ROUGE values as compared to existing state-of-the-art techniques for text extraction and summarization, thus offering a valuable tool for researchers, students, and anyone seeking to efficiently extract knowledge and gain insights from documents through an intuitive question-answering interface.