🤖 AI Summary
This work addresses the lack of a systematic taxonomy for retrieval-augmented generation (RAG) applications. We propose the first comprehensive, lifecycle-spanning classification framework for RAG applications. Methodologically, we introduce a novel four-stage iterative construction paradigm—comprising multi-round expert collaboration, systematic literature review, dimensional abstraction, and empirical validation—thereby filling a critical gap in classification research beyond the ACL community. The framework comprises five meta-dimensions and sixteen fine-grained dimensions, balancing structural clarity with extensibility. Empirical validation across education, healthcare, and legal domains demonstrates its effectiveness in supporting design decisions, technical evaluation, and cross-domain understanding of RAG applications. By providing a foundational taxonomic infrastructure, this work advances the engineering-oriented deployment and standardization of RAG systems.
📝 Abstract
In this research, we develop a taxonomy to conceptualize a comprehensive overview of the constituting characteristics that define retrieval augmented generation (RAG) applications, facilitating the adoption of this technology for different application domains. To the best of our knowledge, no holistic RAG application taxonomies have been developed so far. We employ the method foreign to ACL and thus contribute to the set of methods in the taxonomy creation. It comprises four iterative phases designed to refine and enhance our understanding and presentation of RAG's core dimensions. We have developed a total of five meta-dimensions and sixteen dimensions to comprehensively capture the concept of RAG applications. Thus, the taxonomy can be used to better understand RAG applications and to derive design knowledge for future solutions in specific application domains.