🤖 AI Summary
To address the time-consuming and non-scalable nature of traditional qualitative customer profiling methods, this paper proposes a persona-driven RAG (Retrieval-Augmented Generation) chatbot framework tailored for the construction machinery industry. Methodologically, it introduces the first deep integration of synthetic customer personas—generated by large language models—into an RAG architecture, coupled with a dual-path persona generation and evaluation mechanism leveraging Few-Shot learning and Chain-of-Thought reasoning. Experimental results show that knowledge-base augmentation improves the chatbot’s question-answering accuracy from 5.88 to 6.42 (on a 10-point scale), with statistical significance confirmed by McNemar’s test (p < 0.05). Furthermore, 81.82% of users affirm its practical business value. This work establishes a reusable, domain-specific paradigm for customer cognition modeling in vertical-industry RAG systems.
📝 Abstract
The introduction of Large Language Models (LLMs) has significantly transformed Natural Language Processing (NLP) applications by enabling more advanced analysis of customer personas. At Volvo Construction Equipment (VCE), customer personas have traditionally been developed through qualitative methods, which are time-consuming and lack scalability. The main objective of this paper is to generate synthetic customer personas and integrate them into a Retrieval-Augmented Generation (RAG) chatbot to support decision-making in business processes. To this end, we first focus on developing a persona-based RAG chatbot integrated with verified personas. Next, synthetic personas are generated using Few-Shot and Chain-of-Thought (CoT) prompting techniques and evaluated based on completeness, relevance, and consistency using McNemar's test. In the final step, the chatbot's knowledge base is augmented with synthetic personas and additional segment information to assess improvements in response accuracy and practical utility. Key findings indicate that Few-Shot prompting outperformed CoT in generating more complete personas, while CoT demonstrated greater efficiency in terms of response time and token usage. After augmenting the knowledge base, the average accuracy rating of the chatbot increased from 5.88 to 6.42 on a 10-point scale, and 81.82% of participants found the updated system useful in business contexts.