OlaMind: Towards Human-Like and Hallucination-Safe Customer Service for Retrieval-Augmented Dialogue

📅 2025-10-24

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

Retrieval-augmented generation (RAG)-based intelligent customer service systems suffer from frequent hallucinations and rigid responses, degrading user experience and increasing business risk. Method: We propose a human-like two-stage learning framework: (1) expert reasoning modeling to supervise imitation of human inference processes; and (2) cold-start fine-tuning combined with progressive reinforcement learning for difficulty-structured self-refinement—from simple to complex cases. Our approach integrates RAG, supervised fine-tuning (SFT), and reinforcement learning (RL) to jointly optimize factual consistency and linguistic naturalness. Contribution/Results: In large-scale online A/B tests, the framework improved intelligent resolution rates by 28.92% in community support and 18.42% in live-streaming interaction scenarios, while reducing human takeover rates by 6.08% and 7.12%, respectively—significantly enhancing system safety and human-likeness.

Technology Category

Application Category

📝 Abstract

Intelligent customer service (ICS) systems via retrieval-augmented generation (RAG) have been widely adopted in Web-based domains such as social platforms and e-commerce, achieving remarkable improvements in automation and efficiency. However, notable limitations still remain: these systems are prone to hallucinations and often generate rigid, mechanical responses, which can introduce business risks and undermine user experience, especially in Web-based customer service interactions under the RAG scenarios. In this paper, we introduce OlaMind, a human-like and hallucination-safe customer service framework for retrieval-augmented dialogue. Specifically, it first leverages a Learn-to-Think stage to learn the reasoning processes and response strategies from human experts, and then employs a Learn-to-Respond stage to perform cold-start supervised fine-tuning (SFT) combined with reinforcement learning (RL) for basic-to-hard self-refinement. Our method significantly enhances human-likeness and naturalness while effectively mitigating hallucinations and critical business risks. We have conducted large-scale online A/B experiments in an industry-level social customer service setting, and extensive experimental results show that OlaMind achieves significant cumulative relative improvements with intelligent resolution rates +28.92%/+18.42% and human takeover rate -6.08%/-7.12% in community-support/livestream-interaction scenarios, respectively, which highlights its consistent effectiveness across diverse real-world applications. The code and data will be publicly available.

Problem

Research questions and friction points this paper is trying to address.

Mitigating hallucinations in retrieval-augmented customer service systems

Reducing rigid mechanical responses in web-based dialogue interactions

Addressing business risks from poor automated customer service quality

Innovation

Methods, ideas, or system contributions that make the work stand out.

Learn-to-Think stage learns expert reasoning processes

Cold-start SFT with RL enables self-refinement

Framework enhances human-likeness while reducing hallucinations

🔎 Similar Papers

No similar papers found.