🤖 AI Summary
This work addresses the challenges small businesses face in deploying large language model (LLM)-powered customer service chatbots, including high infrastructure costs, engineering complexity, and prompt injection vulnerabilities in retrieval-augmented generation (RAG) scenarios. To overcome these barriers, the authors propose an open-source, no-code, multi-tenant platform that delivers secure, isolated LLM services on low-cost heterogeneous hardware using lightweight k3s clusters. The platform integrates an encrypted overlay network, container-based isolation, and tenant-level data access controls, while translating state-of-the-art prompt injection defenses into practical, model-agnostic mechanisms that require no retraining. Evaluation in a real-world e-commerce setting demonstrates that the solution efficiently enables customized LLM chatbot deployment under the stringent resource and security constraints typical of small enterprises.
📝 Abstract
Large Language Model (LLM)-based question-answering systems offer significant potential for automating customer support and internal knowledge access in small businesses, yet their practical deployment remains challenging due to infrastructure costs, engineering complexity, and security risks, particularly in retrieval-augmented generation (RAG)-based settings. This paper presents an industry case study of an open-source, multi-tenant platform that enables small businesses to deploy customised LLM-based support chatbots via a no-code workflow. The platform is built on distributed, lightweight k3s clusters spanning heterogeneous, low-cost machines and interconnected through an encrypted overlay network, enabling cost-efficient resource pooling while enforcing container-based isolation and per-tenant data access controls. In addition, the platform integrates practical, platform-level defences against prompt injection attacks in RAG-based chatbots, translating insights from recent prompt injection research into deployable security mechanisms without requiring model retraining or enterprise-scale infrastructure. We evaluate the proposed platform through a real-world e-commerce deployment, demonstrating that secure and efficient LLM-based chatbot services can be achieved under realistic cost, operational, and security constraints faced by small businesses.