Institutional Platform for Secure Self-Service Large Language Model Exploration

πŸ“… 2024-02-01
πŸ›οΈ arXiv.org
πŸ“ˆ Citations: 1
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the challenge of securely, efficiently, and accessibly deploying large-scale, customized large language models (LLMs) in multi-tenant academic research environments. We propose a tenant-aware, proxy-based computational network architecture that integrates zero-trust principles, secure sandboxing, role-based fine-grained access control (RBAC), and end-to-end HTTPS/TLS encryption to enable logically unified scheduling of physical resource pools while enforcing strict inter-tenant process and data isolation. Our approach innovatively combines parallel multi-LoRA inference, agent-driven resource orchestration, and an encrypted inference pipeline. Deployed at the University of Kentucky’s AI Center, the platform supports secure, cross-disciplinary LLM usage by research teams. Evaluation shows a 37% reduction in inference latency and near-zero inter-tenant resource leakage risk, demonstrating robust security, scalability, and operational efficiency in production academic settings.

Technology Category

Application Category

πŸ“ Abstract
This paper introduces a user-friendly platform developed by the University of Kentucky Center for Applied AI, designed to make large, customized language models (LLMs) more accessible. By capitalizing on recent advancements in multi-LoRA inference, the system efficiently accommodates custom adapters for a diverse range of users and projects. The paper outlines the system's architecture and key features, encompassing dataset curation, model training, secure inference, and text-based feature extraction. We illustrate the establishment of a tenant-aware computational network using agent-based methods, securely utilizing islands of isolated resources as a unified system. The platform strives to deliver secure LLM services, emphasizing process and data isolation, end-to-end encryption, and role-based resource authentication. This contribution aligns with the overarching goal of enabling simplified access to cutting-edge AI models and technology in support of scientific discovery.
Problem

Research questions and friction points this paper is trying to address.

Enhances accessibility to large language models
Secures multi-LoRA inference for diverse users
Facilitates secure, isolated AI resource utilization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-LoRA inference adapters
Agent-based tenant-aware network
End-to-end encryption security
πŸ”Ž Similar Papers
No similar papers found.
V
V. K. Cody Bumgardner
Center for Applied AI, Institute for Biomedical Informatics, University of Kentucky, Lexington, Kentucky, USA
M
Mitchell A. Klusty
Center for Applied AI, Institute for Biomedical Informatics, University of Kentucky, Lexington, Kentucky, USA
W
W. Vaiden Logan
Center for Applied AI, Institute for Biomedical Informatics, University of Kentucky, Lexington, Kentucky, USA
S
Samuel E. Armstrong
Center for Applied AI, Institute for Biomedical Informatics, University of Kentucky, Lexington, Kentucky, USA
Caylin Hickey
Caylin Hickey
Department of Computer Science, University of Kentucky, Lexington, Kentucky, USA
J
Jeff Talbert
Center for Applied AI, Institute for Biomedical Informatics, University of Kentucky, Lexington, Kentucky, USA