Institutional Platform for Secure Self-Service Large Language Model Exploration

📅 2024-02-01

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

222K/year

🤖 AI Summary

This work addresses the challenge of securely, efficiently, and accessibly deploying large-scale, customized large language models (LLMs) in multi-tenant academic research environments. We propose a tenant-aware, proxy-based computational network architecture that integrates zero-trust principles, secure sandboxing, role-based fine-grained access control (RBAC), and end-to-end HTTPS/TLS encryption to enable logically unified scheduling of physical resource pools while enforcing strict inter-tenant process and data isolation. Our approach innovatively combines parallel multi-LoRA inference, agent-driven resource orchestration, and an encrypted inference pipeline. Deployed at the University of Kentucky’s AI Center, the platform supports secure, cross-disciplinary LLM usage by research teams. Evaluation shows a 37% reduction in inference latency and near-zero inter-tenant resource leakage risk, demonstrating robust security, scalability, and operational efficiency in production academic settings.

Technology Category

Application Category

📝 Abstract

This paper introduces a user-friendly platform developed by the University of Kentucky Center for Applied AI, designed to make large, customized language models (LLMs) more accessible. By capitalizing on recent advancements in multi-LoRA inference, the system efficiently accommodates custom adapters for a diverse range of users and projects. The paper outlines the system's architecture and key features, encompassing dataset curation, model training, secure inference, and text-based feature extraction. We illustrate the establishment of a tenant-aware computational network using agent-based methods, securely utilizing islands of isolated resources as a unified system. The platform strives to deliver secure LLM services, emphasizing process and data isolation, end-to-end encryption, and role-based resource authentication. This contribution aligns with the overarching goal of enabling simplified access to cutting-edge AI models and technology in support of scientific discovery.

Problem

Research questions and friction points this paper is trying to address.

Enhances accessibility to large language models

Secures multi-LoRA inference for diverse users

Facilitates secure, isolated AI resource utilization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-LoRA inference adapters

Agent-based tenant-aware network

End-to-end encryption security

🔎 Similar Papers

LangBiTe: A Platform for Testing Bias in Large Language Models