HKGAI-V1: Towards Regional Sovereign Large Language Model for Hong Kong

📅 2025-07-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Adapting large language models (LLMs) to Hong Kong’s multilingual context (Cantonese, Mandarin, English), “One Country, Two Systems” legal framework, and locally rooted cultural values poses significant alignment and governance challenges. Method: We propose a governance-embedded AI architecture and introduce HKGAI-V1—the first sovereign LLM tailored for Hong Kong—built upon full-parameter fine-tuning of the DeepSeek architecture, integrated with retrieval-augmented generation (RAG) and a regionally aligned safety framework; we further establish the first adversarial benchmark evaluating value alignment against Hong Kong-specific cultural and ethical norms. Contribution/Results: HKGAI-V1 achieves substantial improvements in cultural sensitivity and factual accuracy on locally relevant queries, ensures high regulatory compliance across critical domains (e.g., judiciary, education, public administration), and provides a reusable paradigm for regional sovereign AI development. Experiments demonstrate its superior performance over general-purpose baselines in both localization alignment and value consistency.

Technology Category

Application Category

📝 Abstract
This paper presents the development of HKGAI-V1, a foundational sovereign large language model (LLM), developed as part of an initiative to establish value-aligned AI infrastructure specifically tailored for Hong Kong. Addressing the region's unique multilingual environment (Cantonese, Mandarin, and English), its distinct socio-legal context under the "one country, two systems" framework, and specific local cultural and value considerations, the model is built upon the DeepSeek architecture and systematically aligned with regional norms through a multifaceted full parameter fine-tuning process. It is further integrated with a retrieval-augmented generation (RAG) system to ensure timely and factually grounded information access. The core contribution lies in the design and implementation of a comprehensive, region-specific AI alignment and safety framework, demonstrated through two key achievements: 1) The successful development of HKGAI-V1 itself - which outper-forms general-purpose models in handling Hong Kong-specific culturally sensitive queries, and embodies a "governance-embedded" approach to digital sovereignty - empowers Hong Kong to exercise control over AI applications in critical sectors including public services, legal systems, and edu-cation. 2) The development of the proprietary Adversarial HK Value Benchmark, a rigorous tool for evaluating model alignment with local ethical and legal stand-ards under challenging conditions. By documenting these achievements, the paper provides not only a technological artifact but also a replicable blueprint for developing advanced, regionally focused AI systems deeply rooted in their local identities.
Problem

Research questions and friction points this paper is trying to address.

Develops a regional sovereign LLM for Hong Kong's multilingual needs
Aligns AI with Hong Kong's socio-legal and cultural norms
Ensures digital sovereignty in critical sectors via governance-embedded AI
Innovation

Methods, ideas, or system contributions that make the work stand out.

DeepSeek-based LLM for Hong Kong
Multilingual full parameter fine-tuning
Retrieval-augmented generation integration
🔎 Similar Papers
No similar papers found.
Sirui Han
Sirui Han
The Hong Kong University of Science and Technology
Large Language ModelInterdisciplinary Artificial Intelligence
J
Junqi Zhu
Hong Kong Generative AI R&D Center, Hong Kong SAR, China
Ruiyuan Zhang
Ruiyuan Zhang
Zhejiang University
MultiModal3D Part AssemblyMixture-of-Expert
Y
Yike Guo
Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong SAR, China