HKGAI-V1: Towards Regional Sovereign Large Language Model for Hong Kong

📅 2025-07-14

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Adapting large language models (LLMs) to Hong Kong’s multilingual context (Cantonese, Mandarin, English), “One Country, Two Systems” legal framework, and locally rooted cultural values poses significant alignment and governance challenges. Method: We propose a governance-embedded AI architecture and introduce HKGAI-V1—the first sovereign LLM tailored for Hong Kong—built upon full-parameter fine-tuning of the DeepSeek architecture, integrated with retrieval-augmented generation (RAG) and a regionally aligned safety framework; we further establish the first adversarial benchmark evaluating value alignment against Hong Kong-specific cultural and ethical norms. Contribution/Results: HKGAI-V1 achieves substantial improvements in cultural sensitivity and factual accuracy on locally relevant queries, ensures high regulatory compliance across critical domains (e.g., judiciary, education, public administration), and provides a reusable paradigm for regional sovereign AI development. Experiments demonstrate its superior performance over general-purpose baselines in both localization alignment and value consistency.

Technology Category

Application Category

📝 Abstract

This paper presents the development of HKGAI-V1, a foundational sovereign large language model (LLM), developed as part of an initiative to establish value-aligned AI infrastructure specifically tailored for Hong Kong. Addressing the region's unique multilingual environment (Cantonese, Mandarin, and English), its distinct socio-legal context under the "one country, two systems" framework, and specific local cultural and value considerations, the model is built upon the DeepSeek architecture and systematically aligned with regional norms through a multifaceted full parameter fine-tuning process. It is further integrated with a retrieval-augmented generation (RAG) system to ensure timely and factually grounded information access. The core contribution lies in the design and implementation of a comprehensive, region-specific AI alignment and safety framework, demonstrated through two key achievements: 1) The successful development of HKGAI-V1 itself - which outper-forms general-purpose models in handling Hong Kong-specific culturally sensitive queries, and embodies a "governance-embedded" approach to digital sovereignty - empowers Hong Kong to exercise control over AI applications in critical sectors including public services, legal systems, and edu-cation. 2) The development of the proprietary Adversarial HK Value Benchmark, a rigorous tool for evaluating model alignment with local ethical and legal stand-ards under challenging conditions. By documenting these achievements, the paper provides not only a technological artifact but also a replicable blueprint for developing advanced, regionally focused AI systems deeply rooted in their local identities.

Problem

Research questions and friction points this paper is trying to address.

Develops a regional sovereign LLM for Hong Kong's multilingual needs

Aligns AI with Hong Kong's socio-legal and cultural norms

Ensures digital sovereignty in critical sectors via governance-embedded AI

Innovation

Methods, ideas, or system contributions that make the work stand out.

DeepSeek-based LLM for Hong Kong

Multilingual full parameter fine-tuning

Retrieval-augmented generation integration

🔎 Similar Papers

No similar papers found.

Authors to Follow