LiveSecBench: A Dynamic and Culturally-Relevant AI Safety Benchmark for LLMs in Chinese Context

📅 2025-11-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing safety evaluation benchmarks for Chinese large language models (LLMs) lack dynamism and cultural adaptability, failing to reflect evolving legal, ethical, and societal norms. Method: This paper introduces LiveSecBench—the first multidimensional, dynamic safety evaluation framework tailored to Chinese law and social conventions. It assesses six dimensions: legality, ethics, factual consistency, privacy protection, adversarial robustness, and reasoning safety. Innovations include real-time threat monitoring, versioned benchmark updates, and a human–machine collaborative evaluation pipeline, with built-in extensibility for emerging modalities (e.g., text-to-image generation) and agent-based systems. Contribution/Results: The v251030 release enables systematic evaluation of 18 mainstream Chinese LLMs and hosts a publicly accessible, reproducible real-time leaderboard—significantly enhancing timeliness, objectivity, and practical utility of safety assessment.

Technology Category

Application Category

📝 Abstract
In this work, we propose LiveSecBench, a dynamic and continuously updated safety benchmark specifically for Chinese-language LLM application scenarios. LiveSecBench evaluates models across six critical dimensions (Legality, Ethics, Factuality, Privacy, Adversarial Robustness, and Reasoning Safety) rooted in the Chinese legal and social frameworks. This benchmark maintains relevance through a dynamic update schedule that incorporates new threat vectors, such as the planned inclusion of Text-to-Image Generation Safety and Agentic Safety in the next update. For now, LiveSecBench (v251030) has evaluated 18 LLMs, providing a landscape of AI safety in the context of Chinese language. The leaderboard is publicly accessible at https://livesecbench.intokentech.cn/.
Problem

Research questions and friction points this paper is trying to address.

Evaluates AI safety across six dimensions in Chinese legal frameworks
Dynamically updates to incorporate emerging threats like text-to-image safety
Assesses 18 LLMs for security risks in Chinese language contexts
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic safety benchmark for Chinese LLM scenarios
Evaluates six dimensions within Chinese legal frameworks
Continuously updates to include emerging threat vectors
🔎 Similar Papers
No similar papers found.
Y
Yudong Li
Tsinghua University
Zhongliang Yang
Zhongliang Yang
Associate Professor, Beijing University of Posts and Telecommunications
AI SecurityFinTech
Kejiang Chen
Kejiang Chen
Department of Electronic Engineering and Information Science, University of Science and Technology
information hiding,steganography,privacy-preserving
W
Wenxuan Wang
Renmin University of China
T
Tianxin Zhang
IntokenTech
S
Sifang Wan
IntokenTech
K
Kecheng Wang
IntokenTech
H
Haitian Li
IntokenTech
X
Xu Wang
IntokenTech
L
Lefan Cheng
IntokenTech
Y
Youdan Yang
IntokenTech
B
Baocheng Chen
IntokenTech
Z
Ziyu Liu
IntokenTech
Y
Yufei Sun
Beijing University of Posts and Telecommunications
L
Liyan Wu
IntokenTech
W
Wen Wen
IntokenTech
X
Xingchi Gu
IntokenTech
P
Peiru Yang
Tsinghua University