🤖 AI Summary
This work addresses the longstanding neglect of low-resource languages in large language model safety evaluations, with a particular focus on the absence of dedicated benchmarks for Albanian. We present the first structured safety evaluation dataset for Albanian, comprising 2,951 human-crafted prompts spanning 11 risk categories. Each prompt has been reviewed by linguistic experts and annotated with its risk category and an English translation. This dataset not only fills a critical gap in AI safety research for non-high-resource languages but also enables comprehensive model safety assessments, red-teaming exercises, fine-tuning, and the development of robust mitigation strategies, thereby advancing the creation of more inclusive and equitable AI systems.
📝 Abstract
Safety evaluation of Large Language Models (LLMs) has largely focused on high-resource languages, leaving low-resource languages critically underserved. We present AlbanianLLMSafety, the first publicly available safety evaluation dataset for LLMs in Albanian, a linguistically distinct low-resource language with approximately 7.5 million speakers across Albania, Kosovo, North Macedonia, and the diaspora. The dataset contains 2,951 prompts spanning 11 safety categories, including self-harm, violence, racist content, child exploitation, and radicalization, with an average of 268 prompts per category. Each prompt is provided in Albanian with an English reference translation and a detailed category label. This resource addresses a significant gap in safety evaluation infrastruc-ture for low-resource languages and provides an essential benchmark for developing safer, more inclusive LLMs. The dataset will be provided upon request to support safety evaluation, fine-tuning, red-teaming, and guardrail development for Albanian-speaking communities.