LongSafety: Enhance Safety for Long-Context LLMs

📅 2024-11-11
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
Research on safety alignment for large language models (LLMs) in long-context settings remains severely underexplored. Method: We introduce LongSafety, the first benchmark dataset specifically designed for long-context safety evaluation and training—comprising 10 safety task categories, 17K high-quality samples, and an average context length of 40.9K tokens. We establish rigorous long-text risk modeling for annotation and propose a multi-task mixed training strategy that jointly optimizes safety performance across both long and short contexts without compromising general capabilities. Contribution/Results: Our work empirically demonstrates that long-context safety is not reducible to short-context safety, revealing critical failures in cross-length generalization and scenario transfer. Evaluated on multiple long-text safety benchmarks, our approach achieves state-of-the-art results, substantially improving long-context safety while also enhancing short-context safety—thereby enabling bidirectional safety improvement.

Technology Category

Application Category

📝 Abstract
Recent advancements in model architectures and length extrapolation techniques have significantly extended the context length of large language models (LLMs), paving the way for their application in increasingly complex tasks. However, despite the growing capabilities of long-context LLMs, the safety issues in long-context scenarios remain underexplored. While safety alignment in short context has been widely studied, the safety concerns of long-context LLMs have not been adequately addressed. In this work, we introduce extbf{LongSafety}, a comprehensive safety alignment dataset for long-context LLMs, containing 10 tasks and 17k samples, with an average length of 40.9k tokens. Our experiments demonstrate that training with LongSafety can enhance long-context safety performance while enhancing short-context safety and preserving general capabilities. Furthermore, we demonstrate that long-context safety does not equal long-context alignment with short-context safety data and LongSafety has generalizing capabilities in context length and long-context safety scenarios.
Problem

Research questions and friction points this paper is trying to address.

Addresses safety concerns in long-context LLMs
Introduces LongSafety dataset for safety alignment
Enhances safety performance across context lengths
Innovation

Methods, ideas, or system contributions that make the work stand out.

LongSafety dataset creation
Enhances long-context safety
Generalizes across context lengths
🔎 Similar Papers
No similar papers found.
M
Mianqiu Huang
School of Computer Science, Fudan University
Xiaoran Liu
Xiaoran Liu
Fudan University
natural language processing
Shaojun Zhou
Shaojun Zhou
Fudan University
Mozhi Zhang
Mozhi Zhang
ByteDance Seed
Large Language ModelNatural Language Processing
Chenkun Tan
Chenkun Tan
Fudan University
P
Pengyu Wang
School of Computer Science, Fudan University
Qipeng Guo
Qipeng Guo
Fudan University
Z
Zhe Xu
School of Computer Science, Fudan University
L
Linyang Li
Shanghai AI Lab
Z
Zhikai Lei
Shanghai AI Lab
Linlin Li
Linlin Li
Huawei Noah’s Ark Lab
Q
Qun Liu
Huawei Noah’s Ark Lab
Y
Yaqian Zhou
School of Computer Science, Fudan University
X
Xipeng Qiu
School of Computer Science, Fudan University
X
Xuanjing Huang
School of Computer Science, Fudan University