Evaluating Nova 2.0 Lite model under Amazon's Frontier Model Safety Framework

πŸ“… 2026-01-27
πŸ“ˆ Citations: 2
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This study presents a systematic safety evaluation of the multimodal large language model Nova 2.0 Lite in high-risk domains, including chemical, biological, radiological, and nuclear (CBRN) threats, offensive cyber operations, and automated AI development. For the first time, Amazon’s Frontier Model Safety Framework (FMSF) is applied to a multimodal model with million-token context length, integrating automated benchmarking, expert red-teaming, and uplift analysis to comprehensively map its safety boundaries across text, image, and video inputs. The research constructs a holistic risk profile of the model across these three critical threat areas, assesses its compliance with FMSF release safety thresholds, and offers a methodological foundation for future safety evaluations of multimodal large models.

Technology Category

Application Category

πŸ“ Abstract
Amazon published its Frontier Model Safety Framework (FMSF) as part of the Paris AI summit, following which we presented a report on Amazon's Premier model. In this report, we present an evaluation of Nova 2.0 Lite. Nova 2.0 Lite was made generally available from amongst the Nova 2.0 series and is one of its most capable reasoning models. The model processes text, images, and video with a context length of up to 1M tokens, enabling analysis of large codebases, documents, and videos in a single prompt. We present a comprehensive evaluation of Nova 2.0 Lite's critical risk profile under the FMSF. Evaluations target three high-risk domains-Chemical, Biological, Radiological and Nuclear (CBRN), Offensive Cyber Operations, and Automated AI R&D-and combine automated benchmarks, expert red-teaming, and uplift studies to determine whether the model exceeds release thresholds. We summarize our methodology and report core findings. We will continue to enhance our safety evaluation and mitigation pipelines as new risks and capabilities associated with frontier models are identified.
Problem

Research questions and friction points this paper is trying to address.

Frontier Model Safety
Risk Evaluation
CBRN
Offensive Cyber Operations
Automated AI R&D
Innovation

Methods, ideas, or system contributions that make the work stand out.

Frontier Model Safety Framework
multimodal reasoning
long-context evaluation
red-teaming
AI safety benchmarking
πŸ”Ž Similar Papers
No similar papers found.
Satyapriya Krishna
Satyapriya Krishna
Harvard University
Trustworthy AILarge Language ModelsExplainable & Fair ML
M
Matteo Memelli
Amazon Nova Responsible AI
Tong Wang
Tong Wang
Amazon
Natural Language ProcessingDeep LearningTopic Model
A
Abhinav Mohanty
Amazon Nova Responsible AI
C
Claire O'Brien Rajkumar
Amazon Nova Responsible AI
P
Payal Motwani
Amazon Nova Responsible AI
Rahul Gupta
Rahul Gupta
Amazon Nova Responsible AI
Responsible AI
S
S. Matsoukas
Amazon Nova Responsible AI