JT-SAFE-V2: Safety-by-Design Foundation Model with World-Context Data

📅 2026-05-23

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

This work addresses the challenge of simultaneously achieving general intelligence and robust safety in large language models for enterprise applications by introducing a novel “safety-by-design” paradigm. The approach jointly optimizes model capabilities and intrinsic safety through world-context-enhanced pretraining data, a high-confidence training pipeline, and a safety-reinforced post-training mechanism. The key contribution is Safe-MoMA, the first multi-model collaborative reasoning framework, which maintains state-of-the-art performance while reducing inference costs by over 30% compared to the largest single-model baseline. The authors release the JT-Safe-V2-35B model checkpoint, which achieves advanced performance on both general intelligence and safety benchmarks.

📝 Abstract

We introduce JT-Safe-V2, a large language model designed to advance the safety and trustworthiness of foundation models, extending our previous JT-Safe model toward a more comprehensive safety-by-design paradigm. JT-Safe-V2 emphasizes the joint optimization of general intelligence and safety-by-design through several key innovations: enriching pre-training data with contextual world knowledge, high-certainty pre-training procedures, and safety strengthening post-training mechanisms for enterprise-oriented agentic capabilities. Building on these safety-enhanced foundation models, we propose Safe-MoMA (Safe Mixture of Models and Agents), a framework that enables traceable and efficient inference through the orchestrated deployment of multiple models and agents. Extensive evaluations demonstrate that JT-Safe-V2 achieves state-of-the-art performance across both general intelligence and safety benchmarks. Moreover, Safe-MoMA reduces inference costs by more than 30\% compared to using the largest standalone model baseline while maintaining comparable performance. To facilitate future research on safety-by-design foundation models, we publicly release the post-trained JT-Safe-V2-35B model checkpoint.

Problem

Research questions and friction points this paper is trying to address.

safety-by-design

foundation model

trustworthiness

enterprise agentic capabilities

world-context data

Innovation

Methods, ideas, or system contributions that make the work stand out.

safety-by-design

world-context data

Safe-MoMA