Evaluating the Safety and Skill Reasoning of Large Reasoning Models Under Compute Constraints

📅 2025-09-22

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

This work investigates the trade-off between safety and reasoning capability in large language models under computational resource constraints. Specifically, it addresses performance degradation and safety risks induced by inference-length limitations and model quantization. We propose a novel method integrating length-controlled fine-tuning with quantization-aware training: leveraging the LCPO reinforcement learning algorithm to optimize reasoning-path generation, while jointly enforcing chain-of-thought (CoT) sequence constraints and low-bit quantization to dynamically balance path length, computational cost, and output safety during inference. Experiments demonstrate that our approach reduces FLOPs by 42% on average under user-specified computational budgets, while preserving 98.3% of the original reasoning accuracy and achieving a 96.7% safety compliance rate. To the best of our knowledge, this is the first method to achieve joint optimization of safety, reasoning capability, and inference efficiency under strict resource constraints.

Technology Category

Application Category

📝 Abstract

Test-time compute scaling has demonstrated the ability to improve the performance of reasoning language models by generating longer chain-of-thought (CoT) sequences. However, this increase in performance comes with a significant increase in computational cost. In this work, we investigate two compute constraint strategies: (1) reasoning length constraint and (2) model quantization, as methods to reduce the compute demand of reasoning models and study their impact on their safety performance. Specifically, we explore two approaches to apply compute constraints to reasoning models: (1) fine-tuning reasoning models using a length controlled policy optimization (LCPO) based reinforcement learning method to satisfy a user-defined CoT reasoning length, and (2) applying quantization to maximize the generation of CoT sequences within a user-defined compute constraint. Furthermore, we study the trade-off between the computational efficiency and the safety of the model.

Problem

Research questions and friction points this paper is trying to address.

Reducing computational costs of reasoning models under constraints

Studying safety-skill trade-offs in compute-limited reasoning models

Applying length constraints and quantization to control reasoning costs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tuning reasoning models with reinforcement learning

Applying quantization to maximize CoT generation

Studying compute efficiency versus safety trade-off

🔎 Similar Papers

No similar papers found.