Towards Sensitivity-Aware Language Models

📅 2026-01-28

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

This work addresses the risk of sensitive information leakage in large language models (LLMs) deployed for enterprise data management and their difficulty in adhering to predefined access control policies. The study introduces the first formalization of “sensitivity awareness” and establishes its theoretical connection to differential privacy. Building on this foundation, the authors propose an efficient supervised fine-tuning method tailored for 4-bit quantized LLMs. Experimental results demonstrate that the fine-tuned models achieve up to a 21.7% improvement in sensitivity-awareness tasks, outperforming both open-source and commercial full-precision models of comparable size. Crucially, these gains are attained without compromising performance on general instruction-following, mathematical reasoning, or commonsense reasoning benchmarks, thereby effectively balancing privacy preservation with broad functional capabilities.

Technology Category

Application Category

📝 Abstract

With LLMs increasingly deployed in corporate data management, it is crucial to ensure that these models do not leak sensitive information. In the context of corporate data management, the concept of sensitivity awareness has been introduced, enabling LLMs to adhere to predefined access rights rules. However, it remains unclear how sensitivity awareness relates to established notions of privacy, such as differential privacy (DP), thereby making it difficult to deploy meaningfully in real-world applications. In this work, we formalize the notion of sensitivity awareness and theoretically establish its connection to DP. Additionally, we develop a supervised fine-tuning recipe to make existing, four-bit quantized LLMs more sensitivity-aware. With a performance boost of up to 21.7%, the finetuned LLMs not only substantially improve over their baseline but also outperform other full-precision open-source and commercial models of similar size in achieving sensitivity awareness, demonstrating the effectiveness of our proposed approach. At the same time, our method also largely preserves the models'performance on other tasks, such as general instruction-following, mathematical, and common-sense reasoning.

Problem

Research questions and friction points this paper is trying to address.

sensitivity awareness

large language models

differential privacy

data privacy

access control

Innovation

Methods, ideas, or system contributions that make the work stand out.

sensitivity awareness

differential privacy

quantized LLMs