Is On-Device AI Broken and Exploitable? Assessing the Trust and Ethics in Small Language Models

📅 2024-06-08
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
While edge-deployed small language models (SLMs) offer privacy preservation and low-latency inference on smartphones and other end devices, they exhibit heightened security and ethical vulnerabilities compared to cloud-based counterparts. Method: This work introduces the first dual-dimension trustworthiness–ethics evaluation framework specifically designed for on-device SLMs, assessing fairness, privacy protection, and harmful content mitigation. We curate a novel unethical question dataset and conduct comparative edge–cloud experiments, zero-intervention robustness tests, and empirical behavioral analysis. Results: Our evaluation reveals that on-device SLMs generate high-risk content—including illegal, hateful, self-harm, and phishing material—without jailbreaking or adversarial prompting; they produce stereotypical and discriminatory responses significantly more frequently than cloud models and universally lack basic content filtering mechanisms. This study provides critical empirical evidence and a foundational methodology for governing trustworthy AI at the edge.

Technology Category

Application Category

📝 Abstract
In this paper, we present a very first study to investigate trust and ethical implications of on-device artificial intelligence (AI), focusing on small language models (SLMs) amenable for personal devices like smartphones. While on-device SLMs promise enhanced privacy, reduced latency, and improved user experience compared to cloud-based services, we posit that they might also introduce significant risks and vulnerabilities compared to their on-server counterparts. As part of our trust assessment study, we conduct a systematic evaluation of the state-of-the-art on-devices SLMs, contrasted to their on-server counterparts, based on a well-established trustworthiness measurement framework. Our results show on-device SLMs to be significantly less trustworthy, specifically demonstrating more stereotypical, unfair and privacy-breaching behavior. Informed by these findings, we then perform our ethics assessment study using a dataset of unethical questions, that depicts harmful scenarios. Our results illustrate the lacking ethical safeguards in on-device SLMs, emphasizing their capabilities of generating harmful content. Further, the broken safeguards and exploitable nature of on-device SLMs is demonstrated using potentially unethical vanilla prompts, to which the on-device SLMs answer with valid responses without any filters and without the need for any jailbreaking or prompt engineering. These responses can be abused for various harmful and unethical scenarios like: societal harm, illegal activities, hate, self-harm, exploitable phishing content and many others, all of which indicates the severe vulnerability and exploitability of these on-device SLMs.
Problem

Research questions and friction points this paper is trying to address.

Assessing trust and ethical risks in on-device small language models.
Evaluating vulnerabilities and harmful content generation in on-device AI.
Identifying exploitable safeguards in on-device SLMs for unethical scenarios.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Systematic evaluation of on-device SLMs trustworthiness
Ethics assessment using unethical question dataset
Demonstration of on-device SLMs exploitable vulnerabilities