Black-Box Privacy Attacks on Shared Representations in Multitask Learning

📅 2025-06-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work uncovers an implicit task-level privacy leakage in multi-task learning (MTL): adversaries can infer whether a specific task was included in model training solely via black-box queries of fresh samples’ embeddings in the shared representation space. To address this, we propose the first black-box task inference attack framework that requires neither shadow models nor labeled reference data. We theoretically establish a strict separation in attack capability between fresh and training samples. By modeling dependencies within the embedding space and integrating statistical significance testing, our method achieves high-accuracy task existence detection—yielding AUC scores of 0.85–0.97 across vision and language multimodal MTL benchmarks. These results demonstrate that shared representations themselves constitute a substantive privacy threat, independent of model parameters or auxiliary data.

Technology Category

Application Category

📝 Abstract
Multitask learning (MTL) has emerged as a powerful paradigm that leverages similarities among multiple learning tasks, each with insufficient samples to train a standalone model, to solve them simultaneously while minimizing data sharing across users and organizations. MTL typically accomplishes this goal by learning a shared representation that captures common structure among the tasks by embedding data from all tasks into a common feature space. Despite being designed to be the smallest unit of shared information necessary to effectively learn patterns across multiple tasks, these shared representations can inadvertently leak sensitive information about the particular tasks they were trained on. In this work, we investigate what information is revealed by the shared representations through the lens of inference attacks. Towards this, we propose a novel, black-box task-inference threat model where the adversary, given the embedding vectors produced by querying the shared representation on samples from a particular task, aims to determine whether that task was present when training the shared representation. We develop efficient, purely black-box attacks on machine learning models that exploit the dependencies between embeddings from the same task without requiring shadow models or labeled reference data. We evaluate our attacks across vision and language domains for multiple use cases of MTL and demonstrate that even with access only to fresh task samples rather than training data, a black-box adversary can successfully infer a task's inclusion in training. To complement our experiments, we provide theoretical analysis of a simplified learning setting and show a strict separation between adversaries with training samples and fresh samples from the target task's distribution.
Problem

Research questions and friction points this paper is trying to address.

Black-box attacks reveal sensitive information from shared MTL representations
Adversaries infer task inclusion in training using only embedding vectors
Privacy risks in multitask learning without shadow models or labeled data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Black-box task-inference threat model
Exploiting dependencies between task embeddings
No shadow models or labeled data needed
🔎 Similar Papers
No similar papers found.
J
John Abascal
Khoury College of Computer Sciences, Northeastern University
N
Nicol'as Berrios
Khoury College of Computer Sciences, Northeastern University
Alina Oprea
Alina Oprea
Northeastern University
Computer SecurityAdversarial Machine LearningAI Security
Jonathan Ullman
Jonathan Ullman
Associate Professor of Computer Science, Northeastern University
Differential PrivacyMachine Learning TheoryCryptography
A
Adam Smith
Department of Computer Science, Boston University
Matthew Jagielski
Matthew Jagielski
Anthropic
adversarial machine learningdifferential privacysecurity