🤖 AI Summary
This work reveals severe privacy vulnerabilities in image autoregressive models (IARs), demonstrating significantly higher information leakage than diffusion models (DMs). Addressing the lack of systematic privacy evaluation for IARs, the authors propose a novel membership inference attack (MIA) grounded in probabilistic modeling discrepancies—enabling, for the first time, high-accuracy identification of IAR training images (TPR@FPR=1% = 86.38%). The attack requires only six query samples to reliably infer dataset membership and successfully reconstructs 698 original training images (e.g., from VAR-d30). Experiments confirm that while IARs achieve superior FID (1.48) and generation speed, they incur substantial privacy costs. This study provides the first quantitative characterization of IAR privacy fragility and introduces a cross-model collaborative defense paradigm, establishing both theoretical foundations and practical guidelines for the privacy–utility trade-off in generative modeling.
📝 Abstract
Image autoregressive (IAR) models have surpassed diffusion models (DMs) in both image quality (FID: 1.48 vs. 1.58) and generation speed. However, their privacy risks remain largely unexplored. To address this, we conduct a comprehensive privacy analysis comparing IARs to DMs. We develop a novel membership inference attack (MIA) that achieves a significantly higher success rate in detecting training images (TPR@FPR=1%: 86.38% for IARs vs. 4.91% for DMs). Using this MIA, we perform dataset inference (DI) and find that IARs require as few as six samples to detect dataset membership, compared to 200 for DMs, indicating higher information leakage. Additionally, we extract hundreds of training images from an IAR (e.g., 698 from VAR-d30). Our findings highlight a fundamental privacy-utility trade-off: while IARs excel in generation quality and speed, they are significantly more vulnerable to privacy attacks. This suggests that incorporating techniques from DMs, such as per-token probability modeling using diffusion, could help mitigate IARs' privacy risks. Our code is available at https://github.com/sprintml/privacy_attacks_against_iars.