🤖 AI Summary
Existing membership inference attacks rely on static, handcrafted strategies with limited generalization capabilities. This work proposes the first agent-based framework that decouples strategy reasoning from execution, reformulating the attack process as a self-exploratory and strategy-evolution mechanism guided by high-level scenario descriptions. The framework automatically generates and iteratively refines attack strategies at the logits level, eliminating the need for manual feature engineering and enabling model-agnostic, systematic strategy search. Experimental results demonstrate that the proposed method consistently matches or surpasses current state-of-the-art baselines across diverse large models, significantly enhancing both the universality and effectiveness of membership inference attacks.
📝 Abstract
Membership Inference Attacks (MIAs) serve as a fundamental auditing tool for evaluating training data leakage in machine learning models. However, existing methodologies predominantly rely on static, handcrafted heuristics that lack adaptability, often leading to suboptimal performance when transferred across different large models. In this work, we propose AutoMIA, an agentic framework that reformulates membership inference as an automated process of self-exploration and strategy evolution. Given high-level scenario specifications, AutoMIA self-explores the attack space by generating executable logits-level strategies and progressively refining them through closed-loop evaluation feedback. By decoupling abstract strategy reasoning from low-level execution, our framework enables a systematic, model-agnostic traversal of the attack search space. Extensive experiments demonstrate that AutoMIA consistently matches or outperforms state-of-the-art baselines while eliminating the need for manual feature engineering.