🤖 AI Summary
In cloud-based AI training, intrinsic nondeterminism from stochastic operations (e.g., dropout) enables adversaries to conceal malicious tampering within “natural randomness,” rendering existing logging-audit mechanisms incapable of verifying the honesty of random-value generation or application—while also failing to simultaneously preserve data and model privacy.
Method: This paper introduces the first verifiable modeling of dropout randomness as cryptographic statements and proposes an embedded zero-knowledge auditing protocol within the training pipeline. Leveraging cryptographic hashing and a deterministic pseudorandom number generator (PRNG), it binds a trusted seed and generates succinct zero-knowledge proofs for post-hoc verification of stochastic operation execution.
Contribution/Results: The scheme guarantees randomness unbiasness and consistency, incurs verification overhead under 0.8% of total training time, and leaks neither model parameters nor training data. It is the first solution enabling auditable stochastic operations in deep learning under strong privacy constraints.
📝 Abstract
Modern cloud-based AI training relies on extensive telemetry and logs to ensure accountability. While these audit trails enable retrospective inspection, they struggle to address the inherent non-determinism of deep learning. Stochastic operations, such as dropout, create an ambiguity surface where attackers can mask malicious manipulations as natural random variance, granting them plausible deniability. Consequently, existing logging mechanisms cannot verify whether stochastic values were generated and applied honestly without exposing sensitive training data. To close this integrity gap, we introduce Verifiable Dropout, a privacy-preserving mechanism based on zero-knowledge proofs. We treat stochasticity not as an excuse but as a verifiable claim. Our approach binds dropout masks to a deterministic, cryptographically verifiable seed and proves the correct execution of the dropout operation. This design enables users to audit the integrity of stochastic training steps post-hoc, ensuring that randomness was neither biased nor cherry-picked, while strictly preserving the confidentiality of the model and data.