🤖 AI Summary
The rapid evolution of deep neural architectures in AI-generated art lacks systematic analysis of their architectural principles, performance trade-offs, and aesthetic implications.
Method: We conduct the first large-scale, reproducible cross-generational benchmarking of representative models—including CNNs, VAEs, GANs, and diffusion models (e.g., Stable Diffusion, DALL-E 3)—quantitatively evaluating image fidelity, semantic controllability, and computational efficiency under a unified evaluation framework.
Contribution/Results: We uncover a fundamental coupling between model architecture and artistic expressivity: pixel-level reconstruction (CNN/VAE) → adversarial semantics (GAN) → probabilistic text-image alignment (diffusion), revealing how architectural advances progressively enhance semantic precision and creative autonomy. Our framework enables rigorous, comparable assessment across generations, providing methodological foundations for theoretical modeling, algorithmic refinement, and human-AI co-creation in generative art.
📝 Abstract
This paper delves into the fascinating field of AI-generated art and explores the various deep neural network architectures and models that have been utilized to create it. From the classic convolutional networks to the cutting-edge diffusion models, we examine the key players in the field. We explain the general structures and working principles of these neural networks. Then, we showcase examples of milestones, starting with the dreamy landscapes of DeepDream and moving on to the most recent developments, including Stable Diffusion and DALL-E 3, which produce mesmerizing images. We provide a detailed comparison of these models, highlighting their strengths and limitations, and examining the remarkable progress that deep neural networks have made so far in a short period of time. With a unique blend of technical explanations and insights into the current state of AI-generated art, this paper exemplifies how art and computer science interact.