ProCrop: Learning Aesthetic Image Cropping from Professional Compositions

📅 2025-05-28

📈 Citations: 0

✨ Influential: 0

career value

161K/year

🤖 AI Summary

Existing automatic image cropping methods suffer from limitations in aesthetic quality and compositional diversity: rule-based approaches lack flexibility, while data-driven methods rely heavily on large-scale annotated datasets and exhibit poor generalization. This paper proposes ProCrop—the first professional-composition-guided retrieval-based cropping framework—that leverages compositional features extracted from both query images and professional photographs to guide cropping decisions. We introduce the first large-scale (242K) weakly annotated composition-aware dataset, and employ diffusion-based outpainting combined with iterative optimization to generate high-quality, diverse cropping proposals. Furthermore, we design retrieval-augmented feature fusion and aesthetic-driven composition modeling. Experiments demonstrate that ProCrop significantly outperforms state-of-the-art methods under both fully supervised and weakly supervised settings; remarkably, training solely on our dataset achieves performance on par with fully supervised baselines. The code and dataset will be publicly released.

Technology Category

Application Category

📝 Abstract

Image cropping is crucial for enhancing the visual appeal and narrative impact of photographs, yet existing rule-based and data-driven approaches often lack diversity or require annotated training data. We introduce ProCrop, a retrieval-based method that leverages professional photography to guide cropping decisions. By fusing features from professional photographs with those of the query image, ProCrop learns from professional compositions, significantly boosting performance. Additionally, we present a large-scale dataset of 242K weakly-annotated images, generated by out-painting professional images and iteratively refining diverse crop proposals. This composition-aware dataset generation offers diverse high-quality crop proposals guided by aesthetic principles and becomes the largest publicly available dataset for image cropping. Extensive experiments show that ProCrop significantly outperforms existing methods in both supervised and weakly-supervised settings. Notably, when trained on the new dataset, our ProCrop surpasses previous weakly-supervised methods and even matches fully supervised approaches. Both the code and dataset will be made publicly available to advance research in image aesthetics and composition analysis.

Problem

Research questions and friction points this paper is trying to address.

Enhancing photo appeal with professional-guided aesthetic cropping

Overcoming lack of diversity in existing cropping methods

Creating large dataset for improved weakly-supervised learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Retrieval-based method leveraging professional photography

Fuses features from professional and query images

Large-scale dataset with aesthetic crop proposals

🔎 Similar Papers

Cropper: Vision-Language Model for Image Cropping through In-Context Learning