🤖 AI Summary
This work addresses the scarcity of native 4K-resolution datasets with sufficient scale to effectively train state-of-the-art super-resolution and text-to-image diffusion models. To bridge this gap, the authors construct a large-scale, multi-category dataset comprising 129,484 high-quality native 4K images spanning diverse scenes such as natural landscapes, urban environments, and human subjects, accompanied by dedicated validation and test sets. Data quality is ensured through a hybrid pipeline combining automated filtering, human curation, and collaborative annotation leveraging large multimodal models. By integrating open-source resources including Photo Concept Bucket, Laion2B, and PD12M, this study establishes the first 4K benchmark tailored for high-fidelity image restoration and generation. Experiments demonstrate that models trained on this dataset achieve significantly improved performance in both 4K super-resolution and diffusion-based synthesis, underscoring the critical role of authentic high-resolution data in enhancing image fidelity.
📝 Abstract
High-resolution datasets are essential for advancing super-resolution (SR) and text-to-image (T2I) diffusion research. However, current publicly available datasets lack both the native 4K resolution and the extensive scale necessary for training state-of-the-art models. To address this gap, we introduce a 4K Large Scale Dataset and Benchmark (4KLSDB), a large-scale, diverse dataset consisting of 129,484 carefully curated 4K resolution images spanning multiple categories such as nature, urban scenes, people, food, artwork, and CGI, alongside distinct validation and test sets containing 2,000 and 1,984 images respectively. Images were sourced from established open datasets including Photo Concept Bucket, Laion2B, and PD12M. 4KLSDB underwent rigorous multi-stage automated filtering and annotation pipelines involving both human annotators and Large Multimodal Models (LMMs) to ensure high aesthetic quality and dataset consistency. We demonstrate 4KLSDB's effectiveness by training representative super-resolution and diffusion models, observing significant improvements in performance on native 4K benchmarks. Comprehensive experiments illustrate a positive correlation between training on true 4K resolution data and improved fidelity in image restoration task, especially on 4K resolution. We provide the research community a valuable resource to drive progress toward genuinely high-fidelity image synthesis and restoration by providing 4KLSDB. Our project page is available at: https://4klsdb.github.io/.