Infinite Leagues Under the Sea: Photorealistic 3D Underwater Terrain Generation by Latent Fractal Diffusion Models

📅 2025-03-09

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

This work addresses the dual challenges of insufficient photorealism and scarcity of real-world data in underwater 3D terrain generation. We propose an implicit diffusion-based generative framework guided by fractal priors. Methodologically, we pioneer the integration of fractal distribution modeling into the latent-space diffusion process, jointly optimizing RGB-D conditional generation and 2D diffusion prior–guided 3D Gaussian Splatting (3DGS) reconstruction to achieve geometry-semantic coherent, hyper-realistic underwater scene synthesis. Our technical pipeline unifies vision foundation models for geometric and semantic feature extraction, fractal-driven implicit embedding, and multimodal co-optimization. Experiments demonstrate that our method enables large-scale, highly consistent, and diverse underwater scene generation, significantly outperforming state-of-the-art approaches in novel-view rendering fidelity. We further validate its effectiveness across applications including film production, gaming, and underwater robot simulation.

Technology Category

Application Category

📝 Abstract

This paper tackles the problem of generating representations of underwater 3D terrain. Off-the-shelf generative models, trained on Internet-scale data but not on specialized underwater images, exhibit downgraded realism, as images of the seafloor are relatively uncommon. To this end, we introduce DreamSea, a generative model to generate hyper-realistic underwater scenes. DreamSea is trained on real-world image databases collected from underwater robot surveys. Images from these surveys contain massive real seafloor observations and covering large areas, but are prone to noise and artifacts from the real world. We extract 3D geometry and semantics from the data with visual foundation models, and train a diffusion model that generates realistic seafloor images in RGBD channels, conditioned on novel fractal distribution-based latent embeddings. We then fuse the generated images into a 3D map, building a 3DGS model supervised by 2D diffusion priors which allows photorealistic novel view rendering. DreamSea is rigorously evaluated, demonstrating the ability to robustly generate large-scale underwater scenes that are consistent, diverse, and photorealistic. Our work drives impact in multiple domains, spanning filming, gaming, and robot simulation.

Problem

Research questions and friction points this paper is trying to address.

Generates photorealistic 3D underwater terrain.

Overcomes realism issues in existing generative models.

Uses real-world underwater images for training.

Innovation

Methods, ideas, or system contributions that make the work stand out.

DreamSea generates hyper-realistic underwater scenes.

Uses fractal distribution-based latent embeddings.

Fuses RGBD images into 3D photorealistic maps.

🔎 Similar Papers

SeaSplat: Representing Underwater Scenes with 3D Gaussian Splatting and a Physically Grounded Image Formation Model