ReassembleNet: Learnable Keypoints and Diffusion for 2D Fresco Reconstruction

📅 2025-05-27

📈 Citations: 0

✨ Influential: 0

career value

222K/year

🤖 AI Summary

Existing deep learning methods for archaeological mural fragment reassembly—a canonical 2D puzzle reconstruction task—suffer from limited scalability, inadequate multimodal fusion, and poor modeling of complex erosion patterns. Method: We propose the first end-to-end reassembly framework tailored to real-world archaeological scenarios. It introduces learnable contour keypoint representations, leverages graph neural networks for adaptive pooling and structure-aware matching, fuses geometric and textural modalities, and employs a diffusion model for high-precision pose estimation. Contribution/Results: After pretraining on semi-synthetic data, our method achieves state-of-the-art performance, reducing rotation and translation RMSE by 55% and 86%, respectively, over prior approaches. It demonstrates significantly enhanced robustness to non-rigid deformations, local fragment缺失 (missing regions), and intricate edge erosion—critical challenges in archaeological conservation.

Technology Category

Application Category

📝 Abstract

The task of reassembly is a significant challenge across multiple domains, including archaeology, genomics, and molecular docking, requiring the precise placement and orientation of elements to reconstruct an original structure. In this work, we address key limitations in state-of-the-art Deep Learning methods for reassembly, namely i) scalability; ii) multimodality; and iii) real-world applicability: beyond square or simple geometric shapes, realistic and complex erosion, or other real-world problems. We propose ReassembleNet, a method that reduces complexity by representing each input piece as a set of contour keypoints and learning to select the most informative ones by Graph Neural Networks pooling inspired techniques. ReassembleNet effectively lowers computational complexity while enabling the integration of features from multiple modalities, including both geometric and texture data. Further enhanced through pretraining on a semi-synthetic dataset. We then apply diffusion-based pose estimation to recover the original structure. We improve on prior methods by 55% and 86% for RMSE Rotation and Translation, respectively.

Problem

Research questions and friction points this paper is trying to address.

Addressing scalability in deep learning for 2D reassembly

Enabling multimodality by integrating geometric and texture data

Improving real-world applicability for complex erosion scenarios

Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph Neural Networks for keypoint selection

Diffusion-based pose estimation technique

Semi-synthetic dataset pretraining enhancement

🔎 Similar Papers

No similar papers found.