Towards Faster and More Compact Foundation Models for Molecular Property Prediction

📅 2025-04-28

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

To address the high parameter count and inference latency of foundation models in molecular property prediction, this paper proposes JMP-L, a lightweight compression framework for the JMP model. Through layer-wise contribution analysis, we first identify significant diminishing returns in the late-stage interaction modules of JMP. Leveraging this insight, we design a structured interaction block pruning strategy—removing redundant modules from the pre-trained model followed by fine-tuning. Experiments demonstrate that JMP-L reduces model size by 32%, improves inference throughput by 1.3×, and incurs an average accuracy drop of less than 0.5% across downstream tasks—negligible in practice. Crucially, JMP-L requires neither full retraining nor architectural redesign, ensuring high efficiency and plug-and-play compatibility. The implementation is publicly available.

Technology Category

Application Category

📝 Abstract

Advancements in machine learning for molecular property prediction have improved accuracy but at the expense of higher computational cost and longer training times. Recently, the Joint Multi-domain Pre-training (JMP) foundation model has demonstrated strong performance across various downstream tasks with reduced training time over previous models. Despite JMP's advantages, fine-tuning it on molecular datasets ranging from small-scale to large-scale requires considerable time and computational resources. In this work, we investigate strategies to enhance efficiency by reducing model size while preserving performance. To better understand the model's efficiency, we analyze the layer contributions of JMP and find that later interaction blocks provide diminishing returns, suggesting an opportunity for model compression. We explore block reduction strategies by pruning the pre-trained model and evaluating its impact on efficiency and accuracy during fine-tuning. Our analysis reveals that removing two interaction blocks results in a minimal performance drop, reducing the model size by 32% while increasing inference throughput by 1.3x. These results suggest that JMP-L is over-parameterized and that a smaller, more efficient variant can achieve comparable performance with lower computational cost. Our study provides insights for developing lighter, faster, and more scalable foundation models for molecular and materials discovery. The code is publicly available at: https://github.com/Yasir-Ghunaim/efficient-jmp.

Problem

Research questions and friction points this paper is trying to address.

Reducing computational cost in molecular property prediction models

Compressing foundation models without significant performance loss

Improving efficiency of pre-trained models for faster inference

Innovation

Methods, ideas, or system contributions that make the work stand out.

Prunes pre-trained model for efficiency

Reduces model size by 32%

Increases inference throughput by 1.3x

🔎 Similar Papers

Generalizable, Fast, and Accurate DeepQSPR with fastprop Part 1: Framework and Benchmarks