adabmDCA 2.0 -- a flexible but easy-to-use package for Direct Coupling Analysis

📅 2025-01-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work introduces the first open-source Direct Coupling Analysis (DCA) framework supporting multitask modeling of proteins and RNA, unifying key challenges including contact prediction, mutation effect estimation, sequence library scoring, and de novo sequence design. Methodologically, it integrates Boltzmann machine modeling with maximum-likelihood estimation and L₂ regularization, and pioneers a unified cross-language (C++/Julia/Python) and cross-hardware (CPU/GPU) interface. It natively supports both dense and sparse learning protocols and enables end-to-end execution of downstream tasks. Experiments demonstrate state-of-the-art performance: >70% Top-L/5 contact prediction accuracy on standard protein family benchmarks; scalable training on sequence libraries exceeding ten million sequences; and over 10× throughput acceleration on GPU-accelerated hardware compared to CPU-only execution.

Technology Category

Application Category

📝 Abstract
In this methods article, we provide a flexible but easy-to-use implementation of Direct Coupling Analysis (DCA) based on Boltzmann machine learning, together with a tutorial on how to use it. The package exttt{adabmDCA 2.0} is available in different programming languages (C++, Julia, Python) usable on different architectures (single-core and multi-core CPU, GPU) using a common front-end interface. In addition to several learning protocols for dense and sparse generative DCA models, it allows to directly address common downstream tasks like residue-residue contact prediction, mutational-effect prediction, scoring of sequence libraries and generation of artificial sequences for sequence design. It is readily applicable to protein and RNA sequence data.
Problem

Research questions and friction points this paper is trying to address.

Direct Coupling Analysis
Protein-RNA Contact Prediction
Sequence Design
Innovation

Methods, ideas, or system contributions that make the work stand out.

Boltzmann Machine Learning
Multi-platform Compatibility
Versatile Task Execution
🔎 Similar Papers
No similar papers found.
L
Lorenzo Rosset
Laboratory of Computational and Quantitative Biology, Sorbonne Université, CNRS, 75005 Paris, France and Laboratoire de Physique Théorique, École Normale Supérieure, 75231 Paris, France
R
Roberto Netti
Laboratory of Computational and Quantitative Biology, Sorbonne Université, CNRS, 75005 Paris, France
Anna Paola Muntoni
Anna Paola Muntoni
DISAT, Politecnico di Torino, 10129 Torino, Italy
Martin Weigt
Martin Weigt
Laboratory of Computational and Quantitative Biology, Sorbonne Université, CNRS, 75005 Paris, France
Francesco Zamponi
Francesco Zamponi
Sapienza Università di Roma
PhysicsStatistical mechanics