adabmDCA 2.0 -- a flexible but easy-to-use package for Direct Coupling Analysis

📅 2025-01-30

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

This work introduces the first open-source Direct Coupling Analysis (DCA) framework supporting multitask modeling of proteins and RNA, unifying key challenges including contact prediction, mutation effect estimation, sequence library scoring, and de novo sequence design. Methodologically, it integrates Boltzmann machine modeling with maximum-likelihood estimation and L₂ regularization, and pioneers a unified cross-language (C++/Julia/Python) and cross-hardware (CPU/GPU) interface. It natively supports both dense and sparse learning protocols and enables end-to-end execution of downstream tasks. Experiments demonstrate state-of-the-art performance: >70% Top-L/5 contact prediction accuracy on standard protein family benchmarks; scalable training on sequence libraries exceeding ten million sequences; and over 10× throughput acceleration on GPU-accelerated hardware compared to CPU-only execution.

Technology Category

Application Category

📝 Abstract

In this methods article, we provide a flexible but easy-to-use implementation of Direct Coupling Analysis (DCA) based on Boltzmann machine learning, together with a tutorial on how to use it. The package exttt{adabmDCA 2.0} is available in different programming languages (C++, Julia, Python) usable on different architectures (single-core and multi-core CPU, GPU) using a common front-end interface. In addition to several learning protocols for dense and sparse generative DCA models, it allows to directly address common downstream tasks like residue-residue contact prediction, mutational-effect prediction, scoring of sequence libraries and generation of artificial sequences for sequence design. It is readily applicable to protein and RNA sequence data.

Problem

Research questions and friction points this paper is trying to address.

Direct Coupling Analysis

Protein-RNA Contact Prediction

Sequence Design

Innovation

Methods, ideas, or system contributions that make the work stand out.

Boltzmann Machine Learning

Multi-platform Compatibility

Versatile Task Execution

🔎 Similar Papers

Transient Nonlinear Electrothermal Adjoint Sensitivity Analysis for HVDC Cable Joints