Learning the PTM Code through a Coarse-to-Fine, Mechanism-Aware Framework

📅 2025-10-27

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

Deciphering the mapping between post-translational modification (PTM) sites and their catalyzing enzymes—termed the PTM “combinatorial code”—remains a central challenge in understanding cellular signaling regulation and disease mechanisms. This work introduces the first mechanism-aware, coarse-to-fine unified framework that jointly models multi-label PTM site prediction and zero-shot enzyme identification, explicitly encoding synergistic or antagonistic syntactic relationships among PTMs. The method integrates evolution-informed protein language model representations, physicochemical priors, and interaction-aware prompting to effectively mitigate the dual long-tail distribution inherent in PTM data. Evaluated on multiple proteome-scale benchmarks, our approach achieves a 122% improvement in site-level F1 score and a 54% gain in zero-shot enzyme assignment accuracy. Moreover, it successfully identifies PTM rewiring events triggered by disease-associated variants.

Technology Category

Application Category

📝 Abstract

Post-translational modifications (PTMs) form a combinatorial "code" that regulates protein function, yet deciphering this code - linking modified sites to their catalytic enzymes - remains a central unsolved problem in understanding cellular signaling and disease. We introduce COMPASS-PTM, a mechanism-aware, coarse-to-fine learning framework that unifies residue-level PTM profiling with enzyme-substrate assignment. COMPASS-PTM integrates evolutionary representations from protein language models with physicochemical priors and a crosstalk-aware prompting mechanism that explicitly models inter-PTM dependencies. This design allows the model to learn biologically coherent patterns of cooperative and antagonistic modifications while addressing the dual long-tail distribution of PTM data. Across multiple proteome-scale benchmarks, COMPASS-PTM establishes new state-of-the-art performance, including a 122% relative F1 improvement in multi-label site prediction and a 54% gain in zero-shot enzyme assignment. Beyond accuracy, the model demonstrates interpretable generalization, recovering canonical kinase motifs and predicting disease-associated PTM rewiring caused by missense variants. By bridging statistical learning with biochemical mechanism, COMPASS-PTM unifies site-level and enzyme-level prediction into a single framework that learns the grammar underlying protein regulation and signaling.

Problem

Research questions and friction points this paper is trying to address.

Deciphering the PTM code linking modified sites to enzymes

Modeling inter-PTM dependencies and long-tail data distributions

Unifying site-level and enzyme-level prediction into one framework

Innovation

Methods, ideas, or system contributions that make the work stand out.

Coarse-to-fine framework integrating residue profiling and enzyme assignment

Protein language models combined with physicochemical priors and crosstalk

Addresses long-tail PTM data distribution for interpretable generalization

🔎 Similar Papers

No similar papers found.