PUFFIN: Protein Unit Discovery with Functional Supervision

📅 2026-04-16
📈 Citations: 0
Influential: 0
📄 PDF

career value

210K/year
🤖 AI Summary
Existing methods struggle to identify biologically meaningful mesoscale functional units from protein structures while effectively integrating functional information. This work proposes a novel approach that, for the first time, incorporates functional label supervision into the structure-based graph partitioning process without relying on predefined annotations. By leveraging graph neural networks and a structure-aware pooling mechanism, the method jointly learns to partition residue-level structural graphs and extract functional signals, thereby automatically discovering structurally coherent, multi-residue protein functional units. The identified units exhibit strong statistical associations with molecular functions and demonstrate high concordance with manually curated InterPro annotations, enabling interpretable modeling of structure–function relationships.

Technology Category

Application Category

📝 Abstract
Proteins carry out biological functions through the coordinated action of groups of residues organized into structural arrangements. These arrangements, which we refer to as protein units, exist at an intermediate scale, being larger than individual residues yet smaller than entire proteins. A deeper understanding of protein function can be achieved by identifying these units and their associations with function. However, existing approaches either focus on residue-level signals, rely on curated annotations, or segment protein structures without incorporating functional information, thereby limiting interpretable analysis of structure-function relationships. We introduce PUFFIN, a data-driven framework for discovering protein units by jointly learning structural partitioning and functional supervision. PUFFIN represents proteins as residue-level structure graphs and applies a graph neural network with a structure-aware pooling mechanism that partitions each protein into multi-residue units, with functional supervision that shapes the partition. We show that the learned units are structurally coherent, exhibit organized associations with molecular function, and show meaningful correspondence with curated InterPro annotations. Together, these results demonstrate that PUFFIN provides an interpretable framework for analyzing structure-function relationships using learned protein units and their statistical function associations. We made our source code available at https://github.com/boun-tabi-lifelu/puffin.
Problem

Research questions and friction points this paper is trying to address.

protein units
structure-function relationship
functional supervision
protein structure partitioning
interpretable analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

protein units
graph neural network
structure-function relationship
functional supervision
structure-aware pooling
🔎 Similar Papers
No similar papers found.