AGNOMIN - Architecture Agnostic Multi-Label Function Name Prediction

📅 2025-09-29

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

This work addresses the challenging problem of cross-architecture function name prediction in stripped binaries—hampered by architecture dependence, scarce labeled data, and naming heterogeneity. We propose the first architecture-agnostic multi-label function name prediction framework. Methodologically, we construct a Feature-Enriched Hierarchical Graph integrating control-flow graphs, call graphs, and dynamic PCode features, and design an attention-enhanced Renée-style decoder to enable multimodal hierarchical representation learning. Evaluated on 9,000 cross-architecture ELF binaries, our approach achieves up to 27.17% higher precision and 55.86% higher recall than state-of-the-art methods; notably, it improves recall on unseen architectures by 5.89% over baselines. The framework has been successfully deployed in real-world vulnerability analysis and patching scenarios.

Technology Category

Application Category

📝 Abstract

Function name prediction is crucial for understanding stripped binaries in software reverse engineering, a key step for extbf{enabling subsequent vulnerability analysis and patching}. However, existing approaches often struggle with architecture-specific limitations, data scarcity, and diverse naming conventions. We present AGNOMIN, a novel architecture-agnostic approach for multi-label function name prediction in stripped binaries. AGNOMIN builds Feature-Enriched Hierarchical Graphs (FEHGs), combining Control Flow Graphs, Function Call Graphs, and dynamically learned exttt{PCode} features. A hierarchical graph neural network processes this enriched structure to generate consistent function representations across architectures, vital for extbf{scalable security assessments}. For function name prediction, AGNOMIN employs a Renée-inspired decoder, enhanced with an attention-based head layer and algorithmic improvements. We evaluate AGNOMIN on a comprehensive dataset of 9,000 ELF executable binaries across three architectures, demonstrating its superior performance compared to state-of-the-art approaches, with improvements of up to 27.17% in precision and 55.86% in recall across the testing dataset. Moreover, AGNOMIN generalizes well to unseen architectures, achieving 5.89% higher recall than the closest baseline. AGNOMIN's practical utility has been validated through security hackathons, where it successfully aided reverse engineers in analyzing and patching vulnerable binaries across different architectures.

Problem

Research questions and friction points this paper is trying to address.

Predicts function names in stripped binaries for vulnerability analysis

Overcomes architecture-specific limitations in binary reverse engineering

Generates cross-architecture function representations for scalable security assessments

Innovation

Methods, ideas, or system contributions that make the work stand out.

Architecture-agnostic approach for multi-label function name prediction

Feature-Enriched Hierarchical Graphs combining CFG, FCG and PCode features

Hierarchical graph neural network with attention-based decoder enhancements

🔎 Similar Papers

BLens: Contrastive Captioning of Binary Functions using Ensemble Embedding