Tail-aware N-version Machine Learning Models for Reliable API Recommendation

📅 2026-04-30

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

This work addresses the unreliability of API recommendation models on low-frequency (tail) APIs caused by long-tailed training data distributions. To tackle this issue, the authors propose NvRec, a novel approach that introduces N-version fault tolerance into API recommendation for the first time. NvRec integrates five large code models—CodeBERT, CodeT5, MulaRec, UniXcoder, and CodeT5+—and dynamically selects outputs based on each model’s performance profile on tail APIs, combined with a majority voting mechanism to suppress unreliable recommendations. Experimental results show that a three-model ensemble achieves a true acceptance rate of 83.8% and a rejection rate of 80.7% under strict reliability constraints, while a five-model configuration using simple majority voting attains a true acceptance rate of 83.1% with a reduced rejection rate of 69.0%, significantly enhancing both reliability and balance in tail API recommendations.

📝 Abstract

Machine learning (ML)-based API recommendation helps developers efficiently identify suitable APIs to complement the application code. However, code datasets used to train ML models often exhibit a long-tail distribution, leading to unreliable API recommendations, especially for infrequently used API methods at the tail of the distribution. To address this issue, we propose N-version API Recommendation (NvRec), which leverages N different versions of ML models to enhance the reliability of API sequence recommendations by suppressing unreliable outputs entailing tail APIs. NvRec leverages a set of available ML models and profiles their performance on individual API methods with their tail properties. The generated model profile is used at inference time to filter out unreliable API recommendations and determine the final output. We implement NvRec using five API recommendation models, including CodeBERT, CodeT5, MulaRec, UniXcoder, and CodeT5+, and evaluate it on a public benchmark dataset constructed from compilable Java projects. For the three-version NvRec, we find that the combination of CodeT5, MulaRec, and UniXcoder achieves the highest true accept rate of 83.8%, with a rejection rate of 80.7%, when majority voting is restricted to highly reliable candidates. In contrast, the five-version configuration achieves its highest true accept rate of 83.1% with simple majority voting, while reducing the rejection rate to 69.0%. Overall, the five-version configuration offers a better balance between true accept rate and rejection rate.

Problem

Research questions and friction points this paper is trying to address.

long-tail distribution

API recommendation

reliability

tail APIs

machine learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

N-version learning

API recommendation

long-tail distribution