π€ AI Summary
This work addresses the challenges of computational inefficiency and deployment difficulty in biomedical entity linking (BEL) when leveraging large language models. It introduces, for the first time, an instruction-tuned open-source generative model into the BEL re-ranking stage and proposes an ensemble instruction-tuning strategy to enable efficient and accurate candidate entity selection. The approach is integrated into BeLink, a modular end-to-end BEL system. Evaluated across multiple BEL benchmarks, the method improves linking accuracy by 3%β24% over state-of-the-art techniques while substantially reducing inference time, thereby achieving a superior balance between precision and efficiency and enhancing the practical feasibility of BEL in real-world applications.
π Abstract
Despite recent progress, Biomedical Entity Linking (BEL) with large language models (LLMs) remains computationally inefficient and challenging to deploy in practical settings. In this work, we demonstrate that instruction-tuning of open-source generative models can offer an effective solution when applied at the re-ranking stage of the BEL pipeline. We propose a set-wise instruction-tuning formulation that enables fast and accurate candidate selection. Our method demonstrates strong performance on multiple BEL benchmarks, yielding significant improvements in linking accuracy (3%-24%) while reducing inference time compared to the state-of-the-art. We integrate our generative re-ranker into BeLink, a modular, end-to-end system designed for practical real-world BEL applications.