🤖 AI Summary
Current biomolecular sequence design methods lack unified, reproducible evaluation standards, hindering fair and rigorous performance comparison. To address this, we introduce BioSeqEval—a modular, open-source Python evaluation library that systematically integrates three model-agnostic metric categories: sequence-based, embedding-based, and property-based—representing the first such comprehensive framework. It supports one-shot and iterative design evaluation across diverse sequence modalities, including small molecules, DNA, RNA, peptides, and proteins. The library incorporates state-of-the-art pretrained embedding models, machine learning–based property predictors, efficient sequence alignment tools, and interactive visualization modules for diagnostic analysis. Empirical evaluation demonstrates that BioSeqEval significantly enhances evaluation standardization, cross-method comparability, and methodological transparency. It exhibits strong flexibility and robustness across multiple benchmark design tasks, enabling reproducible, interpretable, and scalable assessment of generative sequence models.
📝 Abstract
Recent advances in computational methods for designing biological sequences have sparked the development of metrics to evaluate these methods performance in terms of the fidelity of the designed sequences to a target distribution and their attainment of desired properties. However, a single software library implementing these metrics was lacking. In this work we introduce seqme, a modular and highly extendable open-source Python library, containing model-agnostic metrics for evaluating computational methods for biological sequence design. seqme considers three groups of metrics: sequence-based, embedding-based, and property-based, and is applicable to a wide range of biological sequences: small molecules, DNA, ncRNA, mRNA, peptides and proteins. The library offers a number of embedding and property models for biological sequences, as well as diagnostics and visualization functions to inspect the results. seqme can be used to evaluate both one-shot and iterative computational design methods.