torchmil: A PyTorch-based library for deep Multiple Instance Learning

📅 2025-09-09

📈 Citations: 0

✨ Influential: 0

career value

187K/year

🤖 AI Summary

The deep multiple instance learning (MIL) community has long suffered from a lack of standardized tooling, hindering reproducibility, fair benchmarking, and real-world deployment. To address this, we introduce *torchmil*, the first open-source PyTorch framework specifically designed for deep MIL. Guided by principles of unification, modularity, and extensibility, *torchmil* provides standardized data interfaces, curated benchmark datasets, plug-and-play model components, and a comprehensive evaluation protocol. Its key innovation is an end-to-end weakly supervised experimental pipeline that significantly lowers implementation barriers while ensuring cross-study comparability. The framework is publicly released under an open-source license and has been widely adopted in both academic research and industrial applications. Empirical validation demonstrates its effectiveness in enabling systematic, reproducible evaluation and rapid prototyping of MIL methods.

Technology Category

Application Category

📝 Abstract

Multiple Instance Learning (MIL) is a powerful framework for weakly supervised learning, particularly useful when fine-grained annotations are unavailable. Despite growing interest in deep MIL methods, the field lacks standardized tools for model development, evaluation, and comparison, which hinders reproducibility and accessibility. To address this, we present torchmil, an open-source Python library built on PyTorch. torchmil offers a unified, modular, and extensible framework, featuring basic building blocks for MIL models, a standardized data format, and a curated collection of benchmark datasets and models. The library includes comprehensive documentation and tutorials to support both practitioners and researchers. torchmil aims to accelerate progress in MIL and lower the entry barrier for new users. Available at https://torchmil.readthedocs.io.

Problem

Research questions and friction points this paper is trying to address.

Standardizes deep Multiple Instance Learning model development

Addresses lack of reproducibility in MIL research

Provides unified framework for MIL evaluation and comparison

Innovation

Methods, ideas, or system contributions that make the work stand out.

PyTorch-based library for MIL

Modular framework with standardized data

Includes benchmark datasets and documentation

🔎 Similar Papers

Chrono: A Simple Blueprint for Representing Time in MLLMs