AutoMMLab: Automatically Generating Deployable Models from Language Instructions for Computer Vision Tasks

📅 2024-02-23

🏛️ arXiv.org

📈 Citations: 5

✨ Influential: 1

career value

171K/year

🤖 AI Summary

Current computer vision research lacks an end-to-end system that automatically generates deployable models directly from natural language requirements. Method: This paper proposes a novel “request-to-model” paradigm, introducing AutoMMLab—the first open-source platform for end-to-end vision modeling—and the LAMP benchmark. It further designs HPO-LLaMA, an LLM-driven hyperparameter optimization algorithm that integrates natural language understanding, automated training and deployment pipelines, and a multi-stage evaluation framework. Contributions/Results: (1) First realization of a fully automated, language-instruction-driven vision modeling pipeline; (2) HPO-LLaMA achieves over 40% improvement in hyperparameter search efficiency across multiple CV tasks; (3) Comprehensive open-sourcing of datasets, code, and benchmarks to advance accessible, reproducible vision model development.

Technology Category

Application Category

📝 Abstract

Automated machine learning (AutoML) is a collection of techniques designed to automate the machine learning development process. While traditional AutoML approaches have been successfully applied in several critical steps of model development (e.g. hyperparameter optimization), there lacks a AutoML system that automates the entire end-to-end model production workflow for computer vision. To fill this blank, we propose a novel request-to-model task, which involves understanding the user's natural language request and execute the entire workflow to output production-ready models. This empowers non-expert individuals to easily build task-specific models via a user-friendly language interface. To facilitate development and evaluation, we develop a new experimental platform called AutoMMLab and a new benchmark called LAMP for studying key components in the end-to-end request-to-model pipeline. Hyperparameter optimization (HPO) is one of the most important components for AutoML. Traditional approaches mostly rely on trial-and-error, leading to inefficient parameter search. To solve this problem, we propose a novel LLM-based HPO algorithm, called HPO-LLaMA. Equipped with extensive knowledge and experience in model hyperparameter tuning, HPO-LLaMA achieves significant improvement of HPO efficiency. Dataset and code are available at https://github.com/yang-ze-kang/AutoMMLab.

Problem

Research questions and friction points this paper is trying to address.

Automatic Model Generation

Natural Language Processing

Computer Vision

Innovation

Methods, ideas, or system contributions that make the work stand out.

Automated Machine Learning

HPO-LLaMA Algorithm

Natural Language to Vision Model

🔎 Similar Papers

AutoML-Agent: A Multi-Agent LLM Framework for Full-Pipeline AutoML