Template-based Object Detection Using a Foundation Model

📅 2026-03-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes a training-free object detection method tailored for scenarios involving minor data variations where model training and annotation are impractical, such as GUI automation testing. By leveraging a segmentation foundation model—e.g., SAM—to generate image segments and integrating classical feature engineering for object classification, the approach rapidly adapts to new targets or interface changes without any training or labeled data. Evaluated on an in-vehicle navigation icon detection task, the method achieves performance comparable to learning-based detectors like YOLO, while entirely eliminating the need for model training. This significantly reduces deployment time and cost, demonstrating strong practical utility through its efficiency and adaptability.

Technology Category

Application Category

📝 Abstract
Most currently used object detection methods are learning-based, and can detect objects under varying appearances. Those models require training and a training dataset. We focus on use cases with less data variation, but the requirement of being free of generation of training data and training. Such a setup is for example desired in automatic testing of graphical interfaces during software development, especially for continuous integration testing. In our approach, we use segments from segmentation foundation models and combine them with a simple feature-based classification method. This saves time and cost when changing the object to be searched or its design, as nothing has to be retrained and no dataset has to be created. We evaluate our method on the task of detecting and classifying icons in navigation maps, which is used to simplify and automate the testing of user interfaces in automotive industry. Our methods achieve results almost on par with learning-based object detection methods like YOLO, without the need for training.
Problem

Research questions and friction points this paper is trying to address.

template-based object detection
foundation model
training-free detection
GUI testing
icon detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

template-based object detection
foundation model
zero-shot detection
UI testing automation
feature-based classification
🔎 Similar Papers
No similar papers found.
V
Valentin Braeutigam
Friedrich-Alexander-Universität Erlangen-Nürnberg, 91058 Erlangen, Germany
M
Matthias Stock
e.solutions GmbH, 91058 Erlangen, Germany
Bernhard Egger
Bernhard Egger
Professor, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU)
Computational Cognitive ScienceComputer VisionMachine LearningFacesStatistical Shape