General Hazard Detection

📅 2026-05-22
📈 Citations: 0
Influential: 0
📄 PDF

career value

200K/year
🤖 AI Summary
Existing hazard detection systems are constrained by predefined categories and reliance on large-scale annotated data, struggling to handle abstract safety concepts under conditions of data sparsity, evolving definitions, and cross-scenario generalization. This work proposes a language-rule-based framework for general-purpose hazard detection, where safety requirements are expressed as natural language rules decoupled from image examples. By integrating a vision-language model (LLaVA), rule-driven compliance evaluation, active learning, and human-in-the-loop mechanisms, the approach enables context-sensitive, fine-grained judgments. To support this paradigm, we introduce CompliVision—a dataset of 3,006 multi-domain images, each annotated with rule-compliance labels and natural language explanations—substantially enhancing model generalization to unseen scenarios.
📝 Abstract
Hazard, as an abstract concept, is typically defined through cognitive-level logical reasoning rather than concrete examples. In contrast, existing hazard detection systems rely on predefined hazard categories and require intensive collection of labelled examples within detection or classification architectures. This approach faces three fundamental challenges when addressing abstract safety concepts: (1) noisy and sparse training data, (2) dynamically evolving definitions that change across contexts and time, and (3) limited generalisation to unseen or novel scenarios. To address these limitations, we present the CompliVision dataset, the first general-purpose hazard dataset designed for rule-based compliance assessment, along with a baseline framework for hazard evaluation. Our key innovation is decoupling the hazard concept from image-based examples by expressing safety requirements through language-based rules. We ground our approach in authoritative domain regulations and ISO standards to define diverse hazard concepts across multiple domains. The CompliVision dataset comprises 3,006 images spanning traffic, construction, and warehouse environments, with each image annotated for compliance against specific safety rules, accompanied by natural language explanations highlighting the supporting visual evidence. To achieve robust generalisation, we develop an active learning framework to more effectively guide and refine vision-language models in assessing hazard compliance. While state-of-the-art VLMs demonstrate strong capabilities, they struggle with the fine-grained, context-dependent interpretation required for accurate safety assessment. We proposed a general hazard detection framework to address this limitation which combines LLaVA-based visual reasoning with with human-in-the-loop feedback.
Problem

Research questions and friction points this paper is trying to address.

hazard detection
abstract safety concepts
generalisation
dynamic definitions
sparse training data
Innovation

Methods, ideas, or system contributions that make the work stand out.

rule-based hazard detection
vision-language models
compliance assessment
active learning
general-purpose hazard dataset
🔎 Similar Papers
No similar papers found.