Intelligent Image Sensing for Crime Analysis: A ML Approach towards Enhanced Violence Detection and Investigation

📅 2025-06-16

📈 Citations: 0

✨ Influential: 0

career value

244K/year

🤖 AI Summary

To address the challenge of timely detecting diverse and sporadic violent behaviors—where conventional surveillance methods fall short—this paper proposes an end-to-end video violence detection and classification system tailored for real-time security applications. Methodologically, it introduces a novel hybrid architecture integrating 3D CNNs with a disentangled 3D convolutional layer followed by a bidirectional LSTM, enabling fine-grained temporal modeling. We establish a unified frame-level annotation protocol for heterogeneous, cross-source videos (e.g., surveillance footage, smartphone recordings, sports broadcasts, and synthetic data) and curate a proprietary multi-source dataset accordingly. The system is deployed on a Raspberry Pi edge platform, supporting fully automated pipeline execution—from video acquisition and feature extraction to multi-class violence recognition. Evaluated on a multi-source mixed test set, it achieves 92.7% accuracy with a 38% reduction in inference latency, significantly enhancing edge resource efficiency and real-time responsiveness.

Technology Category

Application Category

📝 Abstract

The increasing global crime rate, coupled with substantial human and property losses, highlights the limitations of traditional surveillance methods in promptly detecting diverse and unexpected acts of violence. Addressing this pressing need for automatic violence detection, we leverage Machine Learning to detect and categorize violent events in video streams. This paper introduces a comprehensive framework for violence detection and classification, employing Supervised Learning for both binary and multi-class violence classification. The detection model relies on 3D Convolutional Neural Networks, while the classification model utilizes the separable convolutional 3D model for feature extraction and bidirectional LSTM for temporal processing. Training is conducted on a diverse customized datasets with frame-level annotations, incorporating videos from surveillance cameras, human recordings, hockey fight, sohas and wvd dataset across various platforms. Additionally, a camera module integrated with raspberry pi is used to capture live video feed, which is sent to the ML model for processing. Thus, demonstrating improved performance in terms of computational resource efficiency and accuracy.

Problem

Research questions and friction points this paper is trying to address.

Detect violent events in video streams using Machine Learning

Classify violence types with supervised learning and 3D CNNs

Enhance surveillance efficiency and accuracy with real-time processing

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses 3D CNN for violence detection

Employs separable 3D CNN and BiLSTM

Integrates Raspberry Pi for live feed

🔎 Similar Papers

No similar papers found.