Analyzing Domestic Violence through Exploratory Data Analysis and Explainable Ensemble Learning Insights

📅 2024-03-22

📈 Citations: 0

✨ Influential: 0

career value

221K/year

🤖 AI Summary

This study presents the first systematic investigation of male-directed domestic violence (MDV) in Bangladesh, identifying high prevalence of verbal abuse, economic dependence, and socioeconomic inequality as key drivers—based on survey data from nine major cities. Method: Addressing severe class imbalance and poor model interpretability, we propose a balanced modeling framework for MDV detection, integrating an ANN-CatBoost and logistic regression stacking ensemble, augmented by dual-path interpretability analysis using SHAP and LIME. Contribution/Results: Evaluated against 10 traditional ML, 3 deep learning, and hybrid ensemble baselines, our model achieves 95% accuracy and 99.29% AUC—statistically superior to all comparators (paired t-test with Bonferroni correction, p < 0.0036). This work delivers the first gender-inclusive, high-accuracy, and attribution-aware risk identification tool for MDV, enabling actionable, evidence-based intervention design.

Technology Category

Application Category

📝 Abstract

Domestic violence is commonly viewed as a gendered issue that primarily affects women, which tends to leave male victims largely overlooked. This study explores male domestic violence (MDV) for the first time, highlighting the factors that influence it and tackling the challenges posed by a significant categorical imbalance of 5:1 and a lack of data. We collected data from nine major cities in Bangladesh and conducted exploratory data analysis (EDA) to understand the underlying dynamics. EDA revealed patterns such as the high prevalence of verbal abuse, the influence of financial dependency, and the role of familial and socio-economic factors in MDV. To predict and analyze MDV, we implemented 10 traditional machine learning (ML) models, three deep learning models, and two ensemble models, including stacking and hybrid approaches. We propose a stacking ensemble model with ANN and CatBoost as base classifiers and Logistic Regression as the meta-model, which demonstrated the best performance, achieving 95% accuracy, a 99.29% AUC, and balanced metrics across evaluation criteria. Model-specific feature importance analysis of the base classifiers identified key features influencing their individual decision-making. Model-agnostic explainable AI techniques, SHAP and LIME, provided local and global insights into the decision-making processes of the proposed model, enhancing transparency and interpretability. Additionally, statistical validation using paired t-tests with 10-fold cross-validation and Bonferroni correction (alpha = 0.0036) confirmed the superior performance of our proposed model over alternatives.

Problem

Research questions and friction points this paper is trying to address.

Domestic Violence

Data Imbalance

Model Transparency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Male Victims of Domestic Violence

Machine Learning and Deep Learning Models

Interpretability and Transparency in Predictive Models

🔎 Similar Papers

No similar papers found.