Analyzing Domestic Violence through Exploratory Data Analysis and Explainable Ensemble Learning Insights

📅 2024-03-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study presents the first systematic investigation of male-directed domestic violence (MDV) in Bangladesh, identifying high prevalence of verbal abuse, economic dependence, and socioeconomic inequality as key drivers—based on survey data from nine major cities. Method: Addressing severe class imbalance and poor model interpretability, we propose a balanced modeling framework for MDV detection, integrating an ANN-CatBoost and logistic regression stacking ensemble, augmented by dual-path interpretability analysis using SHAP and LIME. Contribution/Results: Evaluated against 10 traditional ML, 3 deep learning, and hybrid ensemble baselines, our model achieves 95% accuracy and 99.29% AUC—statistically superior to all comparators (paired t-test with Bonferroni correction, p < 0.0036). This work delivers the first gender-inclusive, high-accuracy, and attribution-aware risk identification tool for MDV, enabling actionable, evidence-based intervention design.

Technology Category

Application Category

📝 Abstract
Domestic violence is commonly viewed as a gendered issue that primarily affects women, which tends to leave male victims largely overlooked. This study explores male domestic violence (MDV) for the first time, highlighting the factors that influence it and tackling the challenges posed by a significant categorical imbalance of 5:1 and a lack of data. We collected data from nine major cities in Bangladesh and conducted exploratory data analysis (EDA) to understand the underlying dynamics. EDA revealed patterns such as the high prevalence of verbal abuse, the influence of financial dependency, and the role of familial and socio-economic factors in MDV. To predict and analyze MDV, we implemented 10 traditional machine learning (ML) models, three deep learning models, and two ensemble models, including stacking and hybrid approaches. We propose a stacking ensemble model with ANN and CatBoost as base classifiers and Logistic Regression as the meta-model, which demonstrated the best performance, achieving 95% accuracy, a 99.29% AUC, and balanced metrics across evaluation criteria. Model-specific feature importance analysis of the base classifiers identified key features influencing their individual decision-making. Model-agnostic explainable AI techniques, SHAP and LIME, provided local and global insights into the decision-making processes of the proposed model, enhancing transparency and interpretability. Additionally, statistical validation using paired t-tests with 10-fold cross-validation and Bonferroni correction (alpha = 0.0036) confirmed the superior performance of our proposed model over alternatives.
Problem

Research questions and friction points this paper is trying to address.

Domestic Violence
Data Imbalance
Model Transparency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Male Victims of Domestic Violence
Machine Learning and Deep Learning Models
Interpretability and Transparency in Predictive Models
🔎 Similar Papers
No similar papers found.
Md Abrar Jahin
Md Abrar Jahin
Center on Knowledge Graphs, Information Sciences Institute, University of Southern California
Deep LearningQuantum Machine LearningGeometric Deep LearningTrustworthy AI
Saleh Akram Naife
Saleh Akram Naife
Department of Industrial Engineering and Management, Khulna University of Engineering and Technology (KUET), Khulna, 9203, Bangladesh
F
Fatema Tuj Johora Lima
Department of Political Studies, Shahjalal University of Science and Technology, Sylhet, 3114, Bangladesh
M
M. F. Mridha
Department of Computer Science, American International University-Bangladesh, Dhaka, 1229, Bangladesh
J
Jungpil Shin
Department of Computer Science and Engineering, The University of Aizu, Aizuwakamatsu, 965-8580, Japan