Improving User Behavior Prediction: Leveraging Annotator Metadata in Supervised Machine Learning Models

📅 2025-03-26

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

User behavior prediction in dialogue texts suffers from low-quality crowdsourced annotations and limited performance of underlying NLP tasks. Method: We propose the Metadata-Sensitive Weighted-Encoding Ensemble Model (MSWEEM), the first framework to systematically model annotator metadata—such as fatigue and response speed—to quantify individual reliability, and integrate it into a weighted encoding ensemble architecture. Specifically, we transform annotator metadata—including educational background and response patterns—into learnable weights, enabling dynamic, annotation-quality-aware feature encoding and ensemble learning. Contribution/Results: On held-out test sets, MSWEEM achieves a 14% accuracy gain; cross-dataset generalization improves by 12%. Empirical analysis confirms that fatigue and response speed are highly discriminative metadata signals, and annotators with higher education exhibit both efficiency and consistency. This work establishes a novel paradigm for robust behavioral modeling under low-quality annotation conditions.

Technology Category

Application Category

📝 Abstract

Supervised machine-learning models often underperform in predicting user behaviors from conversational text, hindered by poor crowdsourced label quality and low NLP task accuracy. We introduce the Metadata-Sensitive Weighted-Encoding Ensemble Model (MSWEEM), which integrates annotator meta-features like fatigue and speeding. First, our results show MSWEEM outperforms standard ensembles by 14% on held-out data and 12% on an alternative dataset. Second, we find that incorporating signals of annotator behavior, such as speed and fatigue, significantly boosts model performance. Third, we find that annotators with higher qualifications, such as Master's, deliver more consistent and faster annotations. Given the increasing uncertainty over annotation quality, our experiments show that understanding annotator patterns is crucial for enhancing model accuracy in user behavior prediction.

Problem

Research questions and friction points this paper is trying to address.

Improving user behavior prediction from conversational text

Addressing poor label quality in crowdsourced annotations

Enhancing model accuracy using annotator metadata

Innovation

Methods, ideas, or system contributions that make the work stand out.

MSWEEM integrates annotator meta-features

Annotator behavior signals boost performance

Higher qualifications ensure consistent annotations

🔎 Similar Papers

Prompt Selection Matters: Enhancing Text Annotations for Social Sciences with Large Language Models