Machines in the Margins: A Systematic Review of Automated Content Generation for Wikipedia

📅 2025-09-26

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

This study addresses the longstanding research gap in Wikipedia automated content generation—where prior work predominantly focuses on deployed bots while neglecting broader academic proposals for non-deployed systems. We conduct the first systematic literature review (SLR) of automation methods not yet implemented in production, covering 102 peer-reviewed studies from 2010 to 2023. Our analysis integrates advances in natural language generation, information extraction, and machine learning to characterize method types, model architectures, data sources, and evaluation paradigms. As a key contribution, we present the first comprehensive technology map for Wikipedia-oriented automated content generation, identifying critical gaps in interpretability, editor collaboration, and quality assessment. Furthermore, we propose a novel theoretical framework for AI-augmented crowdsourced content production, offering empirically grounded insights for CSCW, user-generated content research, and AI-supported community collaboration.

Technology Category

Application Category

📝 Abstract

Wikipedia is among the largest examples of collective intelligence on the Web with over 61 million articles covering over 320 languages. Although edited and maintained by an active workforce of human volunteers, Wikipedia is highly reliant on automated bots to fill gaps in its human workforce. As well as administrative and governance tasks, these bots also play a role in generating content, although to date such agents represent the smallest proportion of bots. While there has been considerable analysis of bots and their activity in Wikipedia, such work captures only automated agents that have been actively deployed to Wikipedia and fails to capture the methods that have been proposed to generate Wikipedia content in the wider literature. In this paper, we conduct a systematic literature review to explore how researchers have operationalised and evaluated automated content-generation agents for Wikipedia. We identify the scope of these generation methods, the techniques and models used, the source content used for generation and the evaluation methodologies which support generation processes. We also explore implications of our findings to CSCW, User Generated Content and Wikipedia, as well as research directions for future development. To the best of our knowledge, we are among the first to review the potential contributions of this understudied form of AI support for the Wikipedia community beyond the implementation of bots.

Problem

Research questions and friction points this paper is trying to address.

Systematically reviewing automated Wikipedia content generation methods

Identifying techniques and evaluation for AI-generated Wikipedia articles

Exploring AI support beyond existing Wikipedia bot implementations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Systematic review of automated content generation methods

Analysis of AI techniques for Wikipedia article creation

Evaluation of automated content generation processes

🔎 Similar Papers

No similar papers found.