Characterizing Model Collapse in Large Language Models Using Semantic Networks and Next-Token Probability

📅 2024-10-16

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

This work addresses “model collapse” in large language models (LLMs) during self-generated data fine-tuning—a phenomenon characterized by progressive degradation in both textual diversity and generation quality. We propose the first quantitative framework integrating semantic networks with next-token probability modeling to precisely characterize repetition and diversity loss. Additionally, we introduce a novel evaluation paradigm for collapse severity that operates across datasets and systematically varies the proportion of synthetic tokens. Experiments on three diverse text corpora demonstrate a strong correlation between collapse severity and synthetic token ratio, revealing pronounced domain-specific collapse patterns. Our study establishes a reproducible, fine-grained evaluation metric suite for model collapse, offering both theoretical foundations and empirical tools to support sustainable self-evolution of generative AI systems.

Technology Category

Application Category

📝 Abstract

As synthetic content increasingly infiltrates the web, generative AI models may experience an autophagy process, where they are fine-tuned using their own outputs. This autophagy could lead to a phenomenon known as model collapse, which entails a degradation in the performance and diversity of generative AI models over successive generations. Recent studies have explored the emergence of model collapse across various generative AI models and types of data. However, the current characterizations of model collapse tend to be simplistic and lack comprehensive evaluation. In this article, we conduct a thorough investigation of model collapse across three text datasets, utilizing semantic networks to analyze text repetitiveness and diversity, while employing next-token probabilities to quantify the loss of diversity. We also examine how the proportions of synthetic tokens affect the severity of model collapse and perform cross-dataset evaluations to identify domain-specific variations. By proposing metrics and strategies for a more detailed assessment of model collapse, our study provides new insights for the development of robust generative AI systems.

Problem

Research questions and friction points this paper is trying to address.

Large Language Models

Content Degeneration

Self-improvement Limitations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Model Collapse

Generative AI Evaluation

Synthetic Vocabulary Impact

🔎 Similar Papers

Collapsed Language Models Promote Fairness