ZeroDL: Zero-shot Distribution Learning for Text Clustering via Large Language Models

📅 2024-06-19

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

179K/year

🤖 AI Summary

Current large language models (LLMs) struggle to accurately model text distributions via prompting alone, limiting unsupervised text clustering performance. To address this, we propose ZeroDL, a zero-shot distribution learning framework that requires no labeled data, human-crafted demonstrations, or model fine-tuning. ZeroDL introduces a novel three-stage paradigm: (1) zero-shot open-ended generation from LLMs, (2) semantic aggregation in an induced latent space, and (3) implicit distribution modeling via meta-information distillation and geometric clustering. By leveraging in-context learning to elicit generative capabilities and jointly optimizing semantic coherence and structural fidelity, ZeroDL automatically uncovers the intrinsic distributional geometry of unlabeled texts. Extensive experiments on standard text clustering benchmarks demonstrate that ZeroDL consistently outperforms state-of-the-art unsupervised and self-supervised methods, validating the effectiveness, generalizability, and practicality of zero-shot distribution learning.

Technology Category

Application Category

📝 Abstract

The recent advancements in large language models (LLMs) have brought significant progress in solving NLP tasks. Notably, in-context learning (ICL) is the key enabling mechanism for LLMs to understand specific tasks and grasping nuances. In this paper, we propose a simple yet effective method to contextualize a task toward a specific LLM, by (1) observing how a given LLM describes (all or a part of) target datasets, i.e., open-ended zero-shot inference, and (2) aggregating the open-ended inference results by the LLM, and (3) finally incorporate the aggregated meta-information for the actual task. We show the effectiveness of this approach in text clustering tasks, and also highlight the importance of the contextualization through examples of the above procedure.

Problem

Research questions and friction points this paper is trying to address.

Enabling LLMs to perform tasks not fully describable in prompts

Improving text clustering via zero-shot distribution learning

Generating class labels to enhance LLM task understanding

Innovation

Methods, ideas, or system contributions that make the work stand out.

Utilizes open-ended zero-shot inference

Aggregates inference results effectively

Incorporates meta-information for tasks

🔎 Similar Papers

Text Clustering as Classification with LLMs