Automatic Prompt Optimization for Dataset-Level Feature Discovery

📅 2026-01-20

📈 Citations: 0

✨ Influential: 0

career value

176K/year

🤖 AI Summary

This work addresses the challenge of automatically discovering interpretable and discriminative global features from unstructured text by proposing a dataset-level prompt optimization approach. It extends prompt learning beyond the instance level to the dataset level for the first time, employing a multi-agent collaborative framework that iteratively generates feature definitions, extracts feature values, and jointly optimizes a shared prompt based on feedback from both downstream classification performance and interpretability. Experimental results demonstrate that the method automatically produces high-quality, human-understandable feature sets across multiple text classification tasks, significantly enhancing model performance and confirming its effectiveness and generalizability.

Technology Category

Application Category

📝 Abstract

Feature extraction from unstructured text is a critical step in many downstream classification pipelines, yet current approaches largely rely on hand-crafted prompts or fixed feature schemas. We formulate feature discovery as a dataset-level prompt optimization problem: given a labelled text corpus, the goal is to induce a global set of interpretable and discriminative feature definitions whose realizations optimize a downstream supervised learning objective. To this end, we propose a multi-agent prompt optimization framework in which language-model agents jointly propose feature definitions, extract feature values, and evaluate feature quality using dataset-level performance and interpretability feedback. Instruction prompts are iteratively refined based on this structured feedback, enabling optimization over prompts that induce shared feature sets rather than per-example predictions. This formulation departs from prior prompt optimization methods that rely on per-sample supervision and provides a principled mechanism for automatic feature discovery from unstructured text.

Problem

Research questions and friction points this paper is trying to address.

feature discovery

prompt optimization

unstructured text

interpretable features

dataset-level

Innovation

Methods, ideas, or system contributions that make the work stand out.

prompt optimization

feature discovery

multi-agent framework