Using Artificial Intuition in Distinct, Minimalist Classification of Scientific Abstracts for Management of Technology Portfolios

📅 2025-08-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Scientific abstract classification faces challenges including textual sparsity, insufficient contextual cues, high label overlap, and reliance on semi-supervised settings. This paper proposes the “Artificial Intuition” framework, which leverages large language models (LLMs) to generate semantic metadata and construct a mutually exclusive, interpretable, and context-agnostic label taxonomy—eliminating dependence on semi-supervised learning. Our method integrates a minimalist classification strategy with clustering analysis to enable precise, fully automated annotation of abstract scientific summaries. Evaluated on U.S. and Chinese research grant abstract datasets, it achieves significant improvements in classification accuracy and discriminative power, effectively uncovering funding trends to support technology scouting and R&D portfolio management. The core contribution is the first systematic integration of LLM-driven semantic metadata generation with expert-informed label design, enabling fine-grained, highly interpretable, and zero-shot semi-supervised abstract classification.

Technology Category

Application Category

📝 Abstract
Classification of scientific abstracts is useful for strategic activities but challenging to automate because the sparse text provides few contextual clues. Metadata associated with the scientific publication can be used to improve performance but still often requires a semi-supervised setting. Moreover, such schemes may generate labels that lack distinction -- namely, they overlap and thus do not uniquely define the abstract. In contrast, experts label and sort these texts with ease. Here we describe an application of a process we call artificial intuition to replicate the expert's approach, using a Large Language Model (LLM) to generate metadata. We use publicly available abstracts from the United States National Science Foundation to create a set of labels, and then we test this on a set of abstracts from the Chinese National Natural Science Foundation to examine funding trends. We demonstrate the feasibility of this method for research portfolio management, technology scouting, and other strategic activities.
Problem

Research questions and friction points this paper is trying to address.

Classifying sparse scientific abstracts with few contextual clues
Overcoming label overlap in semi-supervised classification methods
Automating expert-like text classification for technology portfolio management
Innovation

Methods, ideas, or system contributions that make the work stand out.

Using LLM to generate metadata
Artificial intuition replicates expert labeling
Minimalist classification for technology portfolios
🔎 Similar Papers
No similar papers found.
P
Prateek Ranka
Information Sciences Institute, Viterbi School of Engineering, University of Southern California
Fred Morstatter
Fred Morstatter
University of Southern California, Information Sciences Institute
Social Media MiningData ScienceData MiningMachine Learning
A
Alexandra Graddy-Reed
Sol Price School of Public Policy, University of Southern California
A
Andrea Belz
Information Sciences Institute, Viterbi School of Engineering, University of Southern California