EAVIT: Efficient and Accurate Human Value Identification from Text data via LLMs

📅 2025-05-19

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address the performance degradation and high computational overhead of large language models (LLMs) in human value identification for long texts, this paper proposes a two-stage collaborative framework combining local model guidance with online LLM calibration. Methodologically, it introduces (1) a fine-tunable lightweight value detector trained via explanation-driven learning and value-semantic-guided active sampling to generate high-information-density prompts; and (2) a prompt refinement and compression mechanism that substantially reduces input token count. Experimental results demonstrate that the approach cuts token consumption to one-sixth of direct LLM invocation, while consistently outperforming both BERT-based baselines and end-to-end LLM approaches in accuracy. It achieves state-of-the-art performance across multiple value identification benchmarks, validating its efficacy in balancing efficiency and fidelity for fine-grained value detection in lengthy textual inputs.

Technology Category

Application Category

📝 Abstract

The rapid evolution of large language models (LLMs) has revolutionized various fields, including the identification and discovery of human values within text data. While traditional NLP models, such as BERT, have been employed for this task, their ability to represent textual data is significantly outperformed by emerging LLMs like GPTs. However, the performance of online LLMs often degrades when handling long contexts required for value identification, which also incurs substantial computational costs. To address these challenges, we propose EAVIT, an efficient and accurate framework for human value identification that combines the strengths of both locally fine-tunable and online black-box LLMs. Our framework employs a value detector - a small, local language model - to generate initial value estimations. These estimations are then used to construct concise input prompts for online LLMs, enabling accurate final value identification. To train the value detector, we introduce explanation-based training and data generation techniques specifically tailored for value identification, alongside sampling strategies to optimize the brevity of LLM input prompts. Our approach effectively reduces the number of input tokens by up to 1/6 compared to directly querying online LLMs, while consistently outperforming traditional NLP methods and other LLM-based strategies.

Problem

Research questions and friction points this paper is trying to address.

Efficient human value identification from text using LLMs

Reducing computational costs in long-context value detection

Improving accuracy via local and online LLM combination

Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines local and online LLMs for efficiency

Uses explanation-based training for value detection

Reduces input tokens by optimizing prompt brevity

🔎 Similar Papers

No similar papers found.

Authors to Follow