BACH-V: Bridging Abstract and Concrete Human-Values in Large Language Models

📅 2026-01-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates whether large language models genuinely comprehend abstract values or merely treat them as statistical patterns. To this end, the authors propose a three-tiered abstraction–concreteness framework (A–A, A–C, C–C) and conduct cross-level transfer experiments across six open-source models and ten value dimensions, integrating probing and representation intervention techniques. The work provides the first systematic evidence that abstract values function within models as stable internal anchors rather than malleable activations. Crucially, interventions at the abstract layer causally influence concrete behavioral decisions without altering the model’s abstract interpretation of those values. These findings reveal a structured bridging mechanism between abstract moral reasoning and actionable outputs, offering a theoretical foundation for interpretable and generalizable value alignment in artificial intelligence systems.

Technology Category

Application Category

📝 Abstract
Do large language models (LLMs) genuinely understand abstract concepts, or merely manipulate them as statistical patterns? We introduce an abstraction-grounding framework that decomposes conceptual understanding into three capacities: interpretation of abstract concepts (Abstract-Abstract, A-A), grounding of abstractions in concrete events (Abstract-Concrete, A-C), and application of abstract principles to regulate concrete decisions (Concrete-Concrete, C-C). Using human values as a testbed - given their semantic richness and centrality to alignment - we employ probing (detecting value traces in internal activations) and steering (modifying representations to shift behavior). Across six open-source LLMs and ten value dimensions, probing shows that diagnostic probes trained solely on abstract value descriptions reliably detect the same values in concrete event narratives and decision reasoning, demonstrating cross-level transfer. Steering reveals an asymmetry: intervening on value representations causally shifts concrete judgments and decisions (A-C, C-C), yet leaves abstract interpretations unchanged (A-A), suggesting that encoded abstract values function as stable anchors rather than malleable activations. These findings indicate LLMs maintain structured value representations that bridge abstraction and action, providing a mechanistic and operational foundation for building value-driven autonomous AI systems with more transparent, generalizable alignment and control.
Problem

Research questions and friction points this paper is trying to address.

abstract concepts
large language models
human values
conceptual understanding
alignment
Innovation

Methods, ideas, or system contributions that make the work stand out.

abstraction-grounding
value alignment
probing and steering
large language models
structured representations
🔎 Similar Papers
No similar papers found.
J
Junyu Zhang
State Key Laboratory of General Artificial Intelligence, BIGAI; Shandong University
Yipeng Kang
Yipeng Kang
BIGAI
Natural language processing
Jiong Guo
Jiong Guo
Shandong University
Algorithms and Complexity
Jiayu Zhan
Jiayu Zhan
Peking University
visual cognitionneuroscience
J
Junqi Wang
State Key Laboratory of General Artificial Intelligence, BIGAI