Generative causal testing to bridge data-driven models and scientific theories in language neuroscience

📅 2024-10-01
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates the stimulus features driving language selectivity across brain regions in LLM-predicted BOLD fMRI responses, aiming to generate testable neuroscientific explanations. To this end, we propose the Generative Causal Testing (GCT) framework—first repurposing LLMs from predictive tools to hypothesis generators—by leveraging controllable text generation to formulate formal, falsifiable hypotheses about neural language selectivity, followed by closed-loop fMRI validation. Our approach integrates fMRI encoding modeling, causal intervention design, controllable LLM generation, and interpretable neural representational analysis. Key contributions include: (1) high-accuracy, empirically verifiable causal explanations at both voxel- and ROI-levels; (2) discovery of fine-grained functional subdivisions within prefrontal cortex and their precise linguistic selectivities; and (3) empirical confirmation that explanation accuracy strongly correlates with model performance and stability, thereby establishing a bidirectional bridge between data-driven modeling and formal neurocognitive theory.

Technology Category

Application Category

📝 Abstract
Representations from large language models are highly effective at predicting BOLD fMRI responses to language stimuli. However, these representations are largely opaque: it is unclear what features of the language stimulus drive the response in each brain area. We present generative causal testing (GCT), a framework for generating concise explanations of language selectivity in the brain from predictive models and then testing those explanations in follow-up experiments using LLM-generated stimuli.This approach is successful at explaining selectivity both in individual voxels and cortical regions of interest (ROIs), including newly identified microROIs in prefrontal cortex. We show that explanatory accuracy is closely related to the predictive power and stability of the underlying predictive models. Finally, we show that GCT can dissect fine-grained differences between brain areas with similar functional selectivity. These results demonstrate that LLMs can be used to bridge the widening gap between data-driven models and formal scientific theories.
Problem

Research questions and friction points this paper is trying to address.

Explains language selectivity in brain areas using predictive models.
Tests explanations with LLM-generated stimuli in follow-up experiments.
Bridges gap between data-driven models and scientific theories.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generative causal testing explains brain language selectivity.
LLM-generated stimuli test predictive model explanations.
GCT dissects fine-grained differences in brain areas.
🔎 Similar Papers
No similar papers found.
Richard Antonello
Richard Antonello
Postdoctoral Scholar, Columbia University
Natural Language ProcessingComputational NeuroscienceNeuroscience of Language
Chandan Singh
Chandan Singh
Senior researcher, Microsoft research
🔍 Interpretability🤖 Foundation models🧠 Neuroscience🌳 Transparent models💊 Healthcare
S
Shailee Jain
Neurosurgery Department, University of California, San Francisco, CA, USA.
A
Aliyah R. Hsu
EECS Department, University of California, Berkeley, CA, USA.
S
Sihang Guo
J
Jianfeng Gao
Microsoft Research, Redmond, WA, USA.
B
Bin Yu
EECS Department, University of California, Berkeley, CA, USA.; Statistics Department, University of California, Berkeley, CA, USA.; Center for Computational Biology, University of California, Berkeley, CA, USA.
Alexander Huth
Alexander Huth
Assistant Professor Statistics & Neuroscience UC Berkeley
fMRIcomputational neuroscience