π€ AI Summary
Traditional fMRI neural encoding models flatten brain volumes into one-dimensional vectors, discarding spatial structure and anatomical priors, while being constrained by fixed voxel gridsβthus failing to capture local response smoothness and cross-subject consistency. To address this, we propose the Neural Response Function (NRF), the first method to model brain responses as a continuous implicit function defined in the standardized MNI anatomical space. NRF jointly takes natural images and 3D coordinates (x, y, z) as input, enabling continuous response prediction at arbitrary spatial locations. By operating in continuous coordinate space, NRF eliminates voxel-grid constraints, inherently supports cross-subject alignment and multi-resolution querying, and efficiently encodes local smoothness via implicit representation. Experiments demonstrate that NRF significantly outperforms baseline models on both within-subject encoding and cross-subject transfer tasks, achieving comparable performance with only 1/10β1/100 of the training data required by conventional approaches.
π Abstract
Neural encoding models aim to predict fMRI-measured brain responses to natural images. fMRI data is acquired as a 3D volume of voxels, where each voxel has a defined spatial location in the brain. However, conventional encoding models often flatten this volume into a 1D vector and treat voxel responses as independent outputs. This removes spatial context, discards anatomical information, and ties each model to a subject-specific voxel grid. We introduce the Neural Response Function (NRF), a framework that models fMRI activity as a continuous function over anatomical space rather than a flat vector of voxels. NRF represents brain activity as a continuous implicit function: given an image and a spatial coordinate (x, y, z) in standardized MNI space, the model predicts the response at that location. This formulation decouples predictions from the training grid, supports querying at arbitrary spatial resolutions, and enables resolution-agnostic analyses. By grounding the model in anatomical space, NRF exploits two key properties of brain responses: (1) local smoothness -- neighboring voxels exhibit similar response patterns; modeling responses continuously captures these correlations and improves data efficiency, and (2) cross-subject alignment -- MNI coordinates unify data across individuals, allowing a model pretrained on one subject to be fine-tuned on new subjects. In experiments, NRF outperformed baseline models in both intrasubject encoding and cross-subject adaptation, achieving high performance while reducing the data size needed by orders of magnitude. To our knowledge, NRF is the first anatomically aware encoding model to move beyond flattened voxels, learning a continuous mapping from images to brain responses in 3D space.