Time-Dependent VAE for Building Latent Representations from Visual Neural Activity with Complex Dynamics

📅 2024-08-15

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

This study addresses the challenge of extracting interpretable, low-dimensional latent representations from high-dimensional, time-varying neural activity in the visual cortex—specifically disentangling stimulus-driven content information from endogenous, state-dependent neural dynamics. To this end, we propose Split-VAE, the first temporally conditioned variational autoencoder for neural data: a dual-path architecture explicitly separates content latents (encoding natural visual inputs) from style latents (capturing internal state evolution), augmented by a state-factor module to model temporal dependencies in neural activity. Integrating variational inference, temporal conditional distribution modeling, and self-supervised contrastive learning, the model achieves state-of-the-art natural scene decoding performance on mouse visual cortical recordings. It significantly enhances the sensitivity of latent representations to visual semantics and improves interpretability of underlying neural dynamics.

Technology Category

Application Category

📝 Abstract

Seeking high-quality representations with latent variable models (LVMs) to reveal the intrinsic correlation between neural activity and behavior or sensory stimuli has attracted much interest. Most work has focused on analyzing motor neural activity that controls clear behavioral traces and has modeled neural temporal relationships in a way that does not conform to natural reality. For studies of visual brain regions, naturalistic visual stimuli are high-dimensional and time-dependent, making neural activity exhibit intricate dynamics. To cope with such conditions, we propose Time-Dependent Split VAE (TiDeSPL-VAE), a sequential LVM that decomposes visual neural activity into two latent representations while considering time dependence. We specify content latent representations corresponding to the component of neural activity driven by the current visual stimulus, and style latent representations corresponding to the neural dynamics influenced by the organism's internal state. To progressively generate the two latent representations over time, we introduce state factors to construct conditional distributions with time dependence and apply self-supervised contrastive learning to shape them. By this means, TiDeSPL-VAE can effectively analyze complex visual neural activity and model temporal relationships in a natural way. We compare our model with alternative approaches on synthetic data and neural data from the mouse visual cortex. The results show that our model not only yields the best decoding performance on naturalistic scenes/movies but also extracts explicit neural dynamics, demonstrating that it builds latent representations more relevant to visual stimuli.

Problem

Research questions and friction points this paper is trying to address.

Modeling time-evolving latent representations of visual neural activity

Decoding naturalistic visual stimuli from mouse cortical recordings

Capturing temporal dynamics in neural-behavioral relationships

Innovation

Methods, ideas, or system contributions that make the work stand out.

Sequential latent variable model for time-evolving neural representations

Decomposes neural activity into dual-part latent representations

Uses contrastive learning to shape stimulus-aligned latent embeddings

🔎 Similar Papers

A Markov Random Field Multi-Modal Variational AutoEncoder