SAE-RNA: A Sparse Autoencoder Model for Interpreting RNA Language Model Representations

πŸ“… 2025-10-03
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Current RNA language models (e.g., RiNALMo) lack transparency in how they encode mRNA versus non-coding RNA (ncRNA) family information, and no systematic interpretability framework exists for dissecting their learned representations. Method: We propose SAE-RNAβ€”the first interpretability method applying sparse autoencoders (SAEs) to discover biologically meaningful concepts in pre-trained RNA models without retraining; it maps frozen embeddings to interpretable biological features via alignment with authoritative sequence annotations. Results: Applying SAE-RNA to RiNALMo, we systematically decode and visualize latent representations, identifying neuron-level concepts strongly associated with canonical ncRNA families (e.g., snoRNAs, miRNAs). This enables fine-grained, cross-RNA-type functional comparison. Our work reveals how ncRNA families are encoded in large RNA models and establishes a new paradigm for RNA model interpretability, providing reliable, hypothesis-generating computational insights into RNA biology.

Technology Category

Application Category

πŸ“ Abstract
Deep learning, particularly with the advancement of Large Language Models, has transformed biomolecular modeling, with protein advances (e.g., ESM) inspiring emerging RNA language models such as RiNALMo. Yet how and what these RNA Language Models internally encode about messenger RNA (mRNA) or non-coding RNA (ncRNA) families remains unclear. We present SAE- RNA, interpretability model that analyzes RiNALMo representations and maps them to known human-level biological features. Our work frames RNA interpretability as concept discovery in pretrained embeddings, without end-to-end retraining, and provides practical tools to probe what RNA LMs may encode about ncRNA families. The model can be extended to close comparisons between RNA groups, and supporting hypothesis generation about previously unrecognized relationships.
Problem

Research questions and friction points this paper is trying to address.

Interpreting internal representations of RNA language models
Mapping model embeddings to human-level biological RNA features
Discovering relationships between RNA groups without retraining models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Sparse autoencoder interprets RNA language model representations
Maps RNA model embeddings to human-level biological features
Enables concept discovery without end-to-end retraining
πŸ”Ž Similar Papers
No similar papers found.
Taehan Kim
Taehan Kim
University of California, Berkeley
Artificial IntelligenceComputational BiologyAI4ScienceCS Education
S
Sangdae Nam
Department of Development Engineering, University of California, Berkeley