SAE-RNA: A Sparse Autoencoder Model for Interpreting RNA Language Model Representations

📅 2025-10-03

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

Current RNA language models (e.g., RiNALMo) lack transparency in how they encode mRNA versus non-coding RNA (ncRNA) family information, and no systematic interpretability framework exists for dissecting their learned representations. Method: We propose SAE-RNA—the first interpretability method applying sparse autoencoders (SAEs) to discover biologically meaningful concepts in pre-trained RNA models without retraining; it maps frozen embeddings to interpretable biological features via alignment with authoritative sequence annotations. Results: Applying SAE-RNA to RiNALMo, we systematically decode and visualize latent representations, identifying neuron-level concepts strongly associated with canonical ncRNA families (e.g., snoRNAs, miRNAs). This enables fine-grained, cross-RNA-type functional comparison. Our work reveals how ncRNA families are encoded in large RNA models and establishes a new paradigm for RNA model interpretability, providing reliable, hypothesis-generating computational insights into RNA biology.

Technology Category

Application Category

📝 Abstract

Deep learning, particularly with the advancement of Large Language Models, has transformed biomolecular modeling, with protein advances (e.g., ESM) inspiring emerging RNA language models such as RiNALMo. Yet how and what these RNA Language Models internally encode about messenger RNA (mRNA) or non-coding RNA (ncRNA) families remains unclear. We present SAE- RNA, interpretability model that analyzes RiNALMo representations and maps them to known human-level biological features. Our work frames RNA interpretability as concept discovery in pretrained embeddings, without end-to-end retraining, and provides practical tools to probe what RNA LMs may encode about ncRNA families. The model can be extended to close comparisons between RNA groups, and supporting hypothesis generation about previously unrecognized relationships.

Problem

Research questions and friction points this paper is trying to address.

Interpreting internal representations of RNA language models

Mapping model embeddings to human-level biological RNA features

Discovering relationships between RNA groups without retraining models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Sparse autoencoder interprets RNA language model representations

Maps RNA model embeddings to human-level biological features

Enables concept discovery without end-to-end retraining

🔎 Similar Papers

No similar papers found.