SLIM: Sparse Latent Steering for Interpretable and Property-Directed LLM-Based Molecular Editing

📅 2026-05-11

📈 Citations: 0

✨ Influential: 0

career value

217K/year

🤖 AI Summary

This work addresses the challenge of precise property control in large language model (LLM)-based molecular editing, where target properties are entangled within dense latent representations. To overcome this, the authors propose SLIM, a novel framework that introduces a sparse latent space into LLM-driven molecular editing for the first time. By employing a sparse autoencoder equipped with learnable importance gating, SLIM disentangles the latent states into sparse, property-aligned features, enabling plug-and-play, attribute-guided editing without fine-tuning. The method demonstrates strong interpretability and broad applicability, achieving state-of-the-art performance across four model architectures and eight molecular properties on the MolEditRL benchmark, with editing success rates improving by up to 42.4 percentage points over existing baselines.

📝 Abstract

Large language models possess strong chemical reasoning capabilities, making them effective molecular editors. However, property-relevant information is implicitly entangled across their dense hidden states, providing no explicit handle for property control: a substantial fraction of edits fail to improve or even degrade target properties. To address these issues, we propose SLIM (Sparse Latent Interpretable Molecular editing), a plug-and-play framework that decomposes the editor's hidden states into sparse, property-aligned features via a Sparse Autoencoder with learnable importance gates. Steering in this sparse feature space precisely activates property-relevant dimensions, improving editing success rate without modifying model parameters. The same sparse basis further supports interpretable analysis of editing behavior. Experiments on the MolEditRL benchmark across four model architectures and eight molecular properties show consistent gains over baselines, with improvements of up to 42.4 points.

Problem

Research questions and friction points this paper is trying to address.

molecular editing

property control

latent representation

interpretability

large language models

Innovation

Methods, ideas, or system contributions that make the work stand out.

sparse latent steering

interpretable molecular editing

property-directed editing