🤖 AI Summary
This work addresses the challenge of discrete biological sequence optimization under strict grammatical constraints, where existing methods struggle to balance controllable editing with long-range planning. The authors propose a novel post-training framework that, for the first time, integrates Levenshtein-aligned shortest edit trajectories with population-based policy optimization to guide large language models in generating verifiable atomic edit sequences—comprising insertions, deletions, and substitutions—thereby internalizing a denoising editing process. The approach substantially advances protein sequence optimization, increasing success rates from 42% to 89% and novelty from 47% to 97%, while consistently outperforming current baselines in terms of efficacy, controllability, and edit coherence.
📝 Abstract
Discrete biological sequence optimization requires iterative refinement under strict syntactic constraints. Diffusion models offer progressive refinement but do not naturally expose controllable discrete edit operations, while autoregressive LLMs often lack explicit long-horizon planning for constrained edits. We propose STRIDE (Sequence Trajectory Refinement via Internalized Denoising Emulation), a post-training framework that trains an LLM to emit executable trajectories of atomic edits (INSERT/DELETE/REPLACE) as a verifiable reasoning trace for variable-length refinement. STRIDE combines supervised fine-tuning on Levenshtein-aligned shortest edit demonstrations with group-based policy optimization to align edit trajectories with task rewards while preserving coherent editing behavior. Across protein fluorescence and instruction-conditioned molecular optimization, STRIDE improves variable-length protein editing success from 42% to 89% while increasing novelty from 47% to 97%, and yields stronger validity and controllability compared to diverse baselines. The code is published at https://github.com/daiheng-zhang/STRIDE.