🤖 AI Summary
This work addresses the dual challenges of semantic heterogeneity in metadata and biological distribution shifts in single-cell perturbation modeling. To overcome these issues, the authors propose HarmonyCell, an end-to-end agent framework that achieves fully automated semantic alignment and adaptive architecture synthesis without human intervention. HarmonyCell employs an LLM-driven semantic unifier to harmonize heterogeneous metadata and integrates adaptive Monte Carlo Tree Search to dynamically construct model architectures tailored to distribution shifts within a hierarchical action space. Experimental results demonstrate that HarmonyCell attains a 95% effective execution rate on heterogeneous datasets—where generic agents fail completely—and matches or even surpasses expert-designed models in out-of-distribution evaluations, substantially enhancing the scalability and automation of cross-dataset virtual cell modeling.
📝 Abstract
Single-cell perturbation studies face dual heterogeneity bottlenecks: (i) semantic heterogeneity--identical biological concepts encoded under incompatible metadata schemas across datasets; and (ii) statistical heterogeneity--distribution shifts from biological variation demanding dataset-specific inductive biases. We propose HarmonyCell, an end-to-end agent framework resolving each challenge through a dedicated mechanism: an LLM-driven Semantic Unifier autonomously maps disparate metadata into a canonical interface without manual intervention; and an adaptive Monte Carlo Tree Search engine operates over a hierarchical action space to synthesize architectures with optimal statistical inductive biases for distribution shifts. Evaluated across diverse perturbation tasks under both semantic and distribution shifts, HarmonyCell achieves a 95% valid execution rate on heterogeneous input datasets (versus 0% for general agents) while matching or even exceeding expert-designed baselines in rigorous out-of-distribution evaluations. This dual-track orchestration enables scalable automatic virtual cell modeling without dataset-specific engineering.