Notable papers include 'Block-Biased Mamba for Long-Range Sequence Processing', 'Emoji Attack: Enhancing Jailbreak Attacks Against Judge LLM Detection', 'Tuning Frequency Bias of State Space Models', and 'HOPE for a Robust Parameterization of Long-memory State Space Models'.
Serving as Area Chair for NeurIPS 2025, ICML 2025, and ICLR 2025.
Co-organizing the Deep Learning for Science Summer School.
Co-organizing the Berkeley Lab AI for Science Summit (BLASS 24).
Background
Currently a Research Scientist at Lawrence Berkeley National Laboratory.
Leads the Deep Learning Group at the International Computer Science Institute (ICSI), an affiliated institute of UC Berkeley.
Broadly interested in understanding how deep learning systems work and improving their robustness, interpretability, and efficiency.
Applies a scientific approach to neural networks, using dynamical systems theory to explain issues like vanishing/exploding gradients.
Currently working on large-scale generative diffusion models for spatio-temporal forecasting in earth science and fluid dynamics.
Exploring how foundation models can integrate reasoning and multimodal information for better predictions.
Increasing focus on AI safety, particularly understanding and mitigating vulnerabilities in large language models such as jailbreaking and backdoor attacks.