Browse publications on Google Scholar (top-right) ↗
Resume (English only)
Academic Achievements
Published multiple papers in top conferences such as ICLR and NeurIPS. Specific publications include:
- Learn-by-interact: Synthesize Large-scale Agent Data with Trajectories by Interacting with Environments
- Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows
- Astute RAG: Overcoming Imperfect Retrieval Augmentation and Knowledge Conflicts for Large Language Models
- From Few to Many: Enhancing Many-Shot In-Context Learning with Optimized Example Selection and Expansion
- CHASE-SQL: Multi-Path Reasoning and Preference Optimized Candidate Selection in Text-to-SQL
- SQL-GEN: Bridging the Dialect Gap for Text-to-SQL Via Synthetic Data And Model Merging
- Learning to Clarify: Multi-turn Conversations with Action-Based Contrastive Self-Training
- BRIGHT: a realistic Benchmark for ReasonInG-Heavy reTrieval
- Teach Better or Show Smarter? On Instructions and Exemplars in Automatic Prompt Optimization
- Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?
- Chain of Agents: Large Language Models Collaborating on Long-Context Tasks
- SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL
- Capabilities of Gemini Models in Medicine
Research Experience
Senior research scientist at Google Cloud AI Research, involved in multiple research projects including evaluating text-to-SQL workflows and synthesizing large-scale agent data.
Education
Ph.D. in Computational Neuroscience from Columbia University in 2019.
Background
Research interests include large language models for code generation, agents, and factuality. Currently a senior research scientist at Google Cloud AI Research.