Publications: 1. DataVinci: Learning syntactic and semantic string repairs (SIGMOD 2025). 2. Evaluating agent-based program repair at google (ICSE SEIP 2025). 3. Encoding spreadsheets for large language models (EMNLP 2024). 4. An empirical study of validating synthetic data for formula generation (NAACL Findings 2025). 5. Solving data-centric tasks using large language models (NAACL 2024). 6. Code Fusion: A pre-trained diffusion model for code generation (EMNLP 2023). 7. Cornet: Learning table formatting rules by example (VLDB 2023). 8. Co-audit: Tools to help humans double-check AI-generated content (PLATEAU 2024). 9. EmFore: Online learning of email folder classification rules (CIKM 2023). 10. FLAME: A small language model for spreadsheet formulas (AAAI 2024). 11. FlashFill++: Scaling programming by example by cutting to the chase (POPL 2023). 12. FormaT5: Abstention and examples for conditional table formatting with natural language (VLDB 2024).
Research Experience
1. Staff Software Engineer at Google DevAI team, working on machine learning systems to enhance developer productivity. 2. Senior Researcher at Microsoft PROSE team, developing advanced program synthesis technologies.
Education
PhD in Computer Science from MIT, supervised by Martin Rinard; MS in Computer Science from NYU, worked with Dennis Shasha; BA in Economics from University of Pennsylvania. Originally from Costa Rica.
Background
Research Interests: Intersection of machine learning and software engineering. Brief: Currently working as a staff software engineer in Google's DevAI team, building machine learning systems to make Google developers more productive. Previously, a senior researcher on the PROSE team at Microsoft, focusing on developing state-of-the-art program synthesis technologies to make software development more accessible, productive, and fun.