Published 'Boa: A Language and Infrastructure for Analyzing Ultra-Large-Scale Software Repositories' at ICSE 2013
Received Distinguished Paper Award at MSR 2021 for 'Escaping the Time Pit: Pitfalls and Guidelines for Using Time-Based Git Data'
Received Best Paper Award at MODULARITY 2015 for 'Modular Reasoning in the Presence of Event Subtyping'
Multiple recent publications in top venues (ICSE, MSR, EMSE 2025) on topics including LLM-based code issue localization, Python test datasets, and Kotlin type inference usage
Advising Ph.D. and Master's students, including Fatemeh Raei Dehaghi (Ph.D.) and Idriss Abdelmadjid (Master's)
Serving on program committees and as co-chair for major conferences including ICSE 2026, MSR 2026, FSE 2026, and ICSME 2026
Background
Research interests focus on software engineering and programming languages
Applies data analytics techniques (e.g., mining software repositories) to study how developers use programming language features
Believes future programming languages can be better designed by leveraging empirical data
Works on ARG-V and PAClab for automatically generating benchmark programs for program analysis researchers
Lead researcher and engineer for the Boa language and infrastructure, a virtual lab for data-intensive research on ultra-large-scale open-source software repositories