- Wizard of Shopping: Target-Oriented E-commerce Dialogue Generation with Decision Tree Branching, published in Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025).
- Approximate Vector Set Search: A Bio-Inspired Approach for High-Dimensional Spaces, published in Proceedings of the 2025 IEEE 41st International Conference on Data Engineering (ICDE 2025).
- Joinable Search Over Multi-Source Spatial Datasets: Overlap, Coverage, and Efficiency, published in Proceedings of the 2025 IEEE 41st International Conference on Data Engineering (ICDE 2025).
- Identifying High Consideration E-Commerce Search Queries, published in Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP 2024 Industry Track).
- Unbiased Learning-to-Rank Needs Unconfounded Propensity Estimation, published in Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2024).
- Enhancing Low-Resource LLMs Classification with PEFT and Synthetic Data, published in Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024).
- Training-free Optimization of Generative Recommender Systems using Large Language Model Optimizers, published in Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024).
- InstructPTS: Instruction-Tuning LLMs for Product Title Summarization, published in Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023 Industry Track).
- Multi-Coner V2: A Large Multilingual Dataset for Fine-grained and Noisy Named Entity Recognition, published in Findings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023 Findings).
Research Experience
- Amazon, Seattle, WA, US (remote): Applied Scientist Intern (May 2021 - September 2021), proposed a reinforcement learning method for conversational question answering, accepted by EMNLP Industry Track 2022.
- Zhuiyi Technology, Shenzhen, Guangdong, China: Applied Science Intern (May 2019 - September 2019), led a team solving SuperGLUE benchmark tasks, achieving a 2nd-place ranking in our submission, covered in China Daily.
- Bloomberg, Princeton, NJ, USA: Applied Science Intern (June 2016 - August 2016), analyzed Bloomberg Data Licence usage patterns.
Education
- Lehigh University, Bethlehem, PA, USA: Ph.D. in Computer Science (2015-2022), Advisor: Prof. Brian D. Davison
- École Supérieure d’Ingénieurs Léonard de Vinci, Paris, France: Exchange Program (2014-2015)
- Nanjing University of Aeronautics and Astronautics, Nanjing, China: B.E. in Computer Science (2011-2015), GPA: 4.2/5.0, Ranking: 1/94
Background
Research interests include data mining, machine learning, natural language processing, and information retrieval. Currently working as an applied scientist at Amazon.
Miscellany
The name 'Zhiyu' comes from an ancient Chinese poem (好雨知时节), which means good rain knows the best time to fall.