- Enhancing the reasoning capabilities of vision-language models through reinforcement learning with various reward signals.
- Building autonomous agents that can perceive, reason, and act in complex environments (e.g., real websites) using multimodal inputs.
Education
PhD student at Halıcıoğlu Data Science Institute (HDSI) at UC San Diego, advised by Professor Zhiting Hu; Bachelor's degree (with Honors) from Zhejiang University.
Background
Research Interests: Vision-Language models and Agentic systems. Specifically, building an efficient but versatile model for many NLP tasks (Text Alignment) and a metric for evaluating factual consistency (AlignScore) of generative language models. Also interested in improving students' accessibility in classes, where they can ask LMTutor their academic or logistic questions. Current research focuses on enhancing the reasoning capabilities of vision-language models through reinforcement learning with various reward signals, and building autonomous agents that can perceive, reason, and act in complex environments (e.g., real websites) using multimodal inputs.