BLIP3o-NEXT: Next Frontier of Native Image Generation
BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset
Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion
Quantifying uncertainty in answers from any language model and enhancing their trustworthiness
Instructzero: Efficient instruction optimization for black-box large language models
Does your graph need a confidence boost? Convergent boosted smoothing on graphs with tabular node features
Research Experience
Currently on the industry job market.
Education
Pursuing a Ph.D. in Computer Science at the University of Maryland, College Park, advised by Prof. Tianyi Zhou and Prof. Tom Goldstein.
Background
Fourth-year Computer Science Ph.D. student at the University of Maryland, College Park, interested in Multimodal and Large Language Model, including visual perception, visual generation, reasoning for both MLLMs and LLMs.