Third-year PhD student at Cornell University working on computer vision and machine learning
Currently focuses on multi-modal foundation models for visual understanding
Building models that understand and reason about temporal events using visual, textual, and temporal information
Experienced in developing large-scale multi-modal systems and working with unstructured data
Interested in interdisciplinary applications, collaborating with researchers in archaeology, grape pathology, and other domains to tackle real-world challenges