- BlenderFusion: 3D-Grounded Visual Editing and Generative Compositing
- ε-VAE: Denoising as Visual Decoding
- Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs
- SwitchLight: Co-design of Physics-driven Architecture and Pre-training Framework for Human Portrait Relighting
- MTMMC: A Large-Scale Real-World Multi-Modal Camera Tracking Benchmark
- Test-time Adaptation in the Dynamic World with Compound Domain Knowledge Management
- ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders
- Mask-guided Matting in the Wild
- Bidirectional Domain Mixup for Domain Adaptive Semantic Segmentation
- Learning Classifiers of Prototypes and Reciprocal Points for Universal Domain Adaptation
- Bridging Images and Videos: A Simple Learning Framework for Large Vocabulary Video Object Detection
Awards: Microsoft Research Asia PhD Fellowship, Google Research PhD Fellowship
Research Experience
Interned at Adobe Research (San Jose, CA) and Meta AI (Menlo Park, CA). Previously, a Faculty Fellow in Computer Science at NYU Courant, hosted by Prof. Saining Xie.
Education
Ph.D. and M.S. degrees in electrical engineering from Korea Advanced Institute of Science and Technology (KAIST), advised by Prof. In So Kweon; B.S. degree in electrical computer engineering from Seoul National University (SNU).
Background
Research interests: designing effective visual models to better understand the world and developing efficient learning frameworks to utilize data at scale with minimal human supervision. Currently a Senior Research Scientist at Google DeepMind.