Published a paper titled 'STSBench: A Spatio-temporal Scenario Benchmark for Multi-modal Large Language Models in Autonomous Driving' at NeurIPS 2025 (Datasets & Benchmarks track).
Published multiple papers at CVPR 2025, including 'LiSu: A Dataset and Method for LiDAR Surface Normal Estimation', 'GBlobs: Explicit Local Structure via Gaussian Blobs for Improved Cross-Domain LiDAR-based 3D Object Detection'.
Published a paper titled 'Progressive Token Length Scaling in Transformer Encoders for Efficient Universal Segmentation' at ICLR 2025.
Published multiple papers at CVPR 2024, including 'Generating Enhanced Negatives for Training Language-Based Object Detectors', 'Taming Self-Training for Open-Vocabulary Object Detection', 'Self-Training Large Language Models for Improved Visual Program Synthesis With Visual Reinforcement'.
Published a paper titled 'Exploring Question Decomposition for Zero-Shot VQA' at NeurIPS 2023.
Published multiple papers at ICCV 2023, including 'OmniLabel: A Challenging Benchmark for Language-Based Object Detection', 'Efficient Controllable Multi-Task Architectures'.
Published multiple papers at CVPR 2023, including 'Q: How to Specialize Large Vision-Language Models to Data-Scarce VQA Tasks? A: Self-Train on Unlabeled Images!', 'NeurOCS: Neural NOCS Supervision for Monocular 3D Object Localization'.
Published multiple papers at ECCV 2022, including 'Exploiting Unlabeled Data with Vision and Language Models for Object Detection', 'Single-Stream Multi-Level Alignment for Vision-Language Pretraining'.
Published multiple papers at CVPR 2022, including 'MM-TTA: Multi-Modal Test-Time Adaptation for 3D Semantic Segmentation', 'Controllable Dynamic Multi-Task Architectures'.