Scholar

Saksham Suri

Google Scholar ID: NrHYGZ8AAAAJ

Research Scientist, Meta Reality Labs

Computer VisionMachine LearningDeep Learning

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

386

H-index

i10-index

Publications

Co-authors

Contact

Emailsakshams@cs.umd.edu CVOpen ↗TwitterOpen ↗GitHubOpen ↗LinkedInOpen ↗

Publications

7 items

Small Vision-Language Models are Smart Compressors for Long Video Understanding

2026

Cited

Efficient Universal Perception Encoder

2026

Cited

Going Down Memory Lane: Scaling Tokens for Video Stream Understanding with Dynamic KV-Cache Memory

2026

Cited

UPLiFT: Efficient Pixel-Dense Feature Upsampling with Local Attenders

2026

Cited

VideoAuto-R1: Video Auto Reasoning via Thinking Once, Answering Twice

arXiv.org · 2026

Cited

EdgeTAM: On-Device Track Anything Model

2025

Cited

LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior

International Conference on Learning Representations · 2024

Cited

Resume (English only)

Academic Achievements

Publications: LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior (ICLR, 2025); MAPS: Memory Augmented Panoptic Segmentation (Under Review); UVIS: Unsupervised Video Instance Segmentation (CVPR Workshop, 2024); Gen2Det: Generate to Detect (Synthetic Data for Computer Vision Workshop @ CVPR 2024); LiFT: A Surprisingly Simple Lightweight Feature Transform for Dense ViT Descriptors (Under Submission); GRIT: GAN Residuals for Image-to-Image Translation (WACV, 2024); Diff2Lip: Audio Conditioned Diffusion Models for Lip-Synchronization (WACV, 2024); SparseDet: Improving Sparsely Annotated Object Detection with Pseudo-positive Mining (ICCV, 2023)

Research Experience

Research Scientist at Meta Reality Labs, focusing on efficient foundation models; interned with Meta and Amazon during his Ph.D.

Education

Ph.D. in Computer Science from the University of Maryland, College Park, advised by Prof. Abhinav Shrivastava; B.S. in Computer Science and Engineering from IIIT Delhi, worked at IAB Lab and Precog.

Background

Research Interests: Solving problems using less supervision and uncurated as well as synthetic data. Recently, working on improving recognition using generation, especially with diffusion models as synthetic data sources.

Miscellany

Information about personal interests or hobbies is not provided.

Co-authors

0 total

Co-authors: 0 (list not available)