Scholar

Skanda Koppula

Google Scholar ID: hlC76YUAAAAJ

Google DeepMind

Embedded SystemsComputer VisionSpeech RecognitionBioinformatics

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

2,454

H-index

i10-index

Publications

Co-authors

Contact

GitHubOpen ↗

Publications

6 items

A Mixed Diet Makes DINO An Omnivorous Vision Encoder

2026

Cited

Efficiently Reconstructing Dynamic Scenes One D4RT at a Time

2025

Cited

SciVid: Cross-Domain Evaluation of Video Models in Scientific Applications

2025

Cited

TAPNext: Tracking Any Point (TAP) as Next Token Prediction

2025

Cited

Scaling 4D Representations

arXiv.org · 2024

Cited

A Simple Recipe for Contrastively Pre-Training Video-First Encoders Beyond 16 Frames

Computer Vision and Pattern Recognition · 2023

Cited

Resume (English only)

Academic Achievements

Published several papers including but not limited to: TAPNext: Tracking Any Point (TAP) as Next Token Prediction, SciVid: Cross-Domain Evaluation of Video Models in Scientific Applications, TAPVid-3D: A Benchmark for Tracking Any Point in 3D, Scaling 4D Representations, etc.

Research Experience

Worked on new versions of PilotNet with NVIDIA's Autonomous Driving Team; in 2017, worked with the Google Acoustic Modeling research team under Prof. Khe Chai Sim; previously interned at Yahoo and Square.

Education

Was a Fulbright researcher at ETH Zürich in 2019 working with Prof. Onur Mutlu; completed MEng at MIT in 2018, advised by Professor Anantha Chandrakasan and Dr. Jim Glass; received B.S. in Computer Science from MIT in 2016.

Background

Currently a research engineer at Google DeepMind and part of the PRISM vision research group at University College London. Works on dynamic 3D vision, visual representation learning, and video understanding. Research background includes computer vision, natural language understanding, computer security, and computer architecture.

Miscellany

Loves painting and building racecars.

Co-authors

0 total

Co-authors: 0 (list not available)