Scholar

K R Prajwal

Google Scholar ID: C-wGb2sAAAAJ

Chief Scientist @ sync. labs; Ph.D. @ VGG, Oxford

Computer VisionDeep LearningMultimodal learning

Citations & Impact

All-time

Citations

2,036

H-index

i10-index

Publications

Co-authors

list available

Contact

Publications

20 items

Browse publications on Google Scholar (top-right) ↗

Resume (English only)

Academic Achievements

Published multiple papers at top-tier conferences including CVPR, ECCV, BMVC, ACM Multimedia, and WACV
BMVC 2022: Introduced the first weakly-supervised fingerspelling recognition method for British Sign Language and released a new benchmark dataset
ECCV 2022: Proposed scalable methods to densify automatic annotations in sign language videos
CVPR 2022: Developed a sub-word level lip reading model with visual attention, greatly reducing word error rates
BMVC 2021: Proposed a transformer-based architecture for visual keyword spotting
WACV 2021: Introduced a novel audio-visual speech enhancement paradigm robust to visual corruptions
ACM Multimedia 2020 (Oral): Proposed a high-accuracy speech-to-lip generation architecture for in-the-wild scenarios
CVPR 2020: Achieved realistic speech synthesis from silent lip movements for a single speaker
ACM Multimedia 2019 (Oral): Proposed a 'face-to-face translation' pipeline for cross-lingual talking face video translation while preserving pose and background

Research Experience

Conducting doctoral research at VGG, Oxford, focusing on weakly-supervised vision-language tasks
Proposed a novel visual backbone for lip region tracking, significantly reducing word error rates in lip reading
Developed scalable methods to increase automatic annotation density in sign language videos (from 670K to 5M confident annotations)
Designed a novel architecture for accurate audio-driven lip-sync for any identity in the wild
Built an end-to-end system for lip-to-speech synthesis that preserves individual speaking styles

Co-authors

22 total