Scholar
Jack Urbanek
Google Scholar ID: 8LN5NoQAAAAJ
DatologyAI
Artificial Intelligence
Follow
Google Scholar
↗
Citations & Impact
All-time
Citations
3,194
H-index
16
i10-index
18
Publications
20
Co-authors
0
Contact
No contact links provided.
Publications
5 items
The Finetuner's Fallacy: When to Pretrain with Your Finetuning Data
2026
Cited
0
ÜberWeb: Insights from Multilingual Curation for a 20-Trillion-Token Dataset
2026
Cited
0
DatBench: Discriminative, Faithful, and Efficient VLM Evaluations
arXiv.org · 2026
Cited
1
Luxical: High-Speed Lexical-Dense Text Embeddings
2025
Cited
0
BeyondWeb: Lessons from Scaling Synthetic Data for Trillion-scale Pretraining
2025
Cited
0
Resume (English only)
Co-authors
0 total
Co-authors: 0 (list not available)
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up