Scholar

Shane Bergsma

Google Scholar ID: nrE1OroAAAAJ

Cerebras Systems

Machine LearningArtificial IntelligenceNatural Language Processing

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

734

H-index

14

i10-index

25

Publications

20

Co-authors

61

list available

Contact

No contact links provided.

Publications

6 items

PTPP-Aware Adaptation Scaling Laws: Predicting Domain-Adaptation Performance at Unseen Pre-Training Budgets

2025

Cited

0

Scaling with Collapse: Efficient and Predictable Training of LLM Families

2025

Cited

0

Predicting Training Re-evaluation Curves Enables Effective Data Curriculums for LLMs

2025

Cited

0

Power Lines: Scaling Laws for Weight Decay and Batch Size in LLM Pre-training

2025

Cited

0

Don't be lazy: CompleteP enables compute-efficient deep transformers

2025

Cited

0

Straight to Zero: Why Linearly Decaying the Learning Rate to Zero Works Best for LLMs

2025

Cited

1

Resume (English only)

Co-authors

61 total

Professor of Computer Science, Johns Hopkins University

Professor of Computing Science, University of Alberta

Benjamin Van Durme

Johns Hopkins University / Microsoft

Grzegorz Kondrak

Professor of Computing Science, University of Alberta

Johns Hopkins University