Scholar

Jakub Krajewski

Google Scholar ID: v5mZs1kAAAAJ

PhD Student, University of Warsaw, IDEAS NCBR

Large Language ModelsMixture of Expertsconditional computationmachine learning

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

171

H-index

3

i10-index

2

Publications

7

Co-authors

5

list available

Contact

No contact links provided.

Publications

6 items

Revisiting the Scaling Properties of Downstream Metrics in Large Language Model Training

2025

Cited

0

$μ$-Parametrization for Mixture of Experts

2025

Cited

0

Decoupled Relative Learning Rate Schedules

2025

Cited

0

Projected Compression: Trainable Projection for Efficient Transformer Compression

2025

Cited

0

Scaling Fine-Grained MoE Beyond 50B Parameters: Empirical Evaluation and Practical Insights

2025

Cited

0

Joint MoE Scaling Laws: Mixture of Experts Can Be Memory Efficient

2025

Cited

0

Resume (English only)

Co-authors

5 total

Jan Ludziejewski

Sebastian Jaszczur

Anthropic (past: IDEAS, University of Warsaw)

Google, University of Warsaw, Polish Academy of Sciences

Daniel Korzekwa

Szymon Antoniak