Scholar

Yurii Paniv

Google Scholar ID: iUMYDLkAAAAJ

CompSci PhD Student, Ukrainian Catholic University

data-efficient traininglarge language modelsmachine translationnatural language processing

Citations & Impact

All-time

Citations

H-index

i10-index

Publications

Co-authors

list available

Contact

Publications

1 items

2025

Cited

Resume (English only)

Academic Achievements

- Publications:
- Sovereign Large Language Models: Advantages, Strategy and Regulations (2025)
- Benchmarking Multimodal Models for Ukrainian Language Understanding Across Academic and Cultural Domains (2024)
- Setting up the Data Printer with Improved English to Ukrainian Machine Translation (2024)
- Unsupervised Data Validation Methods for Efficient Model Training (2023)
- Open Source Projects:
- UAlpaca: First Ukrainian instruction-tuned language models and datasets
- Crimean Tatar Text-to-Speech experiment
- Ukrainian Question and Answering with BERT
- Ukrainian Text-to-Speech model
- Ukrainian Speech-to-Text model

Research Experience

- Research projects: exploring how to shorten the gap between high-resource and low-resource languages in natural language processing.
- Key research projects:
- Sovereign Large Language Models: Advantages, Strategy and Regulations
- Benchmarking Multimodal Models for Ukrainian Language Understanding Across Academic and Cultural Domains
- Setting up the Data Printer with Improved English to Ukrainian Machine Translation
- Unsupervised Data Validation Methods for Efficient Model Training

Education

Background

PhD student in Computer Science, focusing on developing unsupervised and semi-supervised methods to obtain high-quality machine learning models using minimal data. The goal is to bring state-of-the-art performance to mid- and low-resource languages across different modalities: text, speech, and vision.

Miscellany