- Sovereign Large Language Models: Advantages, Strategy and Regulations (2025)
- Benchmarking Multimodal Models for Ukrainian Language Understanding Across Academic and Cultural Domains (2024)
- Setting up the Data Printer with Improved English to Ukrainian Machine Translation (2024)
- Unsupervised Data Validation Methods for Efficient Model Training (2023)
- Open Source Projects:
- UAlpaca: First Ukrainian instruction-tuned language models and datasets
- Crimean Tatar Text-to-Speech experiment
- Ukrainian Question and Answering with BERT
- Ukrainian Text-to-Speech model
- Ukrainian Speech-to-Text model
Research Experience
- Research projects: exploring how to shorten the gap between high-resource and low-resource languages in natural language processing.
- Key research projects:
- Sovereign Large Language Models: Advantages, Strategy and Regulations
- Benchmarking Multimodal Models for Ukrainian Language Understanding Across Academic and Cultural Domains
- Setting up the Data Printer with Improved English to Ukrainian Machine Translation
- Unsupervised Data Validation Methods for Efficient Model Training
Education
PhD student in Computer Science at the Ukrainian Catholic University (UCU)
Background
PhD student in Computer Science, focusing on developing unsupervised and semi-supervised methods to obtain high-quality machine learning models using minimal data. The goal is to bring state-of-the-art performance to mid- and low-resource languages across different modalities: text, speech, and vision.