Jan Hajič jr.
Scholar

Jan Hajič jr.

Google Scholar ID: 1CWcwO0AAAAJ
Institute of Formal and Applied Linguistics, Charles University in Prague
Natural Language ProcessingArtificial IntelligenceStructural Bioinformatics
Citations & Impact
All-time
Citations
797
 
H-index
11
 
i10-index
12
 
Publications
20
 
Co-authors
4
list available
Resume (English only)
Academic Achievements
  • Authored the foundational paper 'Understanding OMR'.
  • Created and released the MUSCIMA++ dataset, widely used in OMR research.
  • Recent work focuses on practical recognition of pianoform notation and domain adaptation techniques like data synthesis for manuscript recognition.
  • Pioneering digital analysis of Gregorian Chant, exploring appropriate music theories for its eight modes and using bioinformatics methods to trace melodic development.
  • Co-authored an ISMIR 2025 paper on a computational model of saxophone playing difficulty.
  • Produced a technical report for the GAUK1444217 project.
Research Experience
  • 2023–2027: Principal Investigator of the OmniOMR project (NAKI III programme, Ministry of Culture of the Czech Republic), focusing on optical music recognition for digital libraries.
  • 2023–2029: Co-Investigator and leader of the Chant Analytics team in the SSHRC-funded 'Digital Analysis of Chant Transmission' project (Canada).
  • 2023–2024: Principal Investigator of the 'Genome of Melody' project (funded by John Templeton Foundation via Cultural Evolution Society), applying phylogenetics to study Gregorian Chant melody evolution.
  • 2017–2019: PI of the GAUK project 'Multimodal Optical Music Recognition' (GAUK 1444217).
  • 2017–2018: Co-investigator of 'Convolutional Neural Networks for Optical Music Recognition' (GAUK 170217).
  • 2015–2016: PI of 'rRNA Secondary Structure Prediction' (GAUK 550214), which led to the rPredictor database.