🤖 AI Summary
This work addresses the challenging problem of automatically transcribing MIDI music—especially non-guitar-native pieces (e.g., orchestral excerpts)—into playable guitar tablature. The core difficulty lies in assigning each note to an appropriate string and fret position while jointly optimizing playability, finger movement continuity, and polyphonic voice coordination. Methodologically, we propose an end-to-end machine learning framework that encodes domain-specific constraints—including fretboard geometry, standard tuning (EADGBE), and musical semantics (e.g., intervallic relationships and harmonic context)—into rich feature representations. To enhance generalization to unfamiliar repertoire, we introduce a guitar-specific data augmentation strategy grounded in fingering feasibility and idiomatic phrasing. Experiments demonstrate that our approach outperforms baseline methods across monophonic and polyphonic scenarios, yielding tablatures that better conform to human playing conventions and exhibit strong robustness in cross-genre transcription tasks.
📝 Abstract
Guitar tablature transcription consists in deducing the string and the fret number on which each note should be played to reproduce the actual musical part. This assignment should lead to playable string-fret combinations throughout the entire track and, in general, preserve parsimonious motion between successive combinations. Throughout the history of guitar playing, specific chord fingerings have been developed across different musical styles that facilitate common idiomatic voicing combinations and motion between them. This paper presents a method for assigning guitar tablature notation to a given MIDI-based musical part (possibly consisting of multiple polyphonic tracks), i.e. no information about guitar-idiomatic expressional characteristics is involved (e.g. bending etc.) The current strategy is based on machine learning and requires a basic assumption about how much fingers can stretch on a fretboard; only standard 6-string guitar tuning is examined. The proposed method also examines the transcription of music pieces that was not meant to be played or could not possibly be played by a guitar (e.g. potentially a symphonic orchestra part), employing a rudimentary method for augmenting musical information and training/testing the system with artificial data. The results present interesting aspects about what the system can achieve when trained on the initial and augmented dataset, showing that the training with augmented data improves the performance even in simple, e.g. monophonic, cases. Results also indicate weaknesses and lead to useful conclusions about possible improvements.