Towards Generalizability to Tone and Content Variations in the Transcription of Amplifier Rendered Electric Guitar Audio

📅 2025-04-10

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

Electric guitar audio transcription faces two key challenges: (1) the scarcity of diverse datasets covering multiple amplifier–cabinet configurations, and (2) strong nonlinear distortions introduced by the tone chain (amplifier, speaker cabinet, effects pedals), which are tightly coupled with playing content. To address these, we propose Tone-aware Transformer (TIT), featuring three core contributions: (1) EGDB-PG—the first publicly available dataset for generalized transcription, encompassing 12 mainstream amplifier–cabinet combinations; (2) a learnable tone embedding module that explicitly models timbral characteristics and disentangles them from note-level content; and (3) joint tone/content enhancement alongside audio normalization. Ablation studies and cross-amplifier benchmarking demonstrate TIT’s superior generalization: it achieves a 9.2% average accuracy gain over state-of-the-art models under multi-tone settings.

Technology Category

Application Category

📝 Abstract

Transcribing electric guitar recordings is challenging due to the scarcity of diverse datasets and the complex tone-related variations introduced by amplifiers, cabinets, and effect pedals. To address these issues, we introduce EGDB-PG, a novel dataset designed to capture a wide range of tone-related characteristics across various amplifier-cabinet configurations. In addition, we propose the Tone-informed Transformer (TIT), a Transformer-based transcription model enhanced with a tone embedding mechanism that leverages learned representations to improve the model's adaptability to tone-related nuances. Experiments demonstrate that TIT, trained on EGDB-PG, outperforms existing baselines across diverse amplifier types, with transcription accuracy improvements driven by the dataset's diversity and the tone embedding technique. Through detailed benchmarking and ablation studies, we evaluate the impact of tone augmentation, content augmentation, audio normalization, and tone embedding on transcription performance. This work advances electric guitar transcription by overcoming limitations in dataset diversity and tone modeling, providing a robust foundation for future research.

Problem

Research questions and friction points this paper is trying to address.

Addressing electric guitar transcription challenges from tone variations

Introducing EGDB-PG dataset for diverse amplifier-cabinet configurations

Proposing Tone-informed Transformer for improved tone adaptability

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces EGDB-PG dataset for diverse tone variations

Proposes Tone-informed Transformer with tone embeddings

Combines dataset diversity and tone modeling enhancements

🔎 Similar Papers

No similar papers found.