Timed text extraction from Taiwanese Kua-á-hì TV series

📅 2026-01-01
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the longstanding reliance on manual annotation for extracting subtitles and sung lyrics from low-quality televised Taiwanese opera (Gezai opera) recordings, a bottleneck that has hindered related research. The authors propose a two-stage approach: first, automatic localization of lyric segments is achieved by integrating OCR-driven subtitle segmentation with speech-and-music activity detection (SMAD); second, an interactive real-time OCR correction system refines extraction accuracy and ensures precise temporal alignment. This work presents the first integration of OCR and SMAD for traditional Chinese opera audiovisual analysis and constructs the first Gezai opera lyric dataset with accurate time stamps, thereby enabling downstream music information retrieval tasks such as lyric recognition and melody-based search.

Technology Category

Application Category

📝 Abstract
Taiwanese opera (Kua-\'a-h\`i), a major form of local theatrical tradition, underwent extensive television adaptation notably by pioneers like I\^unn L\=e-hua. These videos, while potentially valuable for in-depth studies of Taiwanese opera, often have low quality and require substantial manual effort during data preparation. To streamline this process, we developed an interactive system for real-time OCR correction and a two-step approach integrating OCR-driven segmentation with Speech and Music Activity Detection (SMAD) to efficiently identify vocal segments from archival episodes with high precision. The resulting dataset, consisting of vocal segments and corresponding lyrics, can potentially supports various MIR tasks such as lyrics identification and tune retrieval. Code is available at https://github.com/z-huang/ocr-subtitle-editor .
Problem

Research questions and friction points this paper is trying to address.

timed text extraction
Taiwanese opera
Kua-á-hì
low-quality video
manual data preparation
Innovation

Methods, ideas, or system contributions that make the work stand out.

OCR correction
Speech and Music Activity Detection
two-step segmentation
Taiwanese opera
lyrics alignment
🔎 Similar Papers
No similar papers found.
T
Tzu-Hung Huang
Academia Sinica, Taiwan; National Taiwan University, Taiwan
Y
Yun-En Tsai
Academia Sinica, Taiwan
Y
Yun-Ning Hung
Music AI, USA
Chih-Wei Wu
Chih-Wei Wu
Audio Algorithms, Netflix, Inc
Music Information Retrievalaudio signal processingsound analysis and synthesis
I
I-Chieh Wei
University of Aukland, New Zealand
Li Su
Li Su
Institute of Information Science, Academia Sinica
Music information retrievalsignal processingmachine learningcomputational musicology