🤖 AI Summary
This paper addresses the challenge of hosts in e-commerce live streaming evading regulatory oversight and disseminating false medical/health claims by exploiting homophonic morphs—phonetically identical but semantically distinct words. We introduce Live Auditory Morph Resolution (LiveAMR), the first task dedicated to detecting such illicit morphs in automatic speech recognition (ASR) transcripts. Methodologically, we formulate morph detection as a text-to-text generation problem, integrating LLM-augmented data synthesis, rule-based and model-based joint detection, and a tailored ASR post-processing mechanism. Our contributions are threefold: (1) a formal definition of the LiveAMR task; (2) the first large-scale LiveAMR benchmark dataset, comprising 86,790 annotated samples; and (3) substantial improvements in morph detection accuracy, empirically validating LiveAMR’s effectiveness for compliance auditing of live-stream content and establishing a deployable technical foundation for real-time regulatory enforcement.
📝 Abstract
E-commerce live streaming in China, particularly on platforms like Douyin, has become a major sales channel, but hosts often use morphs to evade scrutiny and engage in false advertising. This study introduces the Live Auditory Morph Resolution (LiveAMR) task to detect such violations. Unlike previous morph research focused on text-based evasion in social media and underground industries, LiveAMR targets pronunciation-based evasion in health and medical live streams. We constructed the first LiveAMR dataset with 86,790 samples and developed a method to transform the task into a text-to-text generation problem. By leveraging large language models (LLMs) to generate additional training data, we improved performance and demonstrated that morph resolution significantly enhances live streaming regulation.