Chinese Morph Resolution in E-commerce Live Streaming Scenarios

📅 2025-12-29
📈 Citations: 0
Influential: 0
📄 PDF

career value

195K/year
🤖 AI Summary
This paper addresses the challenge of hosts in e-commerce live streaming evading regulatory oversight and disseminating false medical/health claims by exploiting homophonic morphs—phonetically identical but semantically distinct words. We introduce Live Auditory Morph Resolution (LiveAMR), the first task dedicated to detecting such illicit morphs in automatic speech recognition (ASR) transcripts. Methodologically, we formulate morph detection as a text-to-text generation problem, integrating LLM-augmented data synthesis, rule-based and model-based joint detection, and a tailored ASR post-processing mechanism. Our contributions are threefold: (1) a formal definition of the LiveAMR task; (2) the first large-scale LiveAMR benchmark dataset, comprising 86,790 annotated samples; and (3) substantial improvements in morph detection accuracy, empirically validating LiveAMR’s effectiveness for compliance auditing of live-stream content and establishing a deployable technical foundation for real-time regulatory enforcement.

Technology Category

Application Category

📝 Abstract
E-commerce live streaming in China, particularly on platforms like Douyin, has become a major sales channel, but hosts often use morphs to evade scrutiny and engage in false advertising. This study introduces the Live Auditory Morph Resolution (LiveAMR) task to detect such violations. Unlike previous morph research focused on text-based evasion in social media and underground industries, LiveAMR targets pronunciation-based evasion in health and medical live streams. We constructed the first LiveAMR dataset with 86,790 samples and developed a method to transform the task into a text-to-text generation problem. By leveraging large language models (LLMs) to generate additional training data, we improved performance and demonstrated that morph resolution significantly enhances live streaming regulation.
Problem

Research questions and friction points this paper is trying to address.

Detects pronunciation-based evasion in e-commerce live streams
Addresses false advertising using morphs in health broadcasts
Enhances regulation through morph resolution in streaming platforms
Innovation

Methods, ideas, or system contributions that make the work stand out.

Detects pronunciation-based morph evasion in live streams
Transforms morph resolution into text-to-text generation task
Uses LLMs to generate data for improved detection performance