๐ค AI Summary
This work addresses the challenges of generating Chinese stand-up comedy, which include heavy reliance on cultural context, precise timing control, lack of performance cues, and the need for multi-step implicit reasoningโissues exacerbated by the inadequacy of existing humor datasets for long-form content creation. To overcome these limitations, the authors propose a multi-agent collaborative generation framework based on AutoGen, integrating retrieval-augmented generation (RAG) with a fine-tuned, specialized joke generator (JokeWriter). Through iterative planning and cooperative optimization, the system automatically transforms user-provided everyday topics into 3โ5 minute, structurally coherent, and stage-ready Chinese stand-up routines, accompanied by narrated videos. This approach effectively mitigates the mismatch between available data and task requirements, significantly enhancing cultural relevance, humorous coherence, and performability of the generated material.
๐ Abstract
Chinese stand-up comedy generation goes beyond plain text generation, requiring culturally grounded humor, precise timing, stage-performance cues, and implicit multi-step reasoning. Moreover, commonly used Chinese humor datasets are often better suited for humor understanding and evaluation than for long-form stand-up generation, making direct supervision misaligned with the target task. To address these challenges, we present OpenMic, an end-to-end multi-agent system built on AutoGen that transforms a user-provided life topic into a 3-5 minute Chinese stand-up performance and further produces a narrated comedy video. OpenMic orchestrates multiple specialized agents in a multi-round iterative loop-planning to jointly optimize humor, timing, and performability. To mitigate the dataset-task mismatch, we augment generation with retrieval-augmented generation (RAG) for material grounding and idea expansion, and we fine-tune a dedicated JokeWriter to better internalize stand-up-specific setup-punchline structures and long-range callbacks.