π€ AI Summary
This work addresses the limitation of current large language models (LLMs) in humor generation, which typically neglect dynamic community feedback and rely on static prompts for evaluation. We propose the first framework that integrates community-level discourse into LLM-based comedy creation by constructing a multi-agent sandbox environment to build a βsocial memory.β This memory captures, filters, and stores user comments and audience reactions, which are subsequently retrieved to condition stand-up comedy generation. Evaluated through 250 A/B tests and expert assessments across 15 dimensions, our approach significantly outperforms baselines with a 75.6% win rate, demonstrating notable improvements in key aspects such as Craft/Clarity (+0.440) and Social Response (+0.422), thereby transcending the conventional static prompting paradigm.
π Abstract
Prior work has explored multi-turn interaction and feedback for LLM writing, but evaluations still largely center on prompts and localized feedback, leaving persistent public reception in online communities underexamined. We test whether broadcast community discussion improves stand-up comedy writing in a controlled multi-agent sandbox: in the discussion condition, critic and audience threads are recorded, filtered, stored as social memory, and later retrieved to condition subsequent generations, whereas the baseline omits discussion. Across 50 rounds (250 paired monologues) judged by five expert annotators using A/B preference and a 15-item rubric, discussion wins 75.6% of instances and improves Craft/Clarity ({\Delta} = 0.440) and Social Response ({\Delta} = 0.422), with occasional increases in aggressive humor.