π€ AI Summary
This work addresses the limitations of existing recommender system simulators, whose user profiles rely on handcrafted designs and thus lack scalability and generalization. The authors propose APG4RecSim, a novel framework that, for the first time, focuses explicitly on the profile generation module within recommendation simulation. Leveraging large language models under minimal supervision, APG4RecSim automatically constructs task-aware, realistic, and coherent user profiles through task-oriented prompting, behavioral modeling, and distribution alignment techniques. This approach significantly enhances the stability of simulation systems across datasets and recommendation models while effectively mitigating popularity and position biases. Experiments on three benchmark datasets demonstrate consistent state-of-the-art performance across discrimination, ranking, and rating tasks, with up to a 7% improvement in nDCG@10 and an 8% reduction in JensenβShannon divergence of rating distributions.
π Abstract
Large Language Model (LLM)-based agent simulation has emerged as a promising approach to meet the increasing demand for real-time and rigorous evaluation in modern recommender systems. A typical LLM-driven simulation framework comprises three essential components: the profile module, memory module, and action module. However, existing studies have primarily concentrated on enhancing the memory and action modules, with limited attention to profile generation, which plays a pivotal role in ensuring realistic agent behaviours and aligning simulated interactions with real user dynamics. Moreover, the scarcity of datasets specifically designed for recommendation simulations has led to heavy reliance on manually crafted profiles, significantly limiting the scalability and generalisability of simulation frameworks across different datasets. To address these challenges, this work proposes an Automated Profile Generation Framework for Recommendation Simulation, APG4RecSim, that constructs realistic, coherent, and robust user profiles with minimal supervision. Extensive experiments on three benchmark datasets demonstrate that APG4RecSim achieves the best overall performance on discrimination, ranking, and rating tasks, improving ranking quality by up to 7% in nDCG@10 and reducing rating distribution divergence by 8% in JSD compared to existing profile-generation baselines. Beyond overall performance gains, our results show that profiles generated by APG4RecSim are resilient to popularity- and position-induced biases and maintain stable performance across datasets and different LLMs.