🤖 AI Summary
Traditional recommender systems lack fine-grained, interpretable user intent modeling, hindering explainability and adaptability in Web-platform recommendation. Method: This work introduces the first LLM-agent-based joint evaluation framework for user behavior modeling and recommendation. It constructs a multi-source heterogeneous dataset integrating Yelp, Amazon, and Goodreads, and designs a dynamic interactive simulation environment. A novel dual-track evaluation protocol—“user modeling + recommendation”—is proposed, incorporating prompt engineering, lightweight fine-tuning, and online evaluation. Contribution/Results: We release an open, reproducible benchmark platform, addressing the standardization gap in LLM-driven recommendation evaluation. The framework attracted 295 global teams with over 1,400 submissions. During development, performance improved by 21.9% (Track 1) and 20.3% (Track 2); final results sustained gains of 9.1% and 15.9%, respectively.
📝 Abstract
The AgentSociety Challenge is the first competition in the Web Conference that aims to explore the potential of Large Language Model (LLM) agents in modeling user behavior and enhancing recommender systems on web platforms. The Challenge consists of two tracks: the User Modeling Track and the Recommendation Track. Participants are tasked to utilize a combined dataset from Yelp, Amazon, and Goodreads, along with an interactive environment simulator, to develop innovative LLM agents. The Challenge has attracted 295 teams across the globe and received over 1,400 submissions in total over the course of 37 official competition days. The participants have achieved 21.9% and 20.3% performance improvement for Track 1 and Track 2 in the Development Phase, and 9.1% and 15.9% in the Final Phase, representing a significant accomplishment. This paper discusses the detailed designs of the Challenge, analyzes the outcomes, and highlights the most successful LLM agent designs. To support further research and development, we have open-sourced the benchmark environment at https://tsinghua-fib-lab.github.io/AgentSocietyChallenge.