Evaluating Multi-Turn Bargain Skills in LLM-Based Seller Agent

📅 2025-09-08

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

Existing seller agents in online secondhand markets struggle to track buyers’ cumulative intent over extended negotiations, and no evaluation framework exists for multi-turn bargaining in e-commerce settings. Method: (1) We release a large-scale e-commerce bargaining benchmark comprising 622 categories, 9,892 items, and 3,014 negotiation tasks; (2) we propose a theory-of-mind–inspired, turn-level evaluation framework; and (3) we design an end-to-end pipeline that automatically extracts high-confidence buyer intent. Contribution/Results: Our work departs from conventional coarse-grained evaluation—reliant solely on final transaction outcomes—and enables fine-grained, quantitative assessment of intent recognition accuracy, negotiation dynamics, and process interpretability. This significantly enhances seller agents’ long-horizon intent modeling capability and bargaining effectiveness.

Technology Category

Application Category

📝 Abstract

In online second-hand marketplaces, multi-turn bargaining is a crucial part of seller-buyer interactions. Large Language Models (LLMs) can act as seller agents, negotiating with buyers on behalf of sellers under given business constraints. A critical ability for such agents is to track and accurately interpret cumulative buyer intents across long negotiations, which directly impacts bargaining effectiveness. We introduce a multi-turn evaluation framework for measuring the bargaining ability of seller agents in e-commerce dialogues. The framework tests whether an agent can extract and track buyer intents. Our contributions are: (1) a large-scale e-commerce bargaining benchmark spanning 622 categories, 9,892 products, and 3,014 tasks; (2) a turn-level evaluation framework grounded in Theory of Mind (ToM) with annotated buyer intents, moving beyond outcome-only metrics; and (3) an automated pipeline that extracts reliable intent from massive dialogue data.

Problem

Research questions and friction points this paper is trying to address.

Evaluating LLM-based seller agents' multi-turn bargaining skills

Tracking cumulative buyer intents in long negotiations

Developing evaluation framework with Theory of Mind metrics

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-turn evaluation framework for bargaining skills

Theory of Mind grounded intent tracking mechanism

Automated pipeline extracting buyer intents from dialogues

🔎 Similar Papers

Assistive Large Language Model Agents for Socially-Aware Negotiation Dialogues