Matching Markets Meet LLMs: Algorithmic Reasoning with Ranked Preferences

📅 2025-06-04

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

Large language models (LLMs) exhibit weak structured algorithmic reasoning in matching markets—particularly in stable matching generation, instability detection, and preference-based repair—due to failures in identifying blocking pairs and iteratively executing combinatorial algorithms under large-scale ranked preferences. Method: We introduce the first benchmark for ranked-preference reasoning, covering multi-level algorithmic tasks, and evaluate LLMs using standard inference and LoRA fine-tuning. Contribution/Results: Top-tier LLMs consistently fail on large-market instances; LoRA improves only small-scale performance, exposing fundamental bottlenecks in algorithmic reasoning under long contexts. This work is the first to systematically reveal LLMs’ structural limitations in preference-driven combinatorial reasoning, providing both a rigorous evaluation framework and critical implications for deploying trustworthy AI in resource allocation and other mission-critical domains.

Technology Category

Application Category

📝 Abstract

The rise of Large Language Models (LLMs) has driven progress in reasoning tasks -- from program synthesis to scientific hypothesis generation -- yet their ability to handle ranked preferences and structured algorithms in combinatorial domains remains underexplored. We study matching markets, a core framework behind applications like resource allocation and ride-sharing, which require reconciling individual ranked preferences to ensure stable outcomes. We evaluate several state-of-the-art models on a hierarchy of preference-based reasoning tasks -- ranging from stable-matching generation to instability detection, instability resolution, and fine-grained preference queries -- to systematically expose their logical and algorithmic limitations in handling ranked inputs. Surprisingly, even top-performing models with advanced reasoning struggle to resolve instability in large markets, often failing to identify blocking pairs or execute algorithms iteratively. We further show that parameter-efficient fine-tuning (LoRA) significantly improves performance in small markets, but fails to bring about a similar improvement on large instances, suggesting the need for more sophisticated strategies to improve LLMs' reasoning with larger-context inputs.

Problem

Research questions and friction points this paper is trying to address.

Exploring LLMs' ability to handle ranked preferences in matching markets

Evaluating LLMs on preference-based reasoning tasks in combinatorial domains

Assessing limitations of LLMs in resolving instability in large markets

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLMs handle ranked preferences in markets

Parameter-efficient fine-tuning boosts small markets

Advanced reasoning struggles with large markets

🔎 Similar Papers

Can LLMs Replace Economic Choice Prediction Labs? The Case of Language-based Persuasion Games