Unlocking Large Language Model's Planning Capabilities with Maximum Diversity Fine-tuning

📅 2024-06-15
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) exhibit strong performance on planning tasks with abundant prior data but suffer from poor generalization in low-data regimes—such as block-world reasoning and high-level itinerary planning—where task structure is critical yet underrepresented in training. Method: We propose CMDS-g, the first method to explicitly encode the graph-structured semantics of planning tasks into the language embedding space. It integrates clustering-driven maximum-diversity sampling with task-instance-aware structured fine-tuning. Contribution/Results: Compared to random sampling and the language-only variant CMDS-l, CMDS-g achieves comparable or superior planning performance across multi-scale and multi-benchmark evaluations using only one-tenth of the data. This substantially reduces the economic, temporal, and computational costs of fine-tuning while significantly improving few-shot generalization and cross-domain transferability.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) have demonstrated impressive task-solving capabilities through prompting techniques and system designs, including solving planning tasks (e.g., math proofs, basic travel planning) when sufficient data is available online and used during pre-training. However, for planning tasks with limited prior data (e.g., blocks world, advanced travel planning), the performance of LLMs, including proprietary models like GPT and Gemini, is poor. This paper investigates the impact of fine-tuning on the planning capabilities of LLMs, revealing that LLMs can achieve strong performance in planning through substantial (tens of thousands of specific examples) fine-tuning. Yet, this process incurs high economic, time, and computational costs for each planning problem variation. To address this, we propose Clustering-Based Maximum Diversity Sampling (CMDS), which selects diverse and representative data to enhance sample efficiency and the model's generalization capability. Extensive evaluations demonstrate that CMDS-l, a baseline method combining CMDS with language embeddings, outperforms random sampling. Furthermore, we introduce a novel algorithm, CMDS-g, which encodes planning task instances with their graph representations into the embedding space. Empirical results show that CMDS-g consistently outperforms baseline methods across various scales and multiple benchmark domains.
Problem

Research questions and friction points this paper is trying to address.

Enhancing LLM planning with limited prior data
Reducing high costs of fine-tuning for planning tasks
Improving sample efficiency and generalization via diversity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Maximum Diversity Fine-tuning enhances LLM planning
Clustering-Based Maximum Diversity Sampling improves efficiency
Graph representations in CMDS-g boost performance
🔎 Similar Papers
No similar papers found.