Efficient Sequential Decision Making with Large Language Models

📅 2024-06-17

🏛️ Conference on Empirical Methods in Natural Language Processing

📈 Citations: 3

✨ Influential: 0

career value

231K/year

🤖 AI Summary

To address the high computational overhead, suboptimal prompt engineering, and prohibitive fine-tuning costs of large language models (LLMs) in sequential decision-making tasks, this paper proposes the first lightweight LLM-based decision framework grounded in online model selection. Our method requires no gradient-based updates; instead, it dynamically orchestrates multiple lightweight LLM agents, selecting the optimal one per decision step via real-time scheduling, while integrating efficient prompting interfaces and sequence-aware modeling. Crucially, we formulate model selection as an online decision problem that jointly optimizes statistical performance and inference efficiency. Evaluated on a large-scale Amazon dataset, our approach achieves over 6× improvement in decision quality compared to strong baselines, with only a 1.5% LLM invocation rate—significantly outperforming both conventional decision algorithms and full-scale LLM agent ensembles.

Technology Category

Application Category

📝 Abstract

This paper focuses on extending the success of large language models (LLMs) to sequential decision making. Existing efforts either (i) re-train or finetune LLMs for decision making, or (ii) design prompts for pretrained LLMs. The former approach suffers from the computational burden of gradient updates, and the latter approach does not show promising results. In this paper, we propose a new approach that leverages online model selection algorithms to efficiently incorporate LLMs agents into sequential decision making. Statistically, our approach significantly outperforms both traditional decision making algorithms and vanilla LLM agents. Computationally, our approach avoids the need for expensive gradient updates of LLMs, and throughout the decision making process, it requires only a small number of LLM calls. We conduct extensive experiments to verify the effectiveness of our proposed approach. As an example, on a large-scale Amazon dataset, our approach achieves more than a 6x performance gain over baselines while calling LLMs in only 1.5% of the time steps.

Problem

Research questions and friction points this paper is trying to address.

Extending LLMs to sequential decision making efficiently

Overcoming computational burden of retraining or finetuning LLMs

Reducing LLM calls while improving decision-making performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses online model selection algorithms

Avoids expensive LLM gradient updates

Minimizes LLM calls during decisions

🔎 Similar Papers

BIRD: A Trustworthy Bayesian Inference Framework for Large Language Models