Ask, Clarify, Optimize: Human-LLM Agent Collaboration for Smarter Inventory Control

📅 2025-12-31

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

192K/year

🤖 AI Summary

This study addresses the challenge faced by small and medium-sized enterprises in effectively applying advanced inventory optimization methods due to limited technical expertise, while also mitigating the performance degradation caused by the "hallucination tax" inherent in end-to-end large language model (LLM) approaches. The authors propose a hybrid agent framework that strictly decouples semantic reasoning from mathematical computation: the LLM serves solely as a natural language interface for parameter extraction and result interpretation, while a rigorous operations research optimization algorithm executes inventory decisions in the backend. The architecture incorporates a "human imitator" digital twin to enable scalable stress testing of irrational managerial behaviors. Experimental results demonstrate a 32.1% reduction in total inventory cost compared to an end-to-end GPT-4o baseline, with analysis confirming that performance bottlenecks stem from computational limitations rather than information scarcity.

Technology Category

Application Category

📝 Abstract

Inventory management remains a challenge for many small and medium-sized businesses that lack the expertise to deploy advanced optimization methods. This paper investigates whether Large Language Models (LLMs) can help bridge this gap. We show that employing LLMs as direct, end-to-end solvers incurs a significant"hallucination tax": a performance gap arising from the model's inability to perform grounded stochastic reasoning. To address this, we propose a hybrid agentic framework that strictly decouples semantic reasoning from mathematical calculation. In this architecture, the LLM functions as an intelligent interface, eliciting parameters from natural language and interpreting results while automatically calling rigorous algorithms to build the optimization engine. To evaluate this interactive system against the ambiguity and inconsistency of real-world managerial dialogue, we introduce the Human Imitator, a fine-tuned"digital twin"of a boundedly rational manager that enables scalable, reproducible stress-testing. Our empirical analysis reveals that the hybrid agentic framework reduces total inventory costs by 32.1% relative to an interactive baseline using GPT-4o as an end-to-end solver. Moreover, we find that providing perfect ground-truth information alone is insufficient to improve GPT-4o's performance, confirming that the bottleneck is fundamentally computational rather than informational. Our results position LLMs not as replacements for operations research, but as natural-language interfaces that make rigorous, solver-based policies accessible to non-experts.

Problem

Research questions and friction points this paper is trying to address.

inventory management

large language models

optimization

small and medium-sized businesses

stochastic reasoning

Innovation

Methods, ideas, or system contributions that make the work stand out.

hybrid agentic framework

Large Language Models (LLMs)

inventory optimization