One for All: A General Framework of LLMs-based Multi-Criteria Decision Making on Human Expert Level

📅 2025-02-17

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

This study addresses the poor generalizability and limited interpretability of conventional methods in high-dimensional multi-criteria decision-making (MCDM). We propose the first general large language model (LLM) framework specifically designed for MCDM. Our method introduces a LoRA-based efficient fine-tuning paradigm, integrated with chain-of-thought prompting and few-shot learning, to systematically evaluate and enhance the human-expert-level performance of mainstream LLMs on MCDM tasks. We construct the first MCDM benchmark dataset annotated by domain experts and conduct comprehensive multi-model comparative experiments. After fine-tuning, all evaluated models achieve stable 95% accuracy across three canonical MCDM task categories; their performance converges significantly and matches human expert proficiency. This breakthrough overcomes long-standing decision bottlenecks in high-dimensional, complex scenarios, establishing a scalable and interpretable paradigm for scientific decision-making across disciplines.

Technology Category

Application Category

📝 Abstract

Multi-Criteria Decision Making~(MCDM) is widely applied in various fields, using quantitative and qualitative analyses of multiple levels and attributes to support decision makers in making scientific and rational decisions in complex scenarios. However, traditional MCDM methods face bottlenecks in high-dimensional problems. Given the fact that Large Language Models~(LLMs) achieve impressive performance in various complex tasks, but limited work evaluates LLMs in specific MCDM problems with the help of human domain experts, we further explore the capability of LLMs by proposing an LLM-based evaluation framework to automatically deal with general complex MCDM problems. Within the framework, we assess the performance of various typical open-source models, as well as commercial models such as Claude and ChatGPT, on 3 important applications, these models can only achieve around 60% accuracy rate compared to the evaluation ground truth. Upon incorporation of Chain-of-Thought or few-shot prompting, the accuracy rates rise to around 70%, and highly depend on the model. In order to further improve the performance, a LoRA-based fine-tuning technique is employed. The experimental results show that the accuracy rates for different applications improve significantly to around 95%, and the performance difference is trivial between different models, indicating that LoRA-based fine-tuned LLMs exhibit significant and stable advantages in addressing MCDM tasks and can provide human-expert-level solutions to a wide range of MCDM challenges.

Problem

Research questions and friction points this paper is trying to address.

Enhance MCDM using LLMs

Improve accuracy with LoRA fine-tuning

Achieve expert-level MCDM solutions

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLMs-based evaluation framework

Chain-of-Thought prompting

LoRA-based fine-tuning technique

🔎 Similar Papers

Generating Diverse Criteria On-the-Fly to Improve Point-wise LLM Rankers