A Comparative Study of Task Adaptation Techniques of Large Language Models for Identifying Sustainable Development Goals

📅 2025-06-18

📈 Citations: 0

✨ Influential: 0

career value

162K/year

🤖 AI Summary

Manual tracking of Sustainable Development Goal (SDG)-related texts is hindered by their large scale and semantic complexity. Method: This study systematically evaluates the adaptation efficacy of mainstream large language models (LLMs) for single-label, multi-class SDG text classification, employing prompt engineering, zero-shot and few-shot learning, and lightweight fine-tuning across both open-source and commercial models. Contribution/Results: Optimized small-scale open-source LLMs—specifically Phi-3 and Qwen2—achieve classification accuracy comparable to or exceeding that of GPT-4, with up to a 12.3% absolute improvement in SDG label accuracy. These findings challenge the prevailing “bigger is better” assumption in LLM deployment and empirically validate lightweight adaptation as a viable strategy. The approach delivers an efficient, deployable AI solution for SDG monitoring in resource-constrained settings, balancing performance, computational efficiency, and accessibility.

Technology Category

Application Category

📝 Abstract

In 2012, the United Nations introduced 17 Sustainable Development Goals (SDGs) aimed at creating a more sustainable and improved future by 2030. However, tracking progress toward these goals is difficult because of the extensive scale and complexity of the data involved. Text classification models have become vital tools in this area, automating the analysis of vast amounts of text from a variety of sources. Additionally, large language models (LLMs) have recently proven indispensable for many natural language processing tasks, including text classification, thanks to their ability to recognize complex linguistic patterns and semantics. This study analyzes various proprietary and open-source LLMs for a single-label, multi-class text classification task focused on the SDGs. Then, it also evaluates the effectiveness of task adaptation techniques (i.e., in-context learning approaches), namely Zero-Shot and Few-Shot Learning, as well as Fine-Tuning within this domain. The results reveal that smaller models, when optimized through prompt engineering, can perform on par with larger models like OpenAI's GPT (Generative Pre-trained Transformer).

Problem

Research questions and friction points this paper is trying to address.

Identifying Sustainable Development Goals in text data

Comparing task adaptation techniques for LLMs

Evaluating performance of small vs large language models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Evaluates task adaptation techniques for LLMs

Compares Zero-Shot and Few-Shot Learning methods

Optimizes smaller models via prompt engineering

🔎 Similar Papers

Surveying Attitudinal Alignment Between Large Language Models Vs. Humans Towards 17 Sustainable Development Goals