Adapting While Learning: Grounding LLMs for Scientific Problems with Intelligent Tool Usage Adaptation

📅 2024-11-01

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

194K/year

🤖 AI Summary

To address the dual challenges of severe hallucination in complex scientific reasoning and inefficient tool invocation (e.g., over-reliance on high-cost tools) in large language models (LLMs), this paper proposes a “Learn-as-You-Adapt” framework. The method employs a two-stage collaborative fine-tuning strategy: first, leveraging tool-generated solutions to internalize world knowledge (WKL); second, enabling fine-grained, difficulty-aware tool usage decisions (TUA) based on model confidence—mimicking human experts’ problem assessment and adaptive strategy switching. It integrates difficulty-aware modeling, multi-domain scientific tool orchestration, and parameter-efficient fine-tuning. Evaluated on six climate, epidemiological, and mathematical benchmarks, our approach improves answer accuracy by 28.27% and tool invocation accuracy by 13.76% over an 8B baseline. Moreover, it outperforms GPT-4 and Claude-3.5 on four custom scientific datasets.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) demonstrate promising capabilities in solving simple scientific problems but, even with domain-specific fine-tuning, often produce hallucinations for complex ones. While integrating LLMs with tools can mitigate this reliability issue, models finetuned on tool usage only often over-rely on them, incurring unnecessary costs from resource-intensive scientific tools even for simpler problems. Inspired by how human experts assess the complexity of the problem before choosing the solutions, we propose a novel two-component fine-tuning method, Adapting While Learning (AWL). In the first component, World Knowledge Learning (WKL), LLMs internalize scientific knowledge by learning from tools-generated solutions. In the second component, Tool Usage Adaptation (TUA), we classify questions as easy or hard based on the WKL-trained model's accuracy, and train it to maintain direct reasoning for simple problems while switching to tools for challenging ones. We validate our method on 6 scientific benchmark datasets in climate science, epidemiology, and mathematics. Compared to the base 8B model, our trained models achieve 28.27% higher answer accuracy and 13.76% better tool usage accuracy, even surpassing state-of-the-art models including GPT-4 and Claude-3.5 on 4 custom-created datasets.

Problem

Research questions and friction points this paper is trying to address.

Adapting LLMs for complex scientific tasks

Reducing over-reliance on resource-intensive tools

Classifying problem complexity for efficient tool usage

Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-component fine-tuning method

Classify questions by complexity

Enhance accuracy with tool usage

🔎 Similar Papers

Learning Evolving Tools for Large Language Models