Locally-Deployed Chain-of-Thought (CoT) Reasoning Model in Chemical Engineering: Starting from 30 Experimental Data

📅 2025-02-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In chemical engineering, molecular property prediction suffers from low accuracy and poor robustness under few-shot conditions (only 30 experimental data points). Method: This study proposes an ML-LLM-CoT hybrid chain-of-thought (CoT) reasoning paradigm, integrating traditional surrogate models—such as Gaussian processes and random forests—into lightweight large language models (LLMs) (e.g., DeepSeek-R1:14B, Qwen2:7B) within a CoT framework. The approach leverages local Ollama deployment and a hierarchical CoT construction strategy. Contribution/Results: This is the first work to enable synergistic reasoning between machine learning models and LLMs within CoT, significantly reducing predictive uncertainty. Evaluated on solubility prediction for 20 structurally diverse molecules, the method reduces the number of high-bias samples (>100% error) from 7 to 4, decreases mean absolute bias, cuts rethinking iterations by 90%, and substantially improves successful judgment rates.

Technology Category

Application Category

📝 Abstract
In the field of chemical engineering, traditional data-processing and prediction methods face significant challenges. Machine-learning and large-language models (LLMs) also have their respective limitations. This paper explores the application of the Chain-of-Thought (CoT) reasoning model in chemical engineering, starting from 30 experimental data points. By integrating traditional surrogate models like Gaussian processes and random forests with powerful LLMs such as DeepSeek-R1, a hierarchical architecture is proposed. Two CoT-building methods, Large Language Model-Chain of Thought (LLM-CoT) and Machine Learning-Large Language Model-Chain of Thought (ML-LLM-CoT), are studied. The LLM-CoT combines local models DeepSeek-r1:14b and Qwen2:7b with Ollama. The ML-LLM-CoT integrates a pre-trained Gaussian ML model with the LLM-based CoT framework. Our results show that during construction, ML-LLM-CoT is more efficient. It only has 2 points that require rethink and a total of 4 rethink times, while LLM-CoT has 5 points that need to be re-thought and 34 total rethink times. In predicting the solubility of 20 molecules with dissimilar structures, the number of molecules with a prediction deviation higher than 100% for the Gaussian model, LLM-CoT, and ML-LLM-CoT is 7, 6, and 4 respectively. These results indicate that ML-LLM-CoT performs better in controlling the number of high-deviation molecules, optimizing the average deviation, and achieving a higher success rate in solubility judgment, providing a more reliable method for chemical engineering and molecular property prediction. This study breaks through the limitations of traditional methods and offers new solutions for rapid property prediction and process optimization in chemical engineering.
Problem

Research questions and friction points this paper is trying to address.

Improving chemical engineering prediction accuracy
Integrating CoT reasoning with machine learning
Optimizing molecular solubility prediction methods
Innovation

Methods, ideas, or system contributions that make the work stand out.

Local Chain-of-Thought reasoning model
Integration of Gaussian processes with LLMs
ML-LLM-CoT enhances prediction accuracy
🔎 Similar Papers
No similar papers found.
Tianhang Zhou
Tianhang Zhou
Assistant Professor, China University of Petroleum (Beijing), Dr. rer. nat. with distiction.
Multiscale SimulationMachine LearningEnergy Dissipation
Y
Yingchun Niu
State Key Laboratory of Heavy Oil Processing, College of Carbon Neutrality Future Technology, China University of Petroleum (Beijing), Beijing 102249, China
X
Xingying Lan
State Key Laboratory of Heavy Oil Processing, College of Carbon Neutrality Future Technology, China University of Petroleum (Beijing), Beijing 102249, China
C
Chunming Xu
State Key Laboratory of Heavy Oil Processing, College of Carbon Neutrality Future Technology, China University of Petroleum (Beijing), Beijing 102249, China