π€ AI Summary
This work addresses the lack of effective training mechanisms for enhancing critical thinking in large language models. The authors propose the first general-purpose automated argument reconstruction engine (GAAR) to generate Arguinas, a high-quality synthetic dataset, which is then used for supervised fine-tuning. This approach provides the first empirical validation that argument reconstruction serves as an effective supervisory signal for improving modelsβ critical thinking capabilities. Experimental results across seven standard critical thinking benchmarks demonstrate that models trained on Arguinas significantly outperform baseline approaches, thereby substantiating the effectiveness and novelty of the proposed framework.
π Abstract
To think critically about arguments, human learners are trained to identify, reconstruct, and evaluate arguments. Argument reconstruction is especially important because it makes an argument's underlying inferences explicit. However, it remains unclear whether LLMs can similarly enhance their critical thinking ability by learning to reconstruct arguments. To address this question, we introduce a holistic framework with three contributions. We (1) propose an engine that automatically reconstructs arbitrary arguments (GAAR), (2) synthesize a new high-quality argument reconstruction dataset (Arguinas) using the GAAR engine, and (3) investigate whether learning argument reconstruction benefits downstream critical thinking tasks. Our experimental results show that, across seven critical thinking tasks, models trained to learn argument reconstruction outperform models that do not, with the largest performance gains observed when training on the proposed Arguinas dataset. The source code and dataset will be publicly available.