DESIGNER: Design-Logic-Guided Multidisciplinary Data Synthesis for LLM Reasoning

📅 2025-08-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing reasoning datasets suffer from narrow disciplinary coverage and insufficient structural depth, limiting large language models’ (LLMs) ability to perform cross-disciplinary, multi-step reasoning. Method: We introduce the concept of “design logic” to emulate human propositional thinking, enabling reverse-engineering of generalizable logical patterns from over 100,000 real-world problems; these patterns are then aligned with raw multi-disciplinary texts (books and web pages) to build an automated, scalable framework for synthesizing reasoning data. Contribution/Results: We release two high-quality datasets—DLR-Book (75 disciplines, 3.04M instances) and DLR-Web (1.66M instances). Experiments demonstrate substantial performance gains for the Qwen3 series on complex cross-disciplinary reasoning tasks. This work establishes a novel paradigm and benchmark resource for evaluating and enhancing LLM reasoning capabilities.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) have achieved remarkable success in many natural language tasks but still struggle with complex, multi-step reasoning, particularly across diverse disciplines. Existing reasoning datasets often either lack disciplinary breadth or the structural depth necessary to elicit robust reasoning behaviors. We propose DESIGNER: a DESIGN-logic-guidEd Reasoning data synthesis pipeline that leverages naturally available, extensive raw documents (book corpus and web corpus) to generate multidisciplinary challenging questions. A core innovation of our approach is the introduction of a Design Logic concept, which mimics the question-creation process of human educators. We use LLMs to reverse-engineer and abstract over 120,000 design logics from existing questions across various disciplines. By matching these design logics with disciplinary source materials, we are able to create reasoning questions that far surpass the difficulty and diversity of existing datasets. Based on this pipeline, we synthesized two large-scale reasoning datasets that span 75 disciplines: Design-Logic-Reasoning-Book (DLR-Book), containing 3.04 million challenging questions synthesized from the book corpus, and Design-Logic-Reasoning-Web (DLR-Web), with 1.66 million challenging questions from the web corpus. Our data analysis demonstrates that the questions synthesized by our method exhibit substantially greater difficulty and diversity than those in the baseline datasets. We validate the effectiveness of these datasets by conducting SFT experiments on the Qwen3-8B-Base and Qwen3-4B-Base models. The results show that our dataset significantly outperforms existing multidisciplinary datasets of the same volume. Training with the full datasets further enables the models to surpass the multidisciplinary reasoning performance of the official Qwen3-8B and Qwen3-4B models.
Problem

Research questions and friction points this paper is trying to address.

Enhance LLM reasoning across diverse disciplines
Generate challenging multidisciplinary questions using design logics
Create large-scale datasets surpassing existing difficulty and diversity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Design Logic mimics human question-creation process
LLMs reverse-engineer 120,000 design logics
Multidisciplinary datasets surpass baseline difficulty
🔎 Similar Papers
No similar papers found.
W
Weize Liu
Alibaba Group
Y
Yongchi Zhao
Alibaba Group
Y
Yijia Luo
Alibaba Group
Mingyu Xu
Mingyu Xu
Bytedance
large language modelmachine learning
J
Jiaheng Liu
Nanjing University
Y
Yanan Li
Alibaba Group
X
Xiguo Hu
Alibaba Group
Y
Yuchi Xu
Alibaba Group
W
Wenbo Su
Alibaba Group
B
Bo Zheng
Alibaba Group