🤖 AI Summary
First-generation large language models (2020–2023) suffer from knowledge staleness, shallow reasoning, and static cognitive modeling, limiting their capacity for deep human-AI collaboration. To address these limitations, we propose the “Cognitive Engineering” framework, which establishes test-time scaling as a new paradigm for mind-level interaction—overcoming the constraints of fixed architectural design at training time. Our approach integrates chain-of-thought prompting, self-consistency verification, dynamic reweighting of reasoning paths, adaptive computational allocation, and differentiable search to transform LLMs from knowledge retrieval systems into programmable, reasoning-aware cognitive engines. We open-source pedagogical tutorials, efficient implementations, and a continuously updated literature repository, substantially lowering the barrier to entry for cognitive engineering. This work catalyzes the transition of generative AI into its second phase (commencing in 2024), characterized by language-driven, engineering-grade cognitive process design.
📝 Abstract
The first generation of Large Language Models - what might be called"Act I"of generative AI (2020-2023) - achieved remarkable success through massive parameter and data scaling, yet exhibited fundamental limitations in knowledge latency, shallow reasoning, and constrained cognitive processes. During this era, prompt engineering emerged as our primary interface with AI, enabling dialogue-level communication through natural language. We now witness the emergence of"Act II"(2024-present), where models are transitioning from knowledge-retrieval systems (in latent space) to thought-construction engines through test-time scaling techniques. This new paradigm establishes a mind-level connection with AI through language-based thoughts. In this paper, we clarify the conceptual foundations of cognition engineering and explain why this moment is critical for its development. We systematically break down these advanced approaches through comprehensive tutorials and optimized implementations, democratizing access to cognition engineering and enabling every practitioner to participate in AI's second act. We provide a regularly updated collection of papers on test-time scaling in the GitHub Repository: https://github.com/GAIR-NLP/cognition-engineering