🤖 AI Summary
Current AI models achieve only 40–55% accuracy on the Abstraction and Reasoning Corpus (ARC), substantially underperforming human-level abstract reasoning. Method: We propose GIFARC—a novel, analogy-guided synthetic dataset constructed from GIF animations—designed to generate ARC-style tasks by explicitly modeling everyday analogical relationships. GIFARC encodes human intuitive analogies as structured priors, shifting the reasoning paradigm from pattern matching to analogy-driven inference. It integrates large language models (LLMs) and vision-language models (VLMs) to automatically synthesize tasks with ground-truth analogy mappings. Contribution/Results: Experiments demonstrate that training on GIFARC significantly improves model performance on ARC, enhancing both solution efficiency and alignment with human reasoning patterns. Critically, it induces LLMs to adopt analogy-dominated reasoning paths, thereby narrowing the gap between AI and human capabilities in abstract reasoning.
📝 Abstract
The Abstraction and Reasoning Corpus (ARC) poses a stringent test of general AI capabilities, requiring solvers to infer abstract patterns from only a handful of examples. Despite substantial progress in deep learning, state-of-the-art models still achieve accuracy rates of merely 40-55% on 2024 ARC Competition, indicative of a significant gap between their performance and human-level reasoning. In this work, we seek to bridge that gap by introducing an analogy-inspired ARC dataset, GIFARC. Leveraging large language models (LLMs) and vision-language models (VLMs), we synthesize new ARC-style tasks from a variety of GIF images that include analogies. Each new task is paired with ground-truth analogy, providing an explicit mapping between visual transformations and everyday concepts. By embedding robust human-intuitive analogies into ARC-style tasks, GIFARC guides AI agents to evaluate the task analogically before engaging in brute-force pattern search, thus efficiently reducing problem complexity and build a more concise and human-understandable solution. We empirically validate that guiding LLM with analogic approach with GIFARC affects task-solving approaches of LLMs to align with analogic approach of human.