Bongards at the Boundary of Perception and Reasoning: Programs or Language?

📅 2026-02-03

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

This work addresses the Bongard problem—a challenging task that requires inducing abstract rules from novel visual contexts—by proposing a neuro-symbolic approach that leverages large language models to generate parameterized programmatic representations of candidate rules, combined with Bayesian optimization to automatically fit rule parameters. This framework represents the first integration of program synthesis and Bayesian optimization for solving Bongard problems, effectively bridging the gap between perceptual processing and symbolic reasoning. Experimental results demonstrate that the proposed method not only achieves high-accuracy image classification under known rules but also successfully infers previously unseen Bongard rules from scratch, exhibiting strong capabilities in visual abstraction and generalization.

Technology Category

Application Category

📝 Abstract

Vision-Language Models (VLMs) have made great strides in everyday visual tasks, such as captioning a natural image, or answering commonsense questions about such images. But humans possess the puzzling ability to deploy their visual reasoning abilities in radically new situations, a skill rigorously tested by the classic set of visual reasoning challenges known as the Bongard problems. We present a neurosymbolic approach to solving these problems: given a hypothesized solution rule for a Bongard problem, we leverage LLMs to generate parameterized programmatic representations for the rule and perform parameter fitting using Bayesian optimization. We evaluate our method on classifying Bongard problem images given the ground truth rule, as well as on solving the problems from scratch.

Problem

Research questions and friction points this paper is trying to address.

Bongard problems

visual reasoning

abstract rule induction

vision-language models

generalization

Innovation

Methods, ideas, or system contributions that make the work stand out.

neurosymbolic

Bongard problems

vision-language models