π€ AI Summary
This study investigates whether vision-language models can reason about the implicit inductive strength conveyed by quantified expressions such as βall bears,β βbearsβ (generic), and βsome bears,β mirroring the patterns observed in human children. By replicating the classic experimental paradigm of Gelman et al. (2002) and integrating category recognition tasks with pretests for quantifier sensitivity and post-hoc representational analyses, this work provides the first evidence within a general-purpose statistical learning model that linguistic form exerts semantic constraints on inductive reasoning. The results demonstrate that the model recapitulates human-like inductive preferences at the behavioral level (all > generics > some) and organizes its internal representations according to inductive strength rather than surface syntactic structure, thereby revealing a computational mechanism through which language shapes induction.
π Abstract
Language places subtle constraints on how we make inductive inferences. Developmental evidence by Gelman et al. (2002) has shown children (4 years and older) to differentiate among generic statements ("Bears are daxable"), universally quantified NPs ("all bears are daxable") and indefinite plural NPs ("some bears are daxable") in extending novel properties to a specific member (all>generics>some), suggesting that they represent these types of propositions differently. We test if these subtle differences arise in general purpose statistical learners like Vision Language Models, by replicating the original experiment. On tasking them through a series of precondition tests (robust identification of categories in images and sensitivities to all and some), followed by the original experiment, we find behavioral alignment between models and humans. Post-hoc analyses on their representations revealed that these differences are organized based on inductive constraints and not surface-form differences.