🤖 AI Summary
This study investigates whether global culinary recipes adhere to universal statistical regularities akin to those observed in natural languages. By constructing a large-scale corpus of traditional recipes spanning multiple cuisines and leveraging state-of-the-art named entity recognition to parse them into structured components—such as ingredients and cooking techniques—the authors apply methods from statistical linguistics to analyze their distributional properties. They reveal for the first time that recipe systems universally obey Zipf’s law, Heaps’ law, and the Menzerath–Altmann law, with macronutrient concentrations following a log-normal distribution. Furthermore, they propose a parsimonious generative model based on preference-driven reuse, constrained sampling, and incremental modification, which successfully reproduces these empirical patterns, offering a unified framework for understanding the combinatorial structure and evolutionary dynamics of recipes across cultures.
📝 Abstract
Cooking is a cultural expression of human creativity that transcends geography and time through the orchestration of ingredients and techniques, much like languages do through words and syntax. Yet, beneath the apparent diversity of culinary traditions, whether recipes obey statistical laws comparable to those of other symbolic systems remains unknown. Here we analyze a large corpus of traditional recipes spanning global cuisines, annotated using a state-of-the-art named entity recognition algorithm into ingredients, cooking techniques, utensils, and other culinary attributes. We find that ingredient usage exhibits Zipf-like rank-frequency scaling, that culinary diversity grows sublinearly with corpus size in accordance with Heaps' law, and that recipe complexity follows Menzerath-Altmann-type relations between the number and average information of constituent units. Consistent with observations in packaged foods, macronutrient concentrations across recipes also display a log-normal signature. Minimal generative models based on preferential reuse, constrained sampling, and incremental modification recapitulate these regularities, suggesting generic processes that shape recipe architecture across cultures. Together, these findings establish recipes as a compositional symbolic system in which complex structure emerges from simple, constrained generative processes.