🤖 AI Summary
Existing sequential sampling models (SSMs) suffer from intractable likelihoods and pose challenges for Bayesian inference when integrating heterogeneous data sources—such as choice outcomes and response times. To address this, we propose MOBOLFI: the first framework to integrate multi-objective Bayesian optimization into likelihood-free inference (LFI). MOBOLFI enables both independent modeling and joint approximation of multiple data sources, employs a multidimensional inconsistency metric to automatically detect inter-source conflicts, and quantifies the relative contribution of each source to individual parameter estimation. Evaluated on synthetic benchmarks and real-world data—specifically, electric vehicle leasing preferences of ride-hailing drivers in Singapore—MOBOLFI demonstrates substantial improvements in estimation efficiency, robustness, and interpretability for multi-source SSMs. It establishes a novel paradigm for cognitive modeling and behavioral decision analysis grounded in principled, data-integrated inference.
📝 Abstract
Scientifically motivated statistical models can sometimes be defined by a generative process for simulating synthetic data. Models specified this way can have likelihoods which are intractable, and this is the case for many sequential sampling models (SSMs) widely used in psychology and consumer behavior modelling. Researchers have developed likelihood-free inference (LFI) methods to make Bayesian inferences on parameters in models with intractable likelihood. Extending a popular approach to simulation efficient LFI for single-source data, we propose Multi-objective Bayesian Optimization for Likelihood-Free Inference (MOBOLFI) to estimate the parameters of SSMs calibrated using multi-source data, such as those based on response times and choice outcomes. MOBOLFI models a multi-dimensional discrepancy between observed and simulated data, using a discrepancy for each data source. Multi-objective Bayesian Optimization is then used to ensure simulation efficient approximation of the SSM likelihood. The use of a multivariate discrepancy allows for approximations to individual data source likelihoods in addition to the joint likelihood, enabling both the detection of conflicting information and a deeper understanding of the importance of different data sources in estimating individual SSM parameters. We illustrate the advantages of our approach in comparison with the use of a single discrepancy in a simple synthetic data example and an SSM example with real-world data assessing preferences of ride-hailing drivers in Singapore to rent electric vehicles. Although we focus on applications to SSMs, our approach applies to the likelihood-free calibration of other models using multi-source data.