🤖 AI Summary
Traditional generalized linear mixed-effects models (GLMMs) struggle to capture nonlinear relationships between covariates and responses. This work proposes a neural generalized mixed-effects model that, for the first time, integrates neural networks into the GLMM framework to flexibly model such nonlinear effects. The authors develop an efficient optimization algorithm based on a differentiable approximation to the marginal likelihood, which offers both scalability and theoretical guarantees in the form of Gaussian-tailed error bounds. Experimental results demonstrate that the proposed method significantly outperforms conventional GLMMs on synthetic data and achieves superior performance over existing approaches across multiple real-world datasets. Notably, the model has been successfully applied to large-scale student ability analysis, highlighting its practical utility and robustness.
📝 Abstract
Generalized linear mixed-effects models (GLMMs) are widely used to analyze grouped and hierarchical data. In a GLMM, each response is assumed to follow an exponential-family distribution where the natural parameter is given by a linear function of observed covariates and a latent group-specific random effect. Since exact marginalization over the random effects is typically intractable, model parameters are estimated by maximizing an approximate marginal likelihood. In this paper, we replace the linear function with neural networks. The result is a more flexible model, the neural generalized mixed-effects model (NGMM), which captures complex relationships between covariates and responses. To fit NGMM to data, we introduce an efficient optimization procedure that maximizes the approximate marginal likelihood and is differentiable with respect to network parameters. We show that the approximation error of our objective decays at a Gaussian-tail rate in a user-chosen parameter. On synthetic data, NGMM improves over GLMMs when covariate-response relationships are nonlinear, and on real-world datasets it outperforms prior methods. Finally, we analyze a large dataset of student proficiency to demonstrate how NGMM can be extended to more complex latent-variable models.