🤖 AI Summary
This paper addresses the challenge of balancing statistical efficiency and economic interpretability in corporate characteristic factor construction. Methodologically, it proposes an economically grounded, data-driven factor extraction framework: first, grouping firm characteristics by economic structure based on financial theory; second, applying data-driven clustering to identify homogeneous subgroups within each economic category; and third, embedding Iterative Principal Component Analysis (IPCA) within each subgroup to extract representative, interpretable factors. This approach ensures factor transparency, conceptual clarity, parsimony, and traceability while overcoming IPCA’s conventional neglect of economic structure. Empirical analysis using 94 firm characteristics demonstrates that the resulting factors significantly outperform standard IPCA and Fama–French–style models in out-of-sample asset pricing. The results validate the efficacy and superiority of the “economic constraints + statistical optimization” paradigm.
📝 Abstract
We develop a new framework for constructing factors from firm characteristics that balances statistical efficiency and economic interpretability. Instead of using all characteristics equally, our method groups related characteristics and derives one factor per group. The grouping combines economic intuition with data-driven clustering. Applied to the IPCA model by Kelly et al. (2019), our approach yields economically meaningful factors that match or exceed standard IPCA in pricing performance. Using 94 characteristics from Gu et al. (2020), we show that our parsimonious, transparent factors outperform benchmarks in out-of-sample tests, demonstrating the value of embedding economic structure into statistical modeling.