🤖 AI Summary
This study investigates whether large language models (LLMs) can accurately forecast the direction of future corporate earnings—upward or downward—solely from standardized, anonymized financial statements, without domain-specific knowledge or narrative context. Using GPT-4 in a zero-shot setting, we design a two-stage task: financial metric extraction followed by binary directional classification, augmented with interpretable, economically grounded reasoning chains. Our empirical analysis provides the first evidence that LLMs significantly outperform professional human analysts—especially in cases where analyst accuracy is low—and that this superiority stems not from memorization of training data but from generating forward-looking insights aligned with economic principles. Moreover, LLM predictions achieve accuracy comparable to state-of-the-art task-specific machine learning models; a trading strategy built upon these predictions delivers superior Sharpe ratio and alpha, confirming their practical efficacy in real-world investment decision-making.
📝 Abstract
We investigate whether large language models (LLMs) can successfully perform financial statement analysis in a way similar to a professional human analyst. We provide standardized and anonymous financial statements to GPT4 and instruct the model to analyze them to determine the direction of firms' future earnings. Even without narrative or industry-specific information, the LLM outperforms financial analysts in its ability to predict earnings changes directionally. The LLM exhibits a relative advantage over human analysts in situations when the analysts tend to struggle. Furthermore, we find that the prediction accuracy of the LLM is on par with a narrowly trained state-of-the-art ML model. LLM prediction does not stem from its training memory. Instead, we find that the LLM generates useful narrative insights about a company's future performance. Lastly, our trading strategies based on GPT's predictions yield a higher Sharpe ratio and alphas than strategies based on other models. Our results suggest that LLMs may take a central role in analysis and decision-making.