Evaluating Large Language Models for Stance Detection on Financial Targets from SEC Filing Reports and Earnings Call Transcripts

📅 2025-10-27

📈 Citations: 0

✨ Influential: 0

career value

141K/year

🤖 AI Summary

This study addresses sentence-level stance detection toward three financial targets—debt, earnings per share (EPS), and sales—in financial texts. To overcome the scarcity of labeled data, we propose a large language model (LLM)-driven approach that requires no large-scale manual annotation. We construct the first fine-grained stance detection corpus specifically for these three key financial metrics, with initial labels generated by ChatGPT-o3-pro and rigorously validated by domain experts. We systematically evaluate zero-shot, few-shot, and chain-of-thought (CoT) prompting strategies. Results show that few-shot + CoT prompting significantly outperforms supervised baselines and demonstrates strong generalization across two distinct financial text genres: SEC annual reports and earnings call transcripts. Moreover, we identify notable genre-specific effects on stance classification performance. This work constitutes the first empirical validation of LLMs’ effectiveness and practicality for low-resource financial stance analysis, establishing a novel paradigm for fine-grained, interpretable financial semantic analysis.

Technology Category

Application Category

📝 Abstract

Financial narratives from U.S. Securities and Exchange Commission (SEC) filing reports and quarterly earnings call transcripts (ECTs) are very important for investors, auditors, and regulators. However, their length, financial jargon, and nuanced language make fine-grained analysis difficult. Prior sentiment analysis in the financial domain required a large, expensive labeled dataset, making the sentence-level stance towards specific financial targets challenging. In this work, we introduce a sentence-level corpus for stance detection focused on three core financial metrics: debt, earnings per share (EPS), and sales. The sentences were extracted from Form 10-K annual reports and ECTs, and labeled for stance (positive, negative, neutral) using the advanced ChatGPT-o3-pro model under rigorous human validation. Using this corpus, we conduct a systematic evaluation of modern large language models (LLMs) using zero-shot, few-shot, and Chain-of-Thought (CoT) prompting strategies. Our results show that few-shot with CoT prompting performs best compared to supervised baselines, and LLMs' performance varies across the SEC and ECT datasets. Our findings highlight the practical viability of leveraging LLMs for target-specific stance in the financial domain without requiring extensive labeled data.

Problem

Research questions and friction points this paper is trying to address.

Detecting stance on financial targets from SEC reports

Evaluating LLMs for financial stance detection without labeled data

Analyzing nuanced language in earnings calls and filings

Innovation

Methods, ideas, or system contributions that make the work stand out.

Used ChatGPT for stance labeling with human validation

Evaluated LLMs with zero-shot and few-shot prompting

Applied Chain-of-Thought prompting for financial stance detection

🔎 Similar Papers

Can Large Language Models Address Open-Target Stance Detection?