🤖 AI Summary
This study addresses the linguistic modeling and quantitative evaluation of “professionalism” in expert financial-domain queries. Method: We propose the first interpretable linguistic feature framework, integrating structural and pragmatic features—including discourse modulators, prefatory expressions, and request types—to construct a manually annotated dataset augmented with large language model–generated samples. A classifier built upon this framework is evaluated against Gemini-2.0 and SVM baselines. Contribution/Results: Our approach significantly improves accuracy in identifying expert queries. Key findings reveal that professionalism rests on a shared, cross-author linguistic style—demonstrating learnability and cross-context transferability as a generalizable construct. The work provides theoretical foundations and methodological tools for human-AI collaborative question-answering in high-stakes domains, professional competency assessment, and domain-specific alignment of large language models.
📝 Abstract
Professionalism is a crucial yet underexplored dimension of expert communication, particularly in high-stakes domains like finance. This paper investigates how linguistic features can be leveraged to model and evaluate professionalism in expert questioning. We introduce a novel annotation framework to quantify structural and pragmatic elements in financial analyst questions, such as discourse regulators, prefaces, and request types. Using both human-authored and large language model (LLM)-generated questions, we construct two datasets: one annotated for perceived professionalism and one labeled by question origin. We show that the same linguistic features correlate strongly with both human judgments and authorship origin, suggesting a shared stylistic foundation. Furthermore, a classifier trained solely on these interpretable features outperforms gemini-2.0 and SVM baselines in distinguishing expert-authored questions. Our findings demonstrate that professionalism is a learnable, domain-general construct that can be captured through linguistically grounded modeling.