🤖 AI Summary
In NLP, core concepts such as “interpretability,” “bias,” and “reasoning” suffer from ambiguous definitions and inconsistent operationalization, undermining conceptual grounding in dataset construction, evaluation metric design, and system claims. Method: This paper introduces the first metascientific conceptual reflection pedagogical framework for NLP, integrating interdisciplinary reading seminars, conceptual critique workshops, operationalization case studies, and collaborative concept mapping to rigorously distinguish conceptualization (what a construct means) from operationalization (how it is measured). Contribution/Results: The framework cultivates researchers’ ability to define abstract constructs precisely and delineate their boundaries. It has generated multiple reflective practice reports, fostered community consensus on conceptual validity and evaluation metric appropriateness, and significantly enhanced researchers’ metascientific awareness and methodological reflexivity.
📝 Abstract
NLP researchers regularly invoke abstract concepts like "interpretability," "bias," "reasoning," and "stereotypes," without defining them. Each subfield has a shared understanding or conceptualization of what these terms mean and how we should treat them, and this shared understanding is the basis on which operational decisions are made: Datasets are built to evaluate these concepts, metrics are proposed to quantify them, and claims are made about systems. But what do they mean, what should they mean, and how should we measure them? I outline a seminar I created for students to explore these questions of conceptualization and operationalization, with an interdisciplinary reading list and an emphasis on discussion and critique.