π€ AI Summary
This study investigates the extent to which higher education curricula embed 21st-century core competencies and align with societal demands. To this end, we construct a dataset comprising 7,600 human-annotated samples and introduce a "Curricular Chain-of-Thought" (Curricular CoT) prompting strategy to enhance large language modelsβ reasoning capabilities in educational contexts, mitigate keyword-matching biases, and improve detection of subtle pedagogical evidence in lengthy texts. Experimental results demonstrate that detailed descriptions of teaching activities are the most informative; open-source models perform comparably to closed-source counterparts on coarse-grained competency mapping tasks; and while Curricular CoT yields a modest yet statistically significant performance gain, models still fall substantially short of human-level proficiency in fine-grained educational reasoning.
π Abstract
The growing emphasis on 21st-century competencies in postsecondary education, intensified by the transformative impact of generative AI, underscores the need to evaluate how these competencies are embedded in curricula and how effectively academic programs align with evolving workforce and societal demands. Curricular Analytics, particularly recent generative AI-powered approaches, offer a promising data-driven pathway. However, analyzing 21st-century competencies requires pedagogical reasoning beyond surface-level information retrieval, and the capabilities of large language models in this context remain underexplored. In this study, we extend prior curricular analytics research by examining a broader range of curriculum documents, competency frameworks, and models. Using 7,600 manually annotated curriculum-competency alignment scores, we assess the informativeness of different curriculum sources, benchmark general-purpose LLMs for curriculum-to-competency mapping, and analyze error patterns. We further introduce a reasoning-based prompting strategy, Curricular CoT, to strengthen LLMs'pedagogical reasoning. Our results show that detailed instructional activity descriptions are the most informative type of curriculum document for competency analytics. Open-weight LLMs achieve accuracy comparable to proprietary models on coarse-grained tasks, demonstrating their scalability and cost-effectiveness for institutional use. However, no model reaches human-level precision in fine-grained pedagogical reasoning. Our proposed Curricular CoT yields modest improvements by reducing bias in instructional keyword inference and improving the detection of nuanced pedagogical evidence in long text. Together, these findings highlight the untapped potential of institutional curriculum documents and provide an empirical foundation for advancing AI-driven curricular analytics.