🤖 AI Summary
This work proposes an automated classification approach that integrates traditional natural language processing techniques—such as syntactic parsing, part-of-speech tagging, and text embeddings—with large language models to efficiently align computer science course materials with the ACM/IEEE curriculum guidelines. Manual evaluation of such alignment is time-consuming and cognitively demanding; in contrast, the proposed method enables precise semantic-level categorization of instructional documents, significantly enhancing the efficiency of curriculum audits. By automating the mapping between course content and internationally recognized CS education standards, this approach offers a scalable technical solution to support quality assurance in computing education.
📝 Abstract
Professional societies often publish curriculum guidelines to help programs align their content to international standards. In Computer Science, the primary standard is published by ACM and IEEE and provide detailed guidelines for what should be and could be included in a Computer Science program. While very helpful, it remains difficult for program administrators to assess how much of the guidelines is being covered by a CS program. This is in particular due to the extensiveness of the guidelines, containing thousands of individual items. As such, it is time consuming and cognitively demanding to audit every course to confidently mark everything that is actually being covered. Our preliminary work indicated that it takes about a day of work per course. In this work, we propose using Natural Language Processing techniques to accelerate the process. We explore two kinds of techniques, the first relying on traditional tools for parsing, tagging, and embeddings, while the second leverages the power of Large Language Models. We evaluate the application of these techniques to classify a corpus of pedagogical materials and show that we can meaningfully classify documents automatically.