Integrating Large Language Models in Software Engineering Education: A Pilot Study through GitHub Repositories Mining

๐Ÿ“… 2025-09-05
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

184K/year
๐Ÿค– AI Summary
This study addresses the responsible integration of large language models (LLMs) in software engineering education. Using repository mining, we systematically analyzed READMEs and issue discussions from 400 GitHub-based open-source educational projects, complemented by a systematic literature review to develop and validate an empirically grounded coding framework. Our analysis yields the first large-scale empirical validation of motivations for and risks associated with LLM adoption in teaching contexts. Key findings indicate that enhancing student engagement (227 instances) and providing programming assistance (97 instances) are the primary pedagogical motivations; conversely, plagiarism concerns (385 instances) and privacy issues (87 instances) represent the most salient barriers. Notably, several theoretically posited risks remain unobserved in practice. By bridging the gap between research and real-world implementation, this work provides empirical foundations and a validated taxonomy to inform the design of systematic, actionable frameworks for responsible LLM integration in computing education.

Technology Category

Application Category

๐Ÿ“ Abstract
Context: Large Language Models (LLMs) such as ChatGPT are increasingly adopted in software engineering (SE) education, offering both opportunities and challenges. Their adoption requires systematic investigation to ensure responsible integration into curricula. Objective: This doctoral research aims to develop a validated framework for integrating LLMs into SE education through a multi-phase process, including taxonomies development, empirical investigation, and case studies. This paper presents the first empirical step. Method: We conducted a pilot repository mining study of 400 GitHub projects, analyzing README files and issues discussions to identify the presence of motivator and demotivator previously synthesized in our literature review [ 8] study. Results: Motivators such as engagement and motivation (227 hits), software engineering process understanding (133 hits), and programming assistance and debugging support (97 hits) were strongly represented. Demotivators, including plagiarism and IP concerns (385 hits), security, privacy and data integrity (87 hits), and over-reliance on AI in learning (39 hits), also appeared prominently. In contrast, demotivators such as challenges in evaluating learning outcomes and difficulty in curriculum redesign recorded no hits across the repositories. Conclusion: The study provides early empirical validation of motivators/demotivators taxonomies with respect to their themes, highlights research practice gaps, and lays the foundation for developing a comprehensive framework to guide the responsible adoption of LLMs in SE education.
Problem

Research questions and friction points this paper is trying to address.

Developing framework for integrating LLMs into software engineering education
Identifying motivators and demotivators of LLM adoption through GitHub mining
Ensuring responsible integration of AI tools in educational curricula
Innovation

Methods, ideas, or system contributions that make the work stand out.

Mining GitHub repositories for empirical data
Analyzing README files and issue discussions
Developing validated framework for LLM integration
๐Ÿ”Ž Similar Papers
No similar papers found.