🤖 AI Summary
This study investigates the feasibility of large language models (LLMs) in bridging environmental knowledge gaps among undergraduate students. To this end, we systematically evaluate GPT-4, Gemini 1.5, Claude 3 Sonnet, and Llama 2 using the standardized Environmental Knowledge Test (EKT-19) and domain-specific questions—marking the first multi-model benchmark assessment for environmental education. Evaluation criteria include knowledge coverage, answer accuracy, and pedagogical appropriateness. Results indicate that mainstream LLMs possess robust and broad foundational knowledge in environmental science, rendering them suitable for supporting introductory instruction; however, persistent factual inaccuracies and contextual misapplications necessitate expert verification. Our key contribution is the development of the first dedicated, multi-model benchmark framework for environmental education, empirically delineating the capabilities and limitations of AI-assisted teaching. This work provides evidence-based guidance and methodological foundations for designing and deploying educational AI tools.
📝 Abstract
This research investigates the potential of Artificial Intelligence (AI) models to bridge the knowledge gap in environmental education among university students. By focusing on prominent large language models (LLMs) such as GPT-3.5, GPT-4, GPT-4o, Gemini, Claude Sonnet, and Llama 2, the study assesses their effectiveness in conveying environmental concepts and, consequently, facilitating environmental education. The investigation employs a standardized tool, the Environmental Knowledge Test (EKT-19), supplemented by targeted questions, to evaluate the environmental knowledge of university students in comparison to the responses generated by the AI models. The results of this study suggest that while AI models possess a vast, readily accessible, and valid knowledge base with the potential to empower both students and academic staff, a human discipline specialist in environmental sciences may still be necessary to validate the accuracy of the information provided.