Rethinking AI Evaluation in Education: The TEACH-AI Framework and Benchmark for Generative AI Assistants

📅 2025-11-28

📈 Citations: 0

✨ Influential: 0

career value

174K/year

🤖 AI Summary

Current AI-in-education evaluations over-rely on technical metrics (e.g., accuracy), neglecting learner agency, situated learning processes, and ethical dimensions. To address this gap, we propose TEACH-AI—a novel socio-technical assessment framework derived from a systematic literature review and theoretical integration. TEACH-AI introduces ten cross-cutting dimensions spanning pedagogy, stakeholder engagement, ethics, and quantifiable indicators, thereby transcending techno-centric paradigms. It emphasizes value alignment, co-design, and contextual adaptation, and is accompanied by a reusable toolkit supporting iterative development and empirical validation of generative AI teaching assistants in authentic classroom settings. The framework provides designers, developers, and policymakers with a theoretically grounded yet practically actionable assessment guide. By integrating educational theory, multi-stakeholder requirements, and measurable outcomes, TEACH-AI advances responsible, trustworthy, and sustainable AI deployment in education.

Technology Category

Application Category

📝 Abstract

As generative artificial intelligence (AI) continues to transform education, most existing AI evaluations rely primarily on technical performance metrics such as accuracy or task efficiency while overlooking human identity, learner agency, contextual learning processes, and ethical considerations. In this paper, we present TEACH-AI (Trustworthy and Effective AI Classroom Heuristics), a domain-independent, pedagogically grounded, and stakeholder-aligned framework with measurable indicators and a practical toolkit for guiding the design, development, and evaluation of generative AI systems in educational contexts. Built on an extensive literature review and synthesis, the ten-component assessment framework and toolkit checklist provide a foundation for scalable, value-aligned AI evaluation in education. TEACH-AI rethinks "evaluation" through sociotechnical, educational, theoretical, and applied lenses, engaging designers, developers, researchers, and policymakers across AI and education. Our work invites the community to reconsider what constructs "effective" AI in education and to design model evaluation approaches that promote co-creation, inclusivity, and long-term human, social, and educational impact.

Problem

Research questions and friction points this paper is trying to address.

Addresses lack of human-centered metrics in AI education evaluation

Introduces a framework for assessing generative AI's educational impact

Promotes inclusive, ethical AI design for learning contexts

Innovation

Methods, ideas, or system contributions that make the work stand out.

Framework with measurable indicators and toolkit

Ten-component assessment for scalable AI evaluation

Sociotechnical lenses to rethink educational AI evaluation

🔎 Similar Papers

Towards Responsible Development of Generative AI for Education: An Evaluation-Driven Approach