Evaluating Large Language Models on Computer Science University Exams in Data Structures

📅 2026-04-25
📈 Citations: 0
Influential: 0
📄 PDF

career value

172K/year
🤖 AI Summary
This study evaluates the problem-solving capabilities of large language models on university-level data structures examination questions. To this end, the authors construct the first closed-book, multiple-choice benchmark dataset derived from the Data Structures course at Tel Aviv University and systematically assess several prominent models—including GPT-4o, Claude 3.5, Mathstral 7B, and LLaMA 3 8B—on this benchmark. The findings illuminate both the current performance and inherent limitations of these models in tackling core computer science problems, thereby addressing a critical gap in domain-specific evaluation benchmarks for foundational computing curricula. Moreover, the work provides empirical evidence supporting the potential integration of large language models into higher education contexts.

Technology Category

Application Category

📝 Abstract
We present a comprehensive evaluation of Large Language Models (LLMs) on Computer Science (CS) Data Structure examination questions. Our work introduces a new benchmark dataset comprising exam questions from Tel Aviv University (TAU), curated to assess LLMs' abilities in handling closed and multiple-choice questions. We evaluated the performance of OpenAI's GPT 4o and Anthropic's Claude 3.5, popular LLMs, alongside two smaller LLMs, Mathstral 7B and LLaMA 3 8B, across the TAU exams benchmark. Our findings provide insight into the current capabilities of LLMs in CS education.
Problem

Research questions and friction points this paper is trying to address.

Large Language Models
Computer Science Education
Data Structures
Exam Evaluation
Benchmark Dataset
Innovation

Methods, ideas, or system contributions that make the work stand out.

benchmark dataset
Large Language Models
Computer Science education
Data Structures
exam evaluation
E
Edan Gabay
Blavatnik School of Computer Science and AI, Tel Aviv University
Y
Yael Maoz
Blavatnik School of Computer Science and AI, Tel Aviv University
J
Jonathan Stahl
Blavatnik School of Computer Science and AI, Tel Aviv University
N
Naama Maoz
Blavatnik School of Computer Science and AI, Tel Aviv University
A
Abdo Amer
Blavatnik School of Computer Science and AI, Tel Aviv University
O
Orr Eilat
Blavatnik School of Computer Science and AI, Tel Aviv University
H
Hanoch Levy
Blavatnik School of Computer Science and AI, Tel Aviv University
M
Michal Kleinbort
Blavatnik School of Computer Science and AI, Tel Aviv University
A
Amir Rubinstein
Blavatnik School of Computer Science and AI, Tel Aviv University
Adi Haviv
Adi Haviv
Tel Aviv University
Natural Language ProcessingMachine LearningArtificial Intelligence