EduRABSA: An Education Review Dataset for Aspect-based Sentiment Analysis Tasks

📅 2025-08-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The education domain has long suffered from a scarcity of high-quality, publicly available aspect-based sentiment analysis (ABSA) datasets, hindering fine-grained opinion mining from multi-stakeholder textual feedback—such as course, instructor, and university reviews. To address this, we introduce EduRABSA, the first education-specific ABSA dataset, supporting three core tasks: explicit/implicit aspect identification, opinion term extraction, and sentiment classification across the aforementioned review targets. We further propose ASQE-DPT, a lightweight offline annotation tool that enables single-task labeling while automatically generating consistent, reproducible multi-task annotations. All resources—including the annotated dataset, annotation tool, and preprocessing scripts—are publicly released on GitHub. This open infrastructure significantly lowers the barrier to entry for educational text analysis and provides critical foundational support for ABSA research in resource-constrained domains.

Technology Category

Application Category

📝 Abstract
Every year, most educational institutions seek and receive an enormous volume of text feedback from students on courses, teaching, and overall experience. Yet, turning this raw feedback into useful insights is far from straightforward. It has been a long-standing challenge to adopt automatic opinion mining solutions for such education review text data due to the content complexity and low-granularity reporting requirements. Aspect-based Sentiment Analysis (ABSA) offers a promising solution with its rich, sub-sentence-level opinion mining capabilities. However, existing ABSA research and resources are very heavily focused on the commercial domain. In education, they are scarce and hard to develop due to limited public datasets and strict data protection. A high-quality, annotated dataset is urgently needed to advance research in this under-resourced area. In this work, we present EduRABSA (Education Review ABSA), the first public, annotated ABSA education review dataset that covers three review subject types (course, teaching staff, university) in the English language and all main ABSA tasks, including the under-explored implicit aspect and implicit opinion extraction. We also share ASQE-DPT (Data Processing Tool), an offline, lightweight, installation-free manual data annotation tool that generates labelled datasets for comprehensive ABSA tasks from a single-task annotation. Together, these resources contribute to the ABSA community and education domain by removing the dataset barrier, supporting research transparency and reproducibility, and enabling the creation and sharing of further resources. The dataset, annotation tool, and scripts and statistics for dataset processing and sampling are available at https://github.com/yhua219/edurabsa_dataset_and_annotation_tool.
Problem

Research questions and friction points this paper is trying to address.

Lack of annotated datasets for education review sentiment analysis
Difficulty in mining low-granularity opinions from educational feedback
Scarcity of ABSA resources focused on education domain
Innovation

Methods, ideas, or system contributions that make the work stand out.

First public annotated ABSA education dataset
Offline lightweight manual annotation tool
Supports implicit aspect and opinion extraction
🔎 Similar Papers
No similar papers found.
Y
Yan Cathy Hua
School of Computer Science, University of Auckland, New Zealand
Paul Denny
Paul Denny
Professor, University of Auckland
Educational technologyComputer Science Education
J
Jörg Wicker
School of Computer Science, University of Auckland, New Zealand
K
Katerina Taskova
School of Computer Science, University of Auckland, New Zealand