PyGress: Tool for Analyzing the Progression of Code Proficiency in Python OSS Projects

📅 2025-11-08

📈 Citations: 0

✨ Influential: 0

career value

171K/year

🤖 AI Summary

This study addresses the lack of standardized, longitudinal assessment of programming proficiency in open-source Python development. We propose the first automated method to adapt the Common European Framework of Reference for Languages (CEFR) to Python code proficiency evaluation. Our approach parses GitHub commit histories and leverages the pycefr static analysis engine alongside syntactic and semantic feature extraction to map code snippets automatically to CEFR levels (A1–C2). Temporal modeling and interactive web-based visualization enable dynamic tracking of individual developers’ and project-level proficiency evolution. Key contributions are: (1) establishing the first CEFR-informed paradigm for code competency assessment; and (2) releasing an open-source, reproducible end-to-end analysis pipeline and demonstration system. Empirical evaluation across major Python repositories confirms the method’s validity, interpretability, and practical utility in quantifying skill progression over time.

Technology Category

Application Category

📝 Abstract

Assessing developer proficiency in open-source software (OSS) projects is essential for understanding project dynamics, especially for expertise. This paper presents PyGress, a web-based tool designed to automatically evaluate and visualize Python code proficiency using pycefr, a Python code proficiency analyzer. By submitting a GitHub repository link, the system extracts commit histories, analyzes source code proficiency across CEFR-aligned levels (A1 to C2), and generates visual summaries of individual and project-wide proficiency. The PyGress tool visualizes per-contributor proficiency distribution and tracks project code proficiency progression over time. PyGress offers an interactive way to explore contributor coding levels in Python OSS repositories. The video demonstration of the PyGress tool can be found at https://youtu.be/hxoeK-ggcWk, and the source code of the tool is publicly available at https://github.com/MUICT-SERU/PyGress.

Problem

Research questions and friction points this paper is trying to address.

Automatically evaluates Python code proficiency in OSS projects

Tracks progression of coding skills across CEFR levels over time

Visualizes individual and project-wide Python proficiency distribution

Innovation

Methods, ideas, or system contributions that make the work stand out.

Automated Python code proficiency analysis using pycefr

Extracts GitHub commit histories for proficiency tracking

Visualizes individual and project-wide coding level progression

🔎 Similar Papers

No similar papers found.