AI-Mediated Code Comment Improvement

📅 2025-05-13

📈 Citations: 0

✨ Influential: 0

career value

144K/year

🤖 AI Summary

Prior work lacks a systematic, empirically grounded definition of code comment quality and practical AI-driven optimization strategies. Method: This study introduces the first multi-dimensional comment quality assessment framework—grounded in grounded theory and empirical software engineering analysis—encompassing accuracy, readability, and informativeness. It proposes a large language model (LLM)-based comment rewriting approach using GPT-4o and innovatively applies knowledge distillation to compress it into a lightweight, locally deployable small model, balancing performance, data privacy, and sovereignty. Contribution/Results: Experiments demonstrate statistically significant improvements in comment quality across all dimensions. To ensure full reproducibility and transparency, all datasets, source code, and models are publicly released under open-source licenses.

Technology Category

Application Category

📝 Abstract

This paper describes an approach to improve code comments along different quality axes by rewriting those comments with customized Artificial Intelligence (AI)-based tools. We conduct an empirical study followed by grounded theory qualitative analysis to determine the quality axes to improve. Then we propose a procedure using a Large Language Model (LLM) to rewrite existing code comments along the quality axes. We implement our procedure using GPT-4o, then distil the results into a smaller model capable of being run in-house, so users can maintain data custody. We evaluate both our approach using GPT-4o and the distilled model versions. We show in an evaluation how our procedure improves code comments along the quality axes. We release all data and source code in an online repository for reproducibility.

Problem

Research questions and friction points this paper is trying to address.

Improving code comments using AI-based tools

Determining quality axes for comment enhancement

Distilling LLM results into smaller in-house models

Innovation

Methods, ideas, or system contributions that make the work stand out.

AI-based tools rewrite code comments

LLM improves comments along quality axes

Distilled model ensures data custody

🔎 Similar Papers

AI-powered Code Review with LLMs: Early Results