Toward Inclusive AI-Driven Development: Exploring Gender Differences in Code Generation Tool Interactions

📅 2025-07-19

📈 Citations: 0

✨ Influential: 0

career value

186K/year

🤖 AI Summary

This study addresses the underexplored issue of gender bias in AI-powered code generation tools (CGTs). Using a mixed-subjects experimental design, we systematically investigate how gender moderates CGT usage across three dimensions: task performance (completion time, code correctness), subjective cognitive load (measured via the NASA-TLX scale), and fine-grained interaction behaviors (captured via screen recording and behavioral log analysis). Results reveal statistically significant gender differences: female developers exhibit higher cognitive load during complex programming tasks, adopt distinct tool reliance patterns, and demonstrate lower efficiency in specific interaction pathways. To our knowledge, this is the first empirical study to rigorously examine fairness and inclusivity in CGT usage from a gender perspective. The findings provide data-driven insights for improving prompt engineering, feedback mechanisms, and UI/UX adaptation—ultimately advancing equitable, human-AI collaborative software development practices.

Technology Category

Application Category

📝 Abstract

Context: The increasing reliance on Code Generation Tools (CGTs), such as Windsurf and GitHub Copilot, are revamping programming workflows and raising critical questions about fairness and inclusivity. While CGTs offer potential productivity enhancements, their effectiveness across diverse user groups have not been sufficiently investigated. Objectives: We hypothesize that developers' interactions with CGTs vary based on gender, influencing task outcomes and cognitive load, as prior research suggests that gender differences can affect technology use and cognitive processing. Methods: The study will employ a mixed-subjects design with 54 participants, evenly divided by gender for a counterbalanced design. Participants will complete two programming tasks (medium to hard difficulty) with only CGT assistance and then with only internet access. Task orders and conditions will be counterbalanced to mitigate order effects. Data collection will include cognitive load surveys, screen recordings, and task performance metrics such as completion time, code correctness, and CGT interaction behaviors. Statistical analyses will be conducted to identify statistically significant differences in CGT usage. Expected Contributions: Our work can uncover gender differences in CGT interaction and performance among developers. Our findings can inform future CGT designs and help address usability and potential disparities in interaction patterns across diverse user groups. Conclusion: While results are not yet available, our proposal lays the groundwork for advancing fairness, accountability, transparency, and ethics (FATE) in CGT design. The outcomes are anticipated to contribute to inclusive AI practices and equitable tool development for all users.

Problem

Research questions and friction points this paper is trying to address.

Investigates gender differences in Code Generation Tool interactions

Examines impact of gender on task outcomes and cognitive load

Aims to improve fairness and inclusivity in AI tool design

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mixed-subjects design with gender-balanced participants

Cognitive load surveys and screen recordings

Statistical analysis of CGT interaction differences

🔎 Similar Papers

No similar papers found.