🤖 AI Summary
Machine learning (ML) notebooks suffer from pervasive low code quality, manifesting as violations of Python coding conventions, poor structural organization, and ML-specific defects—including non-reproducibility and API misuse. Existing static analysis tools address only isolated dimensions and lack cross-layer semantic understanding of ML contexts.
Method: We propose Vespucci Linter—the first metamodel-based, multi-level static analysis tool for ML notebooks. Built on the Moose platform, it unifies notebook structure and code entity modeling, enabling context-aware, cross-layer linting.
Contribution/Results: Vespucci integrates three orthogonal analysis layers—general coding practices, notebook organization, and ML semantics—implementing 22 literature-derived, actionable linting rules. Empirical evaluation on 5,000 Kaggle notebooks demonstrates its effectiveness in detecting multi-granularity quality issues, significantly enhancing notebook maintainability and reliability.
📝 Abstract
Machine Learning (ML) code, particularly within notebooks, often exhibits lower quality compared to traditional software. Bad practices arise at three distinct levels: general Python coding conventions, the organizational structure of the notebook itself, and ML-specific aspects such as reproducibility and correct API usage. However, existing analysis tools typically focus on only one of these levels and struggle to capture ML-specific semantics, limiting their ability to detect issues. This paper introduces Vespucci Linter, a static analysis tool with multi-level capabilities, built on Moose and designed to address this challenge. Leveraging a metamodeling approach that unifies the notebook's structural elements with Python code entities, our linter enables a more contextualized analysis to identify issues across all three levels. We implemented 22 linting rules derived from the literature and applied our tool to a corpus of 5,000 notebooks from the Kaggle platform. The results reveal violations at all levels, validating the relevance of our multi-level approach and demonstrating Vespucci Linter's potential to improve the quality and reliability of ML development in notebook environments.