🤖 AI Summary
This study addresses the critical issue of insufficient color contrast on web pages, which severely compromises accessibility for users with visual impairments. Leveraging Common Crawl’s WARC archives, the authors conduct the first large-scale, server-load-free static analysis of CSS from the homepages of the top 500 domains, automatically computing foreground–background color contrast ratios against the WCAG 2.1 AA compliance threshold of 4.5:1. The analysis reveals that 40.9% of all examined color pairs fail to meet this standard, with a median site-level compliance rate of only 62.7%. Notably, merely 20.4% of sites achieve full compliance, underscoring that inadequate color contrast remains a pervasive accessibility flaw across mainstream websites. This work establishes a novel, reproducible paradigm for large-scale web accessibility auditing through archival data.
📝 Abstract
We present a large-scale automated audit of WCAG 2.1/2.2 Level AA colour contrast compliance across the 500 most frequently crawled registered domains in Common Crawl's CC-MAIN-2026-08 February 2026 crawl archive. Rather than conducting a live crawl, all page content was sourced from Common Crawl's open WARC archives, ensuring reproducibility and eliminating any load on target web servers. Our static CSS analysis of 240 homepages identified 4,327 unique foreground/background colour pairings, of which 1,771 (40.9%) failed to meet the 4.5:1 contrast ratio threshold for normal text. The median per-site pass rate was 62.7%, with 20.4% of sites achieving full compliance across all detected colour pairings. These findings suggest that colour contrast remains a widespread accessibility barrier on the most prominent websites, with significant variation across domain categories.