🤖 AI Summary
This study addresses privacy leakage and degraded result quality in programming-related search queries on mainstream search engines. We conduct a large-scale empirical evaluation of Google, Bing, and Apple Search—first incorporating Apple Search into systematic comparative analysis—using 1,467 coding-related queries. To assess technical relevance, we introduce Stack Overflow result ranking as a novel proxy; to quantify privacy risk, we measure third-party tracker counts per result page. Leveraging quantitative web measurement techniques, we statistically analyze publicly retrieved search results alongside privacy detection data. Results show Bing achieves the highest Stack Overflow coverage in top-1 results and the fewest trackers, outperforming Google and Apple overall. Although Apple exhibits stronger privacy controls, its technical result quality is significantly lower. Our work establishes a reproducible, dual-dimension evaluation framework—integrating technical relevance and privacy risk—that advances benchmarking for privacy-aware code search.
📝 Abstract
Even though currently being challenged by ChatGPT and other large-language models (LLMs), Google Search remains one of the primary means for many individuals to find information on the internet. Interestingly, the way that we retrieve information on the web has hardly changed ever since Google was established in 1998, raising concerns as to Google's dominance in search and lack of competition. If the market for search was sufficiently competitive, then we should probably see a steady increase in search quality over time as well as alternative approaches to the Google's approach to search. However, hardly any research has so far looked at search quality, which is a key facet of a competitive market, especially not over time. In this report, we conducted a relatively large-scale quantitative comparison of search quality of 1,467 search queries relating to coding advice in October 2023. We focus on coding advice because the study of general search quality is difficult, with the aim of learning more about the assessment of search quality and motivating follow-up research into this important topic. We evaluate the search quality of Google Search, Microsoft Bing, and Apple Search, with a special emphasis on Apple Search, a widely used search engine that has never been explored in previous research. For the assessment of search quality, we use two independent metrics of search quality: 1) the number of trackers on the first search result, as a measure of privacy in web search, and 2) the average rank of the first Stack Overflow search result, under the assumption that Stack Overflow gives the best coding advice. Our results suggest that the privacy of search results is higher on Bing than on Google and Apple. Similarly, the quality of coding advice -- as measured by the average rank of Stack Overflow -- was highest on Bing.