🤖 AI Summary
This study addresses the lack of transparency and robustness in Kandidattest, a widely used Danish voting advice application, whose algorithm exhibits significant sensitivity to minor parameter adjustments—potentially misleading voters through unstable candidate recommendations. For the first time, this work conducts a fine-grained robustness audit of a real-world Nordic voting advice system by simulating questionnaire responses to systematically evaluate output stability under variations in question weights and counts. Integrating algorithmic auditing, synthetic data generation, and sensitivity analysis, the research quantifies the system’s dependence on input and configuration changes, revealing pronounced instability. These findings raise serious concerns about the reliability of such tools as democratic decision-support instruments.
📝 Abstract
Voting Advice Applications (VAA) are tools designed to help voters compare political candidates on policy preferences prior to elections. VAAs are popular tools in European countries and in other countries with multi-party democratic systems. Through a freedom of information request we got access to the inner workings of a popular Danish VAA called the Kandidattest which is implemented by major Danish news outlet and has been used for general, municipal, and European elections. Users and politicians from every political party answer the same online questionnaire and get matched based on the agreement percentage stemming from their answers. VAAs play a significant role in elections with 45% of surveyed voters reporting they followed its recommendations in the past Danish general election, however, the inner workings of VAAs have not been thoroughly evaluated. We find that the algorithm is not robust enough for users to trust the agreement percentages in the output, as small changes to the algorithm can lead to different results, potentially affecting election results. We conduct an algorithmic audit of the Kandidattest's robustness, using simulated responses to investigate the tool's brittleness, with respect to minor adjustments of the algorithm's weight, and changes in the number of questions of the questionnaire.