🤖 AI Summary
This study conducts a third-party fairness audit of LinkedIn’s talent search ranking system, focusing on gender- and race-based ranking bias. Addressing the limitation of prior work—which predominantly assesses static exposure while neglecting temporal dynamics—we propose a two-dimensional fairness framework: “static exposure” and “dynamic stability.” We quantify exposure inequality using group proportion deviation and MinSkew, and evaluate stability via ranking volatility and positional persistence. Leveraging large-scale real-world query logs and proxy-based sensitive attribute inference, we perform an external, black-box audit. Empirical results reveal significant underrepresentation of women and racial minorities at top-ranking positions, coupled with higher ranking instability and shorter positional retention—indicating systemic disadvantages in ranking durability. To our knowledge, this is the first fairness evaluation of recruitment platforms to explicitly incorporate temporal dimensions, offering a novel paradigm and empirical foundation for sustainable, algorithmic fairness governance in hiring systems.
📝 Abstract
We conduct an independent, third-party audit for bias of LinkedIn's Talent Search ranking system, focusing on potential ranking bias across two attributes: gender and race. To do so, we first construct a dataset of rankings produced by the system, collecting extensive Talent Search results across a diverse set of occupational queries. We then develop a robust labeling pipeline that infers the two demographic attributes of interest for the returned users. To evaluate potential biases in the collected dataset of real-world rankings, we utilize two exposure disparity metrics: deviation from group proportions and MinSkew. Our analysis reveals an under-representation of minority groups in early ranks across many queries. We further examine potential causes of this disparity, and discuss why they may be difficult or, in some cases, impossible to fully eliminate among the early ranks of queries. Beyond static metrics, we also investigate the concept of subgroup fairness over time, highlighting temporal disparities in exposure and retention, which are often more difficult to audit for in practice. In employer recruiting platforms such as LinkedIn Talent Search, the persistence of a particular candidate over multiple days in the ranking can directly impact the probability that the given candidate is selected for opportunities. Our analysis reveals demographic disparities in this temporal stability, with some groups experiencing greater volatility in their ranked positions than others. We contextualize all our findings alongside LinkedIn's published self-audits of its Talent Search system and reflect on the methodological constraints of a black-box external evaluation, including limited observability and noisy demographic inference.