Do LLMs Favor Their Providers? Measuring Vertical Integration Bias in Code Generation

📅 2026-05-27

📈 Citations: 0

✨ Influential: 0

career value

178K/year

🤖 AI Summary

This study investigates whether large language models exhibit a preference for their affiliated technology ecosystems in code generation, thereby exacerbating developer lock-in. It formally defines and quantifies “vertical integration bias” (VIB), introduces the VIBench benchmark, and evaluates ten provider-affiliated models against three non-affiliated models across twenty alternative integration scenarios. Through direct and agent-mediated multi-step generation experiments, statistical significance testing, and downstream dependency tracing, the work identifies significant VIB in six affiliated models—reaching up to +18.8 percentage points—and demonstrates that agent-based workflows amplify this bias to +39.2 percentage points. Furthermore, initial ecosystem choices persist in subsequent files with a retention rate of 90.3%. These findings reveal a systematic ecological bias in AI-driven code generation and its amplification mechanisms.

📝 Abstract

Large Language Models (LLMs) have become an integral part of software development, especially with the advent of agentic capabilities. Yet, many frontier LLMs are affiliated with specific providers. This raises the question of whether generated code favors the provider's own ecosystem over comparable alternatives, potentially constraining developers' choices and increasing dependence on a single provider. We define this behavior as Vertical Integration Bias (VIB) and introduce \textsc{VIBench}, a benchmark for measuring VIB in direct and agentic code generation across $20$ provider-selectable software-integration scenarios. Evaluating $10$ frontier provider-affiliated models against $3$ non-affiliated controls, we find positive VIB in direct generation, with six of ten affiliated models showing statistically significant effects up to $+18.8$ percentage points (pp). Agentic workflows further amplify VIB, reaching $+39.2$ pp. Moreover, early affiliated-ecosystem choices in agentic workflows can persist into conceptually decoupled downstream files, with persistence as high as $90.3\%$. These findings underscore the need to measure and account for VIB in code generation, especially as agentic capabilities become more prevalent.

Problem

Research questions and friction points this paper is trying to address.

Vertical Integration Bias

LLM bias

code generation

provider affiliation

software ecosystem

Innovation

Methods, ideas, or system contributions that make the work stand out.

Vertical Integration Bias

VIBench

code generation