🤖 AI Summary
This work demonstrates that structured output formats—such as JSON and XML—significantly degrade the accuracy of open-source large language models on reasoning and writing tasks, primarily because format instructions interfere with the models’ natural reasoning processes. To address this “format tax,” the authors propose a general strategy that decouples reasoning from formatting, either by first generating content freely and then reformatting it, or by integrating an expanded chain-of-thought approach within a single generation pass. Experiments across six open-source models, four structured formats, and diverse task types show that this method substantially recovers lost performance and narrows the gap with closed-source counterparts. Notably, state-of-the-art closed-source models exhibit minimal sensitivity to such formatting constraints.
📝 Abstract
Asking a large language model to respond in JSON should be a formatting choice, not a capability tax. Yet we find that structured output requirements -- JSON, XML, LaTeX, Markdown -- substantially degrade reasoning and writing performance across open-weight models. The research response has focused on constrained decoding, but sampling bias accounts for only a fraction of the degradation. The dominant cost enters at the prompt: format-requesting instructions alone cause most of the accuracy loss, before any decoder constraint is applied. This diagnosis points to a simple principle: decouple reasoning from formatting. Whether by generating freeform first and reformatting in a second pass, or by enabling extended thinking within a single generation, separating the two concerns substantially recovers lost accuracy. Across six open-weight models, four API models, four formats, and tasks spanning math, science, logic, and writing, decoupling recovers most lost accuracy. Notably, most recent closed-weight models show little to no format tax, suggesting the problem is not inherent to structured generation but a gap that current open-weight models have yet to close. Code is available at https://github.com/ivnle/the-format-tax.