CatRAG: Functor-Guided Structural Debiasing with Retrieval Augmentation for Fair LLMs

📅 2026-03-22

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

This work proposes CatRAG, a novel debiasing framework for large language models (LLMs) that addresses fairness and credibility concerns arising from demographic, gender, and geographic biases in high-stakes scenarios. CatRAG introduces categorical-theoretic functors into LLM debiasing for the first time, integrating retrieval-augmented generation (RAG) with structure-preserving embedding space projections. This approach enables systematic, cross-stage debiasing that suppresses bias directions while preserving task-relevant semantics. Evaluated on the BBQ benchmark, CatRAG achieves up to a 40% improvement in accuracy over baseline models and reduces bias scores to near zero, substantially outperforming existing debiasing methods.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) are deployed in high-stakes settings but can show demographic, gender, and geographic biases that undermine fairness and trust. Prior debiasing methods, including embedding-space projections, prompt-based steering, and causal interventions, often act at a single stage of the pipeline, resulting in incomplete mitigation and brittle utility trade-offs under distribution shifts. We propose CatRAG Debiasing, a dual-pronged framework that integrates functor with Retrieval-Augmented Generation (RAG) guided structural debiasing. The functor component leverages category-theoretic structure to induce a principled, structure-preserving projection that suppresses bias-associated directions in the embedding space while retaining task-relevant semantics. On the Bias Benchmark for Question Answering (BBQ) across three open-source LLMs (Meta Llama-3, OpenAI GPT-OSS, and Google Gemma-3), CatRAG achieves state-of-the-art results, improving accuracy by up to 40% over the corresponding base models and by more than 10% over prior debiasing methods, while reducing bias scores to near zero (from 60% for the base models) across gender, nationality, race, and intersectional subgroups.

Problem

Research questions and friction points this paper is trying to address.

bias

fairness

large language models

demographic bias

retrieval-augmented generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

functor

Retrieval-Augmented Generation

structural debiasing