GeoLAN: Geometric Learning of Latent Explanatory Directions in Large Language Models

📅 2026-03-19

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

This work addresses the opacity of internal representations in large language models by modeling token embeddings as geometric trajectories. Inspired by the Kakeya conjecture, the authors introduce a “stickiness” constraint to enhance the geometric structure of these representations. They propose two differentiable regularizers, KT-CW and KT-Attn, which jointly incorporate geometric isotropy and attention diversity into interpretability-aware training—a novel integration in the field. Experiments on Gemma-3 and Llama-3-8B demonstrate that the approach significantly improves geometric desiderata and reduces certain fairness biases while preserving task accuracy, with more pronounced benefits observed in medium-scale models.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) demonstrate strong performance, but they often lack transparency. We introduce GeoLAN, a training framework that treats token representations as geometric trajectories and applies stickiness conditions inspired by recent developments related to the Kakeya Conjecture. We have developed two differentiable regularizers, Katz-Tao Convex Wolff (KT-CW) and Katz-Tao Attention (KT-Attn), that promote isotropy and encourage diverse attention. Our experiments with Gemma-3 (1B, 4B, 12B) and Llama-3-8B show that GeoLAN frequently maintains task accuracy while improving geometric metrics and reducing certain fairness biases. These benefits are most significant in mid-sized models. Our findings reveal scale-dependent trade-offs between geometric precision and performance, suggesting that geometry-aware training is a promising approach to enhance mechanistic interpretability.

Problem

Research questions and friction points this paper is trying to address.

interpretability

large language models

transparency

geometric representation

fairness bias

Innovation

Methods, ideas, or system contributions that make the work stand out.

geometric learning

latent explanatory directions

differentiable regularizers