Rethinking the Harmonic Loss via Non-Euclidean Distance Layers

📅 2026-03-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of cross-entropy loss—such as poor interpretability, unbounded weights, and low training efficiency—and extends beyond existing harmonic loss formulations, which are confined to Euclidean distance and lack systematic evaluation of computational efficiency and sustainability. For the first time, we integrate multiple non-Euclidean distance metrics, including cosine, Bray-Curtis, and Mahalanobis distances, into the harmonic loss framework to construct distance-customized loss functions, evaluating them end-to-end in both vision and language models. Experiments demonstrate that, in vision tasks, cosine-based harmonic loss significantly improves accuracy while reducing carbon emissions; in language models, it enhances gradient stability and representation structure, achieving lower training emissions than both cross-entropy and Euclidean harmonic losses. These findings reveal critical trade-offs among performance, interpretability, and environmental sustainability across different distance metrics.

Technology Category

Application Category

📝 Abstract
Cross-entropy loss has long been the standard choice for training deep neural networks, yet it suffers from interpretability limitations, unbounded weight growth, and inefficiencies that can contribute to costly training dynamics. The harmonic loss is a distance-based alternative grounded in Euclidean geometry that improves interpretability and mitigates phenomena such as grokking, or delayed generalization on the test set. However, the study of harmonic loss remains narrow: only Euclidean distance is explored, and no systematic evaluation of computational efficiency or sustainability was conducted. We extend harmonic loss by systematically investigating a broad spectrum of distance metrics as replacements for the Euclidean distance. We comprehensively evaluate distance-tailored harmonic losses on both vision backbones and large language models. Our analysis is framed around a three-way evaluation of model performance, interpretability, and sustainability. On vision tasks, cosine distances provide the most favorable trade-off, consistently improving accuracy while lowering carbon emissions, whereas Bray-Curtis and Mahalanobis further enhance interpretability at varying efficiency costs. On language models, cosine-based harmonic losses improve gradient and learning stability, strengthen representation structure, and reduce emissions relative to cross-entropy and Euclidean heads. Our code is available at: https://anonymous.4open.science/r/rethinking-harmonic-loss-5BAB/.
Problem

Research questions and friction points this paper is trying to address.

harmonic loss
distance metrics
computational efficiency
sustainability
interpretability
Innovation

Methods, ideas, or system contributions that make the work stand out.

harmonic loss
non-Euclidean distance
sustainability
interpretability
large language models
🔎 Similar Papers
No similar papers found.
M
Maxwell Miller-Golub
American University
K
Kamil Faber
AGH University of Science and Technology
M
Marcin Pietron
AGH University of Science and Technology
P
Panpan Zheng
Xinjiang University
Pasquale Minervini
Pasquale Minervini
University of Edinburgh, Miniml.AI, ELLIS Scholar
Generative AIMachine LearningNatural Language ProcessingMachine Reasoning
Roberto Corizzo
Roberto Corizzo
American University
Data MiningBig DataContinual LearningMachine Learning