🤖 AI Summary
This work addresses the issue that existing Lorentz linear layers induce logarithmic growth of hyperbolic norms during training, thereby degrading the model’s ability to capture hierarchical structures inherent in hyperbolic geometry. To resolve this, the authors propose a novel Lorentz linear layer grounded in the geometrically principled formulation of distance to a hyperplane, which—unlike prior approaches—explicitly incorporates a meaningful distance metric into the layer design. This innovation ensures that output norms grow linearly with respect to optimization steps, restoring proper geometric scaling behavior. Combined with a dedicated Lorentz activation function and a gradient caching strategy, the resulting hyperbolic neural network maintains strict adherence to hyperbolic geometry while achieving significantly improved computational efficiency and performance approaching that of Euclidean models, effectively bridging the gap between representational capacity and practical efficiency.
📝 Abstract
Hyperbolic space is quickly gaining traction as a promising geometry for hierarchical and robust representation learning. A core open challenge is the development of a mathematical formulation of hyperbolic neural networks that is both efficient and captures the key properties of hyperbolic space. The Lorentz model of hyperbolic space has been shown to enable both fast forward and backward propagation. However, we prove that, with the current formulation of Lorentz linear layers, the hyperbolic norms of the outputs scale logarithmically with the number of gradient descent steps, nullifying the key advantage of hyperbolic geometry. We propose a new Lorentz linear layer grounded in the well-known ``distance-to-hyperplane"formulation. We prove that our formulation results in the usual linear scaling of output hyperbolic norms with respect to the number of gradient descent steps. Our new formulation, together with further algorithmic efficiencies through Lorentzian activation functions and a new caching strategy results in neural networks fully abiding by hyperbolic geometry while simultaneously bridging the computation gap to Euclidean neural networks. Code available at: https://github.com/robertdvdk/hyperbolic-fully-connected.