🤖 AI Summary
Vector database upgrades requiring embedding model replacement incur prohibitive computational overhead and service disruption due to full re-encoding of all vectors and reconstruction of approximate nearest neighbor (ANN) indices. This paper proposes Drift-Adapter, a lightweight, learnable space-alignment method that inserts a parameterized adapter layer—implementing orthogonal Procrustes alignment, low-rank affine transformation, or residual MLP—between old and new embedding spaces. Trained on only a small set of paired old–new embeddings, it enables online calibration without index rebuilding. Its core contribution is enabling *hot embedding model upgrades*: real-time query-time mapping with zero downtime. Experiments on billion-scale systems demonstrate that Drift-Adapter restores 95%–99% of the recall achieved by full re-encoding, incurs less than 10 μs added query latency, and reduces recomputation cost by over two orders of magnitude.
📝 Abstract
Upgrading embedding models in production vector databases typically requires re-encoding the entire corpus and rebuilding the Approximate Nearest Neighbor (ANN) index, leading to significant operational disruption and computational cost. This paper presents Drift-Adapter, a lightweight, learnable transformation layer designed to bridge embedding spaces between model versions. By mapping new queries into the legacy embedding space, Drift-Adapter enables the continued use of the existing ANN index, effectively deferring full re-computation. We systematically evaluate three adapter parameterizations: Orthogonal Procrustes, Low-Rank Affine, and a compact Residual MLP, trained on a small sample of paired old and new embeddings. Experiments on MTEB text corpora and a CLIP image model upgrade (1M items) show that Drift-Adapter recovers 95-99% of the retrieval recall (Recall@10, MRR) of a full re-embedding, adding less than 10 microseconds of query latency. Compared to operational strategies like full re-indexing or dual-index serving, Drift-Adapter reduces recompute costs by over 100 times and facilitates upgrades with near-zero operational interruption. We analyze robustness to varied model drift, training data size, scalability to billion-item systems, and the impact of design choices like diagonal scaling, demonstrating Drift-Adapter's viability as a pragmatic solution for agile model deployment.