🤖 AI Summary
This work addresses the challenge of balancing speed and accuracy in multilingual transliteration with non-autoregressive models, which often suffer from poor length control and hallucination issues. Focusing on Indic language transliteration, the authors propose NADIR, an architecture that integrates a differential Transformer with a mixture-of-experts mechanism to effectively model complex character mappings while eliminating sequential dependencies. The proposed method achieves over 13× faster inference compared to autoregressive baselines, reduces the character error rate to 15.78%, and significantly mitigates repetition, substitution, deletion, and insertion errors by 49.53%, 24.45%, 32.92%, and 16.87%, respectively, thereby striking an exceptional balance between efficiency and accuracy.
📝 Abstract
In this work, we argue that not all sequence-to-sequence tasks require the strong inductive biases of autoregressive (AR) models. Tasks like multilingual transliteration, code refactoring, grammatical correction or text normalization often rely on local dependencies where the full modeling capacity of AR models can be overkill, creating a trade-off between their high accuracy and high inference latency. While non-autoregressive (NAR) models offer speed, they typically suffer from hallucinations and poor length control. To explore this trade-off, we focus on the multilingual transliteration task in Indic languages and introduce NADIR, a novel NAR architecture designed to strike a balance between speed and accuracy. NADIR integrates a Differential Transformer and a Mixture-of-Experts mechanism, enabling it to robustly model complex character mappings without sequential dependencies. NADIR achieves over a 13x speed-up compared to the state-of-the-art AR baseline. It maintains a competitive mean Character Error Rate of 15.78%, compared to 14.44% for the AR model and 21.88% for a standard NAR equivalent. Importantly, NADIR reduces Repetition errors by 49.53%, Substitution errors by 24.45%, Omission errors by 32.92%, and Insertion errors by 16.87%. This work provides a practical blueprint for building fast and reliable NAR systems, effectively bridging the gap between AR accuracy and the demands of real-time, large-scale deployment.