🤖 AI Summary
Indoor AR on smartphones suffers from inaccurate localization and poor pose estimation robustness due to GPS unavailability and reliance on artificial markers. To address this, we propose a lightweight, markerless indoor AR framework compatible with general-purpose mobile devices. Our approach innovatively fuses Wi-Fi fingerprinting, IMU inertial measurements, and real-time visual feature matching. We design a markerless pose estimation algorithm that integrates an optimized Perspective-n-Point (PnP) solver with multi-source Kalman filtering, overcoming the limitations of purely vision-based or single-source Wi-Fi localization. Experimental evaluation on a 0.5-meter grid yields an average positioning error of 0.61–0.81 meters and visual feature matching accuracy of 77%–82%, significantly outperforming unimodal baselines. The framework enhances both practicality and user experience for indoor AR applications.
📝 Abstract
As a novel way of presenting information, augmented reality (AR) enables people to interact with the physical world in a direct and intuitive way. While there are some mobile AR products implemented with specific hardware at a high cost, the software approaches of AR implementation on mobile platforms(such as smartphones, tablet PC, etc.) are still far from practical use. GPS-based mobile AR systems usually perform poorly due to the inaccurate positioning in the indoor environment. Previous vision-based pose estimation methods need to continuously track predefined markers within a short distance, which greatly degrade user experience. This paper first conducts a comprehensive study of the state-of-the-art AR and localization systems on mobile platforms. Then, we propose an effective indoor mobile AR framework. In the framework, a fusional localization method and a new pose estimation implementation are developed to increase the overall matching rate and thus improving AR display accuracy. Experiments show that our framework has higher performance than approaches purely based on images or Wi-Fi signals. We achieve low average error distances (0.61-0.81m) and accurate matching rates (77%-82%) when the average sampling grid length is set to 0.5m.