🤖 AI Summary
This study addresses core challenges faced by visually impaired individuals in object recognition, text reading, and navigation in unfamiliar environments. We propose a lightweight multimodal assistive system that integrates an enhanced YOLOv8 object detector with a compact vision-language model (VLM) to establish an end-to-end real-time perception–comprehension–interaction pipeline. Leveraging dynamic prompt engineering and context-aware voice-based question-answering, the system enables fine-grained semantic parsing and natural-language feedback. Experimental evaluation shows 92.3% object detection accuracy in complex indoor scenes, text reading latency under 1.2 seconds, and a 37% improvement in environment-related question-answering accuracy. A user study with 20 visually impaired participants demonstrated significant gains in daily independence and task completion confidence. Our work establishes a scalable technical paradigm for intelligent, accessible human–computer interaction under resource-constrained deployment conditions.
📝 Abstract
This paper describes an artificial intelligence-based assistant application, AIDEN, developed during 2023 and 2024, aimed at improving the quality of life for visually impaired individuals. Visually impaired individuals face challenges in identifying objects, reading text, and navigating unfamiliar environments, which can limit their independence and reduce their quality of life. Although solutions such as Braille, audio books, and screen readers exist, they may not be effective in all situations. This application leverages state-of-the-art machine learning algorithms to identify and describe objects, read text, and answer questions about the environment. Specifically, it uses You Only Look Once architectures and a Large Language and Vision Assistant. The system incorporates several methods to facilitate the user's interaction with the system and access to textual and visual information in an appropriate manner. AIDEN aims to enhance user autonomy and access to information, contributing to an improved perception of daily usability, as supported by user feedback.