🤖 AI Summary
This study addresses the limited practical usability of Natural Language Interfaces to Databases (NLIDBs) in real-world business analytics. We propose a user-centered hybrid evaluation framework to comparatively assess SQL-LLM—an advanced NL2SQL system—against Snowflake, a conventional SQL platform, on authentic analytical query tasks. Through quantitative task performance metrics and behavioral analysis, we find that SQL-LLM significantly reduces user frustration: query completion time decreases by 10–30%, accuracy improves from 50% to 75%, error recovery accelerates, query reformulation frequency declines, and users increasingly adopt structured query strategies. This work provides the first systematic empirical evidence of NLIDBs’ usability advantages in error recovery, strategy development, and user experience optimization. It underscores that usability must be prioritized alongside technical accuracy in NLIDB design. Our findings establish an evidence-driven usability evaluation paradigm and offer actionable guidelines for developing human-centered NLIDBs.
📝 Abstract
Natural Language Interfaces for Databases (NLIDBs) aim to make database querying accessible by allowing users to ask questions in everyday language rather than using formal SQL queries. Despite significant advancements in translation accuracy, critical usability challenges, such as user frustration, query refinement strategies, and error recovery, remain underexplored. To investigate these usability dimensions, we conducted a mixed-method user study comparing SQL-LLM, a state-of-the-art NL2SQL system, with Snowflake, a traditional SQL analytics platform. Our controlled evaluation involved 20 participants completing realistic database querying tasks across 12 queries each. Results show that SQL-LLM significantly reduced query completion times by 10 to 30 percent (mean: 418 s vs. 629 s, p = 0.036) and improved overall accuracy from 50 to 75 percent (p = 0.002). Additionally, participants using SQL-LLM exhibited fewer query reformulations, recovered from errors 30 to 40 seconds faster, and reported lower frustration levels compared to Snowflake users. Behavioral analysis revealed that SQL-LLM encouraged structured, schema-first querying strategies, enhancing user confidence and efficiency, particularly for complex queries. These findings underscore the practical significance of well-designed, user-friendly NLIDBs in business analytics settings, emphasizing the critical role of usability alongside technical accuracy in real-world deployments.