Conversational Assistants to support Heart Failure Patients: comparing a Neurosymbolic Architecture with ChatGPT

πŸ“… 2025-04-24
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This study addresses dietary sodium counseling for heart failure patientsβ€”a high-stakes clinical task requiring accuracy, interpretability, and reliability. Method: We conducted the first controlled, task-oriented comparison between a neurosymbolic dialogue assistant and a generative large language model (ChatGPT) in real-world clinical settings. The neurosymbolic system integrates a rule engine, an embedded clinical knowledge graph, and a fine-tuned lightweight language model, augmented with speech interaction and a curated dietary knowledge base; ChatGPT API served as the baseline. Results: The neurosymbolic system achieved significantly higher accuracy and task completion rate (+23%), produced more concise responses, and ensured greater controllability and transparency. While ChatGPT exhibited marginally fewer speech recognition errors and required fewer clarifications, patient preference showed no statistically significant difference. Contribution: We propose a lightweight, controllable neurosymbolic dialogue paradigm tailored for health counseling and empirically demonstrate its superiority over purely generative approaches in safety-critical, reliability-demanding medical Q&A.

Technology Category

Application Category

πŸ“ Abstract
Conversational assistants are becoming more and more popular, including in healthcare, partly because of the availability and capabilities of Large Language Models. There is a need for controlled, probing evaluations with real stakeholders which can highlight advantages and disadvantages of more traditional architectures and those based on generative AI. We present a within-group user study to compare two versions of a conversational assistant that allows heart failure patients to ask about salt content in food. One version of the system was developed in-house with a neurosymbolic architecture, and one is based on ChatGPT. The evaluation shows that the in-house system is more accurate, completes more tasks and is less verbose than the one based on ChatGPT; on the other hand, the one based on ChatGPT makes fewer speech errors and requires fewer clarifications to complete the task. Patients show no preference for one over the other.
Problem

Research questions and friction points this paper is trying to address.

Comparing neurosymbolic and ChatGPT-based assistants for heart failure patients
Evaluating accuracy, task completion, and verbosity in healthcare chatbots
Assessing patient preference between different conversational assistant architectures
Innovation

Methods, ideas, or system contributions that make the work stand out.

Neurosymbolic architecture for accurate responses
ChatGPT-based system reduces speech errors
User study compares both systems' performance
πŸ”Ž Similar Papers
No similar papers found.
A
Anuja Tayal
Department of Computer Science
D
Devika Salunke
Department of Biomedical and Health Information Sciences
Barbara Di Eugenio
Barbara Di Eugenio
Professor, University of Illinois Chicago
Natural Language ProcessingHuman Computer InteractionEducational TechnologyNLP for healthcare
P
Paula Allen-Meares
Department of Medicine
E
E. P. Abril
Department of Communications
O
Olga Garcia
Department of Medicine
C
Carolyn Dickens
Department of Medicine
A
Andrew Boyd
Department of Biomedical and Health Information Sciences