Conversational Assistants to support Heart Failure Patients: comparing a Neurosymbolic Architecture with ChatGPT

📅 2025-04-24

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

This study addresses dietary sodium counseling for heart failure patients—a high-stakes clinical task requiring accuracy, interpretability, and reliability. Method: We conducted the first controlled, task-oriented comparison between a neurosymbolic dialogue assistant and a generative large language model (ChatGPT) in real-world clinical settings. The neurosymbolic system integrates a rule engine, an embedded clinical knowledge graph, and a fine-tuned lightweight language model, augmented with speech interaction and a curated dietary knowledge base; ChatGPT API served as the baseline. Results: The neurosymbolic system achieved significantly higher accuracy and task completion rate (+23%), produced more concise responses, and ensured greater controllability and transparency. While ChatGPT exhibited marginally fewer speech recognition errors and required fewer clarifications, patient preference showed no statistically significant difference. Contribution: We propose a lightweight, controllable neurosymbolic dialogue paradigm tailored for health counseling and empirically demonstrate its superiority over purely generative approaches in safety-critical, reliability-demanding medical Q&A.

Technology Category

Application Category

📝 Abstract

Conversational assistants are becoming more and more popular, including in healthcare, partly because of the availability and capabilities of Large Language Models. There is a need for controlled, probing evaluations with real stakeholders which can highlight advantages and disadvantages of more traditional architectures and those based on generative AI. We present a within-group user study to compare two versions of a conversational assistant that allows heart failure patients to ask about salt content in food. One version of the system was developed in-house with a neurosymbolic architecture, and one is based on ChatGPT. The evaluation shows that the in-house system is more accurate, completes more tasks and is less verbose than the one based on ChatGPT; on the other hand, the one based on ChatGPT makes fewer speech errors and requires fewer clarifications to complete the task. Patients show no preference for one over the other.

Problem

Research questions and friction points this paper is trying to address.

Comparing neurosymbolic and ChatGPT-based assistants for heart failure patients

Evaluating accuracy, task completion, and verbosity in healthcare chatbots

Assessing patient preference between different conversational assistant architectures

Innovation

Methods, ideas, or system contributions that make the work stand out.

Neurosymbolic architecture for accurate responses

ChatGPT-based system reduces speech errors

User study compares both systems' performance

🔎 Similar Papers

No similar papers found.