SenseCF: LLM-Prompted Counterfactuals for Intervention and Sensor Data Augmentation

📅 2025-07-07

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

This study addresses three key challenges in physiological signal prediction: the difficulty of generating counterfactual explanations (CFs), their limited clinical feasibility, and data scarcity. We propose a large language model (LLM)-based few-shot prompting framework for CF generation. Unlike conventional methods (e.g., DiCE), our approach employs structured prompting to jointly optimize intervention feasibility and data augmentation utility, enabling end-to-end CF generation for stress and cardiac disease prediction. Using GPT-4o-mini, zero-shot and three-shot prompting yield CFs with 0.99 validity and 99% plausibility. When used as augmented training data, these CFs improve classifier average accuracy by 5%, substantially mitigating performance degradation in low-resource settings. Our core contribution is the first systematic integration of LLM-driven CF generation into clinical physiological modeling—uniquely balancing interpretability, robustness, and clinical practicality.

Technology Category

Application Category

📝 Abstract

Counterfactual explanations (CFs) offer human-centric insights into machine learning predictions by highlighting minimal changes required to alter an outcome. Therefore, CFs can be used as (i) interventions for abnormality prevention and (ii) augmented data for training robust models. In this work, we explore large language models (LLMs), specifically GPT-4o-mini, for generating CFs in a zero-shot and three-shot setting. We evaluate our approach on two datasets: the AI-Readi flagship dataset for stress prediction and a public dataset for heart disease detection. Compared to traditional methods such as DiCE, CFNOW, and NICE, our few-shot LLM-based approach achieves high plausibility (up to 99%), strong validity (up to 0.99), and competitive sparsity. Moreover, using LLM-generated CFs as augmented samples improves downstream classifier performance (an average accuracy gain of 5%), especially in low-data regimes. This demonstrates the potential of prompt-based generative techniques to enhance explainability and robustness in clinical and physiological prediction tasks. Code base: github.com/anonymous/SenseCF.

Problem

Research questions and friction points this paper is trying to address.

Generating counterfactuals for abnormality prevention

Augmenting sensor data to train robust models

Improving clinical prediction tasks with LLM-generated CFs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses LLM for counterfactual explanations generation

Enhances model robustness with augmented CF data

Achieves high plausibility and validity metrics

🔎 Similar Papers

No similar papers found.