Evaluating the Utility of Personal Health Records in Personalized Health AI

📅 2026-05-18
📈 Citations: 0
Influential: 0
📄 PDF

career value

182K/year
🤖 AI Summary
This study addresses the challenge of leveraging complex personal health record (PHR) data to support personalized health AI. We present the first systematic evaluation of a large language model (Gemini 3.0 Flash) in answering real patient queries—spanning three categories—using 1,945 de-identified PHRs under three input conditions: no context, PHR summaries, and full clinical notes. We introduce a novel evaluation framework specifically designed to assess PHR-related comprehension errors, uncovering previously uncharacterized issues such as temporal misalignment and critical hallucinations. Combining automated metrics with clinician-based assessments grounded in the SHARP criteria and PHR-specific error patterns, our results demonstrate that incorporating PHR context significantly enhances response usefulness (p<0.001) and improves safety, accuracy, relevance, and personalization, thereby validating the potential of PHRs to empower patient-centered health understanding.
📝 Abstract
Patient-managed Personal Health Records (PHRs) promises to empower patients to better understand their health; but information in the record is complex, potentially hindering insights. In this study, we assess the potential of large language models (LLMs, Gemini 3.0 Flash) to provide helpful answers to user health queries, when provided clinical data from PHRs as context. A total of 2,257 user queries were drawn from 3 different distributions to represent patient questions: shorter web search queries, longer questions derived from templates of chatbot conversations, and questions patients asked to their healthcare team (patient calls). Queries were matched with de-identified PHRs (from a pool of 1,945). Gemini responses were generated (1) without PHR context; (2) with a basic summary of demographics, conditions, and medications; (3) with full, extensive clinical notes. For evaluation, we leveraged an existing rating framework (SHARP), and developed a new framework for specific error modes when interpreting PHRs. Evaluation was performed using autoraters for the full set, and with clinician ratings for a subset (n=95), with both sets of raters knowing the full PHR context. We see significant improvements in the helpfulness of answers to all question types with PHR data (p < 0.001, paired t-test). We also observe potential gains in safety, accuracy, relevance and personalization of answers. Our PHR evaluation framework further identifies gaps in LLM understanding of particular aspects of complex PHRs, such as temporal disorientation, and rare but meaningful confabulations. These results suggest potential for PHR data to help people with a wide range of user needs; and provide a framework for monitoring for gaps in LLM answers based on PHR context. This study motivates further work to assess and realize potential benefits to users from understanding their health records.
Problem

Research questions and friction points this paper is trying to address.

Personal Health Records
Large Language Models
Health AI
Clinical Data Interpretation
Patient Empowerment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Personal Health Records
Large Language Models
PHR-aware Evaluation
Clinical Context Integration
Health AI Personalization
🔎 Similar Papers
No similar papers found.
Rory Sayres
Rory Sayres
Researcher, Google
Human / computer interactionMedical artificial intelligenceExplainable AIVisionDeliberation
Kejia Chen
Kejia Chen
Technical University of Munich
Manipulation of Deformable ObjectsMulti-robot CollaborationLLM-based Planning
A
Ayush Jain
Google Research
Matthew Thompson
Matthew Thompson
University of Washington
AIdiagnosticsinfectionscancerdigital health
J
Jonathan Richina
Google Research
X
Xiang Yin
Google Research
J
Jimmy Hu
Google Research
Fan Zhang
Fan Zhang
Google
In-memory computingMemristorHW-SW Co-DesignAI AcceleratorNeuromorphic Computing
B
Bob Lou
Google Research
M
Mike Sanchez
Google Research
I
Ines Mezerreg
Google Research
M
Meredith Schreier
Google Research
H
Hamsa Subramaniam
Google Research
I
I-Ching Lee
Google Research
Y
Yugang Jia
Google Research
D
Daniel Mcduff
Google Research
Yossi Matias
Yossi Matias
Google
A
Avinatan Hassidim
Google Research
D
Dale Webster
Google Research
Yun Liu
Yun Liu
Senior Staff Research Scientist, Google Research
Applied Machine LearningHealthcareBiomedical Data
J
Jackie Barr
Google Research
Q
Quang Duong
Google Research