Small Data Explainer -- The impact of small data methods in everyday life

📅 2025-07-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of deploying AI for socially impactful decision-making under small-data conditions—particularly where marginalized populations are underrepresented and wearable health technologies exhibit limited real-world efficacy. We propose a novel “knowledge-driven + data-driven” hybrid framework that systematically integrates statistical modeling priors with machine learning techniques, establishing an interdisciplinary paradigm for small-data AI applications. Through empirical case studies across policy design and digital health domains, we delineate the practical feasibility boundaries of current small-data methods and identify critical bottlenecks—including data scarcity, model interpretability, and fairness constraints. Results demonstrate that knowledge embedding significantly enhances generalization in low-sample regimes. Beyond empirical validation, the work articulates a forward-looking research agenda centered on model interpretability, human-AI collaboration mechanisms, and fairness-aware evaluation protocols. Collectively, this research provides both theoretical foundations and actionable guidelines for deploying high-value, trustworthy AI in data-scarce societal contexts.

Technology Category

Application Category

📝 Abstract
The emergence of breakthrough artificial intelligence (AI) techniques has led to a renewed focus on how small data settings, i.e., settings with limited information, can benefit from such developments. This includes societal issues such as how best to include under-represented groups in data-driven policy and decision making, or the health benefits of assistive technologies such as wearables. We provide a conceptual overview, in particular contrasting small data with big data, and identify common themes from exemplary case studies and application areas. Potential solutions are described in a more detailed technical overview of current data analysis and modelling techniques, highlighting contributions from different disciplines, such as knowledge-driven modelling from statistics and data-driven modelling from computer science. By linking application settings, conceptual contributions and specific techniques, we highlight what is already feasible and suggest what an agenda for fully leveraging small data might look like.
Problem

Research questions and friction points this paper is trying to address.

Exploring small data methods' impact in everyday life
Addressing under-represented groups in data-driven decisions
Combining knowledge-driven and data-driven modelling techniques
Innovation

Methods, ideas, or system contributions that make the work stand out.

Contrasting small data with big data
Knowledge-driven modelling from statistics
Data-driven modelling from computer science
🔎 Similar Papers
No similar papers found.
M
Maren Hackenberg
Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center, University of Freiburg, Freiburg, Germany; Freiburg Center for Data Analysis, Modeling and AI, University of Freiburg, Freiburg, Germany
S
Sophia G. Connor
Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center, University of Freiburg, Freiburg, Germany
F
Fabian Kabus
Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center, University of Freiburg, Freiburg, Germany
J
June Brawner
The Royal Society, London, United Kingdom
E
Ella Markham
The Royal Society, London, United Kingdom; University of Edinburgh, Edinburgh, United Kingdom
M
Mahi Hardalupas
The Royal Society, London, United Kingdom
A
Areeq Chowdhury
The Royal Society, London, United Kingdom
Rolf Backofen
Rolf Backofen
Professor für Bioinformatik, Universität Freiburg
bioinformaticsnon-coding RNAstranslational regulation
A
Anna Köttgen
Institute of Genetic Epidemiology, Faculty of Medicine and Medical Center, University of Freiburg, Germany
A
Angelika Rohde
Department of Mathematical Stochastics, Faculty of Mathematics and Physics, University of Freiburg, Germany
N
Nadine Binder
Freiburg Center for Data Analysis, Modeling and AI, University of Freiburg, Freiburg, Germany; Institute of General Practice/Family Medicine, Faculty of Medicine and Medical Center, University of Freiburg, Germany
Harald Binder
Harald Binder
Director of the Institute of Medical Biometry and Statistics, University of Freiburg
BiostatisticsMachine LearningDeep Learning
T
the Collaborative Research Center 1597 Small Data
University of Freiburg