🤖 AI Summary
This work addresses the critical barriers hindering AI and digital health advancement in India—namely, fragmented biomedical data, inconsistent data quality, and insufficient incentives for data sharing. To overcome these challenges, the study proposes a multi-layered incentive policy framework that systematically integrates Shapley value–based benefit allocation, data paper recognition, and open data metrics into the national research evaluation system. The framework further incorporates federated learning, rigorous data quality assessment, structured peer review, and compliance mechanisms aligned with Indian regulations such as the Digital Personal Data Protection Act (DPDPA), National Data Sharing and Accessibility Policy (NDSAP), and Biotech-PRIDE guidelines. By harmonizing technical and institutional approaches, this initiative fosters a high-quality, interoperable, and sustainable biomedical data ecosystem, laying the foundational infrastructure for AI-driven healthcare research in India.
📝 Abstract
India generates vast biomedical data through postgraduate research, government hospital services and audits, government schemes, private hospitals and their electronic medical record (EMR) systems, insurance programs and standalone clinics. Unfortunately, these resources remain fragmented across institutional silos and vendor-locked EMR systems. The fundamental bottleneck is not technological but economic and academic. There is a systemic misalignment of incentives that renders data sharing a high-risk, low-reward activity for individual researchers and institutions. Until India's academic promotion criteria, institutional rankings, and funding mechanisms explicitly recognize and reward data curation as professional work, the nation's AI ambitions will remain constrained by fragmented, non-interoperable datasets. We propose a multi-layered incentive architecture integrating recognition of data papers in National Medical Commission (NMC) promotion criteria, incorporation of open data metrics into the National Institutional Ranking Framework (NIRF), adoption of Shapley Value-based revenue sharing in federated learning consortia, and establishment of institutional data stewardship as a mainstream professional role. Critical barriers to data sharing, including fear of data quality scrutiny, concerns about misinterpretation, and selective reporting bias, are addressed through mandatory data quality assessment, structured peer review, and academic credit for auditing roles. The proposed framework directly addresses regulatory constraints introduced by the Digital Personal Data Protection Act 2023 (DPDPA), while constructively engaging with the National Data Sharing and Accessibility Policy (NDSAP), Biotech-PRIDE Guidelines, and the Anusandhan National Research Foundation (ANRF) guidelines.