An unsupervised decision-support framework for multivariate biomarker analysis in athlete monitoring

๐Ÿ“… 2026-04-15
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

193K/year
๐Ÿค– AI Summary
Current athlete monitoring is hindered by small sample sizes, heterogeneous biomarkers, difficulties in repeated sampling, and the absence of injury labels, which collectively limit the interpretability and practical utility of conventional univariate or binary risk models. This work proposes an unsupervised multivariate analytical framework that integrates modular preprocessing, clinical safety screening, Wardโ€™s hierarchical clustering, and Gaussian mixture modeling within a unified biomarker space, augmented with synthetic data to enhance robustness. Without requiring explicit injury labels, the approach effectively discriminates between mechanical injury and metabolic stress mechanisms, identifying latent risk phenotypes with physiological coherence. The method demonstrates robust performance in high-dimensional and synthetically augmented datasets, uncovering risk patterns overlooked by standard monitoring protocols and thereby supporting personalized decision-making in sports medicine.

Technology Category

Application Category

๐Ÿ“ Abstract
Purpose. Athlete monitoring is constrained by small cohorts, heterogeneous biomarker scales, limited feasibility of repeated sampling, and the lack of reliable injury ground truth. These limitations reduce the interpretability and utility of traditional univariate and binary risk models. This study addresses these challenges by proposing an unsupervised multivariate framework to identify latent physiological states in athletes using real data. Methods. We propose a modular computational framework that operates in the joint biomarker space, integrating preprocessing, clinical safety screening, unsupervised clustering, and centroid-based physiological interpretation. Profiles are learned exclusively from amateur soccer players during a competitive microcycle. Synthetic data augmentation evaluates robustness and scalability. Ward hierarchical clustering supports monitoring and etiological differentiation, while Gaussian Mixture Models (GMM) enable structural stability analysis in high-dimensional settings. Results. The framework identifies coherent profiles that distinguish mechanical damage from metabolic stress while preserving homeostatic states. Synthetic data augmentation demonstrates feasibility and detection of latent silent risk phenotypes typically missed by univariate monitoring. Structural analyses indicate robustness under augmentation and higher-dimensional settings. Conclusion. The framework enables interpretable identification of latent physiological states from multivariate biomarker data without injury labels. By distinguishing mechanisms and revealing silent risk patterns not captured by conventional monitoring, it provides actionable insights for individualized athlete monitoring and decision making.
Problem

Research questions and friction points this paper is trying to address.

athlete monitoring
multivariate biomarker analysis
unsupervised learning
latent physiological states
injury prediction
Innovation

Methods, ideas, or system contributions that make the work stand out.

unsupervised learning
multivariate biomarker analysis
latent physiological states
synthetic data augmentation
Gaussian Mixture Models
F
Fernando Barcelos Rosito
Federal University of Health Sciences of Porto Alegre (UFCSPA), RS, Brazil
S
Sebastiรฃo De Jesus Menezes
Levino Inova, AC, Brazil
S
Simone Ferreira Sturza
Levino Inova, AC, Brazil
A
Adriana Seixas
Federal University of Health Sciences of Porto Alegre (UFCSPA), RS, Brazil
Muriel Figueredo Franco
Muriel Figueredo Franco
Federal University of Health Sciences of Porto Alegre (UFCSPA)
CybersecurityNetwork ManagementCybersecurity PlanningHealthcare