🤖 AI Summary
This study addresses the limitations of traditional convex biclustering methods, which are highly sensitive to noisy features and exhibit degraded performance in high-dimensional settings. The authors propose a dual convex biclustering framework that jointly optimizes clustering structures over both samples and features, adaptively assigning higher weights to informative features. This approach simultaneously performs feature selection and biclustering without requiring prior denoising or preprocessing. The method enjoys theoretical guarantees under non-uniform affinity matrices and is implemented via an efficient proximal alternating minimization algorithm, featuring closed-form subproblem solutions and a principled hyperparameter tuning strategy. Experiments demonstrate that the proposed method reliably recovers ground-truth biclusters in synthetic data, significantly outperforming existing techniques, and successfully reproduces biologically meaningful classifications on lymphoma gene expression data, yielding interpretable mRNA groupings along with their associated weights.
📝 Abstract
This article proposes a biconvex modification to convex biclustering in order to improve its performance in high-dimensional settings. In contrast to heuristics that discard a subset of noisy features a priori, our method jointly learns and accordingly weighs informative features while discovering biclusters. Moreover, the method is adaptive to the data, and is accompanied by an efficient algorithm based on proximal alternating minimization, complete with detailed guidance on hyperparameter tuning and efficient solutions to optimization subproblems. These contributions are theoretically grounded; we establish finite-sample bounds on the objective function under sub-Gaussian errors, and generalize these guarantees to cases where input affinities need not be uniform. Extensive simulation results reveal our method consistently recovers underlying biclusters while weighing and selecting features appropriately, outperforming peer methods. An application to a gene microarray dataset of lymphoma samples recovers biclusters matching an underlying classification, while giving additional interpretation to the mRNA samples via the column groupings and fitted weights.