🤖 AI Summary
Out-of-distribution (OOD) detection remains challenging in regression and survival analysis due to the absence of discrete class labels and the difficulty of quantifying predictive uncertainty for continuous outputs.
Method: We propose the first model- and subspace-aware OOD detection framework tailored to continuous-output tasks. Instead of relying on global distance metrics or full-feature density estimation, our approach operates within interpretable local neighborhoods, jointly leveraging model gradient sensitivity and subspace projection to dynamically weight prediction-relevant features and suppress irrelevant directions. Crucially, we introduce variable prioritization directly into the OOD detection pipeline—integrating both model architecture and feature semantics.
Results: Extensive experiments on synthetic and real-world datasets—including an esophageal cancer survival cohort—demonstrate significant improvements over state-of-the-art methods. Our framework successfully identifies clinically meaningful distribution shifts, such as those associated with lymph node dissection, thereby supporting evidence-based refinement of surgical guidelines.
📝 Abstract
Out-of-distribution (OOD) detection is essential for determining when a supervised model encounters inputs that differ meaningfully from its training distribution. While widely studied in classification, OOD detection for regression and survival analysis remains limited due to the absence of discrete labels and the challenge of quantifying predictive uncertainty. We introduce a framework for OOD detection that is simultaneously model aware and subspace aware, and that embeds variable prioritization directly into the detection step. The method uses the fitted predictor to construct localized neighborhoods around each test case that emphasize the features driving the model's learned relationship and downweight directions that are less relevant to prediction. It produces OOD scores without relying on global distance metrics or estimating the full feature density. The framework is applicable across outcome types, and in our implementation we use random forests, where the rule structure yields transparent neighborhoods and effective scoring. Experiments on synthetic and real data benchmarks designed to isolate functional shifts show consistent improvements over existing methods. We further demonstrate the approach in an esophageal cancer survival study, where distribution shifts related to lymphadenectomy identify patterns relevant to surgical guidelines.