On Training-Test (Mis)alignment in Unsupervised Combinatorial Optimization: Observation, Empirical Exploration, and Analysis

📅 2025-06-20

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

This paper identifies an inherent misalignment in unsupervised combinatorial optimization (UCO): during training, models optimize continuous probabilistic solutions and differentiable losses, whereas testing relies on non-differentiable derandomization to obtain deterministic solutions—causing training loss reduction to not necessarily improve actual performance. We are the first to systematically formulate and empirically validate this misalignment across diverse combinatorial optimization tasks. To mitigate it, we integrate differentiable derandomization into the end-to-end training pipeline, enabling consistent optimization of both probabilistic solution modeling and deterministic solution generation. Experiments demonstrate that our approach significantly improves the correlation between training objectives and test performance, establishing a new design paradigm for UCO. Concurrently, we uncover novel challenges introduced by differentiable derandomization—particularly gradient instability—thereby motivating further investigation into robust training strategies for unsupervised discrete optimization.

Technology Category

Application Category

📝 Abstract

In unsupervised combinatorial optimization (UCO), during training, one aims to have continuous decisions that are promising in a probabilistic sense for each training instance, which enables end-to-end training on initially discrete and non-differentiable problems. At the test time, for each test instance, starting from continuous decisions, derandomization is typically applied to obtain the final deterministic decisions. Researchers have developed more and more powerful test-time derandomization schemes to enhance the empirical performance and the theoretical guarantee of UCO methods. However, we notice a misalignment between training and testing in the existing UCO methods. Consequently, lower training losses do not necessarily entail better post-derandomization performance, even for the training instances without any data distribution shift. Empirically, we indeed observe such undesirable cases. We explore a preliminary idea to better align training and testing in UCO by including a differentiable version of derandomization into training. Our empirical exploration shows that such an idea indeed improves training-test alignment, but also introduces nontrivial challenges into training.

Problem

Research questions and friction points this paper is trying to address.

Misalignment between training and testing in unsupervised combinatorial optimization

Lower training losses do not ensure better post-derandomization performance

Challenges in aligning training with differentiable derandomization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unsupervised combinatorial optimization with continuous decisions

Differentiable derandomization during training

Improved training-test alignment via empirical exploration

🔎 Similar Papers

Unsupervised Machine Learning Hybrid Approach Integrating Linear Programming in Loss Function: A Robust Optimization Technique