🤖 AI Summary
This work addresses the challenge that non-expert users face in mapping natural language descriptions of combinatorial optimization problems to expert-authored constraint programming (CP) models. We propose the first unsupervised semantic retrieval system for CP models—requiring no human-annotated data. Our method employs a dual-encoder architecture built upon pretrained language models, jointly encoding natural language queries and structured CP models (including constraints, variables, objectives, and their semantics), and optimizes cross-modal semantic similarity via contrastive learning. Evaluated on a benchmark of 42 real-world CP models, the system achieves 89.3% Top-1 accuracy under diverse user phrasings—substantially outperforming baselines—and reduces average modeling time by 67%, demonstrating effective automation of expert knowledge reuse. Key contributions include: (1) the first unsupervised semantic retrieval framework tailored to CP models, and (2) an interpretable, non-expert-friendly assistance system that significantly accelerates CP model construction.
📝 Abstract
Constraint Programming and its high-level modeling languages have long been recognized for their potential to achieve the holy grail of problem-solving. However, the complexity of modeling languages, the large number of global constraints, and the art of creating good models have often hindered non-experts from choosing CP to solve their combinatorial problems. While generating an expert-level model from a natural-language description of a problem would be the dream, we are not yet there. We propose a tutoring system called CP-Model-Zoo, exploiting expert-written models accumulated through the years. CP-Model-Zoo retrieves the closest source code model from a database based on a user's natural language description of a combinatorial problem. It ensures that expert-validated models are presented to the user while eliminating the need for human data labeling. Our experiments show excellent accuracy in retrieving the correct model based on a user-input description of a problem simulated with different levels of expertise.