🤖 AI Summary
This work addresses “specification overfitting” in AI systems driven by formal specifications—where models syntactically satisfy specifications but fail semantically, leading to poor generalization. We formally define and empirically validate this phenomenon for the first time, introducing a diagnostic framework and benchmark suite that explicitly distinguish syntactic compliance from semantic consistency. Our methodology integrates formal verification, adversarial specification generation, behavioral consistency assessment, and large language model–based reasoning analysis. Across multi-task AI verification experiments, we find that 68% of specification-compliant models exhibit semantic failure. Our proposed mitigation strategies improve generalization accuracy by 23.5%. This work establishes both theoretical foundations and practical tools for specification-driven development of trustworthy AI systems.