π€ AI Summary
This work addresses the vulnerability of static validation to strong search processes that may erroneously accept spurious solutions lacking genuine mechanistic insight. To overcome this limitation, the authors propose the DASES framework, which orchestrates the co-evolution of an innovator, an abyssal falsifier, and a mechanistic causal extractor under a fixed scientific contract. This setup actively generates executable scientific artifacts alongside adversarial counterexample environments for rigorous, dynamic evaluation. By incorporating an adaptive falsification mechanism, DASES transcends conventional static validation paradigms and, for the first time in a controlled setting, discovers a transferable loss functionβFNG-CE. Empirical results demonstrate that DASES effectively rejects spurious solutions, and that FNG-CE significantly outperforms standard cross-entropy (CE) and its L2-regularized variants on benchmarks including ImageNet.
π Abstract
Autonomous scientific discovery is entering a more dangerous regime: once the evaluator is frozen, a sufficiently strong search process can learn to win the exam without learning the mechanism the task was meant to reveal. This is the idea behind our title. To let the abyss stare back is to make evaluation actively push against the candidate through adaptive falsification, rather than passively certify it through static validation. We introduce DASES, a falsification-driven framework in which an Innovator, an Abyss Falsifier, and a Mechanistic Causal Extractor co-evolve executable scientific artifacts and scientifically admissible counterexample environments under a fixed scientific contract. In a controlled loss-discovery problem with a single editable locus, DASES rejects artifacts that static validation would have accepted, identifies the first candidate that survives the admissible falsification frontier, and discovers FNG-CE, a loss that transfers beyond the synthetic discovery environment and consistently outperforms CE and CE+L2 under controlled comparisons across standard benchmarks, including ImageNet.