Language Models are Crossword Solvers

📅 2024-06-13
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates the capability of large language models (LLMs) to solve cryptic crossword puzzles, addressing two core challenges: single-clue interpretation and full-grid autocompletion. We propose an end-to-end, constraint-aware grid-solving framework: first, semantic and lexical constraints are extracted from clues via prompt engineering and structured parsing; second, a constraint-driven backtracking search algorithm refines LLM-generated candidate words to ensure grid-wide consistency, enabling interpretable and verifiable inference. To our knowledge, this is the first fully automated, human-intervention-free cryptic crossword solver. Evaluated on the New York Times cryptic crossword dataset, our method achieves 93% cell-level accuracy and improves puzzle-level success rate by 2–3× over prior state-of-the-art. The framework establishes a novel paradigm for applying LLMs to structured reasoning and symbol-constrained combinatorial tasks.

Technology Category

Application Category

📝 Abstract
Crosswords are a form of word puzzle that require a solver to demonstrate a high degree of proficiency in natural language understanding, wordplay, reasoning, and world knowledge, along with adherence to character and length constraints. In this paper we tackle the challenge of solving crosswords with large language models (LLMs). We demonstrate that the current generation of language models shows significant competence at deciphering cryptic crossword clues and outperforms previously reported state-of-the-art (SoTA) results by a factor of 2-3 in relevant benchmarks. We also develop a search algorithm that builds off this performance to tackle the problem of solving full crossword grids with out-of-the-box LLMs for the very first time, achieving an accuracy of 93% on New York Times crossword puzzles. Additionally, we demonstrate that LLMs generalize well and are capable of supporting answers with sound rationale.
Problem

Research questions and friction points this paper is trying to address.

Solving crosswords with language models
Outperforming state-of-the-art benchmarks
Achieving high accuracy in crossword grids
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLMs solve cryptic crossword clues
Search algorithm for full crossword grids
93% accuracy on New York Times puzzles
🔎 Similar Papers
No similar papers found.