Next-Gen CAPTCHAs: Leveraging the Cognitive Gap for Scalable and Diverse GUI-Agent Defense

📅 2026-02-09

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

Traditional CAPTCHAs struggle to distinguish humans from machines in the face of multimodal GUI agents endowed with advanced reasoning capabilities. This work proposes the first scalable human verification framework based on dynamic task generation, leveraging a backend-driven, infinitely generative mechanism to design challenges that exploit cognitive gaps between humans and AI in interactive perception, memory, and intuitive decision-making. By transcending the limitations of static datasets, the framework employs adaptive, non-procedural tasks that effectively counter state-of-the-art models such as GPT-5.2-Xhigh, drastically reducing their success rate on complex logical puzzles from 90% to near-chance levels, thereby reestablishing a robust boundary for human–machine differentiation in the age of intelligent agents.

Technology Category

Application Category

📝 Abstract

The rapid evolution of GUI-enabled agents has rendered traditional CAPTCHAs obsolete. While previous benchmarks like OpenCaptchaWorld established a baseline for evaluating multimodal agents, recent advancements in reasoning-heavy models, such as Gemini3-Pro-High and GPT-5.2-Xhigh have effectively collapsed this security barrier, achieving pass rates as high as 90% on complex logic puzzles like"Bingo". In response, we introduce Next-Gen CAPTCHAs, a scalable defense framework designed to secure the next-generation web against the advanced agents. Unlike static datasets, our benchmark is built upon a robust data generation pipeline, allowing for large-scale and easily scalable evaluations, notably, for backend-supported types, our system is capable of generating effectively unbounded CAPTCHA instances. We exploit the persistent human-agent"Cognitive Gap"in interactive perception, memory, decision-making, and action. By engineering dynamic tasks that require adaptive intuition rather than granular planning, we re-establish a robust distinction between biological users and artificial agents, offering a scalable and diverse defense mechanism for the agentic era.

Problem

Research questions and friction points this paper is trying to address.

CAPTCHA

GUI agents

Cognitive Gap

AI security

bot detection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Next-Gen CAPTCHAs

Cognitive Gap

Scalable Defense