🤖 AI Summary
To address the inefficiency of prompt routing and the difficulty of adversarial prompt detection in LLM production environments—stemming from the absence of pre-prompt task difficulty estimation—this paper introduces “Number of Thoughts” (NofT), the first pre-prompt metric that quantifies task difficulty based on chain-of-thought (CoT) reasoning trajectories. NofT unifies CoT metadata into computable difficulty features, enabling simultaneous support for intelligent prompt routing and adversarial prompt detection. Methodologically, it employs a lightweight classifier coupled with multi-scale quantized distillation models (DeepSeek-1.7B/7B/14B) for efficient deployment. Evaluated on MathInstruct, NofT achieves a 2% end-to-end latency reduction and 95% accuracy in adversarial prompt detection, demonstrating both performance optimization and enhanced robustness.
📝 Abstract
In this work, we propose a metric called Number of Thoughts (NofT) to determine the difficulty of tasks pre-prompting and support Large Language Models (LLMs) in production contexts. By setting thresholds based on the number of thoughts, this metric can discern the difficulty of prompts and support more effective prompt routing. A 2% decrease in latency is achieved when routing prompts from the MathInstruct dataset through quantized, distilled versions of Deepseek with 1.7 billion, 7 billion, and 14 billion parameters. Moreover, this metric can be used to detect adversarial prompts used in prompt injection attacks with high efficacy. The Number of Thoughts can inform a classifier that achieves 95% accuracy in adversarial prompt detection. Our experiments ad datasets used are available on our GitHub page: https://github.com/rymarinelli/Number_Of_Thoughts/tree/main.