The Rising Threat to Emerging AI-Powered Search Engines

📅 2025-02-07

📈 Citations: 0

✨ Influential: 0

career value

172K/year

🤖 AI Summary

This paper identifies a critical security vulnerability in AI-powered search engines (AIPSEs): even under routine user queries, they frequently generate responses containing malicious URLs—introducing a novel, pervasive threat to end users. Method: We establish the first AIPSE-specific threat model and a four-tier risk assessment framework; empirically evaluate seven mainstream AIPSEs; demonstrate real-world risks via document forgery and phishing attack case studies; and propose a two-module proxy defense system integrating GPT-4o–based content refinement, XGBoost–driven malicious URL detection, and multi-source threat intelligence fusion (PhishTank, ThreatBook, LevelBlue). Contribution/Results: Our evaluation shows the defense significantly reduces harmful response rates. Findings reveal severe security deficiencies in current AIPSEs—particularly heightened risk under non-URL-based natural-language queries—and provide both theoretical foundations and practical methodologies for robust security evaluation and mitigation standards.

Technology Category

Application Category

📝 Abstract

Recent advancements in Large Language Models (LLMs) have significantly enhanced the capabilities of AI-Powered Search Engines (AIPSEs), offering precise and efficient responses by integrating external databases with pre-existing knowledge. However, we observe that these AIPSEs raise risks such as quoting malicious content or citing malicious websites, leading to harmful or unverified information dissemination. In this study, we conduct the first safety risk quantification on seven production AIPSEs by systematically defining the threat model, risk level, and evaluating responses to various query types. With data collected from PhishTank, ThreatBook, and LevelBlue, our findings reveal that AIPSEs frequently generate harmful content that contains malicious URLs even with benign queries (e.g., with benign keywords). We also observe that directly query URL will increase the risk level while query with natural language will mitigate such risk. We further perform two case studies on online document spoofing and phishing to show the ease of deceiving AIPSEs in the real-world setting. To mitigate these risks, we develop an agent-based defense with a GPT-4o-based content refinement tool and an XGBoost-based URL detector. Our evaluation shows that our defense can effectively reduce the risk but with the cost of reducing available information. Our research highlights the urgent need for robust safety measures in AIPSEs.

Problem

Research questions and friction points this paper is trying to address.

AI-Powered Search Engines safety risks

Malicious content dissemination through AIPSEs

Need for robust AIPSE defense mechanisms

Innovation

Methods, ideas, or system contributions that make the work stand out.

GPT-4o content refinement tool

XGBoost-based URL detector

Agent-based defense mechanism

🔎 Similar Papers

Taxonomy and Analysis of Sensitive User Queries in Generative AI Search

2024-04-05arXiv.orgCitations: 0

💼 Related Jobs

ML/Research Engineer, Safeguards

Anthropic

$350,000—$500,000 USD

San Francisco, CA | New York City, NY / San Francisco, CA, San Francisco, California, United States

Authors to Follow