GEO-Detective: Unveiling Location Privacy Risks in Images with LLM Agents

📅 2025-11-27

📈 Citations: 0

✨ Influential: 0

career value

174K/year

🤖 AI Summary

Social images often contain latent geographical privacy risks, yet existing vision-language models (LVLMs) lack task-specific optimization for geolocation inference, limiting their potential and hindering accurate risk assessment. To address this, we propose a human-inspired reasoning-chain-based LVLM agent framework that integrates visual reverse search, external knowledge retrieval, and an adaptive multi-step reasoning mechanism—enabling dynamic strategy adjustment and tool invocation. Compared to baseline models, our approach achieves a 11.1% improvement in country-level localization accuracy, a 5.2% gain in fine-grained (e.g., city-level) localization accuracy, and a 50.6% reduction in unknown-prediction rate. These results demonstrate significantly enhanced robustness and practicality. Moreover, the framework provides interpretable, scalable, and modular reasoning—establishing a novel paradigm for image-based geographical privacy risk assessment.

Technology Category

Application Category

📝 Abstract

Images shared on social media often expose geographic cues. While early geolocation methods required expert effort and lacked generalization, the rise of Large Vision Language Models (LVLMs) now enables accurate geolocation even for ordinary users. However, existing approaches are not optimized for this task. To explore the full potential and associated privacy risks, we present Geo-Detective, an agent that mimics human reasoning and tool use for image geolocation inference. It follows a procedure with four steps that adaptively selects strategies based on image difficulty and is equipped with specialized tools such as visual reverse search, which emulates how humans gather external geographic clues. Experimental results show that GEO-Detective outperforms baseline large vision language models (LVLMs) overall, particularly on images lacking visible geographic features. In country level geolocation tasks, it achieves an improvement of over 11.1% compared to baseline LLMs, and even at finer grained levels, it still provides around a 5.2% performance gain. Meanwhile, when equipped with external clues, GEO-Detective becomes more likely to produce accurate predictions, reducing the "unknown" prediction rate by more than 50.6%. We further explore multiple defense strategies and find that Geo-Detective exhibits stronger robustness, highlighting the need for more effective privacy safeguards.

Problem

Research questions and friction points this paper is trying to address.

Develops an agent for image geolocation using human-like reasoning and tools

Addresses privacy risks from accurate location inference in social media images

Improves geolocation accuracy, especially for images lacking visible geographic features

Innovation

Methods, ideas, or system contributions that make the work stand out.

Agent mimics human reasoning for geolocation

Adaptive strategy selection based on image difficulty

Uses specialized tools like visual reverse search

🔎 Similar Papers

The Emerged Security and Privacy of LLM Agent: A Survey with Case Studies