From Prompts to Packets: A View from the Network on ChatGPT, Copilot, and Gemini

📅 2025-10-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Prior work lacks systematic characterization of network traffic generated by generative AI chatbots (e.g., ChatGPT, Copilot, Gemini) on Android mobile devices. Method: We construct a dual-mode dataset—real-world and controlled—and propose a cross-application, cross-content one-to-one traffic capture architecture. We perform multi-granularity analysis (trajectory–flow–protocol), payload-masking attribution, and model packet sequences using multimodal Markov chains. Contribution/Results: We identify three novel traffic patterns across all models: persistent upload-dominant behavior, high prevalence of TLS 1.3 and QUIC, and highly specific Server Name Indication (SNI)—distinct from traditional IM apps and imposing higher network load. Quantitative analysis confirms SNI as a critical feature for traffic classification. Our dataset is publicly released to support AI traffic identification, network optimization, and security governance.

Technology Category

Application Category

📝 Abstract
Generative AI (GenAI) chatbots are now pervasive in digital ecosystems, yet their network traffic remains largely underexplored. This study presents an in-depth investigation of traffic generated by three leading chatbots (ChatGPT, Copilot, and Gemini) when accessed via Android mobile apps for both text and image generation. Using a dedicated capture architecture, we collect and label two complementary workloads: a 60-hour generic dataset with unconstrained prompts, and a controlled dataset built from identical prompts across GenAI apps and replicated via conventional messaging apps to enable one-to-one comparisons. This dual design allows us to address practical research questions on the distinctiveness of GenAI traffic, its differences from widely deployed traffic categories, and its novel implications for network usage. To this end, we provide fine-grained traffic characterization at trace, flow, and protocol levels, and model packet-sequence dynamics with Multimodal Markov Chains. Our analyses reveal app- and content-specific traffic patterns, particularly in volume, uplink/downlink profiles, and protocol adoption. We highlight the predominance of TLS, with Gemini extensively leveraging QUIC, ChatGPT exclusively using TLS 1.3, and app- and content-specific Server Name Indication (SNI) values. A payload-based occlusion analysis quantifies SNI's contribution to classification: masking it reduces F1-score by up to 20 percentage points in GenAI app traffic classification. Finally, compared with conventional messaging apps when carrying the same content, GenAI chatbots exhibit unique traffic characteristics, highlighting new stress factors for mobile networks, such as sustained upstream activity, with direct implications for network monitoring and management. We publicly release the datasets to support reproducibility and foster extensions to other use cases.
Problem

Research questions and friction points this paper is trying to address.

Investigating network traffic patterns of ChatGPT, Copilot, and Gemini chatbots
Comparing GenAI traffic characteristics with conventional messaging applications
Analyzing protocol usage and traffic classification challenges in mobile networks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Network traffic analysis using dedicated capture architecture
Modeling packet dynamics with Multimodal Markov Chains
Quantifying SNI impact via payload-based occlusion analysis
🔎 Similar Papers
No similar papers found.
A
Antonio Montieri
University of Napoli Federico II, Italy
A
Alfredo Nascita
University of Napoli Federico II, Italy
Antonio Pescapè
Antonio Pescapè
Professor, University of Napoli Federico II / Università di Napoli Federico II
Computer NetworksNetwork SecurityArtificial IntelligenceNetwork Monitoring