🤖 AI Summary
This study addresses the challenge of verifying authorship attribution for social media screenshots—particularly Twitter posts—which are frequently exploited to disseminate disinformation due to their inherent lack of verifiable provenance. We propose the first systematic approach leveraging web archives (specifically the Wayback Machine) for forensic溯源 of such screenshots. Our method employs OCR to extract textual content from screenshots, then jointly models account handles, timestamps, and semantic features to query archived pages via the archive’s API; final verification is achieved through string matching and metadata alignment to locate and authenticate the original tweet. Crucially, this work pioneers the deep integration of web archiving for attributing deleted or inaccessible tweets, overcoming reliance on real-time APIs or platform-provided data. Evaluated on a dataset of 1,571 single-tweet screenshots, our method significantly improves source identification accuracy, establishing a scalable, platform-agnostic forensic paradigm for countering online disinformation.
📝 Abstract
Screenshots of social media posts are a common approach for information sharing. Unfortunately, before sharing a screenshot, users rarely verify whether the attribution of the post is fake or real. There are numerous legitimate reasons to share screenshots. However, sharing screenshots of social media posts is also a vector for mis-/disinformation spread on social media. We are exploring methods to verify the attribution of a social media post shown in a screenshot, using resources found on the live web and in web archives. We focus on the use of web archives, since the attribution of non-deleted posts can be relatively easily verified using the live web. We show how information from a Twitter screenshot (Twitter handle, timestamp, and tweet text) can be extracted and used for locating potential archived tweets in the Internet Archive's Wayback Machine. We evaluate our method on a dataset of 1,571 single tweet screenshots.