Archiving Twitter

In the wake of billionaire Elon Musk’s purchase and reorganization of Twitter, many users are raising questions about the long-term prospects for the survival of the social media platform. Staff reductions and the downgrading of microservices have already caused problems, such as glitches in the platform’s two-factor authentication (2FA) system. The bungled launch of a paid verified check-mark service was exploited by impersonators, further alienating advertisers who were already nervous about the new leadership. Sensing chaotic times ahead, some users are leaving to join alternative platforms such as Mastodon or Post. Cliff Lampe, professor and associate dean for academic affairs at the University of Michigan's School of Information, expressed a dire view of Twitter’s future under Musk, stating that he expected the sale “would eventually kill the platform.

Anxieties about Twitter’s vulnerability are a reminder that social media accounts are not permanent or reliable repositories for our personal digital content. If we want to preserve some of what’s valuable about these digital environments, it's important that we take control over our own content in the event that the platform itself breaks or becomes unusable.

There are a few ways to make a copy of your Twitter content. One is a built-in feature of Twitter itself which enables users to request archive packages of their account data. As Twitter's services continue to degrade, you may want to take advantage of this feature sooner rather than later, even if you plan on keeping your Twitter account active.

To request your Twitter archive:

  1. In the left sidebar menu, click the “More” icon (circle with three dots inside)
  2. Under “Settings and Support,” click “Settings and privacy”
  3. Under “Your Account,” click “Download an archive of your data”
  4. Enter your password. You may then need to enter a verification code sent via email or text message
  5. Click the “Request archive” button

After requesting your data, it may take several days to receive the email to download the ZIP archive package. After you download it and extract the files, you’ll be able to view a version of your account by opening the “Your archive.html” file in your browser. For more detailed instructions, including screenshots, check out this guide from Ars Technica: How to Download a Backup Copy of Your Twitter Data (or Deactivate Your Account)

Twitter's archive package is mostly limited to what you’ve put into the system. It doesn't include most of the content created by other users you interacted with. For example, in your Likes section, you can see the text of tweets you liked, but the tweeter’s handle, images, and other media will be replaced by a “t.co” shortened outlink to the live tweet, which relies on the linked account remaining active. Likewise, you will only see your half of Direct Messages. Johan van der Knijff shares tools and workarounds for some of these shortcomings on his blog: How to Preserve Your Personal Twitter Archive 

If you want the Internet Archive's Wayback Machine to retain a public copy of your Twitter feed, there's now an option to upload the "tweets.js" file from your Twitter archive's Data folder to a Save Page Now Google Sheets interface (free Archive.org user account required). The full instructions are here: How to Archive Your Tweets with the Wayback Machine.

Another relatively user-friendly option for copying Twitter content is Webrecorder's free ArchiveWeb.page Desktop App and Chrome extension. Webrecorder harvests data through the browser, which can be done manually by clicking on and loading the tweets and media you want to save, or through an automatic setting that will open and load each tweet. The archived data is contained in a Web Archive Collection Zipped (WACZ) package that you can download to your computer. WACZ files can be viewed in your browser using ReplayWeb.page.  

For tech-savvy users who are comfortable with the command-line and Docker, there are other advanced tools for collecting data from Twitter's Application Programming Interface (API). The University of Michigan’s Bentley Historical Library uses a tool called Twarc, developed by Documenting the Now, to archive institutional Twitter accounts. Social Feed Manager, a project maintained by George Washington University Libraries, collects data from the Twitter API to support academic research. The outputs of these tools tend to be data-focused and are less concerned with replicating the “look and feel” of Twitter itself. Results may be exported as JavaScript Object Notation (JSON) files or spreadsheets that are not designed for an easy reading experience.

Twitter users have multiple options to copy content to ensure that it survives beyond the end of the platform itself. However, much of what makes Twitter a valuable digital space can't be preserved. Twitter's particular information landscape, with its networks of professional and personal connections and layers of real-time interactivity, is still at risk. Its loss would have a huge impact on journalists, historians, and researchers who rely on Twitter as a valuable record of human experience. Will another social media platform be able to emulate or replicate Twitter's function as an information resource, or will this aspect of Twitter disappear with the platform itself?