Archiving my tweets

, .

As you may be aware, Twitter has been going down the toilet for a while, especially since noted asshole Jack Dorsey sold it to noted even bigger asshole Elon Musk. As you may also be aware, I used to use Twitter quite a bit, and I also generally enjoy referring back to my older social media posts. So while I moved to Mastodon a while ago (first the now-defunct mastodon.technology, then Wikis World), I would still like to have access to my old tweets.

After opening tabs for several Twitter archivers and then not doing anything with them for ca. two years, I’ve now finally tried them out, picked one I liked, and started setting it up. My archiver of choice is Darius Kazemi aka Tiny Subversions’ simple Twitter archiver, which is very easy to use (it runs in the browser: drop in your ZIP file, get a new ZIP file out), fast, and produces output that I find useful and pleasing. Its URL format also lends itself to supporting multiple archives next to each other, which should be useful later; I only had to tweak the app.js and index.html files slightly to make this work.

You can find the @LucasWerkmeistr Twitter archive at twitter.lucaswerkmeister.de/LucasWerkmeistr/, and individual tweets at URLs like twitter.lucaswerkmeister.de/LucasWerkmeistr/status/1154827181387898882, easily rewritable from the original URL twitter.com/LucasWerkmeistr/status/1154827181387898882 – just replace the twitter.com domain with twitter.lucaswerkmeister.de. The archive also includes a search feature; I’m slightly nervous about someone using it to find stupid things I wrote years ago, but I think on the whole it should be okay as long as people keep in mind that this is an archive of things I posted in the past, not necessarily an enthusiastic endorsement that I would post all of it in exactly the same way again today.

The biggest shortcoming of this Twitter archiver, which it shares with the other options I tried, is that it doesn’t include the alt text of the images I posted. This is very unfortunate, as the alt text is important for accessibility, and also I put a lot of work into those alt texts if you add it up and I don’t want to lose it. (The alt text is not included in Twitter’s own exports / archives, so any tool that’s based on them is going to have the same limitation. I assume in principle it would be possible for someone to build a tool that fetches the alt text from Twitter now, as long as the tweets haven’t been deleted yet, but I don’t know if anyone’s done that and I’m certainly not interested in switching to a different archiver now.) So I picked out my most popular tweets – using the command sed 's/^const searchDocuments = //' searchDocuments.js | jq '[.[] | select(.full_text | contains("tweets_media"))] | sort_by(.favorite_count | tonumber | - .) | .[]' | less which picks them out of the search data – and manually copied the alt text for those, as well as for plenty of other tweet threads where I wanted to keep the alt text, just by copying it from the existing tweets (which I haven’t deleted yet) and hand-editing it into the HTML files. (Emacs’ syntax highlighting tells me when I need to switch between alt="" and alt='', and/or HTML-escape individual single quotes as ', because of quote characters in the alt text.) I expect I’ll keep doing this for a few more days after this blog post goes up, and once I decide I’ve copied enough alt text, then it will probably finally be time for me to, at long last, delete my account.

(Side note one: as the archive shows the full thread of a tweet on each page, but still has a separate page per tweet, threaded tweets are included in several copies – once per thread in the tweet. I’ve generally only manually added the alt text to the page for the “top” tweet in the thread – for instance, this page has the alt text for this tweet – so if you ended up on a page in the middle of a thread, you might want to follow the link at the top. If I can be bothered, maybe I’ll later write a little program that syncs the alt= attributes between different pages.)

(Side note two: in the original output of the archiver, links to other tweets in the tweets themselves weren’t updated; I fixed the links pointing to my own tweets using the command sed -i.sedbak -E 's|<a href="https://twitter.com/LucasWerkmeistr/status/([1-9][0-9]*)">https://twitter.com/LucasWerkmeistr/status/\1</a>|<a href="https://twitter.lucaswerkmeister.de/LucasWerkmeistr/status/\1">https://twitter.lucaswerkmeister.de/LucasWerkmeistr/status/\1</a>|g' */index.html – i.e. making them point back to the archive rather than the original URL. If anyone else I’ve interacted with has also archived their tweets in a similar way, let me know and I can make my archived tweets point to your archive 🎉)

As part of copying the alt text, I also rediscovered many of my threads that I liked, so here’s a selection of some of them – some of my favorite tweets and threads:

As I mentioned, the archive’s URL format can accommodate multiple Twitter accounts’ archives on the same domain, so I expect at some point (hopefully soon) I’ll also set up twitter.lucaswerkmeister.de/WikidataFacts/ and twitter.lucaswerkmeister.de/ItsBiNotHetero/ – stay tuned 🙂