I’ve been blogging for a long time. Over the years, I’ve linked to tens of thousands of websites. Inevitably, some of those sites have gone. Even when sites still exist, webmasters seem to have forgotten that Cool URls Don’t Change.
It doesn’t always work, of course. Sometimes the page will have been taken over by spammers, and the snapshot reflects that.
This isn’t some SEO gambit. I believe that the web works best when users can seamlessly surf between sites. Forcing them to search for information is user-hostile.
What I’m trying to achieve
When a visitor clicks on a link, they should get (in order of preference):
- The original page
- An archive.org view of the page
- Ideally the most recent snapshot
- If the recent snapshot doesn’t contain the correct content, a snapshot of the page around the time the link was made
- A snapshot of the site’s homepage around the time the link was made
- A replacement page. For example, Topsy used to show who had Tweeted about your page. Apple killed Topsy – so now I point to Twitter’s search results for a URl.
- If there is no archive, and no replacement, and the link contains useful semantic information – leave it broken.
- Remove the link.
Some links are from people leaving comments, and setting their comments. Is it useful for future web historians to know that Blogger Profile 1234 commented on my blog and your blog?
Some links are only temporarily dead (for tax reasons?) – so I tend to leave them broken.
The Internet Archive say that “If you see something, save something“. So, going forward, I’ll submit every link out from my blog to the Archive. I’m hoping to find a plugin to automate that – any ideas?