The future of the web, isn't the web


A fist emerges from a computer screen and punches the user.

My friends, and former employers, at the Government Digital Service have written a spectacularly good blog post "Making GOV.UK more than a website". In it, they describe how adding Schema.org markup to their website has allowed search engines to extract semantic content and display it to a user. For example, the "Learn to drive" page has content which can appear directly in a search engine: Even better, if you ask Siri / Google / Alexa for something, it can give an answer from an…

Continue reading →

Can you trust CloudFlare with your personal data?


Email with CloudFlare's new privacy policy.

I'm increasingly concerned with the power that CDNs wield - and CloudFlare in particular. So I decided to delete my CloudFlare account. While they claim to have removed my account, they still seem to count me as an active customer. I wonder how many people bought shares in their IPO based on inaccurate customer numbers? Timeline 2019-08-04 I raised a support ticket to close my account. 2019-08-05 CloudFlare sent me confirmation that they'd removed my account. 2019-10-02 I received an…

Continue reading →

The Great(er) Bear - using Wikidata to generate better artwork


A close up of the map.

One of my favourite works of art is The Great Bear by Simon Patterson. At first glance, it appears to be a normal London Tube map. But look closer... Cool! But there is something about it which has always bothered me. Each Tube line represents a theme - therefore, a station at the intersection of multiple lines should be represented by someone who matches all of those themes. For example, here's Baron's Court - the intersection of the Explorer line and the Saint line - represented by…

Continue reading →

Two years of home heating data


A complicated graph.

I have a Tado smart thermostat - part of my smarthome project. As well as letting me set the temperature from my phone, it records environmental data, and provides a handy API for me to retrieve it. This blog post will show you why I've gathered the data, let you download the full dataset, and explain what I learned from it. Why do this? There's a long-standing plan to use waste-heat from a nearby supermarket to provide communal heat to our neighbourhood. A low temperature heat main…

Continue reading →

Personalisation is Asymmetric Psychological Warfare


Another privacy nightmare. An airline wants its cabin crew to know your birthday and favourite drinks order, to better personalise its service to you. My first instinct is to recoil in horror. It sounds like every dystopian sci-fi epic. But why do I feel this way? Partly it is the lack of genuine personality behind the interaction. It is the Uncanny Valley of sincerity. When Facebook wishes you happy birthday, it is a purely mechanical response - not an outpouring of genuine feeling. There's …

Continue reading →

KYLI - because it is superior to JSON


This is a (silly) attempt to fix some of the shortcomings of JSON. Hence it is named after the goddess of music. It uses C0 Control Characters Here is an example: ␜ ␁ This is a KYLI document ␂ ␝ GroupName ␞ data ␟ value ␛ Comments are supported too! They can be multilined easily. ␙ I've used Unicode Control Pictures so you can see what's happening. In reality, ␜ is  - which on your display looks like . Why KYLI is 100x better than crummy J…

Continue reading →

Which Twitter User Receives The Most Citations on Wikipedia?


The Twitter logo.

I few days ago, I was somewhat surprised to find that one of my Tweets had been used as a citation in Wikipedia! I began to wonder - how often are Tweets used in citations? It's possible to search for your own Tweets using this (somewhat obscure) link: https://en.wikipedia.org/w/index.php?title=Special%3ALinkSearch&target=twitter.com%2Fedent Just edit the end of it to see if you, or your friends, have been cited. Note - the username is case sensitive, so "Edent" isn't the same as "edent". …

Continue reading →

A Complete List of Every UK Government Domain Name


The GOV.UK logo.

Eight years after I published this blog post, I helped officially release all these domain names as open data! Funny how life works out, eh? Would you like to know every domain name the UK Government had registered? Of course you would! There could be all sorts of interesting tit-bits hidden in there (ProtectAndSurvive.gov.uk? EbolaOutbreak2017.nhs.uk? MinistryOfTruth.police.uk?) Rather than relying on Freedom of Information requests, or Open Data, we can go straight to the source of domain …

Continue reading →

Big Data As A Lethal Weapon


Yesterday I attended an OII talk on the Ethical Treatment of Data in New Digital Landscapes. Amy O'Donnell from Oxfam lead a discussion about how the charity is seeking to improve the way that Aid Agencies deal with the data they collect. Oxfam collects data for many different reasons - sometimes it is incidental (for example the bank account details it needs to make payments), sometime it is deliberate (for example when conducting a survey about how aid is used). Protecting personal data…

Continue reading →

Shakespeare's Honor


As part of the Shakespeare Hackday I attended a few weeks ago, we discussed some interesting analysis which can be done on the text. Certain forms of analysis are hampered due to the archaic and inconsistent spelling. I wondered if that could be mined for anything interesting. For example, in modern UK English we use the word "honour". In modern US English, it loses the "u" to become "honor". So, how was it spelt in Shakespeare's day? I downloaded the XML representation of all the plays…

Continue reading →

Exporting TwitPic Images - Python


Logo of the Python programming language.

As part of my quest to ensure I have a reasonable backup of all my social media data, I've been investigating ho easy it is to export photos from TwitPic. I've been using TwitPic since 2008 and have uploaded 1,200 images there. There's no official export function for TwitPic. The services which used to exist relied on their RSS feeds - which have since been killed off. This little Python script uses some undocumented APIs to grab all your images, save them in a directory, and make sure they …

Continue reading →

Opt Out of Klout - Now!


Klout logo.

Sites like Klout and Kred are perfect examples of social media frippery. A vaguely plausible "score" that you can use to justify your "investment" in tweeting all day long. When they're used as a silly little badge, or an informal competition with friends, they're a (mostly) harmless way of gamification. Of continual annoyance is the complete lack of transparency these services show. How is your score calculated? Is anyone manipulating it? What can you do to improve it? Still, it doesn't…

Continue reading →