Pursuit Podcast - Life, The Unicode, And Everything

A beautiful hand drawing showing the flow of the conversation

The inimitable Jess Rose interviewed me for her Pursuit Podcast - talking about the Unicode Power Symbol proposal. We talked about how to subvert bureaucracy, building a team of supporters, adding new stuff to Unicode, and recognising that you're a background character in most people's lives. Bit of a ramble, but jolly good fun. Sketchnotes […] Read More

únicode is hard

In the last couple of months, I've been seeing the ú symbol on British receipts. Why? 1963 - ASCII In the beginning* was ASCII. A standard way for computers to exchange text. ASCII was originally designed with 7 bits - that means 128 possible symbols. That ought to be enough for everyone, right? Wrong! ASCII […] Read More

How Do You Sort Chinese Numbers?

Imagine you have a series of number you wish to sort. Sorting is a well known computer science problem - generally speaking you compare one value to the next and then move the item either up or down a list. With "English" characters, that's fairly easy. When a computer sees the character 1 it's really […] Read More

How to type Emoji in Ubuntu

New tech site Gadgette has a great article on how to type Emoji on Mac and Windows - but they (understandably) didn't cover Ubuntu. So here I am to show you how. Get The Fonts If your computer doesn't have the requite font, install the latest version of Symbola. Simply open up the .zip file, […] Read More

Twitter's Weird Control Character Handling

A little curio for you all. A StackOverflow user has pointed out that certain Twitter profiles contain very odd Unicode characters. What on Earth is going on? Let's take a look at Bill Clinton's profile on Twitter. Ok, that looks pretty normal. But let's take a look at the HTML source. Huh... What are those […] Read More

Searching For A Smile

What happens if you search the web for the Unicode character "☺"? On the one hand, it's a symbol just like the letter A or the punctuation mark "!" - on the other, it contains semantic meaning. A smiling, happy face. I decided to look at a few popular search engines to see what they'd […] Read More

Facebook Mangles Unicode URLs

Facebook rewrite URLs with Unicode in the path - this is not best practice and could be dangerous. It is possible to create a URL like http://bit.ly/😀 - the Unicode characters are valid in the path. The URL Encoded representation is : bit.ly/%F0%9F%98%80 Facebook mangles these URLs in such a way that it might be […] Read More

Evading Profanity Filters Using Bi-Directional Text

There are some very sensitive souls on the Internet who object to seeing swear words. To that end, a huge industry has sprung up around "Profanity Filters" - services which claim to be able to detect naughty words and automatically redact them. The approach of dumbly looking for strings of text leads to a range […] Read More