Banish the � with Unifont

by @edent | # # | 6 comments | Read ~4,854 times.
Lots of Emoji.

The GNU Unifont project is amazing. It contains every Unicode glyph in one single file! I am going to argue that you should bundle it with your apps, your operating systems, and - at a pinch - your websites. The Unifont is a perfect fallback font. If your app or website uses a Unicode character… Continue reading →

What does "挨⎒" have to do with "<html"?

by @edent | # # # | 1 comment
Garbled text in an email.

I received this weird bit of mojibake in an email. Here's the raw text view: ------=_NextPart_001_009E_01D4D8BF.D0737E10 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable =E6=8C=A8=E2=8E=92tml xmlns:v=3D"urn:schemas-microsoft-com:vml" xmlns:o=3D"= urn:schemas-microsoft-com:office:office" xmlns:w=3D"urn:schemas-microsoft-c= om:office:word" xmlns:m=3D"http://schemas.microsoft.com/office/2004/12/omml= " xmlns=3D"http://www.w3.org/TR/REC-html40"> What's going on? 挨 is a Chinese, Japanese, Korean (cjk) unified ideograph (U+6328) ⎒ is the passive-pull-up-output symbol (U+2392) That's somehow replaced: < - less-than… Continue reading →

Amazon Prime Video's weird Unicode problems

by @edent | # # # | 1 comment | Read ~120 times.
Description with an error in it.

It's 2019 and high-tech devices are still plagued by text encoding bugs. I recently bought the new 4K Amazon Fire Stick. It's a little Android dongle which plays videos. It's neat - but quite often displays weird text errors. Take the kids' TV show House of Anubis, the Fire displays the description like this: Looking… Continue reading →

Domain hacks with unusual Unicode characters

by @edent | # # # # | 2 comments | Read ~9,829 times.

Unicode contains a range of symbols which don't get much use. For example, there are separate symbols for TradeMark - ™, Service Mark - ℠, and Prescriptions - ℞. Nestling among the "Letterlike Symbols" are two curious entries. Both of these are single characters: Telephone symbol - ℡ Numero Sign - № What's interesting is… Continue reading →

Forbidden Unicode

by @edent | # # | 2 comments | Read ~647 times.

I have been receiving letters from a dear friend by the name of Ophiuchus. He has been researching some curious anomalies in the Unicode Standard. While I cannot vouch for all he has written, I thought it worth presenting his discoveries to you. My friend, I bring you a curiosity! I have been engaged in… Continue reading →

Why Android Pie Won't Be Getting the Copyleft Symbol

by @edent | # # | Read ~484 times.
Wikipedia Copyleft page. The icon is a blank box.

Google is a company with nearly unlimited resources. It often chooses to use its power for the greater good of the Internet. Creating amazing projects like digitizing every printed book, bringing Internet access via high-altitude balloons, and offering high-quality language translation. And sometimes it just gets bored and abandons them. Google Noto is such a… Continue reading →

Virgin Media don't understand Unicode

by @edent | # # | 1 comment | Read ~352 times.
HTML code from Virgin.

More adventures with Unicode. I logged in to my Virgin Media account to see when my promotional discount would end. Here's what their billing PDF said. Let'S Ignore The Weird Capitalisation Virgin'S System Uses. What's that  doing there? Their website says: No  symbol, but also no £ sign. Ah, but let's look at… Continue reading →

Obsolete Technology in Unicode

by @edent | # # # | 6 comments | Read ~298 times.
Screenshot of the Unicode standard. The page shows symbols for Telephone Receivers, Pagers, and Fax Machines.

A short meander through some of the more obscure miscellany within Unicode. Languages hang around far longer than there are native speakers, and symbols get reused and repurposed (🍆). Here are some of the delightfully old-fashioned symbols hidden in your thoroughly modern smartphone. Tapes Long before solid-state drives, we used to record data on long… Continue reading →

Pursuit Podcast - Life, The Unicode, And Everything

by @edent | # # #
A beautiful hand drawing showing the flow of the conversation

The inimitable Jess Rose interviewed me for her Pursuit Podcast - talking about the Unicode Power Symbol proposal. We talked about how to subvert bureaucracy, building a team of supporters, adding new stuff to Unicode, and recognising that you're a background character in most people's lives. Bit of a ramble, but jolly good fun. Sketchnotes… Continue reading →

únicode is hard

by @edent | # # | 15 comments | Read ~29,043 times.

In the last couple of months, I've been seeing the ú symbol on British receipts. Why? 1963 - ASCII In the beginning* was ASCII. A standard way for computers to exchange text. ASCII was originally designed with 7 bits - that means 128 possible symbols. That ought to be enough for everyone, right? Wrong! ASCII… Continue reading →