🔥.me.ss! You can't register emoji domains in South Sudan


Dear Terence, We have contacted the registry and they said they don't allow 2 successive dashes.

It's useful to share negative results. Not every experiment has an amazing or successful outcome. tl;dr you can't register Punycode .ss domains. This also means Internet users in South Sudan can't register domains using their own writing system. Background The Republic of South Sudan became independent and joined the United Nations back in 2011. A decade later, and it's now possible to register .ss domains. Partly due to the history of the letters SS, and partly because of the way domains…

Continue reading →

⩵ != ==


Lots of mechanical fingers typing on a complicated keyboard.

One of the frustrating things about computers is their limited input options. A "standard" PC keyboard only has about 100 keys. Sure, some have some bonus buttons for controlling the machine, but it is becoming clear that there simply aren't enough buttons to efficiently program computers. Most programming languages have the concept of relational operators. How does variable X compare to variable Y? If we want to ask if X is less than or equal to Y, we write X <= Y. Which is a bit weird,…

Continue reading →

How not to sort a list of countries


A list of flags. Estonia, Spain, Finland, France, UK, Greece, Croatia, Hungary, Ireland.

Being from the United Kingdom is hard sometimes. When scrolling through a list of countries, we might be found down the bottom as "UK" or near the top as "Great Britain". Occasionally someone files us under "England" - thus ignoring Wales, Scotland, NI etc. Once in a while, it'll be "The UK". Truly, no one has suffered as we have suffered⸮ Here's a list of countries from the Curve Credit card (join and we both get a fiver!) - I scrolled all the way to the bottom looking for the UK, only to f…

Continue reading →

Should ₹ be part of the Latin font subset?


Stock photo of colourful Indian Rupee notes.

Some background reading. Skip if you're familiar with fonts. A font file contains a list of characters (usually letters, numbers, and punctuation) and glyphs (the drawn representation of that character). It is, of course, a lot more complicated than that. Each character has a codepoint which is represented in hexadecimal. For example, U+0057 is the Latin letter Capital W, U+20AC is the Euro Symbol €, and U+1F600 is the Emoji Smiling Face 😀. These codepoints are assigned by the Unicode Cons…

Continue reading →

iOS 14 gets support for the Unicode Power Symbol!


Power symbols displayed on the screen.

It has been four years since Unicode officially accepted our Power Symbols proposal into the standard. Now I’m delighted to announce that users on iOS 14 are finally able to use the full set of Power Symbols. ⏻ ⏼ ⭘ ⏽ ⏾ They’re available to use in the browser, in emails, and messages. Here’s how they look, in both dark and light mode: Terence Eden is on Mastodon@edentAnyone with iOS 14 able to see these 5 symbols?If not - who do I still know at Apple that is willing to listen to me grouch at …

Continue reading →

Buying a single character domain - and 3 character FQDN - for £15


Glowing computer text showing dot com dot info etc.

Short domains are useful for security testing. If you only have a limited number of characters, you need to be able to reference code on a remote server in as few characters as possible. A few years ago, I tried to find a Minimum Viable XSS. The conclusion that I (and others) came to is that 20 characters is the bare minimum. But it requires you have a 2 character domain name on a 2-character TLD. Something like xy.uk I don't think any 1- or 2-character domain names are available. If they're…

Continue reading →

Hashtag Steganography


Steganography (/ˌstɛɡəˈnɒɡrəfi/ is the practice of concealing a file, message, image, or video within another file, message, image, or video. I recently saw someone tweeting the hashtag #ManchesُterDerby Do you see an odd character in the middle? It's an Arabic Damma (U+064F) - a vowel character. Although it comes after the "s" in Manchester, it appears after the "t" because it is a Right-To-Left (RTL) character. Yet, if you click on the hashtag with the extra character, you get through to …

Continue reading →

Quirks and Limitations of Emoji Flags


A screenshot of a list of country flags

This blog post contains emoji which your system may not be able to display. You may see broken text, weird symbols, or other buggy rendering. The Transgender Flag is a draft candidate for Emoji 13.0 under the name Blue, Pink, and White Flag. A number of platforms include an image for this emoji, but do not show it on the emoji keyboard. As of June 2019 this is now supported on Twitter platforms that use Twemoji. Emojipedia The (proposed) Transgender Flag looks like this (image) or…

Continue reading →

Invisible Pink Unicorns - a Firefox emoji rendering bug


The upper image is partially transparent. The lower image is completely opaque.

Here's a curious bug I just discovered in Firefox 67 for Linux. Can you see this unicorn: →🦄 ← What happens if you use CSS to change the opacity of an emoji? Here's a unicorn, with a pink font colour: 🦄 Unicorn Let's wrap that in this scrap of CSS to make it 50% opaque. color: rgba(255, 105, 180, 0.5); 🦄 Unicorn Hopefully, you see a semi-transparent philosophical argument. What if we set the opacity to 0.0 - that is, completely transparent? 🦄 Unicorn There's a shunicorn there. If you …

Continue reading →

Banish the � with Unifont


Lots of Emoji rendered in small, monochrome pixels.

The GNU Unifont project is amazing. It contains every Unicode glyph in one single file! I am going to argue that you should bundle it with your apps, your operating systems, and - at a pinch - your websites. The Unifont is a perfect fallback font. If your app or website uses a Unicode character which isn't supported on a device, the user will usually see � - a replacement character. If you include Unifont, they'll see the correct character. There are two downsides: The TTF font is 12MB. T…

Continue reading →

What does "挨⎒" have to do with "<html"?


Garbled text in an email.

I received this weird bit of mojibake in an email. Here's the raw text view: ------=_NextPart_001_009E_01D4D8BF.D0737E10 Content-Type: text/plain; charset=&quot;UTF-8&quot; Content-Transfer-Encoding: quoted-printable =E6=8C=A8=E2=8E=92tml xmlns:v=3D&quot;urn:schemas-microsoft-com:vml&quot; xmlns:o=3D&quot;= urn:schemas-microsoft-com:office:office&quot; xmlns:w=3D&quot;urn:schemas-microsoft-c= om:office:word&quot; xmlns:m=3D&quot;http://schemas.microsoft.com/office/2004/12/omml= &quot;…

Continue reading →

Amazon Prime Video's weird Unicode problems


Description with an error in it.

It's 2019 and high-tech devices are still plagued by text encoding bugs. I recently bought the new 4K Amazon Fire Stick. It's a little Android dongle which plays videos. It's neat - but quite often displays weird text errors. Take the kids' TV show House of Anubis, the Fire displays the description like this: Looking at the source code for the description: That's the character "private use two" (U+0092). What on earth is that doing there? Well, in the ancient Windows-1252 encoding, 0x92 …

Continue reading →