Should ₹ be part of the Latin font subset?

by @edent | # # # # # | 5 comments | Read ~106 times.

Some background reading. Skip if you’re familiar with fonts. A font file contains a list of characters (usually letters, numbers, and punctuation) and glyphs (the drawn representation of that character). It is, of course, a lot more complicated than that. Each character has a codepoint which is represented in hexadecimal. For example, U+0057 is the…

Continue reading →

iOS 14 gets support for the Unicode Power Symbol!

by @edent | # # # # | 1 comment | Read ~163 times.
Power symbols displayed on the screen.

It has been four years since Unicode officially accepted our Power Symbols proposal into the standard. Now I’m delighted to announce that users on iOS 14 are finally able to use the full set of Power Symbols. ⏻ ⏼ ⭘ ⏽ ⏾ They’re available to use in the browser, in emails, and messages. Here’s how…

Continue reading →

Buying a single character domain – and 3 character FQDN – for £15

by @edent | # # # # | 15 comments | Read ~19,505 times.
Glowing computer text showing dot com dot info etc.

Short domains are useful for security testing. If you only have a limited number of characters, you need to be able to reference code on a remote server in as few characters as possible. A few years ago, I tried to find a Minimum Viable XSS. The conclusion that I (and others) came to is…

Continue reading →

Hashtag Steganography

by @edent | # # # | Read ~395 times.

Steganography (/ˌstɛɡəˈnɒɡrəfi/ is the practice of concealing a file, message, image, or video within another file, message, image, or video. I recently saw someone tweeting the hashtag #ManchesُterDerby Do you see an odd character in the middle? It’s an Arabic Damma (U+064F) – a vowel character. Although it comes after the “s” in Manchester, it…

Continue reading →

Quirks and Limitations of Emoji Flags

by @edent | # # # | 1 comment | Read ~2,318 times.
A screenshot of a list of country flags

This blog post contains emoji which your system may not be able to display. You may see broken text, weird symbols, or other buggy rendering. The Transgender Flag is a draft candidate for Emoji 13.0 under the name Blue, Pink, and White Flag. A number of platforms include an image for this emoji, but do…

Continue reading →

Invisible Pink Unicorns – a Firefox emoji rendering bug

by @edent | # # # # # | 7 comments | Read ~173 times.
The upper image is partially transparent. The lower image is completely opaque.

Here’s a curious bug I just discovered in Firefox 67 for Linux. Can you see this unicorn: →🦄 ← What happens if you use CSS to change the opacity of an emoji? Here’s a unicorn, with a pink font colour: 🦄 Unicorn Let’s wrap that in this scrap of CSS to make it 50% opaque.…

Continue reading →

Banish the � with Unifont

by @edent | # # | 7 comments | Read ~5,519 times.
Lots of Emoji.

The GNU Unifont project is amazing. It contains every Unicode glyph in one single file! I am going to argue that you should bundle it with your apps, your operating systems, and – at a pinch – your websites. The Unifont is a perfect fallback font. If your app or website uses a Unicode character…

Continue reading →

What does “挨⎒” have to do with “<html”?

by @edent | # # # | 1 comment | Read ~115 times.
Garbled text in an email.

I received this weird bit of mojibake in an email. Here’s the raw text view: ——=_NextPart_001_009E_01D4D8BF.D0737E10 Content-Type: text/plain; charset=”UTF-8″ Content-Transfer-Encoding: quoted-printable =E6=8C=A8=E2=8E=92tml xmlns:v=3D”urn:schemas-microsoft-com:vml” xmlns:o=3D”= urn:schemas-microsoft-com:office:office” xmlns:w=3D”urn:schemas-microsoft-c= om:office:word” xmlns:m=3D”http://schemas.microsoft.com/office/2004/12/omml= ” xmlns=3D”http://www.w3.org/TR/REC-html40″> What’s going on? 挨 is a Chinese, Japanese, Korean (cjk) unified ideograph (U+6328) ⎒ is the passive-pull-up-output symbol (U+2392) That’s somehow replaced: < – less-than…

Continue reading →

Amazon Prime Video’s weird Unicode problems

by @edent | # # # | 1 comment | Read ~214 times.
Description with an error in it.

It’s 2019 and high-tech devices are still plagued by text encoding bugs. I recently bought the new 4K Amazon Fire Stick. It’s a little Android dongle which plays videos. It’s neat – but quite often displays weird text errors. Take the kids’ TV show House of Anubis, the Fire displays the description like this: Looking…

Continue reading →

Domain hacks with unusual Unicode characters

by @edent | # # # # | 3 comments | Read ~15,391 times.

Unicode contains a range of symbols which don’t get much use. For example, there are separate symbols for TradeMark – ™, Service Mark – ℠, and Prescriptions – ℞. Nestling among the “Letterlike Symbols” are two curious entries. Both of these are single characters: Telephone symbol – ℡ Numero Sign – № What’s interesting is…

Continue reading →