Domain hacks with unusual Unicode characters


Unicode contains a range of symbols which don't get much use. For example, there are separate symbols for TradeMark - ™, Service Mark - ℠, and Prescriptions - ℞.

Nestling among the "Letterlike Symbols" are two curious entries. Both of these are single characters:

What's interesting is both .tel and .no are Top-Level-Domains (TLD) on the Domain Name System (DNS).

So my contact site - https://edent.tel/ - can be written as - https://edent.℡/

And the Norwegian domain name registry NORID can be accessed at https://www.norid.№/

Copy and paste those links - they work in any browser!

Is this limited to TLDs?

No! This works ANYWHERE in a domain name. Copy and paste these examples:

  • Script https://ℰ𝒳𝒜ℳ𝓟ℒℰ.𝒞𝓞ℳ/
  • Math Bold https://𝐞𝐱𝐚𝐦𝐩𝐥𝐞.𝐜𝐨𝐦/
  • Fraktur https://𝖊𝖝𝖆𝖒𝖕𝖑𝖊.𝖈𝖔𝖒/
  • Math bold italic https://𝒆𝒙𝒂𝒎𝒑𝒍𝒆.𝒄𝒐𝒎/
  • Math bold script https://𝓮𝔁𝓪𝓶𝓹𝓵𝓮.𝓬𝓸𝓶/
  • Double struck https://𝕖𝕩𝕒𝕞𝕡𝕝𝕖.𝕔𝕠𝕞/
  • Monospace https://𝚎𝚡𝚊𝚖𝚙𝚕𝚎.𝚌𝚘𝚖/
  • Super script https://ᵉˣᵃᵐᵖˡᵉ.ᶜᵒᵐ/
  • Sub script https://ₑₓₐₘₚₗₑ.cₒₘ/ NB not all characters supported
  • Math sans bold https://𝗲𝘅𝗮𝗺𝗽𝗹𝗲.𝗰𝗼𝗺/
  • Math sans bold italic https://𝙚𝙭𝙖𝙢𝙥𝙡𝙚.𝙘𝙤𝙢/
  • Math sans italic https://𝘦𝘹𝘢𝘮𝘱𝘭𝘦.𝘤𝘰𝘮/
  • Math Squared https://🄴🅇🄰🄼🄿🄻🄴.🄲🄾🄼/ NB the dot must not be squared
  • Circled https://ⓔⓧⓐⓜⓟⓛⓔ.ⓒⓞⓜ/ NB the dot must not be circled

There are a whole bunch more miscellaneous characters you can use:

How does this work?

Magic! Which is to say, I think it is the browser doing the conversion. DNS Servers don't successfully reply to queries about .℡ domains.

The browser sees the .℡ and then follows the IDNA2008 process listed in RFC5895 to normalise it:

map characters to the "Simple_Lowercase_Mapping" property (the fourteenth column) in <http://www.unicode.org/Public/UNIDATA/UnicodeData.txt>, if any.

The ℡ entry is:

2121;TELEPHONE SIGN;So;0;ON;<compat> 0054 0045 004C;;;;N;T E L SYMBOL;;;;

U+0054 is T, U+0045 is E, U+004C is L.

You can test this in Python using:

python -c 'import sys;print sys.argv[1].decode("utf-8").encode("idna")' "℡"

Does this work?

Yes! I asked people on Twitter whether they could access my website using a .℡ - and it appeared to work on every modern browser and operating system.

It even works on command line tools like wget and curl.

It does fail in some circumstances:

What are the limitations?

Two main ones:

  • Sites like Twitter and Facebook don't recognise it as a valid URl and refuse to auto link it.
  • Some command line tools like dig and host don't understand it
BASH BASHdig edent.℡

; <<>> DiG 9.10.6 <<>> edent.℡
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 55282

Is this useful?

Obviously yes. This may be the most important discovery of the decade. You get cool looking URls and get to save a couple of characters on specific domains, at the minor expense of working inconsistently.

It could also be used for evading URl filters.

Every modern browser supports these "fancy" domain names - but most websites won't automatically link to them. So sharing on Facebook doesn't work.

Where can it be used?

Here are the single characters which can be normalised down to a valid TLD. They're mostly country codes, but there are a few interesting exceptions:

  • - US Military
  • - .tel registry
  • - Norway
  • - Australia
  • - Dominica
  • - Panama
  • - Namibia
  • - Morocco
  • - French Polynesia
  • - Norfolk Island
  • - Kyrgyzstan
  • - Mali
  • - Federated States of Micronesia
  • - Finland
  • - Myanmar
  • - Cameroon
  • & - Comoros
  • - Palestine
  • - Montserrat
  • & - Republic of Maldives.
  • - Palau
  • & - Malawi
  • - Cocos (Keeling) Islands
  • - Democratic Republic of Congo
  • - Guyana
  • - Philippines
  • - Saint Pierre and Miquelon
  • - Puerto Rico
  • - Suriname
  • - El Salvador
  • - San Marino
  • - Turkmenistan
  • & - São Tomé and Príncipe
  • - Great Britain (Obsolete)
  • ß - South Sudan (Not available)
  • - India and Indiana (subdomain of .us)
  • & - Virgin Islands and Virginia (subdomain of .us)
  • - Florida (subdomain of .us)
  • - New Mexico (subdomain of .us)
  • - Nevada (subdomain of .us)
  • - As part of .ovh

If you can find any more, please stick a comment in the box below.

You can always reach this blog post at:

https://🅂𝖍𝐤ₛᵖ𝒓.ⓜ𝕠𝒃𝓲/🆆🆃🅵/



Share this post on…

  • Mastodon
  • Facebook
  • LinkedIn
  • BlueSky
  • Threads
  • Reddit
  • HackerNews
  • Lobsters
  • WhatsApp
  • Telegram

4 thoughts on “Domain hacks with unusual Unicode characters”

    1. nik says:

      If people do end up in the vi domain , remember you can leave by pressing the esc key a couple of times ( just in case ) then type :q!

      Reply
  1. Hendrik says:

    Unfortunatly ℹ️ seems to work as “i”. Would be neat if it could be usefd as full .info

    Reply

What links here from around this blog?

What are your reckons?

All comments are moderated and may not be published immediately. Your email address will not be published.

Allowed HTML: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <p> <pre> <br> <img src="" alt="" title="" srcset="">