Unicode contains a range of symbols which don’t get much use. For example, there are separate symbols for TradeMark – ™, Service Mark – ℠, and Prescriptions – ℞.
Nestling among the “Letterlike Symbols” are two curious entries. Both of these are single characters:
What’s interesting is both .tel and .no are Top-Level-Domains (TLD) on the Domain Name System (DNS).
So my contact site – https://edent.tel/ – can be written as – https://edent.℡/
And the Norwegian domain name registry NORID can be accessed at https://www.norid.№/
Copy and paste those links – they work in any browser!
Is this limited to TLDs?
No! This works ANYWHERE in a domain name. Copy and paste these examples:
- Script https://ℰ𝒳𝒜ℳ𝓟ℒℰ.𝒞𝓞ℳ/
- Math Bold https://𝐞𝐱𝐚𝐦𝐩𝐥𝐞.𝐜𝐨𝐦/
- Fraktur https://𝖊𝖝𝖆𝖒𝖕𝖑𝖊.𝖈𝖔𝖒/
- Math bold italic https://𝒆𝒙𝒂𝒎𝒑𝒍𝒆.𝒄𝒐𝒎/
- Math bold script https://𝓮𝔁𝓪𝓶𝓹𝓵𝓮.𝓬𝓸𝓶/
- Double struck https://𝕖𝕩𝕒𝕞𝕡𝕝𝕖.𝕔𝕠𝕞/
- Monospace https://𝚎𝚡𝚊𝚖𝚙𝚕𝚎.𝚌𝚘𝚖/
- Super script https://ᵉˣᵃᵐᵖˡᵉ.ᶜᵒᵐ/
- Sub script https://ₑₓₐₘₚₗₑ.cₒₘ/ NB not all characters supported
- Math sans bold https://𝗲𝘅𝗮𝗺𝗽𝗹𝗲.𝗰𝗼𝗺/
- Math sans bold italic https://𝙚𝙭𝙖𝙢𝙥𝙡𝙚.𝙘𝙤𝙢/
- Math sans italic https://𝘦𝘹𝘢𝘮𝘱𝘭𝘦.𝘤𝘰𝘮/
- Math Squared https://🄴🅇🄰🄼🄿🄻🄴.🄲🄾🄼/ NB the dot must not be squared
- Circled https://ⓔⓧⓐⓜⓟⓛⓔ.ⓒⓞⓜ/ NB the dot must not be circled
There are a whole bunch more miscellaneous characters you can use:
Wait, so one can use any of— Christoph Päper 🇪🇺 (@Cr1ss0v) October 8, 2018
㍳ ㏃ ㏇(!) ㏈ ﬀﬃﬄﬁﬂ ㎇㎓㎬㏉ ㏋㍱㎐ ㎄㎅㎑㏍㏎㎸㎾ ㎃㎆㎒㎫㎹㎷㎿㎽ ㎁㎋№㎵㎻ ㍵ ㎀㎩㎊㏗㏙㏚㎴㎺ ₨ ℠ßﬆ㏜ ℡㎔™ ㏝
ÅℬℂℭℰℱℐℑKℒℳℕℙℚℛℜℝℤℨ and more to leet-code URLs?@urlstandard
How does this work?
Magic! Which is to say, I think it is the browser doing the conversion. DNS Servers don’t successfully reply to queries about .℡ domains.
The browser sees the .℡ and then follows the IDNA2008 process listed in RFC5895 to normalise it:
map characters to the “Simple_Lowercase_Mapping” property (the fourteenth column) in <http://www.unicode.org/Public/UNIDATA/UnicodeData.txt>, if any.
The ℡ entry is:
2121;TELEPHONE SIGN;So;0;ON;<compat> 0054 0045 004C;;;;N;T E L SYMBOL;;;;
U+0054 is T, U+0045 is E, U+004C is L.
You can test this in Python using:
python -c 'import sys;print sys.argv.decode("utf-8").encode("idna")' "℡"
Does this work?
Yes! I asked people on Twitter whether they could access my website using a .℡ – and it appeared to work on every modern browser and operating system.
Hey gang! I have a little experiment for you 🙂
Does this URL resolve in your browser?
(That's https:// edent. ℡ /)
If it does or doesn't, could you let me know which browser and operating system?
— Terence Eden (@edent) October 8, 2018
It even works on command line tools like
Things used to retrieve web pages rather than web browsers— Mike (@6byNine) October 8, 2018
curl 7.59, Linux - Yes
wget 1.19, Linux - Yes
It does fail in some circumstances:
Yes, Chrome/Safari/Firefox running on Mac. The TEL however changed from superscript to normal text. If I copied/pasted into Word and then into the browser, the superscript is preserved and it no longer resolves (takes you to the google page with this page being the first hit)— Ricardo Sueiras (@094459) October 8, 2018
What are the limitations?
Two main ones:
- Sites like Twitter and Facebook don’t recognise it as a valid URl and refuse to auto link it.
- Some command line tools like
hostdon’t understand it
dig edent.℡ ; <<>> DiG 9.10.6 <<>> edent.℡ ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 55282
Is this useful?
Obviously yes. This may be the most important discovery of the decade. You get cool looking URls and get to save a couple of characters on specific domains, at the minor expense of working inconsistently.
It could also be used for evading URl filters.
Every modern browser supports these “fancy” domain names – but most websites won’t automatically link to them. So sharing on Facebook doesn’t work.
Where can it be used?
Here are the single characters which can be normalised down to a valid TLD. They’re mostly country codes, but there are a few interesting exceptions:
㏕– US Military
℡– .tel registry
㎊– French Polynesia
㎋– Norfolk Island
㎙– Federated States of Micronesia
㎹– Republic of Maldives.
㏄– Cocos (Keeling) Islands
㏅– Democratic Republic of Congo
㏘– Saint Pierre and Miquelon
㏚– Puerto Rico
㏜– El Salvador
℠– San Marino
ﬅ– São Tomé and Príncipe
㎇– Great Britain (Obsolete)
ß– South Sudan (Not available)
㏌– India and Indiana (subdomain of .us)
ⅵ– Virgin Islands and Virginia (subdomain of .us)
ﬂ– Florida (subdomain of .us)
㎚– New Mexico (subdomain of .us)
㎵– Nevada (subdomain of .us)
㍵– As part of .ovh
If you can find any more, please stick a comment in the box below.
You can always reach this blog post at: