Facebook rewrite URLs with Unicode in the path – this is not best practice and could be dangerous.
It is possible to create a URL like http://bit.ly/😀 – the Unicode characters are valid in the path.
The URL Encoded representation is :
Facebook mangles these URLs in such a way that it might be possible to redirect a user to a malicious site.
Here’s what’s happening. When Facebook sees the “😀” character in text, it rewrites it to the “” character (󾰀). That’s a “private use character“. This means Facebook can replace the user’s computer’s default smiley with a Facebook supplied image or font glyph – if it wants.
In normal text – such as “I passed my exams 😀” – changing the smiley is doesn’t present a problem, but Facebook also replaces the text in a URL!
So, the URL :
Will point to a Facebook security page.
Facebook changes the URL to :
Which points elsewhere – bit.ly/.
I performed a couple of quick experiments. It is sometimes possible to post a link which displays a preview of a “good” site, but when clicked on leads to a bad site.
The chances of this being used as a successful attack vector are slim. Tricking the user into clicking on a link which subsequently steals their password is made marginally easier if the link and link preview don’t match – but I’m sure there are easier ways of deceiving the user.
The real issue here is that Facebook is altering the text that you write – and that can have unexpected consequences.
We live in a non-ASCII world now. A URL like https://莎士比亚.org/奥瑟罗 is perfectly valid. Facebook – and other sites – should not be confused by non-Latin characters.