Facebook Mangles Unicode URLs


Facebook rewrite URLs with Unicode in the path - this is not best practice and could be dangerous.

It is possible to create a URL like http://bit.ly/😀 - the Unicode characters are valid in the path.

The URL Encoded representation is :

bit.ly/%F0%9F%98%80

Facebook mangles these URLs in such a way that it might be possible to redirect a user to a malicious site.

Here's what's happening. When Facebook sees the "😀" character in text, it rewrites it to the "󾰀" character (󾰀). That's a "private use character". This means Facebook can replace the user's computer's default smiley with a Facebook supplied image or font glyph - if it wants.

In normal text - such as "I passed my exams 😀" - changing the smiley is doesn't present a problem, but Facebook also replaces the text in a URL!

So, the URL :

bit.ly/%F0%9F%98%80%F0%9F%98%80

Will point to a Facebook security page.

Facebook changes the URL to :

bit.ly/%F3%BE%B0%80%F3%BE%B0%80

Which points elsewhere - bit.ly/󾰀󾰀.

I performed a couple of quick experiments. It is sometimes possible to post a link which displays a preview of a "good" site, but when clicked on leads to a bad site.

rickroll-fs8

The chances of this being used as a successful attack vector are slim. Tricking the user into clicking on a link which subsequently steals their password is made marginally easier if the link and link preview don't match - but I'm sure there are easier ways of deceiving the user.

The real issue here is that Facebook is altering the text that you write - and that can have unexpected consequences.

We live in a non-ASCII world now. A URL like https://莎士比亚.org/奥瑟罗 is perfectly valid. Facebook - and other sites - should not be confused by non-Latin characters.


Share this post on…

  • Mastodon
  • Facebook
  • LinkedIn
  • BlueSky
  • Threads
  • Reddit
  • HackerNews
  • Lobsters
  • WhatsApp
  • Telegram

What are your reckons?

All comments are moderated and may not be published immediately. Your email address will not be published.

Allowed HTML: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <p> <pre> <br> <img src="" alt="" title="" srcset="">