Evading Profanity Filters Using Bi-Directional Text


reversed signs There are some very sensitive souls on the Internet who object to seeing swear words. To that end, a huge industry has sprung up around "Profanity Filters" - services which claim to be able to detect naughty words and automatically redact them.

The approach of dumbly looking for strings of text leads to a range of problems, including false positives (known colloquially as the Scunthorpe Problem).

A common way to bypass these filters is to use homoglyphs - substituting a lower-case L for an upper-case i, for example. The filtering industry is quite wise to these attempts and does a reasonable job of Bowdlerising the offending text.

What it doesn't seem to have caught up with is the use of directional masks to hide textual content.

Consider the following:

Terence is an ‮toidi

On a normal swear filter, we might expect that to be printed as

Terence is an *****

But it won't be. Why? Take a look at the source code to this page, what it actually says is:

Terence is an ‮toidi

The ‮ symbol tells the page to reverse the direction of the following text. That's really useful when one is writing in a language which goes right to left - like Arabic - but can be used for all sorts of subversive purposes.

I'm not naive enough to imagine that Chinese dissidents can communicate freely by using a basic reverse cipher - but it certainly can be used in lots of chat rooms.

For example, the text ‮kcuf successfully evaded the filters on a number of forums I frequent - including those running phpbb and XenForo.

Mitigating

The easiest way to stop this exploit is to filter out characters which reverse the direction of the text. Of course, if you run a site which has people writing in Arabic, Persian, Urdu, Hebrew, etc. this may not be practical.

The best solution is to ensure that your site is one which does not tolerate bad behaviour and that your community instils a healthy respect for people within its members.


Share this post on…

One thought on “Evading Profanity Filters Using Bi-Directional Text”

  1. e says:

    I wonder what a blind user's screenreader would make of this directionality abuse, though.
    Reply

What links here from around this blog?

  1. Two robots embracing. Saay What?

What are your reckons?

All comments are moderated and may not be published immediately. Your email address will not be published.

Allowed HTML: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <p> <pre> <br> <img src="" alt="" title="" srcset="">