Don't redact FOI answers with a marker pen


(Disclaimer - I currently work for GDS, although I don't work on FOI. This is an opinion piece and doesn't represent the views on any of my employers - past, present, or future.)

The Irish government recently complied with a Freedom of Information Act request from journalists at RTÉ.

The journalists wanted copies of messages sent via a WhatsApp group. The Irish government complied and sent out several pages of documents. Let's take a look at three of the core mistakes that they made.

FOI response poorly redacted

Marker Pens

The redaction process appears to be drawing over the offending words with a marker pen.

When the ink is insufficiently dark, as in this example, we can clearly make out the word which is being deleted.

The word "David" is clearly visible through the ink

Even when the ink is quite thick, we can digitally enhance the picture to clearly see some letters.

An enhanced image showing the name "Veronica" has been redacted

We only need a few letters to discover even more information.

Fixed Width Fonts

The font that this blog is written in uses a proportional font.
The letter M is physically wider than the letter I.

The FOI response uses a monospace font - where every letter is the same length.

This allows us to accurately count how long a word is.

This is particularly useful in working out who is speaking - for example ALICE will take up more horizontal space than BOB.

Even using proportional fonts is not a guarantee that information will stay hidden - but there's no doubt monospace makes information recovery easier.

Coverage

As you can see at the top of the page, not all the letters have been covered.

Text at the top of the page hasn't been fully obscured

We can make a good guess that any vertical line is likely to be one of the letters IJKLbdhlt.

Each of those letters has a unique layout in a monospaced font.

Using our knowledge of the font, the partial exposure, and how the English language is composed, we can make a reasonable guess at what some of the words are.

The redacted text seems to say "see the lay of"

Our knowledge of English-language idioms may allow us to guess the next set of letters.

Thoughts

This post isn't intended to convince people to be more closed - but rather to encourage more openness.

These sort of physical redactions offer little protection against the determined reader. It's the digital equivalent putting your door on the latch rather than locking it.

We've known for over a decade that it is possible to recover information from improperly redacted documents.

Scanned documents are difficult for researchers to use. Equally, delivering an FOI response in a non-accessible format is likely to be a breach of the rights of disabled citizens.

The correct response, in my personal opinion, is to take a digital copy of the records to be released and replace every instance of sensitive information with [REDACTED] or [23 WORDS REDACTED]. That way there can no possibility of accidentally revealing information. Of course, if you leave "track changes" on - the data may still be there.

Using ineffective techniques raises the the risk of unwanted information leakage.


Share this post on…

What are your reckons?

All comments are moderated and may not be published immediately. Your email address will not be published.Allowed HTML: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <p> <pre> <br> <img src="" alt="" title="" srcset="">