Don't redact FOI answers with a marker pen
(Disclaimer - I currently work for GDS, although I don't work on FOI. This is an opinion piece and doesn't represent the views on any of my employers - past, present, or future.)
The Irish government recently complied with a Freedom of Information Act request from journalists at RTÉ.
The journalists wanted copies of messages sent via a WhatsApp group. The Irish government complied and sent out several pages of documents. Let's take a look at three of the core mistakes that they made.
Marker Pens
The redaction process appears to be drawing over the offending words with a marker pen.
When the ink is insufficiently dark, as in this example, we can clearly make out the word which is being deleted.
Even when the ink is quite thick, we can digitally enhance the picture to clearly see some letters.
We only need a few letters to discover even more information.
Fixed Width Fonts
The font that this blog is written in uses a proportional font. The letter M is physically wider than the letter I.
The FOI response uses a monospace font - where every letter is the same length.
This allows us to accurately count how long a word is.
This is particularly useful in working out who is speaking - for example ALICE
will take up more horizontal space than BOB
.
Even using proportional fonts is not a guarantee that information will stay hidden - but there's no doubt monospace
makes information recovery easier.
Coverage
As you can see at the top of the page, not all the letters have been covered.
We can make a good guess that any vertical line is likely to be one of the letters IJKLbdhlt
.
Each of those letters has a unique layout in a monospaced font.
Using our knowledge of the font, the partial exposure, and how the English language is composed, we can make a reasonable guess at what some of the words are.
Our knowledge of English-language idioms may allow us to guess the next set of letters.
Thoughts
This post isn't intended to convince people to be more closed - but rather to encourage more openness.
These sort of physical redactions offer little protection against the determined reader. It's the digital equivalent putting your door on the latch rather than locking it.
Scanned documents are difficult for researchers to use. Equally, delivering an FOI response in a non-accessible format is likely to be a breach of the rights of disabled citizens.
The correct response, in my personal opinion, is to take a digital copy of the records to be released and replace every instance of sensitive information with [REDACTED]
or [23 WORDS REDACTED]
. That way there can no possibility of accidentally revealing information. Of course, if you leave "track changes" on - the data may still be there.
Using ineffective techniques raises the the risk of unwanted information leakage.