Context-Aware Text Recognition?


A scanned document, the text is askew. Next to it is a computer-generated version of the text. A passage is highlighted.

I've been playing with Google's Cloud Vision API. It is OCR (Optical Character Recognition) - but in THE CLOUD and uses MACHINE LEARNING! When it works, it is indistinguishable from magic. When it fails, it reveals a very limited understanding of human text. Let's take a look at this quick example - a piece of […]

Continue reading →

Crowdsourcing Leveson


I've already blogged about the Leveson Inquiry's disturbing habit of releasing evidence as scanned in PDFs. I had a suggestion from digital journalist Kevin Anderson Terence Eden is on Mastodon@edentGah! The #leveson witness statements are photocopied & scanned in levesoninquiry.org.uk/evidence/?witn…Disastrous for open justice - shkspr.mobi/blog/index.php…❤️ 0💬 0♻️ 110:12 - Fri 11 May 2012Mr Anderson@kevglobalReplying to […]

Continue reading →

Leveson - Death By A Thousand (Paper) Cuts


I've been listening to the Leveson inquiry. A large part of the exchanges seem to go like this: Jay: Turning to page 51. Witness: Which bundle? Jay: 1606. Witness: 1660? Leveson: No, the page after. Jay: Paragraph 7. Witness: I don't have a paragraph 7. Jay: Ah, I have an earlier print out. Leveson: You'll […]

Continue reading →