Jason Magee

Guernsey based independent software consultant, founder of data.gg and aspiring game dev at Granite Games Limited.

Interested in local and remote work.

Getting data out of image PDFs without losing your mind

Recently I’ve been gathering data from gov.gg census reports as far back as 1971 for data.gg. The earlier census reports are scanned and the later ones have been created electronically but don’t copy and paste correctly. By the end of manually copying the data from the first table I knew I’d have to find a better way. After a bit of searching I found Tesseract, an OCR (Optical Character Recognition) tool. OCR tools take an image and attempt to convert any text it finds in the image into usable text. Google has even had a hand in Tesseract.

30 Jan 2015

Gamely Digest and Jekyll

We started Gamely Digest in 2013 to make a more digestible way to follow video games. The older I get the less time I have to play games and deciding what is worth playing is time consuming in itself.

21 Jan 2015