Converting the Voynich Manuscript into an eBook
Three years ago I promised to convert the Voynich Manuscript into an ebook. The recent news that it may have been deciphered spurred me to finish my project.
So, here it is, the world famous mystery that is the Voynich Manuscript now in convenient eBook format.
- PDF - suitable for Kindle, nook, Kobo, Android, iOS and for most devices (60MB)
- CBZ - suitable for comic book readers, tablets, etc. (60MB)
As the book is pure images, I decided not to convert to .mobi or .epub. Those are great formats, but offer nothing to picture books that can't be provided by PDF.
Here's how I converted the book for all the different formats.
Size
The original files were quite large - over 4MB per image - leading to a total eBook file size of 660MB. A little on the chunky side to easily distribute.
Using mogrify, it was easy to convert the images to a more manageable size.
A simple
mogrify -resize 25% -quality 85 *.jpg
took the total file size down to 63MB with no visible reduction in quality.
But, of course, there were a series of problems.
Firstly, some of the pages are double or triple width. They are scanned at a different size from all the other files.
In this case, the quarter size file is too small to read. Here it is after resizing:
Indeed, each image is at a subtly different size. So simply resizing by percentage doesn't work well. The size of all the images was determined using ImageMagick's "identify" command. While the largest images had a height of around 6,000 pixels, the smallest had a height of 1352. I figured that a height of 1280 would be enough to satisfy most eReaders and tablets.
mogrify -resize x1280 -quality 85 *.jpg
Once done, the total file size of all the pages was 59MB. Good enough! With the advantage of having every page the same height - if not width.
Cropping
The scans of the images contain lots of extraneous information - most notably the edges of other pages.

Manually cropping each image is possible - although tedious. It results in a slightly higher quality image - click for bigger:
Although it would be possible to automate a simple crop, each page has a unique layout. If there's further interest, I may spend some free evenings cropping the pages individually.
Page Order
As I said in my blog post of 2010:
The Yale site has all the scans available as high-res JPGs or MrSIDs - but it's a pain to download hundreds of images from the site.
So - I turned to a torrent. Don't worry! These images are hundred of years old - they are in the public domain.
(NB, if your ISP is censoring the link from The Pirate Bay, you can use this Magnet Link instead).
I'm still slightly unsure of the order of all the pages. Some have numbers printed on them, but it's not clear whether they're sequential. Indeed, it's probably impossible to know how the pages were originally ordered. So, I've left them in the same order as the Yale scan.
Converting
I didn't transform the images to greyscale because, although current eInk devices are only black and white, future devices will have full colour displays. Although using greyscale would reduce the file size by a minor amount, most modern devices should have enough internal memory to store a colour file of that size and - hopefully - have enough power to render each page without difficulty.
Kindle Formatting
I initially uploaded the PDF to Amazon's Kindle store. After it had finished converting, it ended up looking like this:
All the pages had been rotated and chopped in half. Helpful.
I decided that it would be easier to use Kindle Comic Creator - it seemed to work fine in Linux using WINE.
That had the advantage of being easily able to select sections of pages as "panels".
I had to whack the quality of the images down a little to fit within Kindle's 50MB upload limit. However, despite the total image size being 30MB, the resultant file was still over 60MB! It also turned out that the Kindle Comic Creator resized images so they were no taller than 1024px.
Using the Kindle Creation software imports and keeps your original files - doubling the file size! There are tools to strip the bloat but they won't produce valid files to upload to the Kindle store.
So, I had to make a dirty compromise. I scaled the images to 1,000 pixels high, and set JPG compression to 65%. The images aren't as high quality as I would like - but the file size is a svelte 49MB (despite the source being 22MB!).
After all that kerfuffle, once Amazon converted it, it was back down to 23MB! If there's interest, I'll see if there's any way I can make it higher quality in the future.
Conclusion
I really don't know why I didn't complete this a few years ago. Just another of those projects which got lost on the back burner!
James Lothian says:
Gregory says: