Converting the Voynich Manuscript into an eBook


Three years ago I promised to convert the Voynich Manuscript into an ebook. The recent news that it may have been deciphered spurred me to finish my project.

So, here it is, the world famous mystery that is the Voynich Manuscript now in convenient eBook format.

As the book is pure images, I decided not to convert to .mobi or .epub. Those are great formats, but offer nothing to picture books that can't be provided by PDF.

Here's how I converted the book for all the different formats.

Size

The original files were quite large - over 4MB per image - leading to a total eBook file size of 660MB. A little on the chunky side to easily distribute.

Using mogrify, it was easy to convert the images to a more manageable size.

A simple
mogrify -resize 25% -quality 85 *.jpg
took the total file size down to 63MB with no visible reduction in quality.

But, of course, there were a series of problems.

Firstly, some of the pages are double or triple width. They are scanned at a different size from all the other files.

In this case, the quarter size file is too small to read. Here it is after resizing:
Triple Page Small

Indeed, each image is at a subtly different size. So simply resizing by percentage doesn't work well. The size of all the images was determined using ImageMagick's "identify" command. While the largest images had a height of around 6,000 pixels, the smallest had a height of 1352. I figured that a height of 1280 would be enough to satisfy most eReaders and tablets.

mogrify -resize x1280 -quality 85 *.jpg

Once done, the total file size of all the pages was 59MB. Good enough! With the advantage of having every page the same height - if not width.

Cropping

The scans of the images contain lots of extraneous information - most notably the edges of other pages.

Voynich with Edges

Manually cropping each image is possible - although tedious. It results in a slightly higher quality image - click for bigger:
Voynich without Edges

Although it would be possible to automate a simple crop, each page has a unique layout. If there's further interest, I may spend some free evenings cropping the pages individually.

Page Order

As I said in my blog post of 2010:

The Yale site has all the scans available as high-res JPGs or MrSIDs - but it's a pain to download hundreds of images from the site.
So - I turned to a torrent. Don't worry! These images are hundred of years old - they are in the public domain.

(NB, if your ISP is censoring the link from The Pirate Bay, you can use this Magnet Link instead).

I'm still slightly unsure of the order of all the pages. Some have numbers printed on them, but it's not clear whether they're sequential. Indeed, it's probably impossible to know how the pages were originally ordered. So, I've left them in the same order as the Yale scan.

Converting

I didn't transform the images to greyscale because, although current eInk devices are only black and white, future devices will have full colour displays. Although using greyscale would reduce the file size by a minor amount, most modern devices should have enough internal memory to store a colour file of that size and - hopefully - have enough power to render each page without difficulty.

Kindle Formatting

I initially uploaded the PDF to Amazon's Kindle store. After it had finished converting, it ended up looking like this:
Kindle Mangled Page
All the pages had been rotated and chopped in half. Helpful.

I decided that it would be easier to use Kindle Comic Creator - it seemed to work fine in Linux using WINE.

That had the advantage of being easily able to select sections of pages as "panels".
Kindle Comic Creator Panels
I had to whack the quality of the images down a little to fit within Kindle's 50MB upload limit. However, despite the total image size being 30MB, the resultant file was still over 60MB! It also turned out that the Kindle Comic Creator resized images so they were no taller than 1024px.

Using the Kindle Creation software imports and keeps your original files - doubling the file size! There are tools to strip the bloat but they won't produce valid files to upload to the Kindle store.

So, I had to make a dirty compromise. I scaled the images to 1,000 pixels high, and set JPG compression to 65%. The images aren't as high quality as I would like - but the file size is a svelte 49MB (despite the source being 22MB!).

After all that kerfuffle, once Amazon converted it, it was back down to 23MB! If there's interest, I'll see if there's any way I can make it higher quality in the future.

Conclusion

I really don't know why I didn't complete this a few years ago. Just another of those projects which got lost on the back burner!


Share this post on…

4 thoughts on “Converting the Voynich Manuscript into an eBook”

  1. James Lothian says:

    Hi Terence. I've been following your blog on and off for a while.
    I'm currently doing a little spare-time 'research' on Voynich, and
    while looking for likely resources on Amazon came across your recent
    ebook. I though to myself 'There can't be that many Terence Edens in
    the world!', and sure enough it was the same one -- small world!

    As far as the page order is concerned, things are complicated by
    the fact that the pages are probably not currently bound in the
    original order -- Nick Pelling's 'Curse of the Voynich' details his
    detective work to try to find the original order.

    Reply
  2. Gregory says:

    My suggestion to decode the Voynich Manuscript is in the fact that each of its individual pages encodes some other information . Encryption is not just a written form . There's a whole spectrum of gnosis , which, because of the limited capabilities (eg letter runicze - oldest inscriptions are from the second and third century AD, before the Egyptian hieratic writing , etc.) were also encoded in a different form - for example, by means of signs and symbols : see semiotics - from the Greek : " semasticos " - significant , " semasia " - meaning "," semeion " - a sign of " sema " - a sign , the image signal . And in such a manner is encoded Voynich manuscript - it is not my task , classic cipher written , only symbolic rebus - ideogram . Below to better illustrate the time- historical continuum in brief , a summary of the earlier descriptions of each manuscript illustration . ( From 1R to 19R ) http://gloriaolivae.pl/

    1R - Big Bang and Kolaps - cyclical nature of the universe.

    1V - Approximately 4.5 - 5 billion years ago - the formation of the Earth's crust.

    2R - About 3.5 billion years ago - the first organisms .

    2V - About a billion years ago - the first single-celled organisms ( eukaryotes ) .

    3R - Approximately 900 - 700 million years ago - the first multi-cellular organisms .

    3V - approximately 700 - 600 million years ago - the first invertebrates .

    4R - 500 million years ago - the first vertebrates .

    4V - 400 million years ago - vertebrates came out of the water.

    5R - 220 million years ago - the beginning of the reign of the dinosaurs.

    5V - 65 million years ago - extinction of the dinosaurs , evolution of mammals .

    6R - About 65 - 30 million years ago - carnivores .

    6V - About 30 - 7 million years ago - the formation of plants and animals.

    7R - About 12 million years ago - the first hominids .

    7V - About 7 - 5 million years ago - the appearance of man .

    8R - About 100 thousand . years ago - the emergence of modern man .

    8V - Approximately 15-12 thousand . years ago - man hiking - "bridge" Bering .

    9R - Approximately 11.5 thousand . years ago - the end of the last ice age.

    9V - About 10 thousand . years ago - hunter -gatherers , the birth of agriculture.

    10R - Around 4000 , the BC - Development of urban community Mesopotamia.

    10V - Around 3000 , the BC - The beginnings of civilization of ancient Egypt.

    11R - The turn of the second and first millennium BC - Judaism , Jerusalem.

    11V - turn of the century - Christianity . Rome .

    12R - None. According to me - Ancient Greece .

    12 V - None. According to me - the Empire of Alexander the Great .

    13R - The Roman Empire .

    13V - Persian Empire .

    14R - Huns . Mongol Empire .

    14V - Byzantine Empire .

    15R - The State of the Franks.

    Reply

Trackbacks and Pingbacks

What links here from around this blog?

What are your reckons?

All comments are moderated and may not be published immediately. Your email address will not be published.Allowed HTML: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre> <p> <br> <img src="" alt="" title="" srcset="">