Open Source Shakespeare (in MySQL)
My good friend Richard Brent has often complained that my blog has very little Shakespeare content. Despite the domain name, I don't think I've ever blogged about The Big S. For shame! Fear not, my Brentish-Boy, this post is all about Shakespeare. And MySQL....
Ahem...
When I first started shkspr.mobi it was intended to be an easy way to get Shakespeare on your phone. At that time, there were no mobile formatted texts of his plays and sonnets, so I had to create them. Finding Shakespeare's works in a suitable format for conversion wasn't too hard - but it meant lots of crufty code to read text files line-by-line. Yuck.
A few years later, I stumbled across Open Source Shakespeare. The project grew out of Eric Johnson's MA thesis. It's a remarkably good idea with only one minor problem. The database it uses is Microsoft Access.
MS Access, as a database, could best be described as
deformed, crooked, old and sere, ill faced, worse bodied, shapeless everywhere, vicious, ungentle, foolish, blunt, unkind, stigmatical in making, worse in mind
(Comedy of Errors, Act IV, Scene II)
There are a few Open Source Shakespeare projects on GitHub, but they don't seem very practical.
So, naturally, I've decided to create my own version of Shakespeare's works - in MySQL :-)
This is what it looks like: You can download it from GitHub.
I've stripped out a lot of the extraneous stuff from the original version - word counts, etc. So it should be a fairly lean database which is easy to use. I'm not a database professional, so I would be grateful if you could suggest any improvements. Either using this blog's comment form or on GitHub..
There are four tables
Paragraphs
This is where the main body of text is. A typical row will look like this
- WorkID: hamlet
- ParagraphID: 639015
- ParagraphNum: 3427
- CharID: hamlet
- PlainText: Has this fellow no feeling of his business, that he sings atngrave-making?
- Act: 5
- Scene: 1
Works
This is what translates the "WorkID" into something human readable - plus some extra metadata
- WorkID: hamlet
- Title: Hamlet
- LongTitle: Tragedy of Hamlet, Prince of Denmark, The
- Date: 1600
- GenreType: Tragedy
Character
This is what translates the CharID into a human readable name and description
- charID: hamlet
- CharName: Hamlet
- Abbrev: Ham
- Works: Tragedy of Hamlet, Prince of Denmark, The
- Description: son of the former king and nephew to the present king
Chapters
This gives the setting for each Act and Scene.
- WorkID: hamlet
- ChapterID: 18893
- Act: 5
- Scene: 1
- Description: Elsinore. A churchyard.
What's Next?
The next steps for the project are fairly obvious:
- Write some high level example code to show people how to use the database.
- Make shkspr.mobi a showcase site which runs off the database.
- Fix any bugs and inconsistencies that people find.
You can download the Shakespeare MySQL Database from GitHub.
Samuel Pickard says:
Samuel Pickard says: