So during the month of November there’s an informal hackathon that’s held called NaNoGenMo (aka National Novel Generation Month). It’s a play on the National Novel Writing Month which also takes place in November.
The idea is pretty simple, write code that will generate a “novel”. A novel here is defined as text containing over 50K words. The words are required to be unique, you could even just print the same work 50,000 times and use that for a submission.
I’ve been wanting to explore Markov Chains a bit more, so I decided to write a library to download a corpus from the Internet Archive and use the text to train my Markov Model. I created an open source python tool called IA-Markov to do exactly that.
The module is pretty easy to use, you just need to find a text file on Internet Archive and pass the Archive’s name to the
MarkovModel like so:
m = MarkovModel()
'blah blah blah blah blah'
In the example my Archive’s name is
'FuturistManifesto' and after I have trained my model, I can create random sentences on the model.
Ok, so after I had created a bunch of models trained on a bunch of different corpus, I wanted to organize the text in the form of a novel. Basically I wanted to break up all my 50,000 words into paragraphs and chapters.
I created another open source Python library called Markov-Novel that allows you to write a “random” Novel using Markov Chains.
The library is simple, we create a bunch of
Paragraphs which are 5 (or more) sentences generated by the Markov Model. We then compose those
Chapters and repeat ad infinitum.
Here’s a simple example of creating a single chapter novel:
with open('path/to/corpus.txt') as f:
text = f.read()
# Build the model.
text_model = markovify.Text(text)
novel = markov_novel.Novel(text_model, chapter_count=1)
Mythology & The Slap
I used both of my open source tools to then create a submission for NanNoGenMo 2016. I created a large Markov Chain based off of a combination of the following text hosted on Internet Archive:
- Flatland: a Romance in many dimensions - Edwinn Abbot
- The Burden of Skepticism - Carl Sagan
- The Futurist Manifesto - FT Marinetti
- The Geometry - Rene Descartes
- The Surregionalist Manifesto and Other Writings - Mac Cafard
- The Sound Manifesto - Michael J. O’Donnell; Ilia Bisnovatyi
- The Necessary Angel - Wallace Stevens
- Beelzebub`s Tales to His Grandson GI Gurdjieff
- Industrial Society and it’s future - FC
- The Society of the Spectacle - Guy Debord
- The Revolution of Everyday Life - Raoul Vangeiem
- You Don’t Need A Weatherman To Know Which Way The Wind Blows
I combined all of the text from above into a large Markov Model and then I went about “writing” my novel. First I defined a twelve-chapter novel like this:
novel = Novel(markov_combo, chapter_count=12)
Then I wrote it to file using the
I wrote the file in Markdown since I wanted to convert PDF and compile via LaTex. This was actually really easy to using PanDoc, it even helped me generate a Table of Contents too! All I had to do was use the following command:
$ pandoc -o novel.pdf myth.md --latex-engine=xelatex --toc
Once I ran that command, I had a beautiful, LaTex-compiled PDF of my randomly generated novel entitled “Mythology & The Slap: A post-geometric manifesto in twelve chromatic parts”. You can see my code for this on Github: https://github.com/accraze/nanogenmo16
This approach is both random and stochastic in nature, while also borrowing elements of the cutups literary technique for fun. My implementation is pretty crude since I got started on the project halfway through the month. Going forward I would like to clean some of the text corpus before creating the Markov model to write the novel. All-in-all I had alot of fun doing this project.
If you would like to read the “finished” work, here are the links: