Generative Metrics: Could Distant Reading Include Distant Scansion?

Having recently read portions of Nigel Fabb’s Language and Literary Structure, I was struck by how remarkably algorithmic his grid method for determining meter is. (A useful summary of the Grid Theory used by Fabb can be found in this document from UPenn.) Theoretically, there seems little reason why this “algorithm” couldn’t be converted into a proper, computer-executable one. The uses of such a program would seem to be manifold. With the ability to “distant scan” hundreds, if not hundreds of thousands, of poems, one could chart historical shifts in metrical form or even trace the evolution of metrical usage in the corpus of an individual poet. For my own work, the ability to quickly chart metrical variation between versions of The Prelude would be quite useful.

Software has been developed by linguists to test various notions of generative grammar: most notably the Maxent Grammar Tool developed by Bruce Hayes and Colin Wilson. Yet, to the best of my understanding (and I confess that my knowledge of linguistics is extremely superficial), the Maxent Grammar Tool’s usage in generative metrics seems to be largely related to stylistics. For instance, Hayes, Wilson, and Anne Shisko used a modified version of the software to generate a Shakespeare and Milton “grammar” to challenge the importance of the Stress Maximum Constraint in generative metrics, a constraint which essentially states that only the placement of stress maximums (strongly stressed syllables bordered on either side by syllables with relatively less stress) matters in determining meter. It would seem that such software might, indeed, be able to do the type of distant reading I imagine, though its ultimate, complex purposes create a potentially unscaleable learning curve for literary scholars without relatively intensive linguistics training. Doubtlessly, a more accessible program with more limited capabilities could be built.

TEI-encoding, of course, allows for the marking of meter, but it is tempting to imagine how computer-aided scansion could greatly increase the reach of projects that aim to database poetry based on elements of poetic form. Google’s recent work in trying to get computers to translate poetry from one language to another while preserving rhyme and meter would also suggest that distant reading of metrics and other poetic elements should be possible. Indeed, the possibility seems so likely that it is difficult to imagine that a program hasn’t already been produced. Do you know of software—developed or in development—that would be be able to analyze poetry for meter on a large scale? What uses would you find for such a program?


3 thoughts on “Generative Metrics: Could Distant Reading Include Distant Scansion?

  1. tedunderwood says:

    I deeply agree with you here. This is a huge opportunity. As you say, it’s very achievable to scan poems algorithmically, and the distant-reading approach would be apt. I think whoever does this is going to get some nice juicy literary-historical theses out of it.

    Probably the primary initial challenge is the wee matter of actually locating a good corpus of verse. As you know if you study Romanticism (woo, fellow Romanticist here!), a lot of 19c poems are all mixed up with prose prefaces, and footnotes, and explanatory endnotes, etc. Not to mention publisher’s ads at the back, etc. So it’s not just a question of identifying volumes of poetry (which by the way is not a trivial task, itself); you would then need to extract the poetry from those volumes.

    But that’s also achievable algorithmically. It doesn’t have to be a massive manual task. I’ve got some work in progress that I’ll blog soon.

    As far as actually analyzing meter goes, I don’t know of a program yet, but I think @jessemenn and @cforster at Syracuse are doing some work on it — we had a twitterconvo about it a while back. They would know more than I do.

  2. Jesse Menn, a grad student at Syracuse U (and, in interests of disclosure, a student in a class I taught last semester) is at least toying with this idea; and indeed, has “a more accessible program with more limited” running code that uses some of the built in capabilities of the Python NLTK to hazard a guess at the meter of unstructured text input. From what I’ve seen, it can separate iambic tetrameter from iambic pentameter readily (and it does so without *just* counting syllables and assuming iambs). It is still early in development, but it is moving in the direction of what you describe here. (I’ve just cc:’d Jesse on twitter; he may chime in.)

  3. Chris and Ted: Thanks to both for the info! Finding a corpus definitely remains a challenge, but I’m glad to hear that some folks are in the process of developing these types of programs!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: