Saturday, January 19, 2013

The Speech Length of The Two Gentlemen of Verona

In my previous post I looked at two recent stylometric studies that suggested that The Two Gentlemen of Verona was probably written later than usually thought, certainly later than the dates proposed by those who think it may have been Shakespeare’s first play. One of the studies I discussed was MacD. P. Jackson’s analysis of Hartmut Ilsemann’s use of speech length as a chronological indicator for Shakespeare’s plays.

In this post, I am going to look at one of Ilsemann’s own papers on the subject, where he analyses his speech length data in a very different way to Jackson, and comes up with some interesting results. In More statistical observations on speech lengths in Shakespeare's plays, Ilsemann graphs the speech length distribution of all Shakespeare's plays to determine whether there are any patterns linking individual plays. To minimise stylistic differences due to genre, he analyses the Histories, Comedies and Tragedies separately. Below are his graphs for the comedies (the white line is the average of the individual curves):


As you can see from the graphs, Ilsemann determined that there were three distinct patterns of speech length distribution for the comedies. The first and largest group comprised plays generally believed to have been written in the late 1590s to early 1600s. Their speech length distribution is characterised by (using Ilsemann’s words) “a steep rise [that] goes up to four words, followed by the gentle decline towards the value twenty.” The second group comprises just two plays: The Two Gentlemen of Verona and Love’s Labour’s Lost. The broad shape of their speech length distribution is similar to that of the first group, but with a more gradual rise to a peak of nine instead of four, then a very sharp drop to longer speech lengths.

The big outlier is the third group, comprising The Taming of the Shrew and The Comedy of Errors, which “have two clearly distinct maxima, a smaller maximum at four words and the dominant one at nine words.” From Ilsemann’s graph, it is quite obvious that these two plays have remarkably similar speech length distributions, and ones that are significantly different to all the other comedies. This is a particularly interesting result, because Shrew and Errors - along with Two Gentlemen - are the plays which have most often been put forward as Shakespeare’s first comedy, perhaps even his first play.

How much can we deduce from Ilsemann's analysis? It does seem to indicate that at the broad level speech length distribution is a useful chronological indicator. There is a very marked difference between The Merry Wives of Windsor, Much Ado About Nothing, As You Like It, Twelfth Night, Measure For Measure and All's Well That Ends Well as a group and The Two Gentlemen of Verona, Love's Labour's Lost, The Taming of the Shrew and The Comedy of Errors taken together as another group.  Virtually all proposed chronologies for the comedies, including my own, would see the second group of plays as having been written before the first group (with some doubt, perhaps, about The Merry Wives of Windsor).

What can we deduce from the distribution patterns of the small second group? Not too much. We're on far less firm ground here, because there are only four plays in the group, and the more you drill down with any dataset, the less reliable the data becomes. Ilsemann says that "the shape of the distribution of speech lengths suggests a kinship between The Two Gentlemen of Verona, and Love's Labour's Lost" i.e. Two Gentlemen has a kinship with a play usually dated in the range 1594-6. By itself, it's not enough to prove anything, but it's something you just might want to keep in mind the next time you read any claim that Two Gentlemen was Shakespeare's first play and originated in the 1580s*.

[* Roger Warren makes this claim in his 2008 Oxford Shakespeare edition of The Two Gentlemen of Verona. You can read Gabriel Egan's review of Warren's edition here.]