by Ross Eckler
Word Ways, 1985


One of the commonest ways to "prove" that an author is engaging in wordplay is to search for odd patterns of letters in his writings. For example, much is made of the fact that the initial letters of successive lines in a poem spell out a reasonably long word, or that the initial letters of successive lines in a prose passage do the same thing. The question that is almost never put by such discoverers is the following: how likely is it that a word of the same length would have appeared by chance? Only if this probability is very low--say, 0.01 or less--can one confidently assert that the author put the message there.

In particular, let us consider the claim mentioned in the May 1984 Kickshaws that Shakespeare deliberately introduced the name TITANIA into the initial letters of a speech uttered by her in A Midsummer Night's Dream:

Thou shalt remain here, whether thou wilt or no.
I am a spirit of no common rate,
The summer still doth tend upon my state;
ANd I do love thee. Therefore go with me.
I'll give thee fairies to attend on thee;
And they shall fetch thee jewels from the deep,

On the face of it, this appears to be a remarkable coincidence. What is the probability that this happened by accident?

I recorded 1000 consecutive letters beginning the lines of poetic dialogue in Scene 1 of Act 1, Scenes 1 and 2 of Act 2, and Scene 2 of Act 3 in this Shakespearean play (I omitted Scene 2 of Act 1 and Scene 1 of Act 3 because these are largely written in prose). In this sequence, I noted the appearance of 110 two-letter words, 74 three-letter words, 6 four-letter words (fast, data, watt, sane, chap, wind) and one five-letter word (chaps), all in the Merriam-Webster Pocket Dictionary. (Of course one would not necessarily get 110 two-letter words in a second sample; all one can say is that, two times out of three, the number discovered would lie between 99 and 121.) These data can be adequately summarized by the following empirical rule:

Expected number of words i letters long = 2.2nS/26i

where n is the vocabulary size of the dictionary being used, and S is the sequence length. For the Shakespearean data, the predictions generated by this formula are 124, 68, 9 and 1, reasonably consistent with the observations.

Using this rule, one can predict that in the 100,000 poetic lines of all Shakespearean plays the expected number of seven-letter words from the Pocket Dictionary is 2.2(4591)(100,000)/267 = 0.13; TITANIA looks like a rather unusual event. However, if one allows two letters on the same line (as Shakespeare did), then this quantity must b multiplied by six, since one might encounter any of the arrangements TI,T,A,N,I,A; T,IT,A,N,IA; ... T,I,T,A,N,IA. This raises the expected number to 0.78, suggesting that it is not unlikely that a seven-letter word appears somewhere in Shakespeare in a modified acrostical form.

However, we have not made use of the fact that TITANIA appears in a speech uttered by Titania. If one asks for the probability that a self-referential acrostic appears somewhere in Shakespeare, the number of allowable words is not 4591 but 1, and the expectation correspond-ingly shrinks to 0.00016. Accordingly, I conclude that Shakespeare deliberately doctored this passage to spell out Titania's name. Do Shakespearean scholars know of any other self-referencial acrostics, or is this one unique?

Actually, the probability of finding Titania in an acrostic is even less, if the question posed is "What is the expected number of self-referential acrostics of actors with seven-letter names?" for there are far fewer than 100,000 lines uttered by such characters. By the same token, the corresponding expectation for a self-referential acrostic of an actor with (say) a four-letter name is much greater, and perhaps one should not be surprised if one is found.

One can ask whether the 2.2 factor is representative of modern prose. To assess this, I took the initial letters of 1000 consecutive words from Harold J. Leavitt's Managerial Psychology, Second Edition (University of Chicago, 1964) and found in this sequence 114, 95, 11 and 1 words of two through five letters. These data suggest a slightly larger value for the multiplicative factor, say 2.5.

Back to Word Ways articles
Back to Word Ways home