IN QUEST OF A PANGRAM
by Lee Sallows
Word Ways, 1992
In the August 1991 Word Ways, Douglas Greenwood mentions Kee Dewdney's widely referenced October 1984 Scientific American account of my search for a self-descriptive pangram. In fact, Dewdney's column was based on a long and rather technical article of mine that appeared subsequently in the Spring 1985 Abacus. The following is a shortened adaptation of the first part of the original.
The Pangram Problem
In 1983, a Dutch newspaper carried an astonishing translation of a rather tongue-in-cheek sentence of mine that had previously appeared in Douglas Hofstadter's January 1982 Scientific American column. The translation was by Rudy Kousbroek, a well-known journalist in Holland (two chapters of his 1984 book, De Logologische Ruimte, appeared in the November 1986 and November 1987 Word Ways). Here is the original sentence:
Only the fool would take trouble to verify that this sentence was composed of ten a's, three b's, four c's, four d's, forty-six e's, sixteen f's, four g's, thirteen h's, fifteen i's, two k's, nine l's, four m's, twenty-five n's, twenty-four o's. five p's, sixteen r's, forty-one s's, thirty-seven t's, ten u's, eight v's, eight w's, four x's, eleven y's, twenty-seven commas, twenty-three apostrophes, seven hyphens and, last but not least, a single !
Composing such self-descriptive sentences can be exacting, to say the least. The process has points in common with playing a diabolically-conceived game of patience. How does one begin? My approach is to decide first what the sentence is going to say and then make a flying guess at the frequencies of each sign. Writing out this provisional version, the real totals can be counted up and the initial guess updated. The process is repeated, trial and error leading to successively closer approximations. This opening soon shades into the middle game. By now all of the putative totals ought to have been corrected to within two or three of the true sums. There are, say, 9 f's but only seven being claimed, and 27 real t's where twenty-nine are declared. Switching seven with the nine in twenty-nine corrects both totals at a single stroke. Introducing further careful changes with a view to pulling off this sort of mutual cancellation of errors should eventually lead to the final phase.
The end game is reached when discrepant totals number some four or less. The goal is in sight, but, as in a maze, proximity can be misleading. For instance, suppose painstaking effort has at last yielded a near-perfect specimen: only the x's are wrong. Instead of the five claimed, in reality there are 6. Writing six in place of five will not merely invalidate the totals for e, f, s and v; the x in six means that their number has now become 7. Yet replacing six with seven will only return the total to 6. What now?
Paradoxical situations of this kind are the norm in this activity. Interlocking feedback loops magnify tiny displacements into far-reaching upheavals; harmless truths cannot be stated without disconfirming themselves. It seems the only hope of dehydrating this Hydra and getting every snake-head to eat its own tail lies in doctoring the text accompanying the listed items. In looking at the above case, for example, only a fool will fail to spot instances where style has been sacrificed to arithmetic. This is what made Kousbroek's translation so stunning. Totals excepted, his rendering not only adhered closely to the original in meaning, it was simultaneously a self-descriptor in Dutch!
Or at least, so it appeared at first sight. Counting up, I was amused to discover a few incorrect totals. So I wrote to the author pointing out these discrepancies. This resulted, a month later, in a second article in the same newspaper. Kousbroek wrote of his dismay on being caught out by the original author of the sentence, "specially come over from America, it seems, to put me right." The disparities I'd pointed to, however, were nothing new to him. A single flaw had been spotted in the supposedly finished translation on the very morning of submitting his manuscript. But a happy flash revealed a way to rectify the error in the nick of time. Later a more careful check revealed that this brainwave had in fact introduced even more errors elsewhere. He'd been awaiting "the dreaded letter with its merciless arithmetic" ever since. The account went on to tell of his titanic struggle in getting the translation straight. The new version was included; it is a spectacular achievement. However, the tail concaled a sting. At the end, Kousbroek threw out a new self-descriptor of his own:
Dit pangram bevat vijf a's, twee b's, twee c's, drie d's, zesenveertig e's. vijf f's, vier g's, twee h's, vijftien i's, vier j's, een k, twee l's, twee m's, zeventien n's, een o, twee p's, een q, zeven r's, vierentwintig s's, zestien t's, een u, elf v's, acht w's, een x, een y en zes z's.
A finer specimen of logological elegance is scarcely conceivable. The sentence is written in flawless Dutch beyond hope of refinement. It says "This pangram contains five a's, two b's, ... one y, and six z's." Following this came a devilish quip in my direction: "Lee Sallows will doubtless find little difficulty in producing a magic English translation of this pangram." Needless to say, I didn't manage to find a single error in this sentence of his!
Computerized Construction
Rudy's playful taunt came along at a time when I was already thinking about computer-aided construction of self-descriptors. (Recall this is 1983; advertisements of the "You just bought a personal WHAT???" kind are still in the future.) At first I envisaged no more than an aid to hand composition: a program to count letters so as to yield feedback on the results of keyboard-mediated surgery performed on a sentence displayed on screen. Later I began wondering about a program able to cycle through the list of totals and make automatic corrections along the way. Could solutions be evolved through a repetitive process of mutation and selection? Experiments were made, but processing always became trapped in an endless loop of repeated exchanges. What seemed to be needed was the ability to look ahead so as to evaluate and compare the results of prospective word changes.
I was pondering this problem when Kousbroek's challenge presented itself and sent me off on a different tack. Its sheer hopelessness caught my imagination. But was it actually impossible? What a comeback if it could be brought off! The task was to complete a sentence beginning "This pangram contains...". A solution, were it discoverable, must in a sense already exist out there in the abstract realm of logological space. It was like seeking a number that satisfies certain mathematical conditions. And nobody, Kousbroek least of all, knew whether it existed or not. The thought of finding it was tantalizing. Reckless of long odds, I put aside programs and launched into a resolute attempt to find it by hand trial. It was a foolhardy quest, a search for a needle in a haystack without even the comfort of knowing that a needle was concealed there. In the end, I had only the consolation prize of a near solution: all totals correct save one, 21 t's instead of the twenty-nine claimed.
To the purist in me, that single imperfection was a hideous fracture in an otherwise-flawless crystal. However, a promising new idea now presented itself. The totals in this near solution must be pretty close to those in the real solution, assuming it existed. Why not use it as the basis of a systematic computer search through neighbouring combinations of number-words? Each total above could be seen as centered in a range of, say, 10 consecutive possibilities within which the perfect total was likely to fall. With these ranges defined, a program could generate and test every possible combination. The test would consist in comparing sets of potential totals with the computed letter frequencies they gave rise to, until an exact match was found. Or until all cases had been examined. Blind searching might succeed where cunning was defeated. Work on the program began at once.
It isn't necessary to test all 26 totals, since in English there are just 10 letters which never occur in low-valued cardinals: A,B,C,D,J,K,M,P,Q,Z. Totals for these letters can thus be determined from the initial text and filled in directly:
This pangram contains five a's, one b, two c's, two d's, ? e's, ? f's, ? g's, ? h's, ? i's, one j, one k, ? l's, two m's, ? n's, ? o's, two p's, one q, ? r's, ? s's, ? t's, ? u's, ? v's, ? w's, ? x's, ? y's, and one z.
This leaves 16 critical totals. Counting up shows there are already 7 e's, 2 f's, ... , 1 y: 16 constants which must be added to those letters occurring in the trial list of totals. Similarly, in the program, number words are replaced by profiles or 16-element lists representing their letter content. The profile for twenty-seven would then be:
E F G H I L N O R S T U V W X Y
3 0 0 0 0 0 2 0 0 1 2 0 1 1 0 1
Profiles for all numbers up to fifty were stored in memory and a label associated with each. These labels were simply the corresponding decimal numbers; the label for twenty-seven would be 27.
Starting with the lowest, a simple algorithm could now generate successive combinations of labels (i.e., numbers) drawn from the 16 predefined ranges. Each set would be used to call up their associated set of profiles. The 16 profiles would then be added together element for element, and the resulting sums in turn added to the above set of constants so as to form the complete sentence profile. All that remained was to check whether the numbers in the sentence profile coincided with the present set of labels. If so, the self-descriptor had been found. If not, generate the next combination and try again.
At length, the program was finished and set running as a batch job on a mainframe computer at the University of Nijmegen. Even so, the 16x16 additions needed to calculate each sentence profile made for slow processing; average speed was only about 10 candidates per second. Every morning I would hasten to call up the job file, running my eye down the screen in search of EUREKA! But as day succeeded day without result, the question of how long it would take to exhaust all possibilities gradually loomed in importance. It was a matter I had never seriously considered. Alas, the calculation is an absurdly simple one and even now I blush to recall first seeing what it implied. The time needed is 1015 seconds. A pocket calculator soon converts this to intelligible units. There seemed to be something wrong with the one I was using. Every time I worked it out the answer was ridiculous: 31.7 million years!
I was so unprepared for the blow contained in this revelation that initially I could hardly take it in. The whole object of turning to a computer had been to win speed. Now that the truth had dawned, I began cursing my naivete in ever embarking on such a fool's errand. Doubtless a faster program could be written, but even checking one million candidates a second, it would still take 317 years to run through the 10-deep ranges!
Yet "millions" put me in mind of megahertz, which in turn brought my thoughts to electronics (my profession). This in turn prompted an idea, a fanciful notion, for the first few days no more than an idle phrase repeated in the head, a good title perhaps for a science fiction story: the Pangram Machine. Initially whimsical, suddenly the vague intuition began to crystallize. In a flash, I saw how a central process in the program could be simulated electronically. Taking this as a starting point, I tried translating other aspects of the algorithm into hardware. It worked; it was easy. A few hours later I was thrilled to find the broad outlines of an actual design already clear in my mind.
The phoenix now emerging from the ashes of the Pangram Quest soared serenely to the sky, smoothly circled, swiftly swooped, and soon bore me off, a helpless prisoner in its relentless talons. For the next three months I would be pouring all my energy into the construction of a high speed electronic Pangram Machine.
The Pangram Machine
How seriously should a word puzzle be taken? Though only the size of a smallish suitcase, the apparatus to emerge from three months' concentrated activity packed more than two thousand components onto thirteen specially-designed printed circuit cards. Foresight of this complexity may have dissuaded me from starting. In the event, the completed machine turned out to involve a good deal more circuitry than originally intended.
As indicated, machine operation mimics that of the program. Readers interested in technical details should consult my Spring 1985 Abacus article. For now, suffice it to say that electronic switching speed plus shortened range lengths tailored to individual letters brought running time down to about a month. Technical problems beset the design, but at last came a day when the rocket was ready for launching into logological space. Lift-off came in October 1983, some eight months following Kousbroek's audacious challenge. The ensuing period found me hovering nervously over the machine. There was the nagging worry of reliability. What guarantee was there of faultless operation over so long a period? None, of course, After a while the suspense became nerve-racking. Mornings were worst. On waking, the first thought in consciousness would be: has it halted? It took nerves of iron to go through the morning's ablutions before going down to the living room where the machine was installed on my writing bureau. Opening the door, there would be the flickering gleam of light-emitting diodes signaling the state of the counters as they switched their way through 2.71x1012 trial sentences. One million a second for 32 days. It was a torturing experience. The novelty of watching the machine soon wore off, but a single second's distracted attention was accompanied by the thought that another million chances had already elapsed, so perhaps NOW??? ...and my glance would be wrenched back to the twinkling array.
At length, the grains of sand ran out. It looked like defeat. There never had been a needle in the haystack! But much had happened during the long weeks of waiting. Already, detailed plans were in hand for a Mark II machine in which automatic number-word selectors were to slice running time to a mere two hours! And alternative translations remained to be explored, top of the list being "This pangram comprises...". So it was, with this reconstruction work complete, that some weeks later found me sitting before the machine during its eighth run, following seven previous trials with different verbs, every one of which had ended in failure! How long could this go on? However, suddenly the EUREKA! lamp came on and my stomach turned a somersault. Decoding the diode display into the set of number-words represented, a careful check then verified the following perfect pangram:
This pangram lists four a's, one b, one c, two d's, twenty-nine e's, eight f's, three g's, five h's, eleven i's, one j, one k, three l's, two m's, twenty-two n's, fifteen o's, one p, one q, seven r's, twenty-six s's, nineteen t's, four u's, five v's, nine w's, two x's, four y's, and one z.
Despite a hangover from the celebration, next morning saw me back at the machine. And the zenith of glory was still to come. Changing and to & in the rendering of Kousbroek's pangram, a last desperate bid for a perfect magic translation, had finally met with success. The Quest for the pangram had ended in triumph.
This pangram contains four a's, one b, two c's, one d, thirty e's, six f's, five g's, seven h's, eleven i's, one j, one k, two l's, two m's, eighteen n's, fifteen o's, two p's, one q, five r's, twenty-seven s's, eighteen t's, two u's, seven v's, eight w's, two x's, three y's, & one z.
I had a lot of fun with the Pangram Machine in the ensuing months. For a while I reconnoitered without any clear plan. Pangrams incorporating names of friends provided entertainment. Many new specimens were thus unearthed and the relation between initial text constants and the ranges in which solutions could be expected became clearer. After a time, the facility gained in prospecting prompted an ambitious new research program. The idea of a logological counterpart to a number series suggested itself: This first pangram..., This second pangram ..., This third pangram ..., in each case accompanied by a different verb, led to a series with one hundred terms in it.
The apparent elegance of self-descriptive sentences can be deceptive. Closer scrutiny may reveal imperfections. For instance, oughtn't "one z" to be seen as a redundant curlicue? Its inclusion is clearly a gratuitous addition whose only apparent function is to contribute an extra o, n and e, merely in order to make the sentence work. Appending number-words is just a cunning form of disguised text-doctoring. Purists will rightly insist on the crisp parsimony of:
This sentence employs two a's, two c's, two d's, twenty-six e's, four f's, two g's, seven h's, nine i's, three l's, two m's, thirteen n's, ten o's, two p's, six r's, twenty-eight s's, twenty-three t's, two u's, five v's, eleven w's, three x's, and five y's.
Bimagic Pairs and Bananagrams
A bimagic pair of self-descriptive sentences consists of two sentences in which the initial words are identical but the pangrams list different numbers of letters. Certain minds seem to balk at this confrontation with a single text comprised of nine a's this time and eight a's the next. I have even known the delight of hearing someone patiently explain to me that such a thing can only be a logical impossibility. Though at first sight twisty, the cunning interlock between bimagic pairs is neatly brought out through considering the following two lists:
ten i's eleven i's one l two l's eighteen t's nineteen t's seven w's eight w's The four numbers on the right are all one greater than those on the left, a difference on the descriptive level of one i, one l, one t and one w. But cancelling common letters in the two lists will leave precisely that: the text on the right contains an extra i, l, t and w. Denotational differences parallel those at the typographical level. A self-descriptor incorporating one of these lists remains a self-descriptor if that list is replaced by the other. A similar but more complicated pair of lists can be extracted from any bimagic solution.
Notice that despite suggestive associations, a pair of sublists so derived can never comprise true anagrams of each other. For if their letter content agreed then the numbers named would have to be the same, which would imply identity. Taking into account their slippery character, and the ban on anagrams, I propose a special name for these curiosities: bananagrams.
How rare are bimagic cases? Of the roughly one in eight initial texts to yield a simple self-descriptor, again something like one in eight of these turn out to have dual solutions. Trimagic cases are even rarer; several hundred runs with the machine located only one.
One sting in the tail deserves another. Kousbroek's challenge was to produce a magic translation of his pangram. Having offered one above, I should like to present another. It is the second half of what he had never imagined, a bimagic translation:
This pangram contains four a's, one b, two c's, one d, twenty-six e's, six f's, three g's, six h's, eleven i's, one j, one k, two l's, two m's, seventeen n's, fifteen o's, two p's, one q, eight r's, thirty s's, seventeen t's, four u's, four v's, six w's, six x's, three y's, & one z
An act of magic consists in doing what others believe impossible. The magic of self-descriptive sentences lies in the unbelievable coincidence they effect between a message and its medium. This is a good example of what Freud would have called an over-determined structure, over-determined because it simultaneously satisfies independent sets of demands. Of course, a discipline devoted to over-determined texts already exists, a technical field in which distillation of meaning and coalescence of form with content have ever been focal concepts. Its name is poetry. Let none suppose that anything but poetry has been our purpose here.