ALPHABETIZING THE INTEGERS

by Ross Eckler
Word Ways, 1981

Recently, Howard Bergerson proposed the following research problem in logology:

I have had one research problem in mind for a long time--I once did some preliminary work on it--which could turn out to be anything from easy to formidable to practically impossible. It is this: imagine the one thousand vigintillion minus one (if you don't include zero) consecutively named numbers arranged in alphabetical order and also in numerical order. How many (if any) numbers have their positions the same in both lists? What is the least such number?

When I first read this, my reaction was that of a mathematician. Isn't this merely a verbal version of the well-known matching problem in probability--the game in which two well-shuffled decks of cards are matched, one card at a time from one deck against one card from the other, until identical cards appear? (William Feller, in his classic 1950 work An Introduction to Probability Theory, amusingly casts the problem in terms of a devil-may-care hat-check girl who returns patrons their hats at random.) The probability of one or more matches rapidly approaches the limiting value of 1 - exp(-1) = 0.63 as the number of cards in the deck becomes large. So, there would appear to be a fair chance of a match between the list in numerical order and the list in alphabetical order, but the mathematical theory holds out little hope for more than a few such matches, and searching for these among a thousand vigintillion integers is a task I would not care to contemplate.

But integers arranged alphabetically look far from random when compared with integers arranged numerically. This may not be evident in the first 20 integers, but if the integers from 1 through 99 are set out alphabetically the twenties are in one group, the thirties in another, and so on. I illustrate this with a general plan of the integers from 1 through 999:

```1       eight
2-101   eight hundred, eight hundred eight, ... eight hundred two
102     eighteen
103-112 eighty, eighty eight, ..., eighty two
113     eleven
114     fifteen
115-124 fifty, fifty eight, ..., fifty two
125     five
126-225 five hundred, five hundred eight, ..., five hundred two
226-235 forty, forty eight, ..., forty two
236     four
237-336 four hundred, four hundred eight, ..., four hundred two
337     fourteen
338     nine
339-438 nine hundred, nine hundred eight, ..., nine hundred two
439     nineteen
440-449 ninety, ninety eight, ..., ninety two
450     one
451-550 one hundred, one hundred eight, ..., one hundred two
551     seven
552-651 seven hundred, seven hundred eight, ..., seven hundred two
652     seventeen
653-662 seventy, seventy eight, ..., seventy two
663     six
664-763 six hundred, six hundred eight, ..., six hundred two
764     sixteen
765-774 sixty, sixty eight, ..., sixty two
775     ten
776     thirteen
777-786 thirty, thirty eight, ..., thirty two
787     three
788-887 three hundred, three hundred eight, ..., three hundred two
888     twelve
889-898 twenty, twenty eight, ..., twenty two
899     two
900-999 two hundred, two hundred eight, ..., two hundred two```

It is not difficult to verify that there are no integers in the same position in both the alphabetical and numerical lists. It sufficies to consider the nine blocks of 100 integers apiece beginning "x hundred," for the integers between 0 and 99 all (except for "eight") appear alphabetically too late. Among the blocks of 100, the only one that holds out any hope is the six hundred block, running from 664 through 763. This is quickly assessed by breaking it down further, by tens blocks:

```664     six hundred
665     six hundred eight
666     six hundred eighteen
667-676 six hundred eighty, six hundred eighty eight, ...
677     six hundred eleven
678     six hundred fifteen
679-688 six hundred fifty, six hundred fifty eight, ...
689     six hundred five
690-699 six hundred forty, six hundred forty eight, ...```

There are no matches--in fact not even any near-misses.

Turning to the Bergerson problem, the basic structure of the alphabetical list can be visualized by placing "vigintillion" after each of the 999 written-out integers in the first list. This accounts for 99.9 per cent of all integers, leaving only the integers between one and one vigintillion to insert among the 999 one-vigintillion-sized blocks. This will affect the numbering at the left of the table, but not very much; the numbers (all now in vigintillions) will start in the same way and change at most by one unit in the third place, since the intercalated integers are all of fractional vigintillion size. But, such a small change is not enough to create a match, because there were no near-matches in the thousand table.

Philip Cohen has pointed out that Bergerson's problem can be regarded as one member of a class of matching problems: the integers 1 through n can be alphabetized, where n takes on any integral value. To illustrate, I give the first nine lists and capitalize all matches:

```ONE   ONE  ONE   four   five  five   five   eight  eight
TWO  three one    four  four   four   five   five
two   THREE  one   one    one    four   four
two    three six    seven  one    nine
two   three  six    seven  one
two    three  SIX    seven
two    three  six
two    three
two```

The insertion of blocks of number names earlier in the list ensures that every integer (except for eight, and later on a few beginning with eight billion) will have a match for some value(s) of n. The list sizes that produce matches for the first twenty integers are given:

 1 1-3 6 8 11 87 16 41 2 2 7 11-13 12 13 17 44 3 4 8 never 13 16 18 815 4 11-14 9 40 14 46 19 49 5 18-49 10 14 15 800 20 39

Note that the list of thirteen integers has three matches: four, seven, twelve. Similarly, the list of thirty-one integers has three matches: five, twenty four, twenty seven. Do other triple-matches exist for larger n? [In a letter to the editor, Jeremy Morse showed four matches for lists of size 202, 211, 212, 214, 463, and 465, and five matches for lists of size 213 (101, 200, 204, 207. 212) and 231 (101,200,205,224,227).]

Certain values of n have no matches--for 1 through 99, the no-match cases are 5, 6, 7, 9, 10, 15, 17, 50, 54, 56, 58, 60-86, 88, 90, 92, 96 and 98. This is a total of 43/99, not far from the theoretical (random) value of 1/e = 0.37.

Throughout this article, the space is assumed to precede the letter A in alphabetization (e.g., eight hundred before eighteen). If spaces are ignored in alphabetization (e.g., eighteen before eight hundred), minimal changes occur in the order; however, no matches are created in either the thousand list or the thousand vigintillion one.