ALPHABETIZING THE INTEGERS
by Ross Eckler
Word Ways, 1981
Recently, Howard Bergerson proposed the following research problem in logology:
I have had one research problem in mind for a long time--I once did some preliminary work on it--which could turn out to be anything from easy to formidable to practically impossible. It is this: imagine the one thousand vigintillion minus one (if you don't include zero) consecutively named numbers arranged in alphabetical order and also in numerical order. How many (if any) numbers have their positions the same in both lists? What is the least such number?
When I first read this, my reaction was that of a mathematician. Isn't this merely a verbal version of the well-known matching problem in probability--the game in which two well-shuffled decks of cards are matched, one card at a time from one deck against one card from the other, until identical cards appear? (William Feller, in his classic 1950 work An Introduction to Probability Theory, amusingly casts the problem in terms of a devil-may-care hat-check girl who returns patrons their hats at random.) The probability of one or more matches rapidly approaches the limiting value of 1 - exp(-1) = 0.63 as the number of cards in the deck becomes large. So, there would appear to be a fair chance of a match between the list in numerical order and the list in alphabetical order, but the mathematical theory holds out little hope for more than a few such matches, and searching for these among a thousand vigintillion integers is a task I would not care to contemplate.
But integers arranged alphabetically look far from random when compared with integers arranged numerically. This may not be evident in the first 20 integers, but if the integers from 1 through 99 are set out alphabetically the twenties are in one group, the thirties in another, and so on. I illustrate this with a general plan of the integers from 1 through 999:
1 eight 2-101 eight hundred, eight hundred eight, ... eight hundred two 102 eighteen 103-112 eighty, eighty eight, ..., eighty two 113 eleven 114 fifteen 115-124 fifty, fifty eight, ..., fifty two 125 five 126-225 five hundred, five hundred eight, ..., five hundred two 226-235 forty, forty eight, ..., forty two 236 four 237-336 four hundred, four hundred eight, ..., four hundred two 337 fourteen 338 nine 339-438 nine hundred, nine hundred eight, ..., nine hundred two 439 nineteen 440-449 ninety, ninety eight, ..., ninety two 450 one 451-550 one hundred, one hundred eight, ..., one hundred two 551 seven 552-651 seven hundred, seven hundred eight, ..., seven hundred two 652 seventeen 653-662 seventy, seventy eight, ..., seventy two 663 six 664-763 six hundred, six hundred eight, ..., six hundred two 764 sixteen 765-774 sixty, sixty eight, ..., sixty two 775 ten 776 thirteen 777-786 thirty, thirty eight, ..., thirty two 787 three 788-887 three hundred, three hundred eight, ..., three hundred two 888 twelve 889-898 twenty, twenty eight, ..., twenty two 899 two 900-999 two hundred, two hundred eight, ..., two hundred twoIt is not difficult to verify that there are no integers in the same position in both the alphabetical and numerical lists. It sufficies to consider the nine blocks of 100 integers apiece beginning "x hundred," for the integers between 0 and 99 all (except for "eight") appear alphabetically too late. Among the blocks of 100, the only one that holds out any hope is the six hundred block, running from 664 through 763. This is quickly assessed by breaking it down further, by tens blocks:
664 six hundred 665 six hundred eight 666 six hundred eighteen 667-676 six hundred eighty, six hundred eighty eight, ... 677 six hundred eleven 678 six hundred fifteen 679-688 six hundred fifty, six hundred fifty eight, ... 689 six hundred five 690-699 six hundred forty, six hundred forty eight, ...There are no matches--in fact not even any near-misses.
Turning to the Bergerson problem, the basic structure of the alphabetical list can be visualized by placing "vigintillion" after each of the 999 written-out integers in the first list. This accounts for 99.9 per cent of all integers, leaving only the integers between one and one vigintillion to insert among the 999 one-vigintillion-sized blocks. This will affect the numbering at the left of the table, but not very much; the numbers (all now in vigintillions) will start in the same way and change at most by one unit in the third place, since the intercalated integers are all of fractional vigintillion size. But, such a small change is not enough to create a match, because there were no near-matches in the thousand table.
Philip Cohen has pointed out that Bergerson's problem can be regarded as one member of a class of matching problems: the integers 1 through n can be alphabetized, where n takes on any integral value. To illustrate, I give the first nine lists and capitalize all matches:
ONE ONE ONE four five five five eight eight TWO three one four four four five five two THREE one one one four four two three six seven one nine two three six seven one two three SIX seven two three six two three twoThe insertion of blocks of number names earlier in the list ensures that every integer (except for eight, and later on a few beginning with eight billion) will have a match for some value(s) of n. The list sizes that produce matches for the first twenty integers are given:
1 1-3 6 8 11 87 16 41 2 2 7 11-13 12 13 17 44 3 4 8 never 13 16 18 815 4 11-14 9 40 14 46 19 49 5 18-49 10 14 15 800 20 39 Note that the list of thirteen integers has three matches: four, seven, twelve. Similarly, the list of thirty-one integers has three matches: five, twenty four, twenty seven. Do other triple-matches exist for larger n? [In a letter to the editor, Jeremy Morse showed four matches for lists of size 202, 211, 212, 214, 463, and 465, and five matches for lists of size 213 (101, 200, 204, 207. 212) and 231 (101,200,205,224,227).]
Certain values of n have no matches--for 1 through 99, the no-match cases are 5, 6, 7, 9, 10, 15, 17, 50, 54, 56, 58, 60-86, 88, 90, 92, 96 and 98. This is a total of 43/99, not far from the theoretical (random) value of 1/e = 0.37.
Throughout this article, the space is assumed to precede the letter A in alphabetization (e.g., eight hundred before eighteen). If spaces are ignored in alphabetization (e.g., eighteen before eight hundred), minimal changes occur in the order; however, no matches are created in either the thousand list or the thousand vigintillion one.