Representing Sets of Pure Order

If a man can expect to meet exactly N eligible women in his life, what strategy should he use to maximize his chances of choosing the best one? This is sometimes called the "Bachelor's Problem". The usual answer is that for sufficiently large N he should wait until he has seen N/e candidates and then select the next candidate that is better than any he's seen previously. This gives a probability of about 1/e for selecting the best candidate. (However, see the note “Optimizing Your Wife” for a critique of this strategy).

More interesting than the problem itself are the assumptions on which it's based. Obviously it's assumed the man can never go back and rekindle a romance once it's been terminated, and he can't have more than one relationship at a time. It's also assumed that he meets the prospective brides in random order of suitability. These assumptions are certainly debatable in real life, but they're at least conceivable.

This brings us to the most interesting assumption underlying the Bachelor's Problem: he can only discern relative suitability between two "known quantities"; he has no means whatsoever of assessing absolute suitability. For example, even if he's deliriously happy with the first woman, he still has to assume she may be the least suitable, because he hasn't met any of the others, and for all he knows, every woman may be as nice, or nicer, than this first one.

Even after he breaks up with the first woman and has a chance to evaluate the second woman, we're asked to assume that he still can't deduce anything about the underlying population (so to speak). He can determine which of those two was better, but he has no idea how these two women stand relative to the women he has yet to meet. The same is true with the third and fourth and so on. At each stage he's only able to determine the relative order of the women he has evaluated so far.

This is a very interesting premise, and it has clearly taxed the ingenuity of problem-posers as they try to present a situation in which this assumption would be strictly valid. For a more abstract example, consider an urn containing N marbles, on each of which is written a real number. The numbers are all different and there is no "a priori" upper or lower bound on their possible values. We draw out numbers one at a time until we decide to stop or we've drawn all the numbers. The last number we draw is our selection. What strategy will maximize our chances of selecting the largest number in the urn?

It's often been observed that this type of formulation is somewhat ill-posed, because there's no uniform probability distribution over the set of all numbers, so the very fact that the N numbers have somehow been chosen implies that they were drawn from some non-uniform distribution, which in turn suggests that perhaps some information about the absolute ordering of the numbers can be inferred from the early samples. However, since we're not told what that distribution is, we can't quantify any such inference. In its simplest form this conundrum is exposed in the notorious "High-Low Number Problem" (see “Choice Without Context?”).

People who worry about things like this usually eliminate the problematical aspect of the question by stating that when a number is drawn the man is not allowed to actually see it. Instead, a disinterested third party looks at the number and tells the man where it falls, in order, relative to the numbers already drawn. Thus the man really has access only to the relative ordering, and knows nothing about the absolute ordering.

This is okay, but it makes me wonder if it's the only way in which purely relative ordering can be represented. Why do we need to introduce a third party? Surely it must be possible to create N packets of information such that we can unambiguously establish the relative order of any packets we've examined, but we have zero information about their positions relative to the packets we have not examined.

The simplest way I can see to do this is not very simple. For concreteness, suppose we have 5 cards on which we want to place marks that will allow relative ordering but that will convey zero information about absolute order within the set of 5. Define ten symbols, let's say the letters A through J, and put four symbols on each card as follows

Card 1 Card 2 Card 3 Card 4 Card 5

------ ------ ------ ------ ------

A A B C D

B E F E G

C F H H I

D G I J J

The four letters on each card should be randomly arranged, and we should apply a randomly selected permutation to the 10 characters. Note that each card has exactly one character in common with each other card. To each of the 10 characters we assign a number randomly selected from the set {0,1,2}. This number is placed next to the corresponding character on the first card where it appears, and the number j+1 (mod 3) is placed next to the character on the second card where it appears. The result is a set of 5 cards that look like this

Card 1 Card 2 Card 3 Card 4 Card 5

------ ------ ------ ------ ------

E 0 G 1 D 0 G 2 J 1

B 0 H 0 C 1 C 2 F 1

H 2 D 2 E 1 B 1 A 2

A 1 J 0 F 0 I 2 I 0

Given any two cards, the relative order is determined by comparing the numbers n,m written next to the common character. The number n comes before m if and only if m-n = 1 (mod 3). For example, given Cards 2 and 5 we would note that the common character is J, for which the numbers are 0 and 1 respectively. This shows that Card 2 comes before Card 5. But nothing on these two cards gives us any clue as to their order relative to the remaining three cards.

This seems overly complicated, and there's probably a more economical way of doing it, but the fact that it can be done is interesting. In particular, this can be used to define a set of n purely ordered elements, for any natural number n. All that's required is that each element is characterized by n-1 characters (and associated digits) from an alphabet of n(n-1)/2 characters. In a sense, we could define these "cards" as "numbers", and they closely correspond to the set of numbers [1,2,..,n] (i.e., each element except the first has a unique predecessor, and so on).

What happens if we try to let n go to infinity? This seems fairly straightforward for the set [1,..,n]. We just imagine it extending to infinitely large values of n, and we arrive at our notion of the complete infinite set "N" of all natural numbers. However, for our purely ordered set of "cards" it seems a bit trickier. Presumably we cannot extrapolate this to a complete infinite set of purely ordered elements, because that would contradict the fact that there is no uniform distribution on the natural numbers.

What prevents us from conceiving an infinite set of purely ordered elements? We might think the problem is that we don't have a large enough alphabet to cover infinitely many cards, but why not just use the natural numbers themselves? In other words, A=1, B=2, and so on. We might begin to construct our infinite set of purely ordered numbers like this

Card1 Card2 Card3 Card4 Card5 Card6 Card7 Card8 Card9

------ ----- ----- ----- ----- ----- ----- ----- -----

2 2 4 8 16 32 64 128 256

4 3 9 27 81 243 729 2187

8 9 5 25 125 625 3125

16 27 25 7 49 343

32 81 625 49 11

64 243 3125 343 etc.

128 729 15625

256 2187

512

Clearly each card shares exactly one "character" with each other card, and we can uniformly select an "order code" from the set {0,1,2} for each of these characters. So now all that remains is to apply an arbitrary permutation to the labels of these characters. But this is where we run into a problem, because to perform a truly arbitrary permutation of our infinite set of characters we would need a uniform distribution on all these characters, which we don't have.

Nevertheless, this illustrates an interesting point. We can imagine infinitely many orderings of the elements of an infinite set, but evidently we can't conceive of all possible orderings of an infinite set, at least not in the symmetrical way that the symmetry of the definitions would lead us to expect. In other words, even though no individual element and no particular ordering of the elements of the set N (for example) is a priori preferred or distinguished over any other element or ordering, it appears that some orderings of the natural numbers are more natural than others.

Return to MathPages Main Menu