Optimizing Your Wife 

If a man can expect to meet exactly N eligible women in his life, what strategy should he use to maximize his chances of choosing the very best one? We typically make some idealized assumptions when solving this type of problem, such as assuming that he meets the women in random order of "goodness", that there is an unambiguous ordering of the women, and that the man is concerned only to maximize his chances of choosing the very best of the N candidates, placing no value at all on choosing the secondbest versus the worst. We also stipulate that the man is not able to infer anything about the absolute values of the candidates; he can determine only the relative orderings of their values. (See the note on Representing Sets of Pure Order for more on this interesting assumption.) 

Under these conditions the man's strategy can obviously never be to select a candidate that isn't the best he has met so far. Also, at each stage his only information is (1) how many women he has evaluated so far, and (2) whether the current woman is the best so far. At some stage, his strategy must be to select the current woman if she is the best so far, and clearly if he has reached this stage the optimum strategy can't subsequently be to pass on a "best so far" candidate. Thus, his optimum strategy for selecting the very best of a sequence of N candidates is to pass on the first k candidates (where k is a number to be determined), and then select the next "best so far" that he encounters. This strategy will result in choosing the very best woman if and only if there is one and only one "best so far" in the sequence from k+1 to N. 

The total number of possible orderings for the N women is N!, and, of these, the number of orderings with the very best candidate in the jth position is (N−1)! for any given j, to account for all possible permutations of the other N−1 candidates . Of these orderings, only a fraction would result in the best candidate actually being selected (using the strategy of passing on the first k candidates and then selecting the next “best so far”). Given that the best candidate is in position j, the necessary and sufficient condition for her to be selected is for the best candidate in the first j−1 positions to be in the first k positions. Hence the fraction of the (N−1)! orderings with the best candidate in the jth position that would result in the best candidate being selected is k/(j−1). Dividing the result by N!, we see that the probability of selecting the optimum woman at the jth round, for j in the range k+1 to N, based on some particular value of k, equals the product 



It follows that, for any given value of k, the probability of selecting the optimum woman out of all N candidates based on this strategy (i.e., pass on the first k, then choose the next "best so far") is the sum of the above expression for j ranging from k+1 to N. Thus we have 



If N is fairly large, the summation on the right is a large span of the simple harmonic series, i.e., the sum of the inverses of consecutive integers. Recalling that the sum of 1/m for m = 1 to s is asymptotically equal to ln(s)+g where g = 0.57 is Euler’s constant, it follows that the above probability approaches 



as N and k increase. To maximize this probability, we differentiate with respect to k and set the result to zero, which gives 



Thus, as k increases, the left side approaches 1, and we can take the exponential of both sides to give 



For a large number of candidates this approaches e = N/k, so the optimum strategy for maximizing the chances of selecting the very best woman from N sequential candidates is to pass on the first N/e women (approximately) and then select the next "best so far". With this strategy the probability of success approaches 1/e. 

The analysis described above gives a surprisingly good probability of finding the single best woman from N candidates even as N increases without limit. However, as noted previously, this solution essentially treats all but the very best woman as totally worthless, because the strategy is focused entirely on maximizing the probability of selecting the very best candidate, assigning no value at all to any of the other candidates. A more pragmatic criterion for choosing a strategy might be to maximize the expected "goodness" of the selection based on some weighting of the possible outcomes. 

To evaluate the expected goodness of our selection for any given strategy (i.e., any given value of k), we need to determine not just the probability of selecting the very best candidate, but the probabilities of selecting the second best, the third best, and so on. Then we can apply a suitable weight to each outcome, enabling us to compute the overall expectation. To do this, we need to augment our strategy by stipulating our course of action if we reach the Nth woman and she is not the best so far. Our stipulation will be that we select the Nth woman in this case, regardless of how she compares with the previous candidates. 

Consider the case N = 8 with the strategy k = 3, and let “1” denote the worst candidate and “8” the best. Of all the 8! possible orderings, how many result in the candidate “5” (for example) being selected? The answer is the sum of the orderings in which “5” is selected in positions 4 through 8. If “5” is in position 4, it will be selected if and only if all the candidates greater than “5” are located in the four positions 5 through 8. There are three such candidates (namely, “6”, “7”, and “8”) and there are C(4,3)(3!) ways for these candidates to be placed in this range. The remaining four candidates can be positioned anywhere, so there are 4! possible orderings of the lower candidates for each of the suitable placements of candidates “5” through “8”. It is also necessary for the best of the candidates positioned ahead of “5” to be in the range 1 to k, but this just gives a factor of 3/3, which doesn’t impose any additional restriction in this case. So the number of suitable arrangements is 



If “5” is in position 5, the three better candidates must be in the three higher positions in some arbitrary order, so this gives a factor of C(3,3)3!, and again there are 4! orderings of the lower candidates for each of these. However, in this case the requirement for the best of the candidates to ahead of “5” to be in the range 1 to k results in a factor of 3/4, so we the contribution 



Now, it’s clear that “5” cannot be selected if it is in the 6th or 7th position, because this implies at least one of the three better candidates is ahead of “5”, so it can’t be the “best so far”. This is also true for the 8th position, but in that case we have stipulated that we will select the woman in the last position if we haven’t selected an earlier one. This will be the case if and only if the very best candidate is in the range 1 to k. Therefore, with k choices for the best candidate, and assuming “5” is in the last position, there are (N2)! orderings of the remaining women, so this outcome contributes (k)(N2)! The overall probability of selecting “5” is given by summing these contributions and dividing by N!, so we have 




In general, let P(N,k,v) denote the probability if selecting candidate “v” from a total of N candidates using the strategy of passing on the first k candidates and then choosing the next “best so far” or the Nth candidate, whichever comes first. Obviously for the degenerate cases k = 0 and k = N1 we have 



Also, by the same reasoning described above, for any k from 1 to N2 we have 



with the understanding that the binomial coefficient is zero if the upper index is less than the lower, which is to say, if v is less than j. This can be simplified slightly and written in the form 



again with the understanding that the summand in the second term in square brackets is zero if v is less than j. To illustrate, suppose a man can expect to meet up to N = 8 women, and each of the 8! = 40320 different possible orderings of encountering them is equally likely. The table below lists the probabilities of selecting woman “v” (where v ranges from 1 to 8, with 1 being the worst and 8 being the best) given that he passes on the first k women and then selects the next “best so far” or the last, whichever comes first. 



This confirms that the best strategy for optimizing the probability of selecting the very best candidate (“8”) is to pass on the first k=3 and then select the next “best so far”. With that strategy, 16524 of the 40320 orderings result in selecting the very best candidate. However, that strategy also results in the selection of the very worst candidate in 2160 of the orderings. The strategy that minimizes the probability of selecting the very worst candidate is always to pass on just the first candidate and then select the next “best so far”, which in this case leads to selection of the worst candidate in just 720 of the orderings. 

To optimize our selection strategy, taking into account the different utilities of all the possible candidates, let us assign a weighting factor g(v) to each candidate. One simple weighting factor is to simply set g(v) = v, which is to say, we assign a value of 1 to candidate “1”, and we assign a value of 2 to candidate “2”, and so on. What is the expected value of goodness for any given strategy? Since we’ve set g(v) = v, this will also give the expected value of v. Obviously with k = 0 or k = 7 (corresponding to the strategies of always choosing the first candidate or always choosing the last candidate, respectively), we have an equal probability, 1/N, of selecting each of the N candidates, so the expected value is 



For any other value k (i.e., for k in the range from 1 to N2), the expected value is 



As an example, with N = 100 the expected value of the selected v for a given strategy parameter k is shown in the left hand figure below. The right hand figure shows the same function, but plotted against ln(k+1) instead of k. 

Evidently the optimum value of k is such that ln(k+1) is half of ln(N), which is to say 



Solving for k_{opt}, we get 



Thus the optimum strategy for N = 100 is to pass on the first 9 candidates and then select the next “best so far”. This gives an expected value of about 91.4, and the probability of selecting the very best candidate (“100”) with this strategy is 0.221. In contrast, if we used the strategy designed to optimize the probability of selecting the very best candidate, without regard to the value of any of the other candidates, we would pass on the first 100/e = 33 candidates. This gives a probability of 0.371 of selecting the very best candidate, but the expected value with this strategy is just 81.4. 

Incidentally, it's been called to my attention that the above formulas could, in certain circumstances, be used in a Bayesian way to estimate the number N of candidates that a man could have expected to encounter over his entire life, based on knowledge of having already met the very best woman. The "Amanda Rule" states that if a man knows, by some means, that the jth woman he has encountered is actually the very best, then a Bayesian estimate for the number N of women he would have expected to meet overall is roughly e times j. Of course, if he has already determined that the jth woman is the very best, he presumably has no interest in the remaining (e−1)j candidates, so the formula is of only academic interest. 
