Bernoulli Trials

If the probability of an individual event occurring on a given trial
is P, then within a sequence of S trials what is the probability that 
AT LEAST Y of these events will occur IN A ROW?

These are called Bernoulli trials.  Let Pr{k,q,n} denote the 
probability of at least one "run" (i.e., string of consecutive 
successes) of length n in a sequence of k trials, given that 
the probability of success on each trial is q.  By applying the 
inclusion-exclusion principle it's easy to show that the values 
of Pr{k,q,n} satisfy the recurrence

  Pr{k,q,n} = Pr{k-1,q,n} + (q^n)(1-q) [1 - Pr{k-n-1,q,n}]       (1)

with the initial values 

                Pr{j,q,n} = 0       for j = 0,1,..,n-1

                Pr{j,q,n} = q^n     for j = n

Equation (1) is convenient for computing the values of Pr{k,q,n} 
for k=n,n+1,...  recursively, but it also allows us to give an 
explicit expression for Pr{k,q,n} in terms of k, q, and n.  For 
this purpose it's convenient to deal with the probabilities of the 
complementary event, i.e., the event of having NO run of length n.  
For any given values of q and n, let S[k] denote the probability 
1 - P{k,q,n}.  Substituting into equation (1) gives the simple 
homogeneous recurrence relation

             S[m]  =  S[m-1] - (q^n)(1-q) S[m-n-1]                 (2)

whose characteristic polynomial is

               x^(n+1)  -  x^n  +  (q^n)(1-q)  =  0                (3)

Obviously x=q is a root.  Letting r_1, r_2,..,r_n denote the remaining 
roots, the values of S[k] are given explicitly by

    S[k]  =  C_1 (r_1)^k  +  C_2 (r_2)^k  +  ...  C_3 (r_n)^k

where the C_i are constant coefficients determined by the initial
conditions  S[0]=S[1]=..S[n-1]=1  and  S[n] = 1 - q^n.

For example, if n=2 and q=1/2, the probability that a sequence of k 
trials will contain NO two consecutive successes is exactly
                    _           _                _           _
            / 5 + 3/5 \  / 1 + /5 \ k    / 5 - 3/5 \  / 1 - /5 \ k
  S[k]  =  ( --------- )( -------- )  + ( --------- )( -------- )
            \   10    /  \    4   /      \   10    /  \    4   /

Similar formulas can be given for any specified run length n and 
success probability q.  However, it's usually more efficient to use 
equation (1) or (2) to compute the values recursively.

The above derivation assumed the roots r_i of equation (3) are 
distinct.  To prove this, note that a polynomial f(x) has a multiple
root r iff f(r)=f'(r)=0.  With f(x) = x^(n+1) - x^n + K  we have  
f'(x) = (x^n)((n+1)x - n), so the only possible multiple roots would
be x=0 or x=n/(n+1), which requires q = 0, 1, or n/(n+1).  So, if q 
is anything other than one of these three special values, the roots 
of f(x) are distinct.  (Of course, f(x) always has a factor of (x-q),
so even in a case like n=2, q=2/3 when f(x) has a double root the 
polynomial f(x)/(x-q) has distinct roots.)

Return to MathPages Main Menu