Cyclic Redundancy Checks

A re-formatted version of this article can be found here.

One of the most popular methods of error detection for digital 
signals is the Cyclic Redundancy Check (CRC).  The basic idea
behind CRCs is to treat the message string as a single binary 
word M, and divide it by a key word k that is known to both 
the transmitter and the receiver.  The remainder r left after 
dividing M by k constitutes the "check word" for the given message.
The transmitter sends both the message string M and the check word 
r, and the receiver can then check the data by repeating the 
calculation, dividing M by the key word k, and verifying that
the remainder is r.  The only novel aspect of the CRC process is
that it uses a simplified form of arithmetic, which we'll explain
below, in order to perform the division.

By the way, this method of checking for errors is obviously not
foolproof, because there are many different message strings that 
give a remainder of r when divided by k.  In fact, about 1 out of 
every k randomly selected strings will give any specific remainder.
Thus, if our message string is garbled in transmission, there is a 
chance (about 1/k, assuming the corrupted message is random) that 
the garbled version would agree with the check word.  In such a
case the error would go undetected.  Nevertheless, by making k 
large enough, the chances of a random error going undetected can 
be made extremely small.

That's really all there is to it.  The rest of this discussion 
will consist simply of refining this basic idea to optimize 
its effectiveness, describing the simplified arithmetic that is
used to streamline the computations for maximum efficiency when 
processing binary strings.  

When discussing CRCs it's customary to present the key word k
in the form of a "generator polynomial" whose coefficients are 
the binary bits of the number k.  For example, suppose we want 
our CRC to use the key k=37.  This number written in binary is 
100101, and expressed as a polynomial it is  x^5 + x^2 + 1.  In 
order to implement a CRC based on this polynomial, the transmitter 
and receiver must have agreed in advance that this is the key word
they intend to use.  So, for the sake of discussion, let's say
we have agreed to use the generator polynomial 100101.  

By the way, it's worth noting that the remainder of any word 
divided by a 6-bit word will contain no more than 5 bits, so our 
CRC words based on the polynomial 100101 will always fit into 5 
bits.  Therefore, a CRC system based on this polynomial would be 
called a "5-bit CRC".  In general, a polynomial with k bits leads
to a "k-1 bit CRC".

Now suppose I want to send you a message consisting of the 
string of bits  M = 00101100010101110100011, and I also want to 
send you some additional information that will allow you to check 
the received string for correctness.  Using our agreed key word 
k=100101, I'll simply "divide" M by k to form the remainder r, 
which will constitute the CRC check word.  However, I'm going to
use a simplified kind of division that is particularly well-suited 
to the binary form in which digital data is expressed.

If we interpret k as an ordinary integer (37), it's binary
representation, 100101, is really shorthand for

   (1)2^5 + (0)2^4 + (0)2^3 + (1)2^2 + (0)2^1 + (1)2^0

Every integer can be expressed uniquely in this way, i.e., as 
a polynomial in the base 2 with coefficients that are either 0
or 1.  This is a very powerful form of representation, but it's
actually more powerful than we need for purposes of performing
a data check.  Also, operations on numbers like this can be
somewhat laborious, because they involve borrows and carries in
order to ensure that the coefficients are always either 0 or 1.
(The same is true for decimal arithmetic, except that all the 
digits are required to be in the range 0 to 9.)

To make things simpler, let's interpret our message M, key word 
k, and remainder r, not as actual integers, but as abstract
polynomials in a dummy variable x (rather than a definite base 
like 2 for binary numbers or 10 for decimal numbers).  Also, 
we'll simplify even further by agreeing to pay attention only
to the parity of the coefficients, i.e., if a coefficient is
an odd number we will simply regard it as 1, and if it is an
even number we will regard it as 0.  This is a tremendous 
simplification, because now we don't have to worry about 
borrows and carries when performing arithmetic.  This is
because every integer coefficient must obviously be either 
odd or even, so it's automatically either 0 or 1.

To give just a brief illustration, consider the two polynomials
x^2 + x + 1  and  x^3 + x + 1.  If we multiply these together by 
the ordinary rules of algebra we get

 (x^2 + x + 1)(x^3 + x + 1) = x^5 + x^4 + 2x^3 + 2x^2 + 2x + 1

but according to our simplification we are going to call every
'even' coefficient 0, so the result of the multiplication is 
simply x^5 + x^4 + 1.  You might wonder if this simplified way 
of doing things is really self-consistent.  For example, can we 
divide the product  x^5 + x^4 + 1  by one of its factors, say,
x^2 + x + 1,  to give the other factor?  The answer is yes,
and it's much simpler than ordinary long division.  To divide
the polynomial 110001 by 111 (which is the shorthand way of 
expressing our polynomials) we simply apply the bit-wise 
exclusive-OR operation repeatedly as follows

            1011
          ______
     111 |110001
          111
          ---
          0010
           000
           ---
           0100
            111
            ----
            0111
             111
             ---
             000

This is exactly like ordinary long division, only simpler, because
at each stage we just need to check whether the leading bit of the 
current three bits is 0 or 1.  If it's 0, we place a 0 in the 
quotient and exclusively OR the current bits with 000.  If it's 1, 
we place a 1 in the quotient and exclusively OR the current bits 
with the divisor, which in this case is 111.  As can be seen, the 
result of dividing 110001 by 111 is 1011, which was our other 
factor, x^3 + x + 1, leaving a remainder of 000.  (This kind of 
arithmetic is called the arithmetic of polynomials with coefficients 
from the field of integers modulo 2.)

So now we're armed with everything we need to actually perform
a CRC calculation with the message string M and key word k defined
above.  We simply need to divide M by k using our simplified
polynomial arithmetic.  In fact, it's even simpler, because we
don't really need to keep track of the quotient - all we really
need is the remainder.  So we simply need to perform a sequence 
of 6-bit "exclusive ORs" with our key word k, beginning from the 
left-most "1 bit" of the message string, and at each stage 
thereafter bringing down enough bits from the message string 
to make a 6-bit word with leading 1.  A worksheet for the entire 
computation is shown below:

         _______________________
 100101 |00101100010101110100011
           100101                         
           ------                         
           00100101                  
             100101                   
             ------                     
             0000000101110       
                    100101      
                    ------         
                    00101110       
                      100101       
                      ------       
                      00101100
                        100101
                        ------
                        00100111
                          100101
                          ------
                          000010   remainder = CRC

Our CRC word is simply the remainder, i.e., the result of the last 
6-bit exclusive OR operation.  Of course, the leading bit of this 
result is always 0, so we really only need the last five bits.  This 
is why a 6-bit key word leads to a 5-bit CRC.  In this case, the
CRC word for this message string is 00010, so when I transmit 
the message word M I will also send this corresponding CRC word.
When you receive them you can repeat the above calculation on M
with our agreed generator polynomial k and verify that the resulting 
remainder agrees with the CRC word I included in my transmission.

What we've just done is a perfectly fine CRC calculation, and many
actual implementations work exactly that way, but there is one
potential drawback in our method.  As you can see, the computation 
described above totally ignores any number of "0"s ahead of the 
first "1" bit in the message.  It so happens that many data strings 
in real applications are likely to begin with a long series of "0"s, 
so it's a little bothersome that the algorithm isn't working very 
hard in such cases.  To avoid this "problem", we can agree in advance 
that before computing our n-bit CRC we will always begin by exclusive 
ORing the leading n bits of the message string with a string of n 
"1"s.  With this convention (which of course must be agreed by 
the transmitter and the receiver in advance) our previous example 
would be evaluated as follows

   00101100010101110100011   <--  Original message string
   11111                      <-- "Fix" the leading bits
   -----------------------
   11010100010101110100011    <-- "Fixed" message string
   100101                         
   ------                         
   0100000                    
    100101                    
    ------                
    000101001                
       100101               
       ------              
       00110001               
         100101              
         ------              
         0101000             
          100101                
          ------             
          00110111             
            100101                     
            ------                  
            0100101              
             100101               
             ------               
             0000000100011                 
                    100101
                    ------
                    000110   remainder = CRC

So with the "leading zero fix" convention, the 5-bit CRC word for 
this message string based on the generator polynomial 100101 is 
00110.  That's really all there is to computing a CRC, and many
commercial applications work exactly as we've described.  People
sometimes use various table-lookup routines to speed up the
divisions, but that doesn't alter the basic computation or change
the result.  In addition, people sometimes agree to various
non-standard conventions, such as interpreting the bits in reverse 
order, or carrying out the division with a string of filler bits 
appended to the end of the message, but the essential computation is 
still the same.  (Of course, it's crucial for the transmitter and 
receiver to agree in advance on any unusual conventions they intend 
to observe.)

Now that we've seen how to compute CRC's for a given key polynomial,
it's natural to wonder whether some key polynomials work better
(i.e., give more robust "checks") than others.  From one point of
view the answer is obviously yes, because the larger our key word, 
the less likely it is that corrupted data will go undetected.  By
appending an n-bit CRC to our message string we are increasing the
total number of possible strings by a factor of 2^n, but we aren't
increasing the degrees of freedom, since each message string has
a unique CRC word.  Therefore, we have established a situation in
which only 1 out of 2^n total strings (message+CRC) is valid.  
Notice that if we append our CRC word to our message word, the
result is a multiple of our generator polynomial.  Thus, of all
possible combined strings, only multiples of the generator polynomial
are valid.

So, if we assume that any corruption of our data affects our string 
in a completely random way, i.e., such that the corrupted string 
is totally uncorrelated with the original string, then the 
probability of a corrupted string going undetected is 1/(2^n). 
This is the basis on which people say a 16-bit CRC has a probability
of 1/(2^16) = 1.5E-5 of failing to detect an error in the data, 
and a 32-bit CRC has a probability of 1/(2^32), which is about 
2.3E-10 (less than one in a billion).

Since most digital systems are designed around blocks of 8-bit words
(called "bytes"), it's most common to find key words whose lengths 
are a multiple of 8 bits.  The two most common lengths in practice 
are 16-bit and 32-bit CRCs (so the corresponding generator polynomials 
have 17 and 33 bits respectively).  A few specific polynomials have 
come into widespread use.  For 16-bit CRCs one of the most popular 
key words is 10001000000100001, and for 32-bit CRCs one of the most 
popular is 100000100110000010001110110110111.  In the form of 
explicit polynomials these would be written as

                x^16 + x^12 + x^5 + 1
and
       
   x^32 + x^26 + x^23 + x^22 + x^16 + x^12 + x^11 

               + x^10 + x^8 + x^7 + x^5 + x^4 + x^2 + x + 1

The 16-bit polynomial is known as the "X25 standard", and the 32-
bit polynomial is the "Ethernet standard", and both are widely 
used in all sorts of applications.  (Another common 16-bit key 
polynomial familiar to many modem operators is 11000000000000101, 
which is the basis of the "CRC-16" protocol.)  These polynomials 
are certainly not unique in being suitable for CRC calculations, 
but it's probably a good idea to use one of the established 
standards, to take advantage of all the experience accumulated 
over many years of use.

Nevertheless, we may still be curious to know how these particular
polynomials were chosen.  It so happens that one could use just
about ANY polynomial of a certain degree and achieve most of 
the error detection benefits of the standard polynomials.  For 
example, ANY n-bit CRC will certainly catch any single "burst" 
of m consecutive "flipped bits" for any m less than n, basically 
because a smaller polynomial can't be a multiple of a larger 
polynomial.  Also, we can ensure the detection of any odd number
of bits simply by using a generator polynomial that is a multiple 
of the "parity polynomial", which is x+1.  A polynomial of our
simplified kind is a multiple of x+1 if and only if it has an even 
number of terms.

It's interesting to note that the standard 16-bit polynomials both 
include this parity check, whereas the standard 32-bit CRC does not.
It might seem that this represents a shortcoming of the 32-bit
standard, but it really doesn't, because the inclusion of a parity
check comes at the cost of some other desirable characteristics.
In particular, much emphasis has been placed on the detection of
two separated single-bit errors, and the standard CRC polynomials
were basically chosen to be as robust as possible in detecting such 
double-errors.  Notice that the basic "error word" E representing 
two erroneous bits separated by j bits is of the form x^j + 1 or, 
equivalently, x^j - 1.  Also, an error E superimposed on the message 
M will be undetectable if and only if E is a multiple of the key 
polynomial k.  Therefore, if we choose a key that is not a divisor 
of any polynomial of the form x^t - 1  for t=1,2,...,m, then we are 
assured of detecting any occurrence of precisely two erroneous bits 
that occur within m places of each other.

How would we find such a polynomial?  For this purpose we can use 
a "primitive polynomial".  For example, suppose we want to ensure 
detection of two bits within 31 places of each other.  Let's factor 
the error polynomial x^31 - 1 into it's irreducible components 
(using our simplified arithmetic with coefficients reduced modulo 
2).  We find that it splits into the factors

  x^31 - 1  =   (x+1)
               *(x^5 + x^3 + x^2 + x + 1)
               *(x^5 + x^4 + x^2 + x + 1)
               *(x^5 + x^4 + x^3 + x + 1)
               *(x^5 + x^2 + 1)
               *(x^5 + x^4 + x^3 + x^2 + 1)
               *(x^5 + x^3 + 1)

Aside from the parity factor (x+1), these are all primitive
polynomials, representing primitive roots of x^31 - 1, so they
cannot be divisors of any polynomial of the form x^j - 1 for
any j less than 31.  Notice that  x^5 + x^2 + 1  is the generator
polynomial 100101 for the 5-bit CRC in our first example.

Another way of looking at this is via recurrence formulas.  For
example, the polynomial x^5 + x^2 + 1 corresponds to the recurrence
relation  s[n] = (s[n-3] + s[n-5]) modulo 2.  Beginning with the 
initial values 00001 this recurrence yields

                                     |--> cycle repeats
      0000100101100111110001101110101 00001

Notice that the sequence repeats with a period of 31, which is
another consequence of the fact that x^5 + x^2 + 1 is primitive.
You can also see that the sets of five consecutive bits run through
all the numbers from 1 to 31 before repeating.  In contrast, the
polynomial x^5 + x + 1 corresponds to the recurrence s[n] = (s[n-4]
+ s[n-5]) modulo 2, and gives the sequence

                           |--> cycle repeats
      000010001100101011111 00001

Notice that this recurrence has a period of 21, which implies that
the polynomial x^5 + x + 1  divides  x^21 - 1.  Actually, x^5 + x + 1
can be factored as (x^2 + x + 1)(x^3 + x^2 + 1), and both of those
factors divide x^21 - 1.  Therefore, the polynomial x^5 + x + 1 may 
be considered to give a less robust CRC than x^5 + x^2 + 1, at least 
from the standpoint of maximizing the distance by which two erroneous 
bits must be separated in order to go undetected.

On the other hand, there are error patterns that would be detected
by x^5 + x + 1 but would NOT be detected by x^5 + x^2 + 1.  As
noted previously, any n-bit CRC increases the space of all strings
by a factor of 2^n, so a completely arbitrary error pattern really
is no less likely to be detected by a "poor" polynomial than by a
"good" one.  The distinction between good and bad generators is
based on the premise that the most likely error patterns in real
life are NOT entirely random, but are most likely to consist of a 
very small number of bits (e.g., one or two) very close together. 
To protect against this kind of corruption, we want a generator
that maximizes the number of bits that must be "flipped" to get
from one formally valid string to another.  We can certainly cover
all 1-bit errors, and with a suitable choice of generators we can 
effectively cover virtually all 2-bit errors.

Whether this particular failure mode deserves the attention it has 
received is debatable.  If our typical data corruption event flips
dozens of bits, then the fact that we can cover all 2-bit errors
seems less important.  Some cynics have gone so far as to suggest 
that the focus on the "2-bit failure mode" is really just an excuse 
to give communications engineers an opportunity to deploy some non-
trivial mathematics.  I personally wouldn't go quite that far, since
I believe it makes sense to use a primitive generator polynomial,
just as it would make sense to use a prime number key if we were
working with ordinary integer arithmetic, because one big coincidence
seems intuitively less likely than several small ones.  However, the
fact remains that our overall estimate for the probability of an
error going undetected by an n-bit CRC is 1/(2^n), regardless of
which (n+1)-bit generator polynomial we use.

The best argument for using one of the industry-standard generator
polynomials may be the "spread-the-blame" argument.  Any CRC (like
a pseudo-random number generator) COULD be found to be particularly
unsuitable in some special circumstance, e.g., in an environment
that tends to produce error patterns in multiples of our generator
at a rate significantly greater than would be predicted for a truly
random process.  This would be incredibly bad luck, but if it ever
happened, you'd like to at least be able to say you were using an
industry standard generator, so the problem couldn't be attributed
to any unauthorized creativity on your part.
Return to MathPages Main Menu