Programmer notes on CRC computation
Notes on different techniques used to calculate and implement a cyclic redundancy check generator
Introduction
CRC is a function that receives a sequence of bits of arbitrary size and returns a sequence of bits of fixed size:
uint8_t Result = CRC(ArbitraryLengthData);
The function name depends on how many bits it returns:
uint8 ResultA = CRC8(Data); // Returns 8 bits
uint16 ResultB = CRC16(Data); // Returns 16 bits
uint32 ResultC = CRC32(Data); // Returns 32 bits
Given the same input the function always returns the same output:
uint8_t ResultA = CRC8(DataA);
uint8_t ResultB = CRC8(DataA);
assert(ResultA == ResultB);
However two different inputs could return the same output, but dependending on the CRC polynomial in use the chances can be low (As a general note if the polynomial has a name it means that a couple of PHDs crafted the polynomial and demonstrated the low chances of repetition under certain criteria).
uint8_t ResultA = CRC8(DataA);
uint8_t ResultB = CRC8(DataB);
if (DataA != DataB &&
ResultA == ResultB)
{
printf("Collision");
}
If you transmit a message of arbitrary size along with the respective computed CRC it's possible for a receiver to recalculate the CRC and compare if it is equal to the received CRC (Used like this by USB, RFID, SD, Bluetooth, Ethernet, SCTP, PNG, SATA, cksum, Layers on OSI model and much more).
// At transmitter
Transmit(Message, CRC32(Message));
// At receiver
DataFrame Received = GetData();
if (Received.GetCRC() == CRC32(Received.GetMessage()))
{
printf("Received CRC matches recomputed CRC");
}
You can also use the CRC function as a hash function, which is useful if you're building a hash map or something else that requires a hash (For example, Unreal Engine uses CRC to hash various data structures inside TMaps).
// Unreal Engine function that creates a
// hash value from an IntVector2 data
template<typename T>
uint32 GetTypeHash(const TIntVector2<T>& Vector)
{
return FCrc::MemCrc32(&Vector, sizeof(Vector));
}
Calculation
CRC computation consists on calculating:
Where:
Is a polynomial whose coefficients encode an arbitrary chunk of data, and:
Is a polynomial known as the divisor, generator or CRC polynomial. The degree of the generator polynomial states the name of the CRC in use and also the max degree of:
For example, the following CRC polynomial is known as CRC16-CCITT:
The coefficients of these polynomials can be only 1 or 0,
or more rigurously, these coefficients are elements of the underlying set of a mathematical field
known as GF(2), coders can interpret this as fancy terminology to refer to boolean true
, false
, XOR
and AND
.
Description of GF(2)
Underlaying set of the field is defined as:
Binary operations of the field are:
The field is defined as:
Intuitively it means that addition, subtraction, multiplication, and division are defined and behave as the corresponding operations on rational and real numbers.
All the valid operations in GF(2) are:
A | B | A + B | A - B |
---|---|---|---|
0 | 0 | 0 | 0 |
0 | 1 | 1 | 1 |
1 | 0 | 1 | 1 |
1 | 1 | 0 | 0 |
Note above that subtraction and addition looks the same and is equivalent to the boolean XOR operation.
A | B | A * B | A / B |
---|---|---|---|
0 | 0 | 0 | Undefined |
0 | 1 | 0 | 0 |
1 | 0 | 0 | Undefined |
1 | 1 | 1 | 1 |
We conventionally state division by zero to be undefined, apart from that, multiplication and division looks very similar and are equivalent to the boolean AND operation.
Polynomials over GF(2)
Some example of operations between polynomials whose coefficients are elements of GF(2).
The following github repo contains a toy C++ class to perform operations between polynomials over GF(2) using a syntax similar to the one of math texts (Sacrificing performance as is intended only for small experimentation) and it will be used in the following sections.
Polynomial A = (1 * x{6} + 1 * x{ 0 }) * x{ 1 };
Polynomial B = x{ 7 } + x{ 1 };
Polynomial SumResult = SumA + SumB;
assert(SumResult == Polynomial::Zero);
Polynomials to binary
With the tools previously described we can encode a polynomial using binary strings:
And perform polynomial operations using bitwise XOR and AND:
CRC computation 1 bit at a time
This is based on the standard long division algorithm taught in school,
where multiples of the divisor g(x)
are subtracted from the dividend d(x)
until the remainder is of smaller degree than g(x)
.
The following slide illustrates the computation of:
Let n
be the degree of the CRC
polynomial, in this case n=3, the algorithm above
consists on
- Load the first
(n + 1)
terms of the polynomiald(x)
- Subtract the
CRC
polynomial - (1) Store the result
- (2) Left shift the result
- Subtract the
CRC
polynomial, repeat... - Stop after 4 iterations because d(x) encodes a 4 bits long message
C++ example implementation below (see full source code here):
Polynomial Message = x{3} + x{2} + x{0};
Polynomial Crc3 = x{3} + x{1} + x{0};
// Load n+1 bits of the message
// (we depend here on Message
// and CRC having the same degree,
// otherwise this is incorrect,
// read next examples for a demostration on how to handle
// message and CRC not having the same degree)
Polynomial Remainder = Message;
assert(Remainder.GetDegree() == Crc3.GetDegree());
for (size_t Counter = 0;
Counter < Message.TotalTerms();
++Counter)
{
if (Remainder.GetDegree() == Crc3.GetDegree())
{
Remainder = Remainder - Crc3;
}
// Left shift the remainder one bit
Remainder = Remainder * x{ 1 };
}
return Remainder;
The above code won't work if the message has greater degree than the CRC polynomial, next we will handle the loading of abitrary sized messages:
// This example is based on
// https://en.wikipedia.org/wiki/Cyclic_redundancy_check#Computation
Polynomial Crc3 = x{3} + x{1} + x{0};
// A message polynomial of degree 13, encoding 14 bits
PolynomialBitString Message{ 0b11'0100'1110'1100, 14 };
// Load first n bits of the message a.k.a 110
Polynomial Remainder =
Polynomial::FromBitString(
Message.Substring(Crc3.GetDegree())
);
for (size_t Index = 0;
Index < Message.TotalTerms();
++Index)
{
Remainder =
// Align remainder and CRC, e.g. make 110, to be 1100
// so it has the same degree of CRC
Remainder * x{ 1 }
// Load the next bit of the message
+ Message[Crc3.GetDegree() + Index] * x{ 0 };
// Subtract and update the remainder
// if both polynomials share the same degree
if (Remainder.GetDegree() == Crc3.GetDegree())
{
Remainder = Remainder - Crc3;
}
}
return Remainder;
Note that we can ignore the term of degree n as it always gets zeroed when subtracting the CRC from the Remainder.
The following example is the most used in hardware, it doesn't requires preloading the first n bits of the message and only uses n registers to store the result (instead of n + 1), to achive this, it leverages on the associative property of addition, i.e. it performs the same additions per digit noted above, but in a distinct order, continue reading for an illustrated explanation.
// This example is based on
// https://en.wikipedia.org/wiki/Cyclic_redundancy_check#Computation
Polynomial Crc3 = x{3} + x{1} + x{0};
// A message polynomial of degree 13, encoding 14 bits
PolynomialBitString Message{ 0b11'0100'1110'1100, 14 };
// No more need to load first n bits of the message
Polynomial Remainder = Polynomial::Zero();
for (size_t Index = 0;
Index < Message.TotalTerms();
++Index)
{
Remainder =
// No more need of left shifting one bit
Remainder
// Load the next bit of the message
+ Message[Index] * x{ Crc3.GetDegree() - 1 };
// Is the term n-1 at remainder 1? will the remainder
// be aligned to the crc after one left shift?
// 0110 (This will be aligned to CRC after one left shift)
// 1011
if (Remainder.GetDegree() == (Crc3.GetDegree() - 1))
{
// Each register of remainder will accumulate
// the operations consider the first iteration:
// r= x*1x^(n-1) = 1x^n
// crc= 1x^n + x + 1
// updated r= x + 1
// the terms x and 1 will eventually be subtracted
// from a new loaded bit
Remainder = Remainder * x{ 1 } - Crc3;
}
else
{
// Shift left everything if loaded n-1 term was zero
Remainder = Remainder * x{ 1 };
}
}
return Remainder;
Below is an illustrated step by step debugging of the above algorithm, each bit has an identifier tracking when was it read for first time so you can track the order of operations.
For more credibility compare the above graphic with all the operations "expected" to happen on each digit in the first algorithm studied.
Iteration 0
result bits
1 1 0 1 0 0 0 = d(x)
1 0 1 1 = g(x)
1+1 1+0 0+1 1+1 operations at iteration 0
Iteration 1
result bits
0 1 1 0 0 0 0 = d(x)
1 0 1 1 = g(x)
1+1 1+0 0+1 1+1 operations at iteration 0
1+1 1+0 0+1 0+1 operations at iteration 1
Iteration 2
result bits
0 0 1 1 1 0 0 = d(x)
1 0 1 1 = g(x)
1+1 1+0 0+1 1+1 operations at iteration 0
1+1 1+0 0+1 0+1 operations at iteration 1
1+1 1+0 1+1 0+1 operations at iteration 2
Iteration 3
result bits
0 0 0 1 0 1 0 = d(x)
1 0 1 1 = g(x)
1+1 1+0 0+1 1+1 operations at iteration 0
1+1 1+0 0+1 0+1 operations at iteration 1
1+1 1+0 1+1 0+1 operations at iteration 2
1+1 0+0 1+1 0+1 operations at iteration 3
Which expands to:
digit 6 = 1+1
digit 5 = (1+0)+1
digit 4 = ((0+1)+0)+1
digit 3 = ((((1+1)+1)+0)+1)
digit 2 = (((0+1)+1)+0)
digit 1 = ((0+1)+1)
digit 0 = (0+1)
Where digit 2, 1 and 0 conform the result.
Next example uses bitwise operators directly to removes usage of the class Polynomial.
// Binary message
// 00010000 00010001 00010010 00010011
// 00010100 00010101 00010110 00010111
constexpr uint8_t Message[] {
0x10, 0x11, 0x12, 0x13,
0x14, 0x15, 0x16, 0x17
};
// Most significant bit first (big-endian),
// note that the x power n bit is ignored,
// aka the coefficient of the msb bit
// (x16)+x12+x5+1 = (1) 0001 0000 0010 0001 = 0x1021
constexpr uint32_t Crc = 0x1021;
constexpr uint32_t CrcN = 16;
// Result of the CRC function
uint32_t Remainder = 0;
for (uint8_t MessageByte : Message)
{
// Load the next byte and
// Align the byte with the generator
// polynomial e.g. make 0x12 to be 0x1200 so
// it represents a polynomial of degree n-1
uint32_t MessageByteShifted = MessageByte << 8;
// Calculate the remainder
Remainder = Remainder ^ MessageByteShifted;
// Process the next 8 bits
for (uint32_t BitIndex = 0; BitIndex < 8; ++BitIndex)
{
// Check the coefficient of polynomial n-1,
// i.e. 1000 0000 0000 0000
if (Remainder & 0x8000)
{
Remainder = (Remainder << 1) ^ Crc;
}
else
{
Remainder = Remainder << 1;
}
}
// We need to mask 16bits on the remainder because we are ignoring the n coefficient of the crc polynomial
// so we are not guaranteeing the n coefficient at remainder and crc to cancel each other
Remainder = Remainder & 0xffff;
}
return Remainder;
If we encode the polynomial coefficients using a LSB endianess the code simplifies to:
// Binary message
// 00010000 00010001 00010010 00010011
// 00010100 00010101 00010110 00010111
constexpr uint8_t Message[] {
0x10, 0x11, 0x12, 0x13,
0x14, 0x15, 0x16, 0x17
};
// Least significant bit first (little-endian)
// 1+x5+x12+(x16) = 1000 0100 0000 1000 (1) = 0x8408
constexpr uint32_t Crc = 0x8408;
constexpr uint32_t CrcN = 16;
// Result of the CRC function
uint32_t Remainder = 0;
for (uint8_t MessageByte : Message)
{
// Calculate the remainder
Remainder = Remainder ^ MessageByte
// Process the next 8 bits
for (uint32_t BitIndex = 0; BitIndex < 8; ++BitIndex)
{
// Check the coefficient of polynomial n-1, i.e. 0000 0001
if (Remainder & 0x1)
{
Remainder = (Remainder >> 1) ^ Crc;
}
else
{
Remainder = Remainder >> 1;
}
}
}
CRC computation using Dilip V. Sarwate algorithm
Sarwate's algorithm consists of a table containing 256 rows where each element represents the remainder of a polynomial division:
Such that:
And:
Where:
This way, if we wanted to calculate the remainder of:
We would only need to look-up an entry from a table T.
Next example shows how to compute a CRC16 from a 4 bytes long message.
Iteration Result bytes
A3 A2 A1 A0 A{-1} A{-2}
0: T=A3 C1=F1[T] C0=F0[T]
1: T=C1+A2 C1=C0+F1[T] C0=F0[T]
1: T=C1+A1 C1=C0+F1[T] C0=F0[T]
1: T=C1+A0 C1=C0+F1[T] C0=F0[T]
In the above example consider that only one byte is loaded from the message at a time, e.g. at iteration 0 only A3 is loaded, at iteration 1 only A2 and so on.
For a detailed explanation on how to generate the table F[T]
please refer to appendix at Sarwate's paper.
Further lecture
Sarwate's work was extendend by Kounavis and Berry to perform 4 bytes and 8 bytes per iteration, this is the most performant software implementation so far and is widely used in a lot of mainstream systems like the Linux kernel and Unreal Engine (Requires access to the github repo).
Check the slide by 8 implementation used by Unreal with a lot of extra comments here
Links
- How do CRCs work? By Ben Eater
- Computation of cyclic redundancy checks via table look-up By Dilip V. Sarwate
- A systematic approach to building high performance, software based, CRC generators By Michael E. Kounavis and Frank L. Berry
- Online CRC calculator by Bastian Molkenthin
- Understanding CRC by Bastian Molkenthin
- CRC Playground in C++ by Romualdo Villalobos
- CRC documentation at Kernel.org
- CRC Unreal Source Code At Github
- Cyclic redundancy check at Wikipedia
- Computation of cyclic redundancy checks at Wikipedia
- Algebra group at Wikipedia
- Algebra field at Wikipedia
- Field GF(2) at Wikipedia
- Long division at Wikipedia
Credits
Written by Romualdo Villalobos