Reed–Solomon codes for coders/Additional information

This section presents additional information about the structure of QR codes. The information here is currently incomplete. The full specification is in ISO standard 18004; unfortunately, this document is not freely available. Readers in search of the full details might turn to examining the source code of one of the many open-source QR code decoders available.

Standard QR codes

A standard QR symbol has the following features:

Three bull's-eye locator patterns in the corners
Format information (red) placed around the locators
Alternating light and dark timing marks in one row and column connecting the locators
Smaller bull's-eye alignment patterns within the symbol (only in symbols that are 25×25 and larger)
Version information (blue) placed near the corners (45×45 and larger)
Quiet zone 4 modules wide around the symbol

Micro QR codes

Micro QR symbols have these features:

One locator pattern in the corner
Format information around the locator
Timing marks along two edges
Quiet zone 2 modules wide around the symbol

Model 1 QR codes

Model 1 QR symbols are an older version of the standard that is rarely seen anymore. They have the following features.

Three locator patterns in the corners
Format information placed around the locators
Timing marks in one row and column connecting the locators
Extension patterns along the right and lower edges
Quiet zone 4 modules wide around the symbol

Symbol size

There are 40 sizes of standard QR codes, and 4 sizes of micro QR codes. Confusingly, these are referred to as "versions" by the QR code specification document. The table below gives the size in modules and capacity in 8-bit codewords for each version. Some of the capacity is used to store data and some is used to store error correction information, depending on the error correction level chosen.

The capacities for the non-micro QR codes can be given by the following formula: $\left\lfloor {\frac {1}{8}}\left(16\left(V^{2}+8V+4\right)-25\left(\left\{V\geq 2:1,0\right\}+\left\lfloor {\frac {V}{7}}\right\rfloor \right)^{2}-\left\{V\geq 7:1,0\right\}\left(36+40\left\lfloor {\frac {V}{7}}\right\rfloor \right)\right)\right\rfloor$

Version	Size	Capacity	Version	Size	Capacity	Version	Size	Capacity	Version	Size	Capacity
M1	11	4½	8	49	242	19	93	991	30	137	2185
M2	13	10	9	53	292	20	97	1085	31	141	2323
M3	15	16½	10	57	346	21	101	1156	32	145	2465
M4	17	24	11	61	404	22	105	1258	33	149	2611
1	21	26	12	65	466	23	109	1364	34	153	2761
2	25	44	13	69	532	24	113	1474	35	157	2876
3	29	70	14	73	581	25	117	1588	36	161	3034
4	33	100	15	77	655	26	121	1706	37	165	3196
5	37	134	16	81	733	27	125	1828	38	169	3362
6	41	172	17	85	815	28	129	1921	39	173	3532
7	45	196	18	89	901	29	133	2051	40	177	3706

Note that there are two slightly different codeword layouts for version 3 micro QR codes, depending on whether the error correction level is L or M.

Model 1 size

Model 1 symbols have the same size as the corresponding model 2 version, but the codeword capacity is slightly different.

Version	Size	Capacity	Version	Size	Capacity	Version	Size	Capacity	Version	Size	Capacity
1	21	25½	5	37	133½	9	53	305½	13	69	541½
2	25	45½	6	41	169½	10	57	357½	14	73	609½
3	29	71½	7	45	211½	11	61	415½
4	33	99½	8	49	255½	12	65	475½

Alignment pattern

Standard QR codes of version 2 or higher contain a grid of alignment markers. The markers are placed in rows and columns given by the table below. The numbering starts at zero and counts from the upper left corner. Three corners of the grid overlap the locator patterns, and those alignment markers are omitted.

Version	Locations	Version	Locations	Version	Locations
2	18	15	6, 26, 48, 70	28	6, 26, 50, 74, 98, 122
3	22	16	6, 26, 50, 74	29	6, 30, 54, 78, 102, 126
4	26	17	6, 30, 54, 78	30	6, 26, 52, 78, 104, 130
5	30	18	6, 30, 56, 82	31	6, 30, 56, 82, 108, 134
6	34	19	6, 30, 58, 86	32	6, 34, 60, 86, 112, 138
7	6, 22, 38	20	6, 34, 62, 90	33	6, 30, 58, 86, 114, 142
8	6, 24, 42	21	6, 28, 50, 72, 94	34	6, 34, 62, 90, 118, 146
9	6, 26, 46	22	6, 26, 50, 74, 98	35	6, 30, 54, 78, 102, 126, 150
10	6, 28, 50	23	6, 30, 54, 78, 102	36	6, 24, 50, 76, 102, 128, 154
11	6, 30, 54	24	6, 28, 54, 80, 106	37	6, 28, 54, 80, 106, 132, 158
12	6, 32, 58	25	6, 32, 58, 84, 110	38	6, 32, 58, 84, 110, 136, 162
13	6, 34, 62	26	6, 30, 58, 86, 114	39	6, 26, 54, 82, 110, 138, 166
14	6, 26, 46, 66	27	6, 34, 62, 90, 118	40	6, 30, 58, 86, 114, 142, 170

Format information

All QR codes use 5 bits of format information, encoded with generator 10100110111 to produce 15 bits.

In standard QR codes, the first two bits give the error correction level, as shown below. The remaining three bits identify the mask pattern for the data area.

ECL	Bits
L	01
M	00
Q	11
H	10

Model 2 codes (the current standard) use the format mask 101010000010010, while model 1 codes use 010100000100101 instead. Note that the first five bits are inverted. This means that, for example, L is represented by two dark modules in model 2, but by two light modules in model 1. The masks were chosen so that valid masked format information from one model is never valid for the other model.

In Micro QR codes, the format information has a different structure. The first three bits give the version and ECL, as shown below; only certain error correction levels are possible for each version. The other two bits select the mask pattern for the data area.

Version	ECL	Bits
M1	L	000
M2	L	001
M2	M	010
M3	L	011
M3	M	100
M4	L	101
M4	M	110
M4	Q	111

Micro QR format information is masked with 100010001000101.

Data masking

Data masking inverts certain modules in the data area based on the formulas below. The variables i and j are the zero-based row and column numbers of a module. The purpose of the masking is to break up large solid blocks of black or white, and to avoid patterns that look confusingly similar to the locator marks. There are 8 different mask patterns available for standard QR codes, and 4 for micro codes.

Mask	Micro	Invert if
000		(i + j) % 2 = 0
001	00	i % 2 = 0
010		j % 3 = 0
011		(i + j) % 3 = 0
100	01	((i / 2) + (j / 3)) % 2 = 0
101		(i * j) % 2 = 0 and (i * j) % 3 = 0
110	10	(i * j + (i * j) % 3) % 2 = 0
111	11	(i + j + (i * j) % 3) % 2 = 0

RS block size

Because the Reed–Solomon coding uses 8-bit codewords, an individual code block cannot exceed 255 codewords in length. Since the larger QR symbols contain much more data than that, it is necessary to break the message up into multiple blocks. The QR specification does not use the largest possible block size, though; instead, it defines the block sizes so that no more than 30 error-correction symbols appear in each block. This means that at most 15 errors per block can be corrected, which limits the complexity of certain steps in the decoding algorithm.

The table below shows the data capacity and block sizes for each error correction level. The parenthesized pairs (n,k) give the total block length n and the number of data codewords in the block k. Capacity is in codewords, and does not take mode indicator overhead (see below) into account so the number of bytes that can be stored is slightly less.

Version	L Data	L Blocks	M Data	M Blocks	Q Data	Q Blocks	H Data	H Blocks
M1	2½	(5,3)
M2	5	(10,5)	4	(10,4)
M3	10½	(17,11)	8½	(17,9)
M4	16	(24,16)	14	(24,14)	10	(24,10)
1	19	(26,19)	16	(26,16)	13	(26,13)	9	(26,9)
2	34	(44,34)	28	(44,28)	22	(44,22)	16	(44,16)
3	55	(70,55)	44	(70,44)	34	2×(35,17)	26	2×(35,13)
4	80	(100,80)	64	2×(50,32)	48	2×(50,24)	36	4×(25,9)
5	108	(134,108)	86	2×(67,43)	62	2×(33,15), 2×(34,16)	46	2×(33,11), 2×(34,12)
6	136	2×(86,68)	108	4×(43,27)	76	4×(43,19)	60	4×(43,15)
7	156	2×(98,78)	124	4×(49,31)	88	2×(32,14), 4×(33,15)	66	4×(39,13), (40,14)
8	194	2×(121,97)	154	2×(60,38), 2×(61,39)	110	4×(40,18), 2×(41,19)	86	4×(40,14), 2×(41,15)
9	232	2×(146,116)	182	3×(58,36), 2×(59,37)	132	4×(36,16), 4×(37,17)	100	4×(36,12), 4×(37,13)
10	274	2×(86,68), 2×(87,69)	216	4×(69,43), (70,44)	154	6×(43,19), 2×(44,20)	122	6×(43,15), 2×(44,16)
11	324	4×(101,81)	254	(80,50), 4×(81,51)	180	4×(50,22), 4×(51,23)	140	3×(36,12), 8×(37,13)
12	370	2×(116,92), 2×(117,93)	290	6×(58,36), 2×(59,37)	206	4×(46,20), 6×(47,21)	158	7×(42,14), 4×(43,15)
13	428	4×(133,107)	334	8×(59,37), (60,38)	244	8×(44,20), 4×(45,21)	180	12×(33,11), 4×(34,12)
14	461	3×(145,115), (146,116)	365	4×(64,40), 5×(65,41)	261	11×(36,16), 5×(37,17)	197	11×(36,12), 5×(37,13)
15	523	5×(109,87), (110,88)	415	5×(65,41), 5×(66,42)	295	5×(54,24), 7×(55,25)	223	11×(36,12), 7×(37,13)
16	589	5×(122,98), (123,99)	453	7×(73,45), 3×(74,46)	325	15×(43,19), 2×(44,20)	253	3×(45,15), 13×(46,16)
17	647	(135,107), 5×(136,108)	507	10×(74,46), (75,47)	367	(50,22), 15×(51,23)	283	2×(42,14), 17×(43,15)
18	721	5×(150,120), (151,121)	563	9×(69,43), 4×(70,44)	397	17×(50,22), (51,23)	313	2×(42,14), 19×(43,15)
19	795	3×(141,113), 4×(142,114)	627	3×(70,44), 11×(71,45)	445	17×(47,21), 4×(48,22)	341	9×(39,13), 16×(40,14)
20	861	3×(135,107), 5×(136,108)	669	3×(67,41), 13×(68,42)	485	15×(54,24), 5×(55,25)	385	15×(43,15), 10×(44,16)
21	932	4×(144,116), 4×(145,117)	714	17×(68,42)	512	17×(50,22), 6×(51,23)	406	19×(46,16), 6×(47,17)
22	1006	2×(139,111), 7×(140,112)	782	17×(74,46)	568	7×(54,24), 16×(55,25)	442	34×(37,13)
23	1094	4×(151,121), 5×(152,122)	860	4×(75,47), 14×(76,48)	614	11×(54,24), 14×(55,25)	464	16×(45,15), 14×(46,16)
24	1174	6×(147,117), 4×(148,118)	914	6×(73,45), 14×(74,46)	664	11×(54,24), 16×(55,25)	514	30×(46,16), 2×(47,17)
25	1276	8×(132,106), 4×(133,107)	1000	8×(75,47), 13×(76,48)	718	7×(54,24), 22×(55,25)	538	22×(45,15), 13×(46,16)
26	1370	10×(142,114), 2×(143,115)	1062	19×(74,46), 4×(75,47)	754	28×(50,22), 6×(51,23)	596	33×(46,16), 4×(47,17)
27	1468	8×(152,122), 4×(153,123)	1128	22×(73,45), 3×(74,46)	808	8×(53,23), 26×(54,24)	628	12×(45,15), 28×(46,16)
28	1531	3×(147,117), 10×(148,118)	1193	3×(73,45), 23×(74,46)	871	4×(54,24), 31×(55,25)	661	11×(45,15), 31×(46,16)
29	1631	7×(146,116), 7×(147,117)	1267	21×(73,45), 7×(74,46)	911	(53,23), 37×(54,24)	701	19×(45,15), 26×(46,16)
30	1735	5×(145,115), 10×(146,116)	1373	19×(75,47), 10×(76,48)	985	15×(54,24), 25×(55,25)	745	23×(45,15), 25×(46,16)
31	1843	13×(145,115), 3×(146,116)	1455	2×(74,46), 29×(75,47)	1033	42×(54,24), (55,25)	793	23×(45,15), 28×(46,16)
32	1955	17×(145,115)	1541	10×(74,46), 23×(75,47)	1115	10×(54,24), 35×(55,25)	845	19×(45,15), 35×(46,16)
33	2071	17×(145,115), (146,116)	1631	14×(74,46), 21×(75,47)	1171	29×(54,24), 19×(55,25)	901	11×(45,15), 46×(46,16)
34	2191	13×(145,115), 6×(146,116)	1725	14×(74,46), 23×(75,47)	1231	44×(54,24), 7×(55,25)	961	59×(46,16), (47,17)
35	2306	12×(151,121), 7×(152,122)	1812	12×(75,47), 26×(76,48)	1286	39×(54,24), 14×(55,25)	986	22×(45,15), 41×(46,16)
36	2434	6×(151,121), 14×(152,122)	1914	6×(75,47), 34×(76,48)	1354	46×(54,24), 10×(55,25)	1054	2×(45,15), 64×(46,16)
37	2566	17×(152,122), 4×(153,123)	1992	29×(74,46), 14×(75,47)	1426	49×(54,24), 10×(55,25)	1096	24×(45,15), 46×(46,16)
38	2702	4×(152,122), 18×(153,123)	2102	13×(74,46), 32×(75,47)	1502	48×(54,24), 14×(55,25)	1142	42×(45,15), 32×(46,16)
39	2812	20×(147,117), 4×(148,118)	2216	40×(75,47), 7×(76,48)	1582	43×(54,24), 22×(55,25)	1222	10×(45,15), 67×(46,16)
40	2956	19×(148,118), 6×(149,119)	2334	18×(75,47), 31×(76,48)	1666	34×(54,24), 34×(55,25)	1276	20×(45,15), 61×(46,16)

Note that version 1 and 3 micro QR codes only store 4 bits for the final data codeword. Four zero bits are appended before processing by the error correction algorithm.

The message is broken into blocks by simply putting the first k codewords in the first block, the next k in the second, and so on.

Interleaving

There may be localized damage to a QR symbol; for example, it may be partially obscured by a smudge. The symbol can still be decoded properly if the allowable number of errors is not exceeded in any of its error correction blocks. For this reason, the blocks are spread out by interleaving their codewords together when the symbol is created. The codeword sequence is built by taking the first codeword from each block, then the second from each, and so on. If one block is exhausted, skip over it and continue placing codewords from the other blocks. After all data codewords have been placed in sequence, the error correction codewords are placed similarly.

For example, a version 5 symbol stores 134 codewords. At error correction level Q, there are 62 data codewords and 72 error correction codewords. The message is broken into blocks by filling each row in the data section of the table below in turn. The first two blocks are one codeword shorter than the last two.

Block	Data codewords																Error correction codewords
1	D₁	D₂	D₃	D₄	D₅	D₆	D₇	D₈	D₉	D₁₀	D₁₁	D₁₂	D₁₃	D₁₄	D₁₅	—	E₁	E₂	E₃	E₄	E₅	E₆	E₇	E₈	E₉	E₁₀	E₁₁	E₁₂	E₁₃	E₁₄	E₁₅	E₁₆	E₁₇	E₁₈
2	D₁₆	D₁₇	D₁₈	D₁₉	D₂₀	D₂₁	D₂₂	D₂₃	D₂₄	D₂₅	D₂₆	D₂₇	D₂₈	D₂₉	D₃₀	—	E₁₉	E₂₀	E₂₁	E₂₂	E₂₃	E₂₄	E₂₅	E₂₆	E₂₇	E₂₈	E₂₉	E₃₀	E₃₁	E₃₂	E₃₃	E₃₄	E₃₅	E₃₆
3	D₃₁	D₃₂	D₃₃	D₃₄	D₃₅	D₃₆	D₃₇	D₃₈	D₃₉	D₄₀	D₄₁	D₄₂	D₄₃	D₄₄	D₄₅	D₄₆	E₃₇	E₃₈	E₃₉	E₄₀	E₄₁	E₄₂	E₄₃	E₄₄	E₄₅	E₄₆	E₄₇	E₄₈	E₄₉	E₅₀	E₅₁	E₅₂	E₅₃	E₅₄
4	D₄₇	D₄₈	D₄₉	D₅₀	D₅₁	D₅₂	D₅₃	D₅₄	D₅₅	D₅₆	D₅₇	D₅₈	D₅₉	D₆₀	D₆₁	D₆₂	E₅₅	E₅₆	E₅₇	E₅₈	E₅₉	E₆₀	E₆₁	E₆₂	E₆₃	E₆₄	E₆₅	E₆₆	E₆₇	E₆₈	E₆₉	E₇₀	E₇₁	E₇₂

Next, error correction codes are calculated for each row of the table. Finally, the table is read column-by-column from left to right, resulting in the following codeword sequence:

D₁ D₁₆ D₃₁ D₄₇ D₂ D₁₇ D₃₂ D₄₈ ... D₁₅ D₃₀ D₄₅ D₆₁ D₄₆ D₆₂ E₁ E₁₉ E₃₇ E₅₅ ... E₁₈ E₃₆ E₅₄ E₇₂

Model 1 block size

Model 1 symbols use different block sizes, as shown below. Only 4 bits are stored for the first codeword, and 4 zero bits are prepended before processing with the error-correction algorithms. Multiple block sizes are not used within a symbol, so there are sometimes leftover codewords.

Version	L Data	L Blocks	M Data	M Blocks	Q Data	Q Blocks	H Data	H Blocks
1	18½	(26,19)	15½	(26,16)	12½	(26,13)	8½	(26,9)
2	35½	(46,36)	29½	(46,30)	23½	(46,24)	15½	(46,16)
3	56½	(72,57)	43½	(72,44)	35½	(72,36)	23½	(72,24)
4	79½	(100,80)	59½	(100,60)	49½	(100,50)	33½	(100,34)
5	107½	(134,108)	81½	(134,82)	67½	(134,68)	45½	2×(67,23)
6	135½	(170,136)	105½	2×(85,53)	85½	2×(85,43)	57½	2×(85,29)
7	169½	(212,170)	131½	2×(106,66)	107½	2×(106,54)	71½	3×(70,24)
8	207½	2×(128,104)	159½	2×(128,80)	127½	2×(128,64)	86½	3×(85,29)
9	245½	2×(153,123)	185½	2×(153,93)	155½	3×(102,52)	101½	3×(102,34)
10	289½	2×(179,145)	221½	2×(179,111)	182½	3×(119,61)	123½	4×(89,31)
11	335½	2×(208,168)	255½	4×(104,64)	207½	4×(104,52)	144½	5×(83,29)
12	383½	2×(238,192)	291½	4×(119,73)	243½	4×(119,61)	164½	5×(95,33)
13	431½	3×(180,144)	331½	4×(135,83)	275½	4×(135,69)	191½	6×(90,32)
14	488½	3×(203,163)	367½	4×(152,92)	309½	5×(122,62)	209½	6×(101,35)

Model 1 QR symbols do not use interleaving. The data codewords for each block are simply concatenated in order, followed by the error correction codewords.

Encoding modes

Standard QR codes use a 4-bit encoding mode indicator followed by a length field. The length field has a variable number of bits, which depends on the mode and the symbol version.

Encoding Mode	Indicator	Length (ver 1 to 9)	Length (ver 10 to 26)	Length (ver 27 to 40)
Numeric	0001	10	12	14
Alphanumeric	0010	9	11	13
Byte	0100	8	16	16
Kanji	1000	8	10	12

After the length, the encoded data bits appear, followed by another mode indicator. This allows different encoding modes to be used for different portions of the message. The special indicator 0000 signals the end of the message, but it can be omitted if there are not enough data bits left in the symbol to store it.

Here is a list of special indicator values. They do not initiate an encoding mode, but are used for other purposes.

Indicator	Meaning
0000	End of message
0011	Structured Append
0101	FNC1 (first position)
0111	Extended Channel Interpretation (ECI)
1001	FNC1 (second position)

Micro QR codes use a different indicator system. Version 1 micro symbols do not use an indicator; only numeric encoding is possible. Version 2 uses a 1-bit indicator, and only the numeric and alphanumeric encodings can be used. Versions 3 and 4 can use all four encoding modes.

Encoding Mode	V1 Length	V2 Indicator	V2 Length	V3 Indicator	V3 Length	V4 Indicator	V4 Length
Numeric	3	0	4	00	5	000	6
Alphanumeric	—	1	3	01	4	001	5
Byte	—	—	—	10	4	010	5
Kanji	—	—	—	11	3	011	4

End-of-message can be signaled with a numeric field of length zero.

Numeric encoding

In numeric encoding, each group of three digits is encoded with ten bits. If the number of digits is not a multiple of three, the last group is shortened. If there is one leftover digit, it is encoded in four bits. If there are two digits in the last group, it is stored with seven bits.

Alphanumeric encoding

The alphanumeric encoding stores 45 different characters, including numbers, upper-case letters, space, and a few symbols, as shown in the table below.

Code	Char	Code	Char	Code	Char	Code	Char	Code	Char
0	0	9	9	18	I	27	R	36	SP
1	1	10	A	19	J	28	S	37	$
2	2	11	B	20	K	29	T	38	%
3	3	12	C	21	L	30	U	39	*
4	4	13	D	22	M	31	V	40	+
5	5	14	E	23	N	32	W	41	-
6	6	15	F	24	O	33	X	42	.
7	7	16	G	25	P	34	Y	43	/
8	8	17	H	26	Q	35	Z	44	:

Two characters A and B are combined with the following formula, and the result is stored in an 11-bit value.

45 × A + B

If there are an odd number of characters to be encoded, the last one is stored by itself in a 6-bit field.

Byte encoding

This encoding simply stores a sequence of 8-bit bytes. According to the QR 2005 standard, the ISO Latin-1 encoding is used by default, but in previous standards the default was JIS-8 (the 8-bit subset of Shift JIS). ECI markers are used to select a different encoding; unfortunately, many QR code readers do not understand ECI markers, so encoders frequently do not include them. In practice, QR codes that store characters outside the basic ASCII set often use UTF-8 or various language-specific encodings, and the decoder must guess which encoding is being used.

ECI

After the ECI indicator are one, two, or three bytes that specify the Extended Channel Interpretation to be used. Examine the high bits of the first data byte to determine the number of data bytes present.

Initial byte	Number of bytes	Maximum value stored
0xxxxxxx	1	000127
10xxxxxx	2	016383
110xxxxx	3	999999

Here is a partial list of defined ECI codes.

ECI	Encoding	ECI	Encoding	ECI	Encoding
2	Code page 437 (original MS-DOS character set)	12	ISO-8859-10 (Latin-6 Nordic)	23	Windows-1252 (English/Western)
3	ISO-8859-1 (Latin-1 Western European)	13	ISO-8859-11 (Thai)	24	Windows-1256 (Arabic)
4	ISO-8859-2 (Latin-2 Eastern European)	14	Reserved for ISO-8859-12, which was abandoned	25	UTF-16BE
5	ISO-8859-3 (Latin-3 Southern European)	15	ISO-8859-13 (Latin-7 Baltic Rim)	26	UTF-8
6	ISO-8859-4 (Latin-4 Northern European)	16	ISO-8859-14 (Latin-8 Celtic)	27	ASCII
7	ISO-8859-5 (Cyrillic)	17	ISO-8859-15 (Latin-9 Western European variant)	28	Big5 (Chinese)
8	ISO-8859-6 (Arabic)	18	ISO-8859-16 (Latin-10 South-Eastern European)	29	GB 2312 (Chinese)
9	ISO-8859-7 (Greek)	20	Shift JIS (Japanese)	30	EUC-KR (Korean)
10	ISO-8859-8 (Hebrew)	21	Windows-1250 (Central and Eastern European)
11	ISO-8859-9 (Latin-5 Turkish)	22	Windows-1251 (Cyrillic)

This list is based on information from the ZXing project.

Kanji encoding

QR codes were developed in Japan and have special support for storing Japanese text. The Kanji encoding uses 13 bits for each Kanji character stored. It is based on the Shift JIS encoding, which uses two bytes per kanji character. The first Shift JIS byte is in the range 0x81 to 0x9f or 0xe0 to 0xef, and the second is 0x40 to 0xfc. The encoded value in the QR code is as follows.

(A–0x81)×0xc0 + (B–0x40) if 0x81≤A≤0x9f

(A–0xc1)×0xc0 + (B–0x40) if 0xe0≤A≤0xeb

Note that characters higher than 0xeb 0xbf cannot be encoded with this method. This means that the JIS X 0208 character set can be encoded, but not the JIS X 0213 extension.

Structured append

Structured append mode allows a long message to be split between several separate QR symbols. The indicator is followed by a 4-bit field that gives the number of this symbol within the set, counting from 0. Next is another 4-bit field which gives the number of the last symbol in the set (equal to the number of symbols minus 1). Finally, there is an 8-bit exclusive-or checksum of the bytes in the complete message. (For kanji data, the bytes of the ShiftJIS encoding are checksummed.) Each symbol in the message contains the same checksum value, and this helps ensure that parts from different messages are not mixed up when decoding.

Reed-Solomon

Universal Reed-Solomon Codec

In the main article, a simple Reed-Solomon codec was described, but for the sake of simplicity, some non essential parameters were hidden. These parameters do not affect the capabilities of the decoder, in fact they just shift the internal values, but the structure remains totally equivalent, and so the resulting repairing capability is the same (ie, the Singleton bound still applies).

A universal Reed-Solomon codec is a Reed-Solomon encoder/decoder which allows to setup those hidden parameters. This allows to decode the RS codes generated by (almost) any other Reed-Solomon codec, however it was implemented.

Here is the list of the non essential, shifting parameters:

fcr: first consecutive root, this allows to specify the value to "jump" when computing the sequential terms of the generator polynomial. Usually set at 0, but bounded between 0 and the Galois Field's cardinality (eg, 255 for GF(2^8)).
generator: the integer value that will replace alpha in the standard academic nomenclatura. This must be a prime integer, such as 2, 3, 5, etc. In practice, what is the generator? You know, when we use the log/antilog tables to do our computations, from what are these tables generated from? You guessed it: the generator number. Usually, it's set to 2, so the log/antilog tables are generated by computing the sequence of powers: 0, 2^0, 2^1, ..., 2^p. We can then almost entirely work with these log/antilog tables, except at the first stage of encoding, and at the first and last stage of decoding, because we need to convert back and forth from/to the input numbers to the log/antilog tables values. So basically, you will see we need to supply the generator to only a few functions that needs to convert from or to the log space.
prim: the primitive polynomial to compute the reductions in the field. Warning: you cannot use just any number! If prim is wrong, you will get an infinite loop when trying to compute. Prim is always an integer between field's cardinality and its double (eg, for GF(2^8) prim must be between 256 and 512), but it must be irreducible, which depends on the chosen generator. So Prim depends on both the generator and c_exp.

To compute the list of possible primitive polynomials for the chosen generator and field's exponent, you can use the function:

def find_prime_polys(generator=2, c_exp=8, fast_primes=False, single=False):

Beware that for c_exp > 14, the process may be so long that it never finishes! This is because finding the prime polynomials is a NP-Hard problem, and this function uses an exhaustive search (so the result is always correct but the time complexity is exponential). But in practice, if you just need one primitive polynomial to make the RS codec work (and not the list of all possible prim polynomials), you can set fast_primes=True and single=True, this will return you the first primitive polynomial the function can find, which will usually be computed in just a few seconds.

This universal Reed-Solomon codec goes further by allowing to work in any GF(2^p) field, by providing these parameters:

c_exp: the Galois Field's exponent (ie, the p in GF(2^p)). For example, c_exp=16 means that you will work in GF(2^16) == GF(65536)
field_charac: an internal variable you don't have to modify. It just computes the full range/cardinality of the field minus 1, eg, for GF(2^16), field_charac = 65535.
field_charac_full: just like field_charac but +1 (ie, the full field's range including 0).

Here is the code of the universal Reed-Solomon codec. You can compare it with a diff to the code presented in the main article to see the differences:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

###################################################
### Universal Reed-Solomon Codec
### initially released at Wikiversity
###################################################

################### INIT and stuff ###################

try: # compatibility with Python 3+
    xrange
except NameError:
    xrange = range

class ReedSolomonError(Exception):
    pass

gf_exp = [0] * 512 # For efficiency, gf_exp[] has size 2*GF_SIZE, so that a simple multiplication of two numbers can be resolved without calling % 255. For more infos on how to generate this extended exponentiation table, see paper: "Fast software implementation of finite field operations", Cheng Huang and Lihao Xu, Washington University in St. Louis, Tech. Rep (2003).
gf_log = [0] * 256
field_charac = int(2**8 - 1)


################### GALOIS FIELD ELEMENTS MATHS ###################

def rwh_primes1(n):
    # http://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    ''' Returns  a list of primes < n '''
    sieve = [True] * (n/2)
    for i in xrange(3,int(n**0.5)+1,2):
        if sieve[i/2]:
            sieve[i*i/2::i] = [False] * ((n-i*i-1)/(2*i)+1)
    return [2] + [2*i+1 for i in xrange(1,n/2) if sieve[i]]

def find_prime_polys(generator=2, c_exp=8, fast_primes=False, single=False):
    '''Compute the list of prime polynomials for the given generator and galois field characteristic exponent.'''
    # fast_primes will output less results but will be significantly faster.
    # single will output the first prime polynomial found, so if all you want is to just find one prime polynomial to generate the LUT for Reed-Solomon to work, then just use that.

    # A prime polynomial (necessarily irreducible) is necessary to reduce the multiplications in the Galois Field, so as to avoid overflows.
    # Why do we need a "prime polynomial"? Can't we just reduce modulo 255 (for GF(2^8) for example)? Because we need the values to be unique.
    # For example: if the generator (alpha) = 2 and c_exp = 8 (GF(2^8) == GF(256)), then the generated Galois Field (0, 1, a, a^1, a^2, ..., a^(p-1)) will be galois field it becomes 0, 1, 2, 4, 8, 16, etc. However, upon reaching 128, the next value will be doubled (ie, next power of 2), which will give 256. Then we must reduce, because we have overflowed above the maximum value of 255. But, if we modulo 255, this will generate 256 == 1. Then 2, 4, 8, 16, etc. giving us a repeating pattern of numbers. This is very bad, as it's then not anymore a bijection (ie, a non-zero value doesn't have a unique index). That's why we can't just modulo 255, but we need another number above 255, which is called the prime polynomial.
    # Why so much hassle? Because we are using precomputed look-up tables for multiplication: instead of multiplying a*b, we precompute alpha^a, alpha^b and alpha^(a+b), so that we can just use our lookup table at alpha^(a+b) and get our result. But just like in our original field we had 0,1,2,...,p-1 distinct unique values, in our "LUT" field using alpha we must have unique distinct values (we don't care that they are different from the original field as long as they are unique and distinct). That's why we need to avoid duplicated values, and to avoid duplicated values we need to use a prime irreducible polynomial.

    # Here is implemented a bruteforce approach to find all these prime polynomials, by generating every possible prime polynomials (ie, every integers between field_charac+1 and field_charac*2), and then we build the whole Galois Field, and we reject the candidate prime polynomial if it duplicates even one value or if it generates a value above field_charac (ie, cause an overflow).
    # Note that this algorithm is slow if the field is too big (above 12), because it's an exhaustive search algorithm. There are probabilistic approaches, and almost surely prime approaches, but there is no determistic polynomial time algorithm to find irreducible monic polynomials. More info can be found at: http://people.mpi-inf.mpg.de/~csaha/lectures/lec9.pdf
    # Another faster algorithm may be found at Adleman, Leonard M., and Hendrik W. Lenstra. "Finding irreducible polynomials over finite fields." Proceedings of the eighteenth annual ACM symposium on Theory of computing. ACM, 1986.

    # Prepare the finite field characteristic (2^p - 1), this also represent the maximum possible value in this field
    root_charac = 2 # we're in GF(2)
    field_charac = int(root_charac**c_exp - 1)
    field_charac_next = int(root_charac**(c_exp+1) - 1)

    prim_candidates = []
    if fast_primes:
        prim_candidates = rwh_primes1(field_charac_next) # generate maybe prime polynomials and check later if they really are irreducible
        prim_candidates = [x for x in prim_candidates if x > field_charac] # filter out too small primes
    else:
        prim_candidates = xrange(field_charac+2, field_charac_next, root_charac) # try each possible prime polynomial, but skip even numbers (because divisible by 2 so necessarily not irreducible)

    # Start of the main loop
    correct_primes = []
    for prim in prim_candidates: # try potential candidates primitive irreducible polys
        seen = [0] * (field_charac+1) # memory variable to indicate if a value was already generated in the field (value at index x is set to 1) or not (set to 0 by default)
        conflict = False # flag to know if there was at least one conflict

        # Second loop, build the whole Galois Field
        x = 1
        for i in xrange(field_charac):
            # Compute the next value in the field (ie, the next power of alpha/generator)
            x = gf_mult_noLUT(x, generator, prim, field_charac+1)

            # Rejection criterion: if the value overflowed (above field_charac) or is a duplicate of a previously generated power of alpha, then we reject this polynomial (not prime)
            if x > field_charac or seen[x] == 1:
                conflict = True
                break
            # Else we flag this value as seen (to maybe detect future duplicates), and we continue onto the next power of alpha
            else:
                seen[x] = 1

        # End of the second loop: if there's no conflict (no overflow nor duplicated value), this is a prime polynomial!
        if not conflict: 
            correct_primes.append(prim)
            if single: return prim

    # Return the list of all prime polynomials
    return correct_primes # you can use the following to print the hexadecimal representation of each prime polynomial: print [hex(i) for i in correct_primes]

def init_tables(prim=0x11d, generator=2, c_exp=8):
    '''Precompute the logarithm and anti-log tables for faster computation later, using the provided primitive polynomial.
    These tables are used for multiplication/division since addition/substraction are simple XOR operations inside GF of characteristic 2.
    The basic idea is quite simple: since b**(log_b(x), log_b(y)) == x * y given any number b (the base or generator of the logarithm), then we can use any number b to precompute logarithm and anti-log (exponentiation) tables to use for multiplying two numbers x and y.
    That's why when we use a different base/generator number, the log and anti-log tables are drastically different, but the resulting computations are the same given any such tables.
    For more infos, see https://en.wikipedia.org/wiki/Finite_field_arithmetic#Implementation_tricks
    '''
    # generator is the generator number (the "increment" that will be used to walk through the field by multiplication, this must be a prime number). This is basically the base of the logarithm/anti-log tables. Also often noted "alpha" in academic books.
    # prim is the primitive/prime (binary) polynomial and must be irreducible (ie, it can't represented as the product of two smaller polynomials). It's a polynomial in the binary sense: each bit is a coefficient, but in fact it's an integer between field_charac+1 and field_charac*2, and not a list of gf values. The prime polynomial will be used to reduce the overflows back into the range of the Galois Field without duplicating values (all values should be unique). See the function find_prime_polys() and: http://research.swtch.com/field and http://www.pclviewer.com/rs2/galois.html
    # note that the choice of generator or prime polynomial doesn't matter very much: any two finite fields of size p^n have identical structure, even if they give the individual elements different names (ie, the coefficients of the codeword will be different, but the final result will be the same: you can always correct as many errors/erasures with any choice for those parameters). That's why it makes sense to refer to all the finite fields, and all decoders based on Reed-Solomon, of size p^n as one concept: GF(p^n). It can however impact sensibly the speed (because some parameters will generate sparser tables).
    # c_exp is the exponent for the field's characteristic GF(2^c_exp)

    global gf_exp, gf_log, field_charac
    field_charac = int(2**c_exp - 1)
    gf_exp = [0] * (field_charac * 2) # anti-log (exponential) table. The first two elements will always be [GF256int(1), generator]
    gf_log = [0] * (field_charac+1) # log table, log[0] is impossible and thus unused

    # For each possible value in the galois field 2^8, we will pre-compute the logarithm and anti-logarithm (exponential) of this value
    # To do that, we generate the Galois Field F(2^p) by building a list starting with the element 0 followed by the (p-1) successive powers of the generator a : 1, a, a^1, a^2, ..., a^(p-1).
    x = 1
    for i in xrange(field_charac): # we could skip index 255 which is equal to index 0 because of modulo: g^255==g^0 but either way, this does not change the later outputs (ie, the ecc symbols will be the same either way)
        gf_exp[i] = x # compute anti-log for this value and store it in a table
        gf_log[x] = i # compute log at the same time
        x = gf_mult_noLUT(x, generator, prim, field_charac+1)

        # If you use only generator==2 or a power of 2, you can use the following which is faster than gf_mult_noLUT():
        #x <<= 1 # multiply by 2 (change 1 by another number y to multiply by a power of 2^y)
        #if x & 0x100: # similar to x >= 256, but a lot faster (because 0x100 == 256)
            #x ^= prim # substract the primary polynomial to the current value (instead of 255, so that we get a unique set made of coprime numbers), this is the core of the tables generation

    # Optimization: double the size of the anti-log table so that we don't need to mod 255 to stay inside the bounds (because we will mainly use this table for the multiplication of two GF numbers, no more).
    for i in xrange(field_charac, field_charac * 2):
        gf_exp[i] = gf_exp[i - field_charac]

    return [gf_log, gf_exp]

def gf_add(x, y):
    return x ^ y

def gf_sub(x, y):
    return x ^ y # in binary galois field, substraction is just the same as addition (since we mod 2)

def gf_neg(x):
    return x

def gf_mul(x, y):
    if x == 0 or y == 0:
        return 0
    return gf_exp[(gf_log[x] + gf_log[y]) % field_charac]

def gf_div(x, y):
    if y == 0:
        raise ZeroDivisionError()
    if x == 0:
        return 0
    return gf_exp[(gf_log[x] + field_charac - gf_log[y]) % field_charac]

def gf_pow(x, power):
    return gf_exp[(gf_log[x] * power) % field_charac]

def gf_inverse(x):
    return gf_exp[field_charac - gf_log[x]] # gf_inverse(x) == gf_div(1, x)

def gf_mult_noLUT(x, y, prim=0, field_charac_full=256, carryless=True):
    '''Galois Field integer multiplication using Russian Peasant Multiplication algorithm (faster than the standard multiplication + modular reduction).
    If prim is 0 and carryless=False, then the function produces the result for a standard integers multiplication (no carry-less arithmetics nor modular reduction).'''
    r = 0
    while y: # while y is above 0
        if y & 1: r = r ^ x if carryless else r + x # y is odd, then add the corresponding x to r (the sum of all x's corresponding to odd y's will give the final product). Note that since we're in GF(2), the addition is in fact an XOR (very important because in GF(2) the multiplication and additions are carry-less, thus it changes the result!).
        y = y >> 1 # equivalent to y // 2
        x = x << 1 # equivalent to x*2
        if prim > 0 and x & field_charac_full: x = x ^ prim # GF modulo: if x >= 256 then apply modular reduction using the primitive polynomial (we just substract, but since the primitive number can be above 256 then we directly XOR).

    return r


################### GALOIS FIELD POLYNOMIALS MATHS ###################

def gf_poly_scale(p, x):
    return [gf_mul(p[i], x) for i in xrange(len(p))]

def gf_poly_add(p,q):
    r = [0] * max(len(p),len(q))
    for i in xrange(len(p)):
        r[i+len(r)-len(p)] = p[i]
    for i in xrange(len(q)):
        r[i+len(r)-len(q)] ^= q[i]
    return r

def gf_poly_mul(p, q):
    '''Multiply two polynomials, inside Galois Field (but the procedure is generic). Optimized function by precomputation of log.'''
    # Pre-allocate the result array
    r = [0] * (len(p) + len(q) - 1)
    # Precompute the logarithm of p
    lp = [gf_log[p[i]] for i in xrange(len(p))]
    # Compute the polynomial multiplication (just like the outer product of two vectors, we multiply each coefficients of p with all coefficients of q)
    for j in xrange(len(q)):
        qj = q[j] # optimization: load the coefficient once
        if qj != 0: # log(0) is undefined, we need to check that
            lq = gf_log[qj] # Optimization: precache the logarithm of the current coefficient of q
            for i in xrange(len(p)):
                if p[i] != 0: # log(0) is undefined, need to check that...
                    r[i + j] ^= gf_exp[lp[i] + lq] # equivalent to: r[i + j] = gf_add(r[i+j], gf_mul(p[i], q[j]))
    return r

def gf_poly_mul_simple(p, q): # simple equivalent way of multiplying two polynomials without precomputation, but thus it's slower
    '''Multiply two polynomials, inside Galois Field'''
    # Pre-allocate the result array
    r = [0] * (len(p) + len(q) - 1)
    # Compute the polynomial multiplication (just like the outer product of two vectors, we multiply each coefficients of p with all coefficients of q)
    for j in xrange(len(q)):
        for i in xrange(len(p)):
            r[i + j] ^= gf_mul(p[i], q[j]) # equivalent to: r[i + j] = gf_add(r[i+j], gf_mul(p[i], q[j])) -- you can see it's your usual polynomial multiplication
    return r

def gf_poly_neg(poly):
    '''Returns the polynomial with all coefficients negated. In GF(2^p), negation does not change the coefficient, so we return the polynomial as-is.'''
    return poly

def gf_poly_div(dividend, divisor):
    '''Fast polynomial division by using Extended Synthetic Division and optimized for GF(2^p) computations
    (doesn't work with standard polynomials outside of this galois field, see the Wikipedia article for generic algorithm).'''
    # CAUTION: this function expects polynomials to follow the opposite convention at decoding:
    # the terms must go from the biggest to lowest degree (while most other functions here expect
    # a list from lowest to biggest degree). eg: 1 + 2x + 5x^2 = [5, 2, 1], NOT [1, 2, 5]

    msg_out = list(dividend) # Copy the dividend list and pad with 0 where the ecc bytes will be computed
    #normalizer = divisor[0] # precomputing for performance
    for i in xrange(len(dividend) - (len(divisor)-1)):
        #msg_out[i] /= normalizer # for general polynomial division (when polynomials are non-monic), the usual way of using
                                  # synthetic division is to divide the divisor g(x) with its leading coefficient, but not needed here.
        coef = msg_out[i] # precaching
        if coef != 0: # log(0) is undefined, so we need to avoid that case explicitly (and it's also a good optimization).
            for j in xrange(1, len(divisor)): # in synthetic division, we always skip the first coefficient of the divisior,
                                              # because it's only used to normalize the dividend coefficient
                if divisor[j] != 0: # log(0) is undefined
                    msg_out[i + j] ^= gf_mul(divisor[j], coef) # equivalent to the more mathematically correct
                                                               # (but xoring directly is faster): msg_out[i + j] += -divisor[j] * coef

    # The resulting msg_out contains both the quotient and the remainder, the remainder being the size of the divisor
    # (the remainder has necessarily the same degree as the divisor -- not length but degree == length-1 -- since it's
    # what we couldn't divide from the dividend), so we compute the index where this separation is, and return the quotient and remainder.
    separator = -(len(divisor)-1)
    return msg_out[:separator], msg_out[separator:] # return quotient, remainder.

def gf_poly_eval(poly, x):
    '''Evaluates a polynomial in GF(2^p) given the value for x. This is based on Horner's scheme for maximum efficiency.'''
    y = poly[0]
    for i in xrange(1, len(poly)):
        y = gf_mul(y, x) ^ poly[i]
    return y


################### REED-SOLOMON ENCODING ###################

def rs_generator_poly(nsym, fcr=0, generator=2):
    '''Generate an irreducible generator polynomial (necessary to encode a message into Reed-Solomon)'''
    g = [1]
    for i in xrange(nsym):
        g = gf_poly_mul(g, [1, gf_pow(generator, i+fcr)])
    return g

def rs_generator_poly_all(max_nsym, fcr=0, generator=2):
    '''Generate all irreducible generator polynomials up to max_nsym (usually you can use n, the length of the message+ecc). Very useful to reduce processing time if you want to encode using variable schemes and nsym rates.'''
    g_all = {}
    g_all[0] = g_all[1] = [1]
    for nsym in xrange(max_nsym):
        g_all[nsym] = rs_generator_poly(nsym, fcr, generator)
    return g_all

def rs_simple_encode_msg(msg_in, nsym, fcr=0, generator=2, gen=None):
    '''Simple Reed-Solomon encoding (mainly an example for you to understand how it works, because it's slower than the inlined function below)'''
    global field_charac
    if (len(msg_in) + nsym) > field_charac: raise ValueError("Message is too long (%i when max is %i)" % (len(msg_in)+nsym, field_charac))
    if gen is None: gen = rs_generator_poly(nsym, fcr, generator)

    # Pad the message, then divide it by the irreducible generator polynomial
    _, remainder = gf_poly_div(msg_in + [0] * (len(gen)-1), gen)
    # The remainder is our RS code! Just append it to our original message to get our full codeword (this represents a polynomial of max 256 terms)
    msg_out = msg_in + remainder
    # Return the codeword
    return msg_out

def rs_encode_msg(msg_in, nsym, fcr=0, generator=2, gen=None):
    '''Reed-Solomon main encoding function, using polynomial division (Extended Synthetic Division, the fastest algorithm available to my knowledge), better explained at http://research.swtch.com/field'''
    global field_charac
    if (len(msg_in) + nsym) > field_charac: raise ValueError("Message is too long (%i when max is %i)" % (len(msg_in)+nsym, field_charac))
    if gen is None: gen = rs_generator_poly(nsym, fcr, generator)
    # Init msg_out with the values inside msg_in and pad with len(gen)-1 bytes (which is the number of ecc symbols).
    msg_out = [0] * (len(msg_in) + len(gen)-1)
    # Initializing the Synthetic Division with the dividend (= input message polynomial)
    msg_out[:len(msg_in)] = msg_in

    # Synthetic division main loop
    for i in xrange(len(msg_in)):
        # Note that it's msg_out here, not msg_in. Thus, we reuse the updated value at each iteration
        # (this is how Synthetic Division works: instead of storing in a temporary register the intermediate values,
        # we directly commit them to the output).
        coef = msg_out[i]

        # log(0) is undefined, so we need to manually check for this case.
        if coef != 0:
            # in synthetic division, we always skip the first coefficient of the divisior, because it's only used to normalize the dividend coefficient (which is here useless since the divisor, the generator polynomial, is always monic)
            for j in xrange(1, len(gen)):
                #if gen[j] != 0: # log(0) is undefined so we need to check that, but it slow things down in fact and it's useless in our case (reed-solomon encoding) since we know that all coefficients in the generator are not 0
                msg_out[i+j] ^= gf_mul(gen[j], coef) # equivalent to msg_out[i+j] += gf_mul(gen[j], coef)

    # At this point, the Extended Synthetic Divison is done, msg_out contains the quotient in msg_out[:len(msg_in)]
    # and the remainder in msg_out[len(msg_in):]. Here for RS encoding, we don't need the quotient but only the remainder
    # (which represents the RS code), so we can just overwrite the quotient with the input message, so that we get
    # our complete codeword composed of the message + code.
    msg_out[:len(msg_in)] = msg_in

    return msg_out


################### REED-SOLOMON DECODING ###################

def rs_calc_syndromes(msg, nsym, fcr=0, generator=2):
    '''Given the received codeword msg and the number of error correcting symbols (nsym), computes the syndromes polynomial.
    Mathematically, it's essentially equivalent to a Fourrier Transform (Chien search being the inverse).
    '''
    # Note the "[0] +" : we add a 0 coefficient for the lowest degree (the constant). This effectively shifts the syndrome, and will shift every computations depending on the syndromes (such as the errors locator polynomial, errors evaluator polynomial, etc. but not the errors positions).
    # This is not necessary, you can adapt subsequent computations to start from 0 instead of skipping the first iteration (ie, the often seen range(1, n-k+1)),
    synd = [0] * nsym
    for i in xrange(nsym):
        synd[i] = gf_poly_eval(msg, gf_pow(generator, i+fcr))
    return [0] + synd # pad with one 0 for mathematical precision (else we can end up with weird calculations sometimes)

def rs_correct_errata(msg_in, synd, err_pos, fcr=0, generator=2): # err_pos is a list of the positions of the errors/erasures/errata
    '''Forney algorithm, computes the values (error magnitude) to correct the input message.'''
    global field_charac
    # calculate errata locator polynomial to correct both errors and erasures (by combining the errors positions given by the error locator polynomial found by BM with the erasures positions given by caller)
    coef_pos = [len(msg_in) - 1 - p for p in err_pos] # need to convert the positions to coefficients degrees for the errata locator algo to work (eg: instead of [0, 1, 2] it will become [len(msg)-1, len(msg)-2, len(msg) -3])
    err_loc = rs_find_errata_locator(coef_pos, generator)
    # calculate errata evaluator polynomial (often called Omega or Gamma in academic papers)
    err_eval = rs_find_error_evaluator(synd[::-1], err_loc, len(err_loc)-1)[::-1]

    # Second part of Chien search to get the error location polynomial X from the error positions in err_pos (the roots of the error locator polynomial, ie, where it evaluates to 0)
    X = [] # will store the position of the errors
    for i in xrange(len(coef_pos)):
        l = field_charac - coef_pos[i]
        X.append( gf_pow(generator, -l) )

    # Forney algorithm: compute the magnitudes
    E = [0] * (len(msg_in)) # will store the values that need to be corrected (substracted) to the message containing errors. This is sometimes called the error magnitude polynomial.
    Xlength = len(X)
    for i, Xi in enumerate(X):

        Xi_inv = gf_inverse(Xi)

        # Compute the formal derivative of the error locator polynomial (see Blahut, Algebraic codes for data transmission, pp 196-197).
        # the formal derivative of the errata locator is used as the denominator of the Forney Algorithm, which simply says that the ith error value is given by error_evaluator(gf_inverse(Xi)) / error_locator_derivative(gf_inverse(Xi)). See Blahut, Algebraic codes for data transmission, pp 196-197.
        err_loc_prime_tmp = []
        for j in xrange(Xlength):
            if j != i:
                err_loc_prime_tmp.append( gf_sub(1, gf_mul(Xi_inv, X[j])) )
        # compute the product, which is the denominator of the Forney algorithm (errata locator derivative)
        err_loc_prime = 1
        for coef in err_loc_prime_tmp:
            err_loc_prime = gf_mul(err_loc_prime, coef)
        # equivalent to: err_loc_prime = functools.reduce(gf_mul, err_loc_prime_tmp, 1)

        # Compute y (evaluation of the errata evaluator polynomial)
        # This is a more faithful translation of the theoretical equation contrary to the old forney method. Here it is an exact reproduction:
        # Yl = omega(Xl.inverse()) / prod(1 - Xj*Xl.inverse()) for j in len(X)
        y = gf_poly_eval(err_eval[::-1], Xi_inv) # numerator of the Forney algorithm (errata evaluator evaluated)
        y = gf_mul(gf_pow(Xi, 1-fcr), y) # adjust to fcr parameter
        
        # Check: err_loc_prime (the divisor) should not be zero.
        if err_loc_prime == 0:
            raise ReedSolomonError("Could not find error magnitude")    # Could not find error magnitude

        # Compute the magnitude
        magnitude = gf_div(y, err_loc_prime) # magnitude value of the error, calculated by the Forney algorithm (an equation in fact): dividing the errata evaluator with the errata locator derivative gives us the errata magnitude (ie, value to repair) the ith symbol
        E[err_pos[i]] = magnitude # store the magnitude for this error into the magnitude polynomial

    # Apply the correction of values to get our message corrected! (note that the ecc bytes also gets corrected!)
    # (this isn't the Forney algorithm, we just apply the result of decoding here)
    msg_in = gf_poly_add(msg_in, E) # equivalent to Ci = Ri - Ei where Ci is the correct message, Ri the received (senseword) message, and Ei the errata magnitudes (minus is replaced by XOR since it's equivalent in GF(2^p)). So in fact here we substract from the received message the errors magnitude, which logically corrects the value to what it should be.
    return msg_in

def rs_find_error_locator(synd, nsym, erase_loc=None, erase_count=0):
    '''Find error/errata locator and evaluator polynomials with Berlekamp-Massey algorithm'''
    # The idea is that BM will iteratively estimate the error locator polynomial.
    # To do this, it will compute a Discrepancy term called Delta, which will tell us if the error locator polynomial needs an update or not
    # (hence why it's called discrepancy: it tells us when we are getting off board from the correct value).

    # Init the polynomials
    if erase_loc: # if the erasure locator polynomial is supplied, we init with its value, so that we include erasures in the final locator polynomial
        err_loc = list(erase_loc)
        old_loc = list(erase_loc)
    else:
        err_loc = [1] # This is the main variable we want to fill, also called Sigma in other notations or more formally the errors/errata locator polynomial.
        old_loc = [1] # BM is an iterative algorithm, and we need the errata locator polynomial of the previous iteration in order to update other necessary variables.
    #L = 0 # update flag variable, not needed here because we use an alternative equivalent way of checking if update is needed (but using the flag could potentially be faster depending on if using length(list) is taking linear time in your language, here in Python it's constant so it's as fast.

    # Fix the syndrome shifting: when computing the syndrome, some implementations may prepend a 0 coefficient for the lowest degree term (the constant). This is a case of syndrome shifting, thus the syndrome will be bigger than the number of ecc symbols (I don't know what purpose serves this shifting). If that's the case, then we need to account for the syndrome shifting when we use the syndrome such as inside BM, by skipping those prepended coefficients.
    # Another way to detect the shifting is to detect the 0 coefficients: by definition, a syndrome does not contain any 0 coefficient (except if there are no errors/erasures, in this case they are all 0). This however doesn't work with the modified Forney syndrome, which set to 0 the coefficients corresponding to erasures, leaving only the coefficients corresponding to errors.
    synd_shift = 0
    if len(synd) > nsym: synd_shift = len(synd) - nsym

    for i in xrange(nsym-erase_count): # generally: nsym-erase_count == len(synd), except when you input a partial erase_loc and using the full syndrome instead of the Forney syndrome, in which case nsym-erase_count is more correct (len(synd) will fail badly with IndexError).
        if erase_loc: # if an erasures locator polynomial was provided to init the errors locator polynomial, then we must skip the FIRST erase_count iterations (not the last iterations, this is very important!)
            K = erase_count+i+synd_shift
        else: # if erasures locator is not provided, then either there's no erasures to account or we use the Forney syndromes, so we don't need to use erase_count nor erase_loc (the erasures have been trimmed out of the Forney syndromes).
            K = i+synd_shift

        # Compute the discrepancy Delta
        # Here is the close-to-the-books operation to compute the discrepancy Delta: it's a simple polynomial multiplication of error locator with the syndromes, and then we get the Kth element.
        #delta = gf_poly_mul(err_loc[::-1], synd)[K] # theoretically it should be gf_poly_add(synd[::-1], [1])[::-1] instead of just synd, but it seems it's not absolutely necessary to correctly decode.
        # But this can be optimized: since we only need the Kth element, we don't need to compute the polynomial multiplication for any other element but the Kth. Thus to optimize, we compute the polymul only at the item we need, skipping the rest (avoiding a nested loop, thus we are linear time instead of quadratic).
        # This optimization is actually described in several figures of the book "Algebraic codes for data transmission", Blahut, Richard E., 2003, Cambridge university press.
        delta = synd[K]
        for j in xrange(1, len(err_loc)):
            delta ^= gf_mul(err_loc[-(j+1)], synd[K - j]) # delta is also called discrepancy. Here we do a partial polynomial multiplication (ie, we compute the polynomial multiplication only for the term of degree K). Should be equivalent to brownanrs.polynomial.mul_at().
        #print "delta", K, delta, list(gf_poly_mul(err_loc[::-1], synd)) # debugline

        # Shift polynomials to compute the next degree
        old_loc = old_loc + [0]

        # Iteratively estimate the errata locator and evaluator polynomials
        if delta != 0: # Update only if there's a discrepancy
            if len(old_loc) > len(err_loc): # Rule B (rule A is implicitly defined because rule A just says that we skip any modification for this iteration)
            #if 2*L <= K+erase_count: # equivalent to len(old_loc) > len(err_loc), as long as L is correctly computed
                # Computing errata locator polynomial Sigma
                new_loc = gf_poly_scale(old_loc, delta)
                old_loc = gf_poly_scale(err_loc, gf_inverse(delta)) # effectively we are doing err_loc * 1/delta = err_loc // delta
                err_loc = new_loc
                # Update the update flag
                #L = K - L # the update flag L is tricky: in Blahut's schema, it's mandatory to use `L = K - L - erase_count` (and indeed in a previous draft of this function, if you forgot to do `- erase_count` it would lead to correcting only 2*(errors+erasures) <= (n-k) instead of 2*errors+erasures <= (n-k)), but in this latest draft, this will lead to a wrong decoding in some cases where it should correctly decode! Thus you should try with and without `- erase_count` to update L on your own implementation and see which one works OK without producing wrong decoding failures.

            # Update with the discrepancy
            err_loc = gf_poly_add(err_loc, gf_poly_scale(old_loc, delta))

    # Check if the result is correct, that there's not too many errors to correct
    while len(err_loc) and err_loc[0] == 0: del err_loc[0] # drop leading 0s, else errs will not be of the correct size
    errs = len(err_loc) - 1
    if (errs-erase_count) * 2 + erase_count > nsym:
        raise ReedSolomonError("Too many errors to correct")    # too many errors to correct

    return err_loc

def rs_find_errata_locator(e_pos, generator=2):
    '''Compute the erasures/errors/errata locator polynomial from the erasures/errors/errata positions
       (the positions must be relative to the x coefficient, eg: "hello worldxxxxxxxxx" is tampered to "h_ll_ worldxxxxxxxxx"
       with xxxxxxxxx being the ecc of length n-k=9, here the string positions are [1, 4], but the coefficients are reversed
       since the ecc characters are placed as the first coefficients of the polynomial, thus the coefficients of the
       erased characters are n-1 - [1, 4] = [18, 15] = erasures_loc to be specified as an argument.'''

    e_loc = [1] # just to init because we will multiply, so it must be 1 so that the multiplication starts correctly without nulling any term
    # erasures_loc = product(1 - x*alpha**i) for i in erasures_pos and where alpha is the alpha chosen to evaluate polynomials.
    for i in e_pos:
        e_loc = gf_poly_mul( e_loc, gf_poly_add([1], [gf_pow(generator, i), 0]) )
    return e_loc

def rs_find_error_evaluator(synd, err_loc, nsym):
    '''Compute the error (or erasures if you supply sigma=erasures locator polynomial, or errata) evaluator polynomial Omega
       from the syndrome and the error/erasures/errata locator Sigma.'''

    # Omega(x) = [ Synd(x) * Error_loc(x) ] mod x^(n-k+1)
    _, remainder = gf_poly_div( gf_poly_mul(synd, err_loc), ([1] + [0]*(nsym+1)) ) # first multiply syndromes * errata_locator, then do a
                                                                                   # polynomial division to truncate the polynomial to the
                                                                                   # required length

    # Faster way that is equivalent
    #remainder = gf_poly_mul(synd, err_loc) # first multiply the syndromes with the errata locator polynomial
    #remainder = remainder[len(remainder)-(nsym+1):] # then slice the list to truncate it (which represents the polynomial), which
                                                     # is equivalent to dividing by a polynomial of the length we want

    return remainder

def rs_find_errors(err_loc, nmess, generator=2): # nmess is len(msg_in)
    '''Find the roots (ie, where evaluation = zero) of error polynomial by brute-force trial, this is a sort of Chien's search
    (but less efficient, Chien's search is a way to evaluate the polynomial such that each evaluation only takes constant time).'''
    errs = len(err_loc) - 1
    err_pos = []
    for i in xrange(nmess): # normally we should try all 2^8 possible values, but here we optimize to just check the interesting symbols
        if gf_poly_eval(err_loc, gf_pow(generator, i)) == 0: # It's a 0? Bingo, it's a root of the error locator polynomial,
                                                        # in other terms this is the location of an error
            err_pos.append(nmess - 1 - i)
    # Sanity check: the number of errors/errata positions found should be exactly the same as the length of the errata locator polynomial
    if len(err_pos) != errs:
        # couldn't find error locations
        raise ReedSolomonError("Too many (or few) errors found by Chien Search for the errata locator polynomial!")
    return err_pos

def rs_forney_syndromes(synd, pos, nmess, generator=2):
    # Compute Forney syndromes, which computes a modified syndromes to compute only errors (erasures are trimmed out). Do not confuse this with Forney algorithm, which allows to correct the message based on the location of errors.
    erase_pos_reversed = [nmess-1-p for p in pos] # prepare the coefficient degree positions (instead of the erasures positions)

    # Optimized method, all operations are inlined
    fsynd = list(synd[1:])      # make a copy and trim the first coefficient which is always 0 by definition
    for i in xrange(len(pos)):
        x = gf_pow(generator, erase_pos_reversed[i])
        for j in xrange(len(fsynd) - 1):
            fsynd[j] = gf_mul(fsynd[j], x) ^ fsynd[j + 1]
        #fsynd.pop() # useless? it doesn't change the results of computations to leave it there

    # Theoretical way of computing the modified Forney syndromes: fsynd = (erase_loc * synd) % x^(n-k)
    # See Shao, H. M., Truong, T. K., Deutsch, L. J., & Reed, I. S. (1986, April). A single chip VLSI Reed-Solomon decoder. In Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP'86. (Vol. 11, pp. 2151-2154). IEEE.ISO 690
    #erase_loc = rs_find_errata_locator(erase_pos_reversed, generator=generator) # computing the erasures locator polynomial
    #fsynd = gf_poly_mul(erase_loc[::-1], synd[1:]) # then multiply with the syndrome to get the untrimmed forney syndrome
    #fsynd = fsynd[len(pos):] # then trim the first erase_pos coefficients which are useless. Seems to be not necessary, but this reduces the computation time later in BM (thus it's an optimization).

    return fsynd

def rs_correct_msg(msg_in, nsym, fcr=0, generator=2, erase_pos=None, only_erasures=False):
    '''Reed-Solomon main decoding function'''
    global field_charac
    if len(msg_in) > field_charac:
        # Note that it is in fact possible to encode/decode messages that are longer than field_charac, but because this will be above the field, this will generate more error positions during Chien Search than it should, because this will generate duplicate values, which should normally be prevented thank's to the prime polynomial reduction (eg, because it can't discriminate between error at position 1 or 256, both being exactly equal under galois field 2^8). So it's really not advised to do it, but it's possible (but then you're not guaranted to be able to correct any error/erasure on symbols with a position above the length of field_charac -- if you really need a bigger message without chunking, then you should better enlarge c_exp so that you get a bigger field).
        raise ValueError("Message is too long (%i when max is %i)" % (len(msg_in), field_charac))

    msg_out = list(msg_in)     # copy of message
    # erasures: set them to null bytes for easier decoding (but this is not necessary, they will be corrected anyway, but debugging will be easier with null bytes because the error locator polynomial values will only depend on the errors locations, not their values)
    if erase_pos is None:
        erase_pos = []
    else:
        for e_pos in erase_pos:
            msg_out[e_pos] = 0
    # check if there are too many erasures to correct (beyond the Singleton bound)
    if len(erase_pos) > nsym: raise ReedSolomonError("Too many erasures to correct")
    # prepare the syndrome polynomial using only errors (ie: errors = characters that were either replaced by null byte
    # or changed to another character, but we don't know their positions)
    synd = rs_calc_syndromes(msg_out, nsym, fcr, generator)
    # check if there's any error/erasure in the input codeword. If not (all syndromes coefficients are 0), then just return the message as-is.
    if max(synd) == 0:
        return msg_out[:-nsym], msg_out[-nsym:]  # no errors

    # Find errors locations
    if only_erasures:
        err_pos = []
    else:
        # compute the Forney syndromes, which hide the erasures from the original syndrome (so that BM will just have to deal with errors, not erasures)
        fsynd = rs_forney_syndromes(synd, erase_pos, len(msg_out), generator)
        # compute the error locator polynomial using Berlekamp-Massey
        err_loc = rs_find_error_locator(fsynd, nsym, erase_count=len(erase_pos))
        # locate the message errors using Chien search (or bruteforce search)
        err_pos = rs_find_errors(err_loc[::-1], len(msg_out), generator)
        if err_pos is None:
            raise ReedSolomonError("Could not locate error")


    # Find errors values and apply them to correct the message
    # compute errata evaluator and errata magnitude polynomials, then correct errors and erasures
    msg_out = rs_correct_errata(msg_out, synd, (erase_pos + err_pos), fcr, generator) # note that we here use the original syndrome, not the forney syndrome
                                                         # (because we will correct both errors and erasures, so we need the full syndrome)
    # check if the final message is fully repaired
    synd = rs_calc_syndromes(msg_out, nsym, fcr, generator)
    if max(synd) > 0:
        raise ReedSolomonError("Could not correct message")     # message could not be repaired
    # return the successfully decoded message
    return msg_out[:-nsym], msg_out[-nsym:] # also return the corrected ecc block so that the user can check()

def rs_correct_msg_nofsynd(msg_in, nsym, fcr=0, generator=2, erase_pos=None, only_erasures=False):
    '''Reed-Solomon main decoding function, without using the modified Forney syndromes
    This demonstrates how the decoding process is done without using the Forney syndromes
    (this is the most common way nowadays, avoiding Forney syndromes require to use a modified Berlekamp-Massey
    that will take care of the erasures by itself, it's a simple matter of modifying some initialization variables and the loop ranges)'''
    global field_charac
    if len(msg_in) > field_charac:
        raise ValueError("Message is too long (%i when max is %i)" % (len(msg_in), field_charac))

    msg_out = list(msg_in)     # copy of message
    # erasures: set them to null bytes for easier decoding (but this is not necessary, they will be corrected anyway, but debugging will be easier with null bytes because the error locator polynomial values will only depend on the errors locations, not their values)
    if erase_pos is None:
        erase_pos = []
    else:
        for e_pos in erase_pos:
            msg_out[e_pos] = 0
    # check if there are too many erasures
    if len(erase_pos) > nsym: raise ReedSolomonError("Too many erasures to correct")
    # prepare the syndrome polynomial using only errors (ie: errors = characters that were either replaced by null byte or changed to another character, but we don't know their positions)
    synd = rs_calc_syndromes(msg_out, nsym, fcr, generator)
    # check if there's any error/erasure in the input codeword. If not (all syndromes coefficients are 0), then just return the codeword as-is.
    if max(synd) == 0:
        return msg_out[:-nsym], msg_out[-nsym:]  # no errors

    # prepare erasures locator and evaluator polynomials
    erase_loc = None
    #erase_eval = None
    erase_count = 0
    if erase_pos:
        erase_count = len(erase_pos)
        erase_pos_reversed = [len(msg_out)-1-eras for eras in erase_pos]
        erase_loc = rs_find_errata_locator(erase_pos_reversed, generator=generator)
        #erase_eval = rs_find_error_evaluator(synd[::-1], erase_loc, len(erase_loc)-1)

    # prepare errors/errata locator polynomial
    if only_erasures:
        err_loc = erase_loc[::-1]
        #err_eval = erase_eval[::-1]
    else:
        err_loc = rs_find_error_locator(synd, nsym, erase_loc=erase_loc, erase_count=erase_count)
        err_loc = err_loc[::-1]
        #err_eval = rs_find_error_evaluator(synd[::-1], err_loc[::-1], len(err_loc)-1)[::-1] # find error/errata evaluator polynomial (not really necessary since we already compute it at the same time as the error locator poly in BM)

    # locate the message errors
    err_pos = rs_find_errors(err_loc, len(msg_out), generator) # find the roots of the errata locator polynomial (ie: the positions of the errors/errata)
    if err_pos is None:
        raise ReedSolomonError("Could not locate error")

    # compute errata evaluator and errata magnitude polynomials, then correct errors and erasures
    msg_out = rs_correct_errata(msg_out, synd, err_pos, fcr=fcr, generator=generator)
    # check if the final message is fully repaired
    synd = rs_calc_syndromes(msg_out, nsym, fcr, generator)
    if max(synd) > 0:
        raise ReedSolomonError("Could not correct message")
    # return the successfully decoded message
    return msg_out[:-nsym], msg_out[-nsym:] # also return the corrected ecc block so that the user can check()

def rs_check(msg, nsym, fcr=0, generator=2):
    '''Returns true if the message + ecc has no error of false otherwise (may not always catch a wrong decoding or a wrong message, particularly if there are too many errors -- above the Singleton bound --, but it usually does)'''
    return ( max(rs_calc_syndromes(msg, nsym, fcr, generator)) == 0 )

Autodetecting the Reed-Solomon parameters

If you intend to work with the QR codes or RS codes generated by another RS codec than your own, you may have to find what are the parameters.

If you have a sample input message and its corresponding RS code, then you can use the following function to try to find the correct set of parameters to decode it.

This function is naive, and thus not fit for realtime computation. You should use it only once on a message (or a set of messages) to compute the parameters, and then hardcode these parameters in your RS codec.

There are a few research articles about more efficient algorithms, but this is outside the scope of this article.

from scipy.spatial.distance import hamming

def detect_reedsolomon_parameters(message, mesecc_orig, gen_list=[2, 3, 5], c_exp=8):
    '''Use an exhaustive search to automatically find the correct parameters for the ReedSolomon codec from a sample message and its encoded RS code.
    Arguments: message is the sample message, eg, "hello world" ; mesecc_orig is the message variable encoded with RS block appended at the end.
    '''
    # Description: this is basically an exhaustive search where we will try every possible RS parameter, then try to encode the sample message, and see if the resulting RS code is close to the supplied code.
    # All variables except the Galois Field's exponent are automatically generated and searched.
    # To compare with the supplied RS code, we compute the Hamming distance, so that even if the RS code is tampered, we can still find the closest set of RS parameters to decode this message.

    # Init the variables
    n = len(mesecc_orig)
    k = len(message)
    field_charac = int((2**c_exp) - 1)
    maxval = max(mesecc_orig)
    if (max(mesecc_orig) > field_charac):
        raise ValueError("The specified field's exponent is wrong, the message contains values (%i) above the field's cardinality (%i)!" % (maxval, field_charac))

    # Prepare the variable that will store the result
    best_match = {"hscore": -1, "params": [{"gen_nb": 0, "prim": 0, "fcr": 0}]}

    # Exhaustively search by generating every combination of values for the RS parameters and test the Hamming distance
    for gen_nb in gen_list:
        prim_list = find_prime_polys(generator=gen_nb, c_exp=c_exp, fast_primes=False, single=False)
        for prim in prim_list:
            init_tables(prim, gen_nb, c_exp)
            for fcr in xrange(field_charac):
                #g = rs_generator_poly_all(n, fcr=fcr, generator=gen_nb)
                # Generate a RS code from the sample message using the current combination of RS parameters
                mesecc = rs_encode_msg(message, n-k, fcr, gen_nb)
                # Compute the Hamming distance
                h = int(hamming(mesecc, mesecc_orig) * len(mesecc) + 0.5)
                # If the Hamming distance is lower than the previous best match (or if it's the first try), save this set of parameters
                if best_match["hscore"] == -1 or h <= best_match["hscore"]:
                    # If the distance is strictly lower than for the previous match, then we replace the previous match with the current one
                    if best_match["hscore"] == -1 or h < best_match["hscore"]:
                        best_match["hscore"] = h
                        best_match["params"] = [{"gen_nb": gen_nb, "prim": prim, "fcr": fcr}]
                    # Else there is an ambiguity: the Hamming distance is the same as for the previous best match, so we keep the previous set of parameters but we append the current set
                    elif h == best_match["hscore"]:
                        best_match["params"].append({"gen_nb": gen_nb, "prim": prim, "fcr": fcr})
                    # If Hamming distance is 0, then we have found a perfect match (the current set of parameters allow to generate the exact same RS code from the sample message), so we stop here
                    if h == 0: break

    # Printing the results to the user
    if best_match["hscore"] >= 0:
        perfect_match_str = " (0=perfect match)" if best_match["hscore"]==0 else ""
        print "Found closest set of parameters, with Hamming distance %i%s:" % (best_match["hscore"], perfect_match_str)
        for param in best_match["params"]:
            print "gen_nb=%s prim=%s(%s) fcr=%s" % (param["gen_nb"], param["prim"], hex(param["prim"]), param["fcr"])
    else:
        print "Parameters could not be automatically detected..."

Probabilistic prime polynomial generator

The deterministic prime polynomial provided above is a good first approach to find prime polynomials in Galois Fields, as it is guaranteed to find all the prime polynomials. However, this guarantee comes at a cost: algorithmic complexity. Thus, if you try to find a prime polynomial for a higher Galois Field such as 2^16 or worst 2^32, the process will either take an almost infinite amount of time or memory.

Indeed, the prime number theorem says that the probability that the n-th number is prime is equal to: $p_{n}~=nlogn$ , which means in other words that bigger is the Galois Field, more sparse are the prime polynomials, and thus more time we will spend search for one.

To avoid these issues, we can relax our guarantees by using a Monte-Carlo probabilistic generator. Thus, the generator below does not guarantee to find all prime polynomials, but it will find some, and anyway we need only one to initialize the Reed-Solomon codec.

How it works:

We pick a number randomly between 2^p and (2^p)*2
We check if it is prime, by using a probabilistic primality test called Miller-Rabin. Without going into the mathematical details, a primality test serves as a way to check if the number is prime or not. In our case, it is probabilistic, which means that we have a chance of having a false positive (but never a false negative). Thus, a composite number (ie, a non prime polynomial) can be declared as prime. To avoid that, the primality test is wrapped inside a tournament: we run the test k times to ensure the result is correct. Bigger is this number, more guarantees we get, but also more time the test takes (but this is insignificant compared to the deterministic approach).
If the number/polynomial seems prime, then build the whole Galois Field generator polynomial to check if there is no duplication of any value in the field (the field must not be surjective, no two values must be the same).
If the Galois Field generator polynomial only contains unique value, then we found a prime polynomial!

Below is a Python code implementing a probabilistic prime polynomial generator.

Note that the Bloom Filter is optional, it is provided here only as a way to be able to compute bigger fields without overblowing the memory because of the galois field generator polynomial test.

import random
from math import log, exp

def miller_rabin(n, k=40):
    # Implementation uses the Miller-Rabin Primality Test
    # The optimal number of rounds for this test is 6 to 40

    # If number is even, it's a composite number
    if n == 2:
        return True

    if n % 2 == 0:
        return False

    r, s = 0, n - 1
    while s % 2 == 0:
        r += 1
        s //= 2
    for _ in xrange(k):
        a = random.randrange(2, n - 1)
        x = pow(a, s, n)
        if x == 1 or x == n - 1:
            continue
        for _ in xrange(r - 1):
            x = pow(x, 2, n)
            if x == n - 1:
                break
        else:
            return False
    return True

def fastprimes_montecarlo(n, k=None, lower=4):
    lowPrimes = [3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71,73,79,83,89,97
           ,101,103,107,109,113,127,131,137,139,149,151,157,163,167,173,179
           ,181,191,193,197,199,211,223,227,229,233,239,241,251,257,263,269
           ,271,277,281,283,293,307,311,313,317,331,337,347,349,353,359,367
           ,373,379,383,389,397,401,409,419,421,431,433,439,443,449,457,461
           ,463,467,479,487,491,499,503,509,521,523,541,547,557,563,569,571
           ,577,587,593,599,601,607,613,617,619,631,641,643,647,653,659,661
           ,673,677,683,691,701,709,719,727,733,739,743,751,757,761,769,773
           ,787,797,809,811,821,823,827,829,839,853,857,859,863,877,881,883
           ,887,907,911,919,929,937,941,947,953,967,971,977,983,991,997]
    if not k: # estimate nb of prime nbs using prime numbers theorem 
        k = int(n * (1./log(n))) * 10
    for _ in range(k):
        nb = random.randrange(lower, n)
        notPrime = False
        for p in lowPrimes:
            if nb == p:
                break
            elif nb % p == 0:
                notPrime = True
                break
        if not notPrime and miller_rabin(nb):
            yield nb

def gf_mult_noLUT(x, y, prim=0, field_charac_full=256, carryless=True):
    '''Galois Field integer multiplication using Russian Peasant Multiplication algorithm (faster than the standard multiplication + modular reduction).
    If prim is 0 and carryless=False, then the function produces the result for a standard integers multiplication (no carry-less arithmetics nor modular reduction).'''
    r = 0
    while y: # while y is above 0
        if y & 1: r = r ^ x if carryless else r + x # y is odd, then add the corresponding x to r (the sum of all x's corresponding to odd y's will give the final product). Note that since we're in GF(2), the addition is in fact an XOR (very important because in GF(2) the multiplication and additions are carry-less, thus it changes the result!).
        y = y >> 1 # equivalent to y // 2
        x = x << 1 # equivalent to x*2
        if prim > 0 and x & field_charac_full: x = x ^ prim # GF modulo: if x >= 256 then apply modular reduction using the primitive polynomial (we just substract, but since the primitive number can be above 256 then we directly XOR).

    return r

class BloomFilter(object):
    """A simple bloom filter for lots of int()"""
    # https://gist.github.com/josephkern/2897618
    # http://www.maxburstein.com/blog/creating-a-simple-bloom-filter/

    def __init__(self, array_size=(1 * 1024), hashes=13):
        """Initializes a BloomFilter() object:
            Expects:
                array_size (in bytes): 4 * 1024 for a 4KB filter
                hashes (int): for the number of hashes to perform"""

        self.filter = bytearray(array_size)     # The filter itself
        self.bitcount = array_size * 8          # Bits in the filter
        self.hashes = hashes                    # The number of hashes to use

    def _hash(self, value):
        """Creates a hash of an int and yields a generator of hash functions
        Expects:
            value: int()
        Yields:
            generator of ints()"""

        # Build an int() around the sha256 digest of int() -> value
        value = value.__str__() # Comment out line if you're filtering strings()
        digest = int(sha256(value).hexdigest(), 16)
        for _ in range(self.hashes):
            # bitwise AND of the digest and all of the available bit positions 
            # in the filter
            yield digest & (self.bitcount - 1)
            # Shift bits in digest to the right, based on 256 (in sha256)
            # divided by the number of hashes needed be produced. 
            # Rounding the result by using int().
            # So: digest >>= (256 / 13) would shift 19 bits to the right.
            digest >>= (256 / self.hashes)

    def add(self, value):
        """Bitwise OR to add value(s) into the self.filter
        Expects:
            value: generator of digest ints()
        """
        for digest in self._hash(value):
            # In-place bitwise OR of the filter, position is determined 
            # by the (digest / 8) digest is described above in self._hash()
            # Bitwise OR is undertaken on the value at the location and 
            # 2 to the power of digest modulo 8. Ex: 2 ** (30034 % 8) 
            # to grantee the value is <= 128, the bytearray not being able 
            # to store a value >= 256. Q: Why not use ((modulo 9) -1) then?
            self.filter[(digest / 8)] |= (2 ** (digest % 8))
            # The purpose here is to spread out the hashes to create a unique 
            # "fingerprint" with unique locations in the filter array, 
            # rather than just a big long hash blob.

    def query(self, value):
        """Bitwise AND to query values in self.filter
        Expects:
            value: value to check filter against (assumed int())"""
        # If all() hashes return True from a bitwise AND (the opposite 
        # described above in self.add()) for each digest returned from 
        # self._hash return True, else False
        return all(self.filter[(digest / 8)] & (2 ** (digest % 8)) 
            for digest in self._hash(value))

    def __setitem__(self, idx, value):
        return self.add(idx)

    def __getitem__(self, value):
        return self.query(value)

def find_prime_polys_montecarlo(generator=2, c_exp=8, fast_primes=True, single=True, forcebloom=False, bloomsize=1E8):
    '''Compute the list of prime polynomials for the given generator and galois field characteristic exponent.'''
    # fast_primes will output less results but will be significantly faster.
    # single will output the first prime polynomial found, so if all you want is to just find one prime polynomial to generate the LUT for Reed-Solomon to work, then just use that.

    # A prime polynomial (necessarily irreducible) is necessary to reduce the multiplications in the Galois Field, so as to avoid overflows.
    # Why do we need a "prime polynomial"? Can't we just reduce modulo 255 (for GF(2^8) for example)? Because we need the values to be unique.
    # For example: if the generator (alpha) = 2 and c_exp = 8 (GF(2^8) == GF(256)), then the generated Galois Field (0, 1, a, a^1, a^2, ..., a^(p-1)) will be galois field it becomes 0, 1, 2, 4, 8, 16, etc. However, upon reaching 128, the next value will be doubled (ie, next power of 2), which will give 256. Then we must reduce, because we have overflowed above the maximum value of 255. But, if we modulo 255, this will generate 256 == 1. Then 2, 4, 8, 16, etc. giving us a repeating pattern of numbers. This is very bad, as it's then not anymore a bijection (ie, a non-zero value doesn't have a unique index). That's why we can't just modulo 255, but we need another number above 255, which is called the prime polynomial.
    # Why so much hassle? Because we are using precomputed look-up tables for multiplication: instead of multiplying a*b, we precompute alpha^a, alpha^b and alpha^(a+b), so that we can just use our lookup table at alpha^(a+b) and get our result. But just like in our original field we had 0,1,2,...,p-1 distinct unique values, in our "LUT" field using alpha we must have unique distinct values (we don't care that they are different from the original field as long as they are unique and distinct). That's why we need to avoid duplicated values, and to avoid duplicated values we need to use a prime irreducible polynomial.

    # Here is implemented a Monte-Carlo probabilistic approach with Miller-Rabin as a primality test and the generation of the generator polynomial to check that all values are unique and the Galois Field is correct.
    # This algorithm should allow to quickly find one or a few prime polynomials in little time, up to GF(2^32). It should work also above but you need exponentially more computing power and memory...

    # Prepare the finite field characteristic (2^p - 1), this also represent the maximum possible value in this field
    root_charac = 2 # we're in GF(2)
    field_charac = int(root_charac**c_exp - 1)
    field_charac_next = int(root_charac**(c_exp+1) - 1)

    prim_candidates = []
    if fast_primes:
        #prim_candidates = rwh_primes1(field_charac_next) # generate maybe prime polynomials and check later if they really are irreducible
        #prim_candidates = [x for x in prim_candidates if x > field_charac] # filter out too small primes
        prim_candidates = fastprimes_montecarlo(field_charac_next, lower=field_charac, k=10000)
    else:
        prim_candidates = xrange(field_charac+2, field_charac_next, root_charac) # try each possible prime polynomial, but skip even numbers (because divisible by 2 so necessarily not irreducible)

    # Start of the main loop
    correct_primes = []
    for prim in prim_candidates: # try potential candidates primitive irreducible polys
        if prim in correct_primes:
            continue
        # Use bloom filter (probabilistic filter) when memory overflow. TODO: try cuckoo filters, should be 20KB per entry (so consume less memory for same or lower fp rate?)
        try:
            if forcebloom:
                raise(MemoryError)
            seen = bytearray(field_charac+1) # memory variable to indicate if a value was already generated in the field (value at index x is set to 1) or not (set to 0 by default)
        except (MemoryError, OverflowError) as exc:
            array_size = int(bloomsize)
            k = 13
            fp_rate = exp(log(1. - exp(log(1. - exp(log(1) - log(array_size*8))) * (k*(field_charac+1)))) * k) # Optimized arithmetically precise version of the original false positive equation: fp_rate = ( 1 - (1. - 1./(array_size*8))**(k*(field_charac+1)) )**k
            if fp_rate > (1. / field_charac):
                raise OverflowError('Predicted false positive error rate of the bloom filter is too high: %.2f%% vs uniform chance: %.2f%%' % (fp_rate*100, (1. / field_charac) * 100))
            seen = BloomFilter(array_size=array_size, hashes=k) # TODO: autodetect memory usage (psutil?), for now just assign 1GB
        conflict = False # flag to know if there was at least one conflict

        # Second loop, build the whole Galois Field
        x = 1
        for i in xrange(field_charac):
            # Compute the next value in the field (ie, the next power of alpha/generator)
            x = gf_mult_noLUT(x, generator, prim, field_charac+1)

            # Rejection criterion: if the value overflowed (above field_charac) or is a duplicate of a previously generated power of alpha, then we reject this polynomial (not prime)
            if x > field_charac or seen[x]:
                conflict = True
                break
            # Else we flag this value as seen (to maybe detect future duplicates), and we continue onto the next power of alpha
            else:
                seen[x] = 1

        # End of the second loop: if there's no conflict (no overflow nor duplicated value), this is a prime polynomial!
        if not conflict: 
            correct_primes.append(prim)
            if single: return prim

    # Return the list of all prime polynomials
    correct_primes.sort()
    return correct_primes # you can use the following to print the hexadecimal representation of each prime polynomial: print [hex(i) for i in correct_primes]

find_prime_polys_montecarlo(2, 16)

Optimizing performances

There are several ways to optimize the performances of a Reed-Solomon codec, but most can be classified into three categories:

Algorithms with lower algorithmic complexity, which changes the asymptotic time, not just by a constant. This is the Holy Grail of error correction codes, and some O(n log n) algorithms exist for Reed-Solomon such as NTT (FFT applied to Galois Fields).
Mathematical tricks to reduce the complexity of operations, without changing the asymptotic time (or bending it a little).
Implementation tricks, such as inner loop optimization, parallel calculation using SIMD, double buffering by using thread to write a codeword down while the next one is being computed at the same time, etc. Most of the implementation optimization tricks for Reed-Solomon are in fact the same as those applied to matrix multiplication and factorization, with the only difference being that Reed-Solomon requires Galois Field arithmetics, and thus XOR (carryless addition) and modulo a prime number.

The following sections will describe some of the techniques, without being exhaustive.

Note that there is also a fourth way that will not be covered here: using specifically designed hardware designed to maximize parallelization of operations (such as VLSI or FPGA boards).

Algorithms with lower complexity

Fast Fourier Transform: As we saw in the tutorial, Reed-Solomon encoding is done by polynomial division of the message with the generator polynomial. Mathematically, polynomial division is equivalent (in some conditions) in the spatial domain with a deconvolution in the frequency domain. Hence, with the Fast Fourier Transform (FFT), we can convert a polynomial into the frequency domain, apply a division, and project back the result by using an Inverse Fourier Transform (IFFT). This algorithm runs asymptotically in O(n log n).
Number Theoretic Transform: An extension of the Fast Fourier Transform technique, where the modulo is done on any integer of choice instead of $e^{-2\pi i/N}$ . This has the great advantage that the results of the NTT can be directly in the target Galois Field instead of doing a modular reduction post FFT^[1]. This type of transform can potentially be more optimized than FFT, since Galois Field operations only rely of XOR, additional implementation optimization tricks can be applied (a benchmark in Python comparing NTT with FTT is available here).

Mathematical tricks

Mathematical tricks are half-way between a change of algorithm and an implementation optimization, because it does change parts of the algorithm in order to account for computers architectures and thus optimize computations on them, but by changing mathematical operators instead of computer architectural details. One such example is Cauchy-Reed-Solomon.

In our implementation, the generator polynomial was in fact a very simple Vandermonde matrix (in our case it was only a vector), but in Cauchy-Reed-Solomon, a Cauchy matrix is used instead. This provides nice mathematical properties, one being that only XOR operations are needed to encode, removing the overhead of galois field multiplications^[2], by considering the galois field's numbers not as bit vectors (polynomial forms) but as bit matrices, with the column i = 2 * (column i-1) in the chosen GF field. Another difference is that it adds an additional packet-size parameter which controls the number of symbols that are to be put in each ECC block (so that you are not anymore limited to 2⁸ = 256 characters but to (2⁸ × packet size, but the packet size must be smaller than the GF field used) and which is a lot faster and resilient to burst errors than standard Reed–Solomon.

Implementation optimization

Bit manipulation: since Galois Field operations are all using XOR at the low level, bit manipulation can greatly speed up calculations, one simple example is to use a << 1 to multiply by 2 instead of using gf_mul(a,2) or even a ^ a. Indeed, this command the shifting of the variable a to one bit left, which effectively add a bit and thus multiply a by 2^1. And we could multiply by any other multiple of two: a << 5 to multiply a by 2^5.
Reducing k and n: the number of ecc symbols can be reduced to decrease the calculation time of Reed-Solomon, since it is proportional. Also, the total size of a codeword can be reduced: this will allow you to compute smaller messages, but potentially faster. This is particularly useful when using Reed-Solomon in a GF(2^16) field: instead of having codewords of 65535 symbols, which would not fit very will in computers memory, you can set the codeword size to 1024, or any other value that fits your L1 or L2 CPU cache. This will allow your CPU to optimize the transfert of information back and forth from the hard drive: for one codeword calculation, only one access to the hard drive or the RAM is required to get the values, and then everything is done with CPU memory.
SIMD and parallelization: by encoding multiple messages at once, the CPU (if supporting SIMD instructions, which most can nowadays) will broadcast the same operations on all messages, since anyway you are using the same generator polynomial/matrix. This drastically reduces the calculation time. Note that for some programming language, you may find optimized distributions that increases SIMD performance on specific architectures, such as Intel Distribution for Python. In addition, working in parallel allows to load multiple messages at once from a file/stream, which reduces the I/O workload by the number of parallel messages you load.
Double buffering: I/O is probably the biggest bottleneck after the heavy CPU calculations required by Reed-Solomon. To alleviate a bit this issue, you can use a thread to save encoded messages while starting immediately the calculation of the next batch of codewords.
Using CRC to enable erasure decoding, thus doubling the number of recoverable symbols compared to error decoding. The idea is to use Cyclic Redundancy Code, a type of error detection code, to detect if there is any error in a source message, and if this is the case, which symbols were corrupted. The advantage of CRC is that it is a lot faster to compute than Reed-Solomon, thus for the same amount of symbols, you can reduce drastically your calculation time.

Resources: ParPar implementation notes.

↑ Lin, S. J., Chung, W. H., & Han, Y. S. (2014, October). Novel polynomial basis and its application to reed-solomon erasure codes. In Foundations of Computer Science (FOCS), 2014 IEEE 55th Annual Symposium on (pp. 316-325). IEEE.
↑ "Tutorial on Erasure Coding for Storage Applications, Part 1", James S.Planck, 2013, Oral communication, pp53-57. url: http://web.eecs.utk.edu/~plank/plank/papers/2013-02-11-FAST-Tutorial.pdf

[ntt-1] Lin, S. J., Chung, W. H., & Han, Y. S. (2014, October). Novel polynomial basis and its application to reed-solomon erasure codes. In Foundations of Computer Science (FOCS), 2014 IEEE 55th Annual Symposium on (pp. 316-325). IEEE.

[2] "Tutorial on Erasure Coding for Storage Applications, Part 1", James S.Planck, 2013, Oral communication, pp53-57. url: http://web.eecs.utk.edu/~plank/plank/papers/2013-02-11-FAST-Tutorial.pdf

[1]

[2]