Talk:Reed–Solomon codes for coders

From Wikiversity
Jump to navigation Jump to search

Some problems using this[edit source]

Hi, I was trying to use this on a Standard Datamatrix example I found in a text book. Attention, here the prime has to be set to

0x12d

Here the example:

The correct message is

c = [142,164,186,114,25,5,88,102]

with the first three being data, and the last 5 error symbols. For this message, the syndrome calculation correctly gives 0. If I then change the message to

c~ = [255,164,186,114,25,7,88,102]

The syndrome gets calculated correctly, if I use rs_calc_syndromes and change gf_pow(2,i) to gf_pow(2,i+1) in the loop of rs_calc_syndromes (I guess this is a error in your code, can you confirm this?). The correctly calculated syndrome is then

s = [126,161,114,243,137]

From this I get the right error locator polynomial via rs_find_error_locator, which is

[90,132,1]

using rs_find_errors I get [5,0] which are the correct positions, although I had to change gf_poly_eval(err_loc, gf_pow(2, i)) to gf_poly_eval(err_loc, gf_pow(2, -i)) in rs_find_errors (I guess this is another bug?) Still, using the correct syndrome and err_pos rs_correct_errata(mesecc, synd, err_pos) does not correct the message right, as the error magnitude seems to be wrong.

Can you help me get this example working?

You did not provide the parameters of the Galois Field, the generator etc that are needed. However, I personally tested the code here several times to ensure everything works fine. However, the code in the main article is a simplified example working only for Galois Field 2^8 and generator 2 and a specific fcr, so if the parameters of your example are different, you need to adapt the code. And indeed, since you had to do +1 in rs_calc_syndromes, it seems that your fcr = 1, so you cannot use the example from the main page (but your change in rs_find_errors is buggy, avoid changing i to -i without a very very good reason). An extended example code that supports different galois fields, generators and fcr is provided in the supplemental material, maybe you can try it out? --Lrq3000 (discusscontribs) 11:57, 16 June 2016 (UTC)[reply]

Question about rs_correct_errata[edit source]

Hi, Could someone explain, what do following lines in the rs_correct_errata function mean?

for i in range(0, len(pos)):            #loop 
      x = gf_exp[pos[i]+256-len(msg)]   #reciprocal of error location Xj
      y = gf_poly_eval(p, x)            #omega evaluation of reciprocal Xj
      z = gf_poly_eval(q, gf_mul(x,x))  #??
      msg[pos[i]] ^= gf_div(y, gf_mul(x,z)) #??

in relationship to explanation of Forney's algorithm on wikipedia.org? all other lines of the function are clear.

Thanks.

jan

It's evaluating the error values.
Since we're in a field of characteristic 2, Λ' is Λ with even-numbered coefficients set to zero. The code removes the even elements completely and then evaluates the resulting polynomial at x2 instead of x. Bobmath (discusscontribs) 20:08, 15 April 2013 (UTC)[reply]
In a field of characteristic p, as in GF(pm) ... when taking the formal derivative of a term of a polynomial, ciXi, i is an integer, not a field element, and the result is i field additions of the term ciXi-1, (which could be implemented as a special type of multiply that multiplies a field element by an integer). In the case where p = 2, then when i is even, the result of the i field additions = 0, and when i is odd, the result is ciXi-1. In the general case, the field additions are done modulo p, and when i is a multiple of p, the term = 0. Removing the terms = 0 is optional, they will evaluate to 0 and not affect the result, other than being a bit slower. Rcgldr (discusscontribs) 16:31, 17 February 2018 (UTC)[reply]

The magic number 0x11D (0b100011101)[edit source]

Hello. Could someone explain how the value 0x11D was picked? This value was used in preparing the list objects gf_exp and gf_log. What if I want to use different list sizes such as 16 or 65536? How do I determine the correct modulo value for these lists?

For finding larger polynomials, check out wolfram alpha. Any of the primitive polynomials will do. Hjfreyer (discusscontribs) 15:15, 24 April 2015 (UTC)[reply]

Thanks for your help - jccruz

i'm pretty sure 0x11b is the reducing polynomial of the field, basically if you represent it as a polynomial you cannot factor it. from what i read, i'm not entirely sure but you might be able to use every un-factorable polynomial with a degree of 8 as the reducing polynomial for the field, but just use 0x11b, maybe it was chosen for resones we cannot understand. and for the list sizes, they must be 256 because they must contain an item for each of the elements in the field. because the field contains 256 elements (2^8) both lists must be that size. i think that one of them is double that size to make implementation of the multiplication algorithm easier and faster (read the comments in that algorithm). so unless you will use a different finite field, use the exponent lists with sizes 256.

The math is a bit complicated; see w:Finite field. You can find a list of different "magic" numbers used by different codes in the zxing source.
I had the same question and looked into it. Added some explanation. Hjfreyer (discusscontribs) 15:13, 24 April 2015 (UTC)[reply]
11d is not a magic number, out of the 30 possible prime polynomials for GF(256), only 16 of them (such as 11d) have a primitive element α = 1x + 0 = 2, which makes the math simpler. 11b is used by AES encryption, one of it's α's is 1x + 1 = 3, which is a bit awkward, however AES mostly just needs to calculate 1/x in this field, which can be done using a lookup table in software or sub-field mapping aka composite fields in hardware to reduce gate count. Rcgldr (discusscontribs) 17:13, 30 January 2018 (UTC)[reply]
In the case of some tape backup drives used for computers, the "magic number" is 0x187. Rcgldr (discusscontribs) 21:25, 31 January 2018 (UTC)[reply]
Ok so it makes sense when deriving manually the calculations, but do these numbers provide any advantage when the calculations are automated by a program? --Lrq3000 (discusscontribs) 00:25, 12 February 2018 (UTC)[reply]
In software, assuming tables are used for multiply and divide, it makes no difference. I have test RS programs where any of the 30 prime polynomials can be chosen as the basis for GF(2^8), and other than initialization of the tables, the code is the same. In hardware, it reduces gate count to choose a polynomial where the primitive α = 2, and with fewer 1 bits in the polynomial, such as 0x11d or 0x187 which have five 1 bits, versus 0x15f or 0x1cf which have seven 1 bits. Rcgldr (discusscontribs) 11:07, 12 February 2018 (UTC)[reply]
Ok great info Rcgldr, this explains why functionality-wise the result is the same, but some super optimized RS codec still prefer to use specific primitive polynomials and elements as you explain, because they have fewer bits and reduce the number of low-level calculations. I'll try to add that to the wiki sometime, thank you! :-) --Lrq3000 (discusscontribs) 02:50, 13 February 2018 (UTC)[reply]
(unindent) For tape drives, both 0x11d and 0x187 are/were used in format specs and both have α = 1x + 0 (0x02) and both have five 1 bits. There's no clear reason for choosing one over the other. AES uses 0x11b, where α ≠ 1x + 0, the "smallest" α = 1x + 1 (0x03), but it only needs to calculate 1/x in GF(2^8), which could be done using a 256 byte lookup table. Rcgldr (discusscontribs) 13:54, 13 February 2018 (UTC)[reply]
Unrelated to this article, which is about software, I'll mention that in hardware, lookup tables are expensive in terms of gate count, and sub-field mapping is used to calculate 1/x in GF(). This method maps a field into a composite or nested field. For example an eight 1 bit field GF(2^8) can be mapped into a pair of 4 bit fields GF((2^4)^2) or even into four 2 bit fields GF(((2^2)^2)^2). Each of these fields will have a primitive α, where all non-zero numbers of each field are powers of α for each field, so powers of one fields α are mapped into powers of the other fields α. This results in multiplication and division working in the same manner for both fields, using M() to represent the mapping: if in the original field a · b = c, then M(a) · M(b) = M(c). The tricky part is the same mapping has to be compatible with addition: if a + b = c, then M(a) + M(b) = M(c). In some cases, there's some logic behind choosing the sub-field so that both multiplication and addition (and their inverse) are compatible, but a brute force search is often used to find a mapping that minimizes gate count. Here is a pdf file about mapping GF(2^8) into GF((2^4)^2): m8to44.pdf. Here is another article about the more extreme measures taken to do a brute force search for a minimum gate count where a piece of hardware might include 10 parallel encoders: compact AES S-box pdf. Note the mapping to a sub-field, calculating 1/x in the sub-field, and mapping back can be done with a single propagation delay through the circuitry used to calculate 1/x for GF(). Rcgldr (discusscontribs) 15:03, 13 February 2018 (UTC)[reply]
Trivia, if α = 1x+0 in the original and mapped fields (0x02 for GF(2^8), 0x10 for GF((2^4)^2), 0x02 for GF(2^4)), then there will be 4 of 64 possible mapping polynomials of the form 1x^2 + ax + b, and the product of those 4 polynomials will have the same coefficient values of 0 or 1 as the original fields polynomial. For GF(2^8) based on the polynomial x^8 + x^4 + x^3 + x^2 + 1 (0x11d) and GF(2^4) based on x^4 + x + 1 (0x13), there are 4 compatible GF((2^4)^2) fields: x^2 + 2 x + 5, x^2 + 3 x + 4, x^2 + 4 x + 2, x^2 + 5 x + 3. The product of these 4 fields is x^8 + x^4 + x^3 + x^2 + 1. For the AES case of 0x11b, α ≠ 1x+0, but to reduce gate count the sub-fields have α = 1x+0 so the trivia about the mapping does not apply. Rcgldr (discusscontribs) 14:05, 13 February 2018 (UTC)[reply]
What about for prime base symbols, where the modular multiplication with primitive roots directly creates maximum period for exponent? Making Reed-Solomon codes for base41, base43, etc.? Is the prime base already its reducing polynomial? 2A01:119F:220:9400:F089:7A0F:F25F:21CA (discuss) 08:15, 1 June 2023 (UTC)[reply]

Some Mistakes I Think I Have Found in The Article[edit source]

1: The generator number: In the article, they say if you multiply 2 by itself (in the finite field) over and over you will eventually reach all numbers in the field except 0. From my experiments, that's wrong, you should instead use 3 or 5 or any other number from the list in this page: http://www.samiam.org/galois.html

It depends on the generating polynomial chosen. Most often, code designers choose a polynomial for which 2 is a primitive element. Bobmath (discusscontribs) 02:28, 14 November 2014 (UTC)[reply]
Yes but indeed a lot of academic papers and books advise to carefully choose the base number for the generator polynomial, as to maximize separability. But I don't know too much about QR codes standards, so it may be that they did not follow those advices and you are required to use 2... Lrq3000 (discusscontribs) 01:45, 4 June 2015 (UTC)[reply]
As mentioned by Bobmath, it depends on the polynomial. Here is a list of the 30 9 bit polynomials in hex used for GF(28) along with the smallest α for that polynomial: 0x11b:0x03, 0x11d:0x02, 0x12b:0x02, 0x12d:0x02, 0x139:0x03, 0x13f:0x03, 0x14d:0x02, 0x15f:0x02, 0x163:0x02, 0x165:0x02, 0x169:0x02, 0x171:0x02, 0x177:0x03, 0x17b:0x0b, 0x187:0x02, 0x18b:0x0b, 0x18d:0x02, 0x19f:0x03, 0x1a3:0x03, 0x1a9:0x02, 0x1b1:0x07, 0x1bd:0x07, 0x1c3:0x02, 0x1cf:0x02, 0x1d7:0x07, 0x1dd:0x07, 0x1e7:0x02, 0x1f3:0x0d, 0x1f5:0x02, 0x1f9:0x03 . In the case of 0x1f3, the smallest α = 0x0d. Normally one of the 16 fields where α = 2 and the fewest number of 1 bits in the generator polynomial is chosen to simplify hardware. So although both 0x11d and 0x15f have the same "minimum" α = 2, 0x11d has five 1 bits versus 0x15f with seven 1 bits. Some devices use 0x187 which also has five 1 bits and "minimum" α = 2. Trivia - the bits for a field polynomial can be reversed, but for α ≠ 2, α may be different for the bit reversed field, for example, 0x19f:0x03 and bit reversed, 0x1f3:0x0d.Rcgldr (discusscontribs) 01:20, 9 February 2018 (UTC)[reply]

Multiplication[edit source]

I think I see an error in the multiplication example:

10001001 times 00101010 is 1011001111010, and not 1010001111010 (137 times 42 is 5754, and not 5242)

In the third line of your calculation 2 x⁸ disappears, what happens to it...? (2 x⁸ = 512, exactly the difference between 5242 and 5754)

Why don't you fix it? After all you have the ability to edit it and correct it. --Goldenburg111 14:41, 1 December 2014 (UTC)[reply]

Sorry, i can not "fix" it because: the 'addition replaced with exclusive-or' is 0000100010010 ^ 0010001001000 ^ 1000100100000 = 1010001111010 in the example. This seems to be correct. (alas, where 2 x⁸ occurs we have 1 ^ 0 ^ 1 = 0; why is there no carrier to the x⁹ column....) I am an absolute noob and can not resolve this contradiction.

Alright, I see. Possibly someone down the line can do it or if anyone decides not to pay attention to this I'll do it myself. Thanks :) --Goldenburg111 16:23, 1 December 2014 (UTC)[reply]

On a side note I have tried the "Python function which implements the polynomial multiplication" (as a rewrite in C) and it produces the same result 137 * 42 = 5242 (which is wrong in my opinion). Bitwise operations in the Python function work exactly as in the multiplication example. Do I miss the point here or is the function missing something?

The definition of "multiplication" here is fundamentally different from the algorithm from grade school we all know and love. It's correct that it gives different answers. Also note that in this setting, 8 - 1 = 9. Just a different set of rules. Hjfreyer (discusscontribs) 15:19, 24 April 2015 (UTC)[reply]
I'd like to add that indeed your intuition is correct: additions and substractions of GF(2^p) elements are carry-less. Do not ask me why, I can't give the specifics, but I have read it in a lot of academic books and articles.Lrq3000 (discusscontribs) 01:43, 4 June 2015 (UTC)[reply]
I just checked the results, both programmatically and manually. The calculation is correct, and the result you've got is because you did not do the addition carry-less: if you do a normal addition inside the multiplication, you get 1011001111010, but if you do it carry-less like it should, you get 1010001111010. Note that doing with-carry addition might get you to the correct result after you do the modular reduction (polynomial division) for some numbers, but not for all. That's a pretty hard bug to catch, so be precautious in your implementation of carry-less operations, they are mandatory.--Lrq3000 (discusscontribs) 18:42, 22 June 2015 (UTC)[reply]

I get a different value for g_4(x)[edit source]

When calculating , I get a different coefficient for :

In hexadecimal, the coefficients would be: [ 0x01 0x0f 0x46 0x78 0x40 ]

Nope, just did it out by hand. His number is correct. Remember, the calculations are happening using the field operations. So, addition is replaced by XOR, and 14+56 = 54, not 70 like it does with normal arithmetic. Hjfreyer (discusscontribs) 20:26, 27 April 2015 (UTC)[reply]

More importantly, though, why did he choose g4(x) and not a higher number? Is that what's actually used in QR? Is that only for the scope of the example?

According to Wikipedia: "[In QR codes] Codewords are 8 bits long and use the Reed–Solomon error correction algorithm with four error correction levels." So yes, the given polynomial is the one used in the spec, I'm pretty sure. Hjfreyer (discusscontribs) 20:26, 27 April 2015 (UTC)[reply]

The RS encoding example, is a little confusing. I don't see how he get's the value that he's XOR-ing with. In the section above (the BCH part) he was XOR-ing with the generator polynomial (in binary) which made sense to me. Here, I'd expect he'd XOR with the generator polynomial as well (this time with the RS polynomial, that we've just computed). But no matter how I shift the generator polynomial back and forth, I can't get them to align, so that it'd result '12' in the left-most byte.

I can't see where he get's his 12 ee 2b 23 f4 from. As I understnand it - and how I remember if from school - I should multiply the whole byte string (the generator) by 12, but that doesn't seem to work either.

You lost me a little bit with the specifics, but re-read the section on finite field arithmetic (which I recently beefed up), and try your calculations again using those operations. Don't worry, it's legitimately tricky. I studied this kind of math in college and it still hurts my brain sometimes. Hjfreyer (discusscontribs) 20:26, 27 April 2015 (UTC)[reply]

Possible plagiat (with more comments)[edit source]

Another tutorial on Reed-Solomon with provided sourcecode, with more comments, can be found here: http://www.drdobbs.com/testing/error-correction-with-reed-solomon/240157266

Sourcecode: http://twimgs.com/ddj/images/article/2013/0613/ReedSolomon.py

This is probably a plagiat of this tutorial on wikiversity, since the structure is identical, only variables names and comments are different (there are a lot of different ways to implement Reed-Solomon, you won't find any two implementation having the same structure and using the exact same algorithms implementations). However, the comments and the more meaningful variables names can be useful as an additional study material for readers. Lrq3000 (discusscontribs) 23:11, 3 June 2015 (UTC)[reply]

The decoding part of the Dr.Dobb's tutorial seems to be highly similar to the decoding part of your tutorial except for one thing: Its _rsCorrect() function looks quite different from your rs_correct_errata() function. I made a simple performance test and found that _rsCorrect() is almost twice as fast as rs_correct_errata(). I do not know why there is such a difference, though.--Kenethcusten (discusscontribs) 06:40, 19 January 2017 (UTC)[reply]
I wrote this before I did a major overhaul of the article (I was talking about Bobmath's original code), so this might explain the differences. In particular, rs_correct_errata() is drastically different. Also, it can correct both errors and erasures, I think _rsCorrect() can only correct errors (as Bobmath's original code), so this might explain why it's slower. Finally, the code here is not made for optimized speed, I have a version of this code that runs more than 10x faster in native Python, but is rather optimized for readability. --Lrq3000 (discusscontribs) 13:17, 6 February 2017 (UTC)[reply]

Update 06/2015[edit source]

I updated the tutorial with more background information, more details, updated code and added comments, and added possible enhancements and current research topics such as list decoding. I hope the tutorial is still logical and natural to read, but if not, feel free to fix the mistakes you see.

Also feel free to ask if you feel like there needs to be some more info on a topic. I may not answer all questions, but I will do my best. Lrq3000 (discusscontribs) 01:38, 4 June 2015 (UTC)[reply]

Coders, help is needed![edit source]

Help us make this tutorial better by unifying the convention for polynomials ordering inside lists throughout the code and functions.

Let me explain: for now, there are a lot of list reversing in order to make it work, you never know the order of a polynomial, ie, if the first coefficient is the major degree or the constant term... This can be particularly frustrating for learners that try to understand what the code really does, and for expert coders as well when trying to debug.

Stating a clear convention from the start (eg, least degree first, major degree last) and following it throughout the code would help a lot. There's no need to understand the maths, so if you are an experienced coder and want to help, you are welcome to do that! Lrq3000 (discusscontribs) 16:51, 13 June 2015 (UTC)[reply]

For Berlekamp Massey decoder, the order doesn't matter as long as it's consistent. For Sugiyama's adaptation of extended Euclid algorithm, the polynomials should be most significant coefficient first, since it's similar to long division. However, hardware implementations often use a pair of shift registers that always shift left, so potential error locator polynomials are left to right, but potential "Omegas" are right to left, a pair of them in each register. Rcgldr (discusscontribs) 17:30, 30 January 2018 (UTC)[reply]
Thank you Rcgldr for the additional infos, this (partially) explains why the convention can change between codecs, since choosing one over the other might be more natural for some algorithms. But indeed since we are using BM here we can use any convention, so it would be nice to make it consistent on the whole article to ease understanding (the maths are complicated enough for an outsider!). Maybe one day I'll do it if I have enough spare time (I can dream...). --Lrq3000 (discusscontribs) 02:55, 13 February 2018 (UTC)[reply]
Follow up - BCH view codes treat the first coefficient as the most significant term. For original view, the codeword is a series of values, not a polynomial. I don't know if there was a convention for the original view non-systematic encoding, since the message itself is treated as the generating polynomial (so is the message treated as most significant or least significant coefficient first?). For original view systematic encoding, both the message and codeword are a series of values, and the convention of the generating polynomial doesn't matter. On the other hand, when expressing polynomials using summation notation, the first coefficient is the least significant (constant) term. Another issue is if the error locator polynomial has roots 1/Xi, reversing the coefficients results in a polynomial with roots Xi. Rcgldr (discusscontribs) 19:08, 26 February 2018 (UTC)[reply]

Optimized for generator=2[edit source]

I noticed that the code in this tutorial (but not the explanations) is optimized because the field is generated using 2, which gives hardly comprehensible code like the following (from the rs_generator_poly):

g = gf_poly_mul(g, [1, gf_exp[i]])

This isn't theoretically correct, the following is:

g = gf_poly_mul(g, [1, gf_pow(generator, i)])

However, since the precomputed log and anti-log tables are generated using 2, you get the following:

gf_pow(2, i) == gf_exp[(gf_log[2] * i) % 255] == gf_exp[(1 * i) % 255] == gf_exp[i]

Because of course log(2) == 1 if the base is 2.

Optimization explained! But not generalizable at all if you work with other implementations of the Reed-Solomon codec, so I will fix the tutorial by implementing generators in the code. Lrq3000 (discusscontribs) 23:56, 22 June 2015 (UTC)[reply]

Update 10/2015[edit source]

I updated the tutorial with a better code, which removes some very edgy errors in decoding that can happen, and also with a lot more comments and better mathematical clarity (I removed the optimizations that were very specific here and that you could not find in any book, the goal here is that the tutorial should be clear and close to the mathematical equations so that anyone can understand and compare with what you can read in books).

I also tested the code thoroughly to make sure everything works and is correct, and I also added a universal Reed-Solomon codec, which is based on the code in the main article, in the appendix here: https://en.wikiversity.org/wiki/Reed%E2%80%93Solomon_codes_for_coders/Additional_information#Universal_Reed-Solomon_Codec

I hope it will help the readers work more confidently and better understand the Reed-Solomon algorithms. --Lrq3000 (discusscontribs) 19:04, 8 October 2015 (UTC)[reply]

Why "% 255"?[edit source]

GF(2^8) contains 256 values from 0 to 255. If we mod a value by 255 we will get a result in the range from 0 to 254, which contains only 255 different values. In the code, I see many "% 255". Did you mean "% 256" or the equivalent "& 255"? -- by Kenethcusten on 15 January 2017.

Um... that's a good question. Actually, IIRC, I did some tests, and I discovered that the first and the last elements of the log table are the same. So basically, even if indeed there are 256 elements in the field, element 1 == element 256, so it can be skipped. This is anyway something you will see in all RS implementations. You should ask a mathematician for a more formal proof than my anecdotical evidence :) --Lrq3000 (discusscontribs) 13:22, 6 February 2017 (UTC)[reply]
But I can confirm that if you use % 256, the maths won't work. You need to use nb_of_elements - 1, whatever field you are using. --Lrq3000 (discusscontribs) 13:23, 6 February 2017 (UTC)[reply]
Log and antilog tables are based on powers of α, which don't include the value 0. Cyclic codes such as BCH Code based implementations of Reed Solomon have maximum length 255 for GF(256), non cyclic codes such as the original view Reed Solomon codes have maximum length 256. Rcgldr (discusscontribs) 21:36, 31 January 2018 (UTC)[reply]
Thank you very much Rcgldr for the clarification, I think you should add this to the article, this is a very useful explanation :-) --Lrq3000 (discusscontribs) 02:50, 9 February 2018 (UTC)[reply]

original view decoders[edit source]

Somehow these got missed in the wiki article, even though Berlekamp Welch (1986) has it's own wiki article, but the original article wasn't clear (it got a rewrite). The wiki article now also includes the Gao (extended Euclid) decoder for original view codewords. Rcgldr (discusscontribs) 17:22, 30 January 2018 (UTC)[reply]

Principles of error correction - bad / misleading example[edit source]

From the article: "If we now get five corruptions, for example on that ... * * * * 20 8 1 * ...". This is misleading, since it implies correction could be made with 5 erasures, when it can't even correct a 4 erasure case such as: "t h * * 20 8 * *", it's equally distant from "t h i s 20 8 9 19" and "t h a t 20 8 1 20". Rcgldr (discusscontribs) 05:28, 1 February 2018 (UTC)[reply]

Yes you're right, we could probably make the example closer to a real hamming distance of 5 erasures by adding to each number the total sum of all, which would then correctly differenciate between the t of that and the t of this, but this would complicate further the example and anyway I'm pretty sure we could find other edge cases (only symbols encoded in Galois Field and with a primitive polynomial can guarantee a Hamming distance in any case). This was just meant as a simple intuitive and not necessarily generalizable example, but if you have an idea to make it more mathematically correct you're welcome (or if you have a suggestion to make the sentence clearer). --Lrq3000 (discusscontribs) 02:59, 13 February 2018 (UTC)[reply]
The example adds 4 elements of redundancy, so at best, it could correct 4 erasures in the general case. As for a simple example that can handle 2 errors or 4 erasures, the dictionary could just ensure that each entry has a unique set of redundancy characters with no duplicates of any redundancy characters in the same position, such as thisabcd, thatbcde, corncdef (max dictionary size is 23). Assuming that comparisons of characters only result in equal or not equal, the hamming distance is 5. The 4 erasure case would require a single search of the dictionary with the 4 non-erased characters. The 2 error case would require 28 searches of the dictionary for 6 character subsets of 8 characters.Rcgldr (discusscontribs) 22:15, 17 February 2018 (UTC)[reply]

Why use QR codes as an example?[edit source]

In my opinion, the section on QR codes takes up too much space in a article about Reed Solomon coding. It would be sufficient to note the cases where it's used similar to the Wiki Reed Solomon article (cd's, dvd's, ... , no mention of hard drives internal ecc?). Rcgldr (discusscontribs) 05:31, 1 February 2018 (UTC)[reply]

This is a choice of the original author Bobmath, and indeed myself I never made a single QR code. I agree it's not necessary, but I think it gives some appeal to the article, as a decoy to provoke interest and introduce gently with a practical and common case often met by mobile app developers nowadays, so I think we should keep it. But I would be happy if someone could make this section more concise and simpler (as it adds a certain layer of complexity to an already complex topic). --Lrq3000 (discusscontribs) 03:01, 13 February 2018 (UTC)[reply]
That section could be greatly shortened by including a link to wiki QR code, along with some comments how RS is used for QR codes. Rcgldr (discusscontribs) 16:35, 17 February 2018 (UTC)[reply]

Not all Reed Solomon codes are BCH codes[edit source]

The original view Reed Solomon codes are not a class of BCH codes, since they don't use a fixed generator polynomial, instead the polynomial is based on the message and the encoder and decoder use the same set of evaluation points, although if the set of evaluation points are successive powers of α, the code becomes cyclic (a rotated code word would still be a "valid" codeword). The first original view "decoder" was impractical, and since the concurrently developed BCH codes had practical decoders, the Reed Solomon codes were switched to using BCH style generator polynomials, making them a class of BCH codes and a class of Reed Solomon codes. However, in 1986, an original view practical decoder (time complexity O(n^3)) was developed by Berlekamp and Welch, and later in 2002 an improved original view decoder was developed by Gao. There are also "list" decoders that can generate a list of potential original view polynomials in polynomial time, and work continues on these. These original view decoders are now included, along with examples, in the Wiki Reed Solomon article. Rcgldr (discusscontribs) 05:42, 1 February 2018 (UTC)[reply]

Thank you for your input that's very interesting, I must admit I don't know much about the history of error correction codes as I have learnt by myself (my background is in artificial intelligence, we don't study much about signal processing). However, I can't find your commits in the history, can you give me a few links (or maybe you meant you updated the wikipedia article, not the wikiversity one)? BTW if you have some info on how to implement a list decoder or a fast NTT decoder, **please** feel free to contribute to the article, as I would very much like to learn how to implement these approaches :-D --Lrq3000 (discusscontribs) 03:00, 9 February 2018 (UTC)[reply]
The original view decoders and history have been updated in the wiki article wiki Reed Solomon. The wiki article also mentions list decoders with perhaps one link. Seems like the work on list decoders is still ongoing. One issue with the original view decoders is that they are slower than BCH view decoders. This becomes clear when you compare Gao's original view extended Euclid algorithm versus Sugiyama's BCH view extended Euclid algorithm. Gao's method starts off with degree n and n-1 polynomials, while Sugiyama's starts off with n-k and n-k-1 degree polynomials. The number of iterations for both decoders is the same as the number of errors, and with quadratic time complexity, the difference in time is ~(n/(n-k))2. Berlekamp Welch original view decoder has cubic time complexity so significantly slower. The list decoders are slower still, and meant for "low rate" code (smaller number of elements in a codeword) but attempt to handle cases beyond the normal error bound, by finding a list of original view polynomials of degree k-1 that match "most" of the n elements in a received codeword with errors. This is the same as Reed Solomon's original concept, but with optimized algorithms that can produce a list in polynomial time, but as mentioned, they're too slow to be useful for RS(n,n-k) codes where n = 255 is a common size for a BCH 8 bit code. Rcgldr (discusscontribs) 03:45, 9 February 2018 (UTC)[reply]
I should mention that the list decoders are sacrificing mis-correction detection in favor of a good chance of recovery. One example would be image or video transmission, where occasional mis-corrections would be preferred to retransmission of data. It's not clear to me what is the motivation behind the original view decoder research. If there was some advantage for these with a low rate code, such as RAID 6 on 10 or fewer hard drives, they could be used, but to my knowledge, most or all RAID 6 error correction methods are based on BCH view RS codes. The main advantage is you get one more element, for GF(2^8) original view codeword size can be 256, instead of BCH limit of 255. Rcgldr (discusscontribs) 03:59, 9 February 2018 (UTC)[reply]
Yes but still even if list decoders are slower, the ability of decoding above the Singleton bound can balance the slowness in non-communication setups, such as data archival (as opposed to data storage where the user still needs to be able to fastly access/modify the files, hence why probably it was not implemented for RAID yet, but for data archival time does not matter as long as it provides a greater possibility of recovery). Also I had in mind that list decoding could be used on the same codes as standard RS, such that we could imagine first trying a fast standard decoder, and in case of failure (possibly detected by hash since RS has a hard time detecting failures when above the Singleton bound) we could do a list decoding. Is that not the case, are these two types of decoding incompatible?
List decoders use the original view: they produce a list of original view encoding polynomials of degree less than k (where k is the number of symbols in the original message), that match the most data points in a received message and a preference for the lowest degree encoding polynomial(s). The few articles I've read on this imply improvement in either speed or mis-correction is needed and the research is ongoing.Rcgldr (discusscontribs) 05:39, 13 February 2018 (UTC)[reply]
Follow up on list decoders. Note that list decoders don't actually correct beyond the normal limits. Instead they provide a list of possible original view polynomials, one of which may the correct polynomial, but there's no way to know which of the polynomials in the list of possible polynomials is the correct one. Having some type of additional information about the original data could narrow down the list, possibly to a single polynomial, but a list decoder on it's own will not do this. In rare cases, a list decoder may result in a single possible polynomial, but there's no guarantee that it's the correct polynomial unless there is a constraint on the maximum number of errors. Rcgldr (discusscontribs) 01:54, 1 June 2018 (UTC)[reply]
About the "fast speed", I was more refering to FFT and NTT codecs, I am very eager to learn how to perform these as I did not find any way to understand/implement them yet. If you have a good reference or if you can add a section about it on the article it would be AWESOME! :-D --Lrq3000 (discusscontribs) 03:07, 13 February 2018 (UTC)[reply]
I'm not familiar with the other codes like Turbo, FFT, NTT, ... . Fourier transform for BCH code is the same as syndrome calculation, except that n mapped values are generated instead of n-k syndromes. The wiki article explains how a Fourier based decoder works. Note that the error locator polynomial is generated in the same manner as any of the syndrome based BCH decoders, so although it's possible to correct a Fourier mapped codeword then use inverse Fourier transform to convert it back, it's just extra overhead, and mostly an academic exercise. Rcgldr (discusscontribs) 05:39, 13 February 2018 (UTC)[reply]

General comments about the article[edit source]

As noted above, original view Reed Solomon codes are not BCH codes, and slow but "practical" decoders for the original view encoding now exist, such as Berlekamp Welch (1986) and Gao (2002). Rcgldr (discusscontribs)

Polynomial addition and subtraction never involves carries or borrows between terms (coefficients) of a polynomial. Rcgldr (discusscontribs) 09:11, 9 February 2018 (UTC)[reply]

Very pertinent, thank you Rcgldr, I will think and try to implement your remarks in the article :-) --Lrq3000 (discusscontribs) 03:12, 13 February 2018 (UTC)[reply]

Reed Solomon codes can be based on GF(pm) for p ≥ 2 and m ≥ 1. The wiki Reed Solomon article uses GF(929) (929 is prime) as an example. As an unusual example, GF(32) based on 1x2+2x+2, with primitive α = 1x+0 could be used. These are examples where addition and subtraction are not xor operations, and need to be kept track of (knowing when to subtract versus add). Rcgldr (discusscontribs) 09:10, 9 February 2018 (UTC)[reply]

I am not sure I understand this exception to the rule? Normally polynomial addition and subtraction never involves carries or borrows, but then why here by simply changing the GF and the primitive polynomial we would need to account carries and borrows? --Lrq3000 (discusscontribs) 03:12, 13 February 2018 (UTC)[reply]
For any GF(pm), addition or subtraction of coefficients for the same degree terms is performed modulo p, which involves carries or borrows for p > 2. Carries or borrows don't go across terms of different degree. For example, (a x^2 + b x + c) + (d x^2 + e x + f) = (a+d mod p) x^2 + (b+e mod p) x + (c+f mod p). Note that a carry from b+e mod p is just dropped and not included in the a+d term. It just happens that in the case of p = 2, then addition and subtraction of single bit terms are the same as xor. For m > 1, then as noted, carries or borrows don't go across terms of different degree. Consider the field GF(7^3), using the same addition example, the resulting sum is (a+d mod 7) x^2 + (b+e mod 7) x + (c+f mod 7). Complicating matters is composite or nested fields, such as GF(((2^2)^2)^2), in which case you have one 8 bit field composed of two 4 bit fields, each of which is composed of two 2 bit fields, where all the math on like terms is modulo 2 so the same as xor. Rcgldr (discusscontribs) 09:34, 13 February 2018 (UTC)[reply]
Continuing with my prior response, (a x^2 + b x + c) - (d x^2 + e x + f) = (a-d mod p) x^2 + (b-e mod p) x + (c-f mod p). If p = 2, then addition and subtraction modulo 2 involve 1 bit values and the result is the same as xor, but the generic implementation using the first terms of the example is (a+d)mod p or (a-d)mod p. Note that unlike C's % remainder operator, in mathematics, and proper mathematical programming languages like APL, -1 mod 2 = +1, which is part of the reason that addition and subtraction modulo 2 is the same as xor. Since addition and subtraction are the same as xor for p = 2, then addition or subtraction with groups of bits can be performed in parallel using xor. Rcgldr (discusscontribs) 04:15, 15 February 2018 (UTC)[reply]

I suggest removing principles of error correction or greatly shortening it. Rcgldr (discusscontribs) 14:52, 28 April 2018 (UTC)[reply]

I suggest removing the QR code section, replacing it with a link. Rcgldr (discusscontribs) 14:52, 28 April 2018 (UTC)[reply]

I suggest removing the BCH section. Only one type of Reed Solomon code is one type of a BCH code. The original view Reed Solomon code is not a BCH code. Binary BCH codes use 1 bit coefficients for the generator polynomial in an otherwise GF(2^m) environment, requiring a complicated process to determine a "least common multiple" generator polynomial, a situation which does not exist for BCH view Reed Solomon codes. Rcgldr (discusscontribs) 20:02, 22 June 2018 (UTC)[reply]

In the intro section, it mentions barcodes as an example, but most barcodes don't use Reed Solomon (UPC barcodes use checkdigits). It also mentions QR Codes are visual, but the masking pattern would make this difficult, so software would be needed to show an "unmasked" bar code to make it visible, and even then the layout can be hard to follow. Rcgldr (discusscontribs) 02:28, 12 May 2018 (UTC)[reply]

@Rcgldr: Be bold. -- Dave Braunschweig (discusscontribs) 14:19, 23 June 2018 (UTC)[reply]
@Dave Braunschweig: - What's the goal for this article? I wrote a small 20 page primer ecc.pdf just to cover the basic concepts for co-workers. Nasa has a 120+ page primer nasa ecc pdf. Even my small primer seems like it would be too big for a wiki article and it's missing the commonly used decoders. Neither primer deals with non-binary fields or original view Reed Solomon codes. The current wiki article Reed Solomon includes examples and links to algorithms. Rcgldr (discusscontribs) 00:17, 25 June 2018 (UTC)[reply]
@Rcgldr: The goal for any Wikiversity resource that isn't someone's homework assignment or similar self-learning experience should be to help others learn something, generally as learn by doing. It should specifically not duplicate content available elsewhere (Wikipedia, Wikibooks, NASA articles, etc.). Much better (generally) to link to those sources and then provide examples of how to use or apply them.
That said, I know nothing about Reed-Solomon except that this article is typically in the top 100 to 200 articles on Wikiversity each month (Wikiversity:Statistics). People look for this subject, and look at it here. You would likely know better than most what they are looking for, and how the article here can provide a learning experience that Wikipedia cannot. -- Dave Braunschweig (discusscontribs) 01:23, 25 June 2018 (UTC)[reply]
@Rcgldr: To be honest, these parts are leftovers from the first author. I found them interesting and somewhat useful, but they might be unnecessary, but I'm not expert enough to know :-) If you have an idea how to explain the concepts without needing to refer to these old implementations, feel free do so :-) --Signimu (discusscontribs) 17:56, 9 April 2019 (UTC)[reply]
@Signimu: Much of the subject is covered in the main Wiki article or at the web sites linked to from the Wiki article. In my opinion, this article should be focused on specific implementations of popular RS codes, such as how to modify syndromes for erasures, Berlekamp Massey and Sugiyama extended Euclid algorithms, but Berlekamp Massey already has it's own Wiki page, and Sugiyama's algorithm is described in a pdf file linked to from the Wiki article. The lower level stuff, such as using log and antilog tables for multiplication and division in finite fields is also covered by other web sites. It's not clear to me what should be covered here. I have C source code for an interactive GF(2^8) eccdemo program here: eccdemo8.zip, but it includes 3 decoders and some other special interest stuff. It's Euclid decoder was written to emulate a hardware implementation, which may be a bit tricky to follow. Rcgldr (discusscontribs) 09:10, 11 April 2019 (UTC)[reply]

non binary implementations[edit source]

Any knowledge of implementations (preferably python) for RS/BCH codes under general GF(p^m)? In particular - GF(5), GF(4), GF(7) --Leonanavy (discusscontribs) 16:10, 5 March 2018 (UTC)[reply]

The main issue is add and subtract of polynomial terms is modulo p, and if basing a program on a binary (p = 2) implementation that uses xor for both add and subtract, the code needs to be updated to add or subtract (modulo p) appropriately. For Forney algorithm used for BCH view, the "formal" derivative of the error locator polynomial of the term is i field additions of the term , which can be implemented via multiplication of a field element by an integer: .Rcgldr (discusscontribs) 04:16, 7 March 2018 (UTC)[reply]
GF(4) is binary, it's assumed to be GF(2^2), since 4 is not prime. Similarly GF(9) is GF(3^2), and GF(81) is GF(3^4). I have C code for BCH view Reed Solomon code GF(7), GF(3^2), GF(7^2), GF(3^4), GF(929) (929 is prime) with PGZ (matrix), Berlekamp Massey, (discrepancy) and Sugiyama (extended Euclid) decoders. I also have C code for original view decoders such as Berlekamp Welch (matrix) and Gao (extended Euclid), for some of these fields. Even though it's not practical in most cases, I also did the original view original decoder that goes through all k of n subsets to find the most popular encoding polynomial. For my example programs for GF(p^m), p is easily changed, but m is hard coded in low level math routines for add, subtract, to initialize log and antilog tables, and for multiply field element by integer (used to create derivative of error locator polynomial for Forney algorithm). The wiki article for Reed Solomon includes examples (but not code) for both original view and BCH view decoders (all of these are GF(929)). Wiki Berlekamp Welch article uses GF(7). This article only has Berlekamp Massey BCH view decoder. It's missing the other decoders mentioned in the wiki article. Rcgldr (discusscontribs) 07:03, 21 March 2018 (UTC)[reply]

I missed the part about BCH code. "Classic" BCH code uses GF(p) for data and generator polynomial coefficients, while the rest of the environment operates within GF(p^m) (with m > 1). The generator polynomials are complicated least common multiples required to produce the desired Hamming distance. I'm not aware of a "classic" BCH code where p ≠ 2, but it could be implemented. Since this article is about Reed Solomon (everything is GF(p^m)), I don't think example BCH code for p ≠ 2 is needed. Rcgldr (discusscontribs) 15:15, 28 April 2018 (UTC)[reply]

Unexpected ZeroDivisionError[edit source]

I was porting this code to C#, so when I was ready to test my implementation with loads of random data, I got an divide by zero exception. This Python code also threw an ZeroDivisionError with this input.

Here is the input:

Galois field = the default for this implementation (QR code field).
n = 9
k = 3

initial message = [188,182,44]

corrupted message+ecc = [188,182,44,183,164,72,74,150,20]

erase_pos = [1,2,3,4,8]

This example input contains 1 error and 5 erasures, so decoding is expected to fail, changing the 72 to 73 will throw an ReedSolomonError as expected.

Sonic The Hedgehog LNK1123 (discusscontribs) 16:53, 16 January 2019 (UTC)[reply]

Code has been updated, this should be fixed now. Sonic The Hedgehog LNK1123 (discusscontribs) 16:13, 2 February 2019 (UTC)[reply]
@Sonic The Hedgehog LNK1123: Thank you :-) --Signimu (discusscontribs) 18:01, 9 April 2019 (UTC)[reply]

Maximum error detection bound not reached[edit source]

The current Python implementation does not reach the bound of detectable errors.

Various sources, including Wikipedia state the following: A Reed–Solomon code can detect (but not correct) any combination of up to erroneous symbols.

Example:

Galois field = the default for this implementation (QR code field).
n = 10
k = 6

erase_pos = None

initial message = [136,98,229,248,135,143]

initial message+ecc = [136,98,229,248,135,143,111,77,106,183]

corrupted message+ecc = [106,89,178,146,135,143,111,77,106,183]

corrupted positions = 0, 1, 2, 3 (4 corruptions, matching t in this example)

This input should have thrown an ReedSolomonError (because the number of corrupted symbols is equal to ), but instead we get the following output:

corrected message+ecc = [106,89,178,146,135,13,9,77,106,183]

Sonic The Hedgehog LNK1123 (discusscontribs) 18:28, 15 January 2022 (UTC)[reply]