@@ -273,9 +273,9 @@ byte. A quick analysis:
for the column parity we use the par variable. When extending to 32 bits
we can in the end easily calculate rp0 and rp1 from it.
(because par now consists of 4 bytes, contributing to rp1, rp0, rp1, rp0
-respectively, from MSB to LSB in little endian mode)
+respectively, from MSB to LSB)
also rp2 and rp3 can be easily retrieved from par as rp3 covers the
-first two MSBs and rp2 covers the last two LSBs in little endian mode.
+first two MSBs and rp2 covers the last two LSBs.
Note that of course now the loop is executed only 64 times (256/4).
And note that care must taken wrt byte ordering. The way bytes are
@@ -387,7 +387,7 @@ Analysis 2
The code (of course) works, and hurray: we are a little bit faster than
the linux driver code (about 15%). But wait, don't cheer too quickly.
-THere is more to be gained.
+There is more to be gained.
If we look at e.g. rp14 and rp15 we see that we either xor our data with
rp14 or with rp15. However we also have par which goes over all data.
This means there is no need to calculate rp14 as it can be calculated from