Submitted by Mikael Pettersson on July 19, 2010, 9:58 p.m.

Message ID | 19524.51858.992299.119315@pilspetsen.it.uu.se |
---|---|

State | Accepted |

Delegated to: | David Miller |

Headers | show |

Mikael Pettersson writes: > The kernel's math-emu code contains a macro _FP_FROM_INT() which is > used to convert an integer to a raw normalized floating-point value. > It does this basically in three steps: > > 1. Compute the exponent from the number of leading zero bits. > 2. Downshift large fractions to put the MSB in the right position > for normalized fractions. > 3. Upshift small fractions to put the MSB in the right position. > > There is an boundary error in step 2, causing a fraction with its > MSB exactly one bit above the normalized MSB position to not be > downshifted. This results in a non-normalized raw float, which when > packed becomes a massively inaccurate representation for that input. > > The impact of this depends on a number of arch-specific factors, > but it is known to have broken emulation of FXTOD instructions > on UltraSPARC III, which was originally reported as GCC bug 44631 > <http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44631>. > > Any arch which uses math-emu to emulate conversions from integers to > same-size floats may be affected. > > The fix is simple: the exponent comparison used to determine if the > fraction should be downshifted must be "<=" not "<". > > I'm sending a kernel module to test this as a reply to this message. > There are also SPARC user-space test cases in the GCC bug entry. And here's the test module. To illustrate the bug it uses math-emu to convert a series of single-bit power-of-two ints to corresponding floats and back, and prints the intermediate representations and the differences between the original values and the final ones. These ints all have exact float representations so there should be no differences. But for 0x08000000 the conversion goes wrong. The test case then converts a few more numbers near this one and up to the next power-of-two which did convert Ok. With an unpatched kernel you get the following kernel messages after insmod: mathemu_test: init test_itof: 0x40000000 -> 0x4e800000 -> 0x40000000, diff 0 test_itof: 0x20000000 -> 0x4e000000 -> 0x20000000, diff 0 test_itof: 0x10000000 -> 0x4d800000 -> 0x10000000, diff 0 test_itof: 0x08000000 -> 0x4d800000 -> 0x10000000, diff 134217728 test_itof: 0x04000000 -> 0x4c800000 -> 0x04000000, diff 0 test_itof: 0x02000000 -> 0x4c000000 -> 0x02000000, diff 0 test_itof: 0x01000000 -> 0x4b800000 -> 0x01000000, diff 0 test_itof: 0x00800000 -> 0x4b000000 -> 0x00800000, diff 0 test_itof: 0x0f000000 -> 0x4de00000 -> 0x1c000000, diff 218103808 test_itof: 0x0e000000 -> 0x4dc00000 -> 0x18000000, diff 167772160 test_itof: 0x0d000000 -> 0x4da00000 -> 0x14000000, diff 117440512 test_itof: 0x0c000000 -> 0x4d800000 -> 0x10000000, diff 67108864 test_itof: 0x0b000000 -> 0x4de00000 -> 0x1c000000, diff 285212672 test_itof: 0x0a000000 -> 0x4dc00000 -> 0x18000000, diff 234881024 test_itof: 0x09000000 -> 0x4da00000 -> 0x14000000, diff 184549376 test_itof: 0x08000000 -> 0x4d800000 -> 0x10000000, diff 134217728 test_itof: 0x07000000 -> 0x4ce00000 -> 0x07000000, diff 0 With the patch applied, you instead get this: mathemu_test: init test_itof: 0x40000000 -> 0x4e800000 -> 0x40000000, diff 0 test_itof: 0x20000000 -> 0x4e000000 -> 0x20000000, diff 0 test_itof: 0x10000000 -> 0x4d800000 -> 0x10000000, diff 0 test_itof: 0x08000000 -> 0x4d000000 -> 0x08000000, diff 0 test_itof: 0x04000000 -> 0x4c800000 -> 0x04000000, diff 0 test_itof: 0x02000000 -> 0x4c000000 -> 0x02000000, diff 0 test_itof: 0x01000000 -> 0x4b800000 -> 0x01000000, diff 0 test_itof: 0x00800000 -> 0x4b000000 -> 0x00800000, diff 0 test_itof: 0x0f000000 -> 0x4d700000 -> 0x0f000000, diff 0 test_itof: 0x0e000000 -> 0x4d600000 -> 0x0e000000, diff 0 test_itof: 0x0d000000 -> 0x4d500000 -> 0x0d000000, diff 0 test_itof: 0x0c000000 -> 0x4d400000 -> 0x0c000000, diff 0 test_itof: 0x0b000000 -> 0x4d300000 -> 0x0b000000, diff 0 test_itof: 0x0a000000 -> 0x4d200000 -> 0x0a000000, diff 0 test_itof: 0x09000000 -> 0x4d100000 -> 0x09000000, diff 0 test_itof: 0x08000000 -> 0x4d000000 -> 0x08000000, diff 0 test_itof: 0x07000000 -> 0x4ce00000 -> 0x07000000, diff 0 Unfortunately it seems difficult to write a generic module which uses math-emu: - <math-emu/soft-fp.h> includes <asm/sfp-machine.h>, but only a handful of archs have it - <asm/sfp-machine.h> isn't always self-contained and may depend on various $arch-specific declarations being present The given test module works on sparc64 and ppc64, where it uses the kernel's sfp-machine.h, and on x86 where it uses a stub sfp-machine.h supplied by itself. I tried to cross-compile it for alpha, but that failed due to its sfp-machine.h not being self-contained. I didn't try sh or s390. /Mikael

From: Mikael Pettersson <mikpe@it.uu.se> Date: Mon, 19 Jul 2010 23:58:42 +0200 > The kernel's math-emu code contains a macro _FP_FROM_INT() which is > used to convert an integer to a raw normalized floating-point value. > It does this basically in three steps: > > 1. Compute the exponent from the number of leading zero bits. > 2. Downshift large fractions to put the MSB in the right position > for normalized fractions. > 3. Upshift small fractions to put the MSB in the right position. > > There is an boundary error in step 2, causing a fraction with its > MSB exactly one bit above the normalized MSB position to not be > downshifted. This results in a non-normalized raw float, which when > packed becomes a massively inaccurate representation for that input. > > The impact of this depends on a number of arch-specific factors, > but it is known to have broken emulation of FXTOD instructions > on UltraSPARC III, which was originally reported as GCC bug 44631 > <http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44631>. > > Any arch which uses math-emu to emulate conversions from integers to > same-size floats may be affected. > > The fix is simple: the exponent comparison used to determine if the > fraction should be downshifted must be "<=" not "<". > > I'm sending a kernel module to test this as a reply to this message. > There are also SPARC user-space test cases in the GCC bug entry. > > Signed-off-by: Mikael Pettersson <mikpe@it.uu.se> Thanks for fixing this Mikael: Acked-by: David S. Miller <davem@davemloft.net> Has anyone done an audit to compare the copy of math-emu in glibc, gcc, and the linux kernel so that we don't have bugs living in some places but not others? These sources really need to be consolidated somehow. -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

On Tue, 20 Jul 2010 00:12:02 +0200 Mikael Pettersson <mikpe@it.uu.se> wrote: > Unfortunately it seems difficult to write a generic module > which uses math-emu: > - <math-emu/soft-fp.h> includes <asm/sfp-machine.h>, > but only a handful of archs have it > - <asm/sfp-machine.h> isn't always self-contained and may depend > on various $arch-specific declarations being present > > The given test module works on sparc64 and ppc64, where it uses > the kernel's sfp-machine.h, and on x86 where it uses a stub > sfp-machine.h supplied by itself. I tried to cross-compile it > for alpha, but that failed due to its sfp-machine.h not being > self-contained. I didn't try sh or s390. It would be challange to try this on s390. The math emulation code is only used for really old 31 bit machines. Starting with the G5 the fpu can do IEEE754, I would say the math emulation code is irrelevant for s390 by now.

Mikael Pettersson writes: > The kernel's math-emu code contains a macro _FP_FROM_INT() which is > used to convert an integer to a raw normalized floating-point value. > It does this basically in three steps: > > 1. Compute the exponent from the number of leading zero bits. > 2. Downshift large fractions to put the MSB in the right position > for normalized fractions. > 3. Upshift small fractions to put the MSB in the right position. > > There is an boundary error in step 2, causing a fraction with its > MSB exactly one bit above the normalized MSB position to not be > downshifted. This results in a non-normalized raw float, which when > packed becomes a massively inaccurate representation for that input. > > The impact of this depends on a number of arch-specific factors, > but it is known to have broken emulation of FXTOD instructions > on UltraSPARC III, which was originally reported as GCC bug 44631 > <http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44631>. > > Any arch which uses math-emu to emulate conversions from integers to > same-size floats may be affected. > > The fix is simple: the exponent comparison used to determine if the > fraction should be downshifted must be "<=" not "<". > > I'm sending a kernel module to test this as a reply to this message. > There are also SPARC user-space test cases in the GCC bug entry. > > Signed-off-by: Mikael Pettersson <mikpe@it.uu.se> I forgot to mention that this needs to be backported to older kernels, so can the maintainer who picks this up please add Cc: stable@kernel.org Thanks, /Mikael > --- > include/math-emu/op-common.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff -rupN linux-2.6.35-rc5/include/math-emu/op-common.h linux-2.6.35-rc5.mathemu-FP_FROM_INT-fraction-downshift-condition/include/math-emu/op-common.h > --- linux-2.6.35-rc5/include/math-emu/op-common.h 2010-05-17 19:51:32.000000000 +0200 > +++ linux-2.6.35-rc5.mathemu-FP_FROM_INT-fraction-downshift-condition/include/math-emu/op-common.h 2010-07-18 22:33:46.000000000 +0200 > @@ -799,7 +799,7 @@ do { \ > X##_e -= (_FP_W_TYPE_SIZE - rsize); \ > X##_e = rsize - X##_e - 1; \ > \ > - if (_FP_FRACBITS_##fs < rsize && _FP_WFRACBITS_##fs < X##_e) \ > + if (_FP_FRACBITS_##fs < rsize && _FP_WFRACBITS_##fs <= X##_e) \ > __FP_FRAC_SRS_1(ur_, (X##_e - _FP_WFRACBITS_##fs + 1), rsize);\ > _FP_FRAC_DISASSEMBLE_##wc(X, ur_, rsize); \ > if ((_FP_WFRACBITS_##fs - X##_e - 1) > 0) \ > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

From: Mikael Pettersson <mikpe@it.uu.se> Date: Mon, 19 Jul 2010 23:58:42 +0200 > The kernel's math-emu code contains a macro _FP_FROM_INT() which is > used to convert an integer to a raw normalized floating-point value. > It does this basically in three steps: ... > The fix is simple: the exponent comparison used to determine if the > fraction should be downshifted must be "<=" not "<". > > I'm sending a kernel module to test this as a reply to this message. > There are also SPARC user-space test cases in the GCC bug entry. > > Signed-off-by: Mikael Pettersson <mikpe@it.uu.se> Applied and I'll make sure this gets into -stable too. Thanks! -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

diff -rupN linux-2.6.35-rc5/include/math-emu/op-common.h linux-2.6.35-rc5.mathemu-FP_FROM_INT-fraction-downshift-condition/include/math-emu/op-common.h --- linux-2.6.35-rc5/include/math-emu/op-common.h 2010-05-17 19:51:32.000000000 +0200 +++ linux-2.6.35-rc5.mathemu-FP_FROM_INT-fraction-downshift-condition/include/math-emu/op-common.h 2010-07-18 22:33:46.000000000 +0200 @@ -799,7 +799,7 @@ do { \ X##_e -= (_FP_W_TYPE_SIZE - rsize); \ X##_e = rsize - X##_e - 1; \ \ - if (_FP_FRACBITS_##fs < rsize && _FP_WFRACBITS_##fs < X##_e) \ + if (_FP_FRACBITS_##fs < rsize && _FP_WFRACBITS_##fs <= X##_e) \ __FP_FRAC_SRS_1(ur_, (X##_e - _FP_WFRACBITS_##fs + 1), rsize);\ _FP_FRAC_DISASSEMBLE_##wc(X, ur_, rsize); \ if ((_FP_WFRACBITS_##fs - X##_e - 1) > 0) \

`The kernel's math-emu code contains a macro _FP_FROM_INT() which is used to convert an integer to a raw normalized floating-point value. It does this basically in three steps: 1. Compute the exponent from the number of leading zero bits. 2. Downshift large fractions to put the MSB in the right position for normalized fractions. 3. Upshift small fractions to put the MSB in the right position. There is an boundary error in step 2, causing a fraction with its MSB exactly one bit above the normalized MSB position to not be downshifted. This results in a non-normalized raw float, which when packed becomes a massively inaccurate representation for that input. The impact of this depends on a number of arch-specific factors, but it is known to have broken emulation of FXTOD instructions on UltraSPARC III, which was originally reported as GCC bug 44631 <http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44631>. Any arch which uses math-emu to emulate conversions from integers to same-size floats may be affected. The fix is simple: the exponent comparison used to determine if the fraction should be downshifted must be "<=" not "<". I'm sending a kernel module to test this as a reply to this message. There are also SPARC user-space test cases in the GCC bug entry. Signed-off-by: Mikael Pettersson <mikpe@it.uu.se> --- include/math-emu/op-common.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html`