Patchwork math-emu: correct test for downshifting fraction in _FP_FROM_INT()

login
register
mail settings
Submitter Mikael Pettersson
Date July 19, 2010, 9:58 p.m.
Message ID <19524.51858.992299.119315@pilspetsen.it.uu.se>
Download mbox | patch
Permalink /patch/59229/
State Accepted
Delegated to: David Miller
Headers show

Comments

Mikael Pettersson - July 19, 2010, 9:58 p.m.
The kernel's math-emu code contains a macro _FP_FROM_INT() which is
used to convert an integer to a raw normalized floating-point value.
It does this basically in three steps:

1. Compute the exponent from the number of leading zero bits.
2. Downshift large fractions to put the MSB in the right position
   for normalized fractions.
3. Upshift small fractions to put the MSB in the right position.

There is an boundary error in step 2, causing a fraction with its
MSB exactly one bit above the normalized MSB position to not be
downshifted.  This results in a non-normalized raw float, which when
packed becomes a massively inaccurate representation for that input.

The impact of this depends on a number of arch-specific factors,
but it is known to have broken emulation of FXTOD instructions
on UltraSPARC III, which was originally reported as GCC bug 44631
<http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44631>.

Any arch which uses math-emu to emulate conversions from integers to
same-size floats may be affected.

The fix is simple: the exponent comparison used to determine if the
fraction should be downshifted must be "<=" not "<".

I'm sending a kernel module to test this as a reply to this message.
There are also SPARC user-space test cases in the GCC bug entry.

Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
---
 include/math-emu/op-common.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mikael Pettersson - July 19, 2010, 10:12 p.m.
Mikael Pettersson writes:
 > The kernel's math-emu code contains a macro _FP_FROM_INT() which is
 > used to convert an integer to a raw normalized floating-point value.
 > It does this basically in three steps:
 > 
 > 1. Compute the exponent from the number of leading zero bits.
 > 2. Downshift large fractions to put the MSB in the right position
 >    for normalized fractions.
 > 3. Upshift small fractions to put the MSB in the right position.
 > 
 > There is an boundary error in step 2, causing a fraction with its
 > MSB exactly one bit above the normalized MSB position to not be
 > downshifted.  This results in a non-normalized raw float, which when
 > packed becomes a massively inaccurate representation for that input.
 > 
 > The impact of this depends on a number of arch-specific factors,
 > but it is known to have broken emulation of FXTOD instructions
 > on UltraSPARC III, which was originally reported as GCC bug 44631
 > <http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44631>.
 > 
 > Any arch which uses math-emu to emulate conversions from integers to
 > same-size floats may be affected.
 > 
 > The fix is simple: the exponent comparison used to determine if the
 > fraction should be downshifted must be "<=" not "<".
 > 
 > I'm sending a kernel module to test this as a reply to this message.
 > There are also SPARC user-space test cases in the GCC bug entry.

And here's the test module.

To illustrate the bug it uses math-emu to convert a series of
single-bit power-of-two ints to corresponding floats and back,
and prints the intermediate representations and the differences
between the original values and the final ones.  These ints all
have exact float representations so there should be no differences.

But for 0x08000000 the conversion goes wrong.  The test case
then converts a few more numbers near this one and up to the next
power-of-two which did convert Ok.

With an unpatched kernel you get the following kernel messages
after insmod:

mathemu_test: init
test_itof: 0x40000000 -> 0x4e800000 -> 0x40000000, diff 0
test_itof: 0x20000000 -> 0x4e000000 -> 0x20000000, diff 0
test_itof: 0x10000000 -> 0x4d800000 -> 0x10000000, diff 0
test_itof: 0x08000000 -> 0x4d800000 -> 0x10000000, diff 134217728
test_itof: 0x04000000 -> 0x4c800000 -> 0x04000000, diff 0
test_itof: 0x02000000 -> 0x4c000000 -> 0x02000000, diff 0
test_itof: 0x01000000 -> 0x4b800000 -> 0x01000000, diff 0
test_itof: 0x00800000 -> 0x4b000000 -> 0x00800000, diff 0
test_itof: 0x0f000000 -> 0x4de00000 -> 0x1c000000, diff 218103808
test_itof: 0x0e000000 -> 0x4dc00000 -> 0x18000000, diff 167772160
test_itof: 0x0d000000 -> 0x4da00000 -> 0x14000000, diff 117440512
test_itof: 0x0c000000 -> 0x4d800000 -> 0x10000000, diff 67108864
test_itof: 0x0b000000 -> 0x4de00000 -> 0x1c000000, diff 285212672
test_itof: 0x0a000000 -> 0x4dc00000 -> 0x18000000, diff 234881024
test_itof: 0x09000000 -> 0x4da00000 -> 0x14000000, diff 184549376
test_itof: 0x08000000 -> 0x4d800000 -> 0x10000000, diff 134217728
test_itof: 0x07000000 -> 0x4ce00000 -> 0x07000000, diff 0

With the patch applied, you instead get this:

mathemu_test: init
test_itof: 0x40000000 -> 0x4e800000 -> 0x40000000, diff 0
test_itof: 0x20000000 -> 0x4e000000 -> 0x20000000, diff 0
test_itof: 0x10000000 -> 0x4d800000 -> 0x10000000, diff 0
test_itof: 0x08000000 -> 0x4d000000 -> 0x08000000, diff 0
test_itof: 0x04000000 -> 0x4c800000 -> 0x04000000, diff 0
test_itof: 0x02000000 -> 0x4c000000 -> 0x02000000, diff 0
test_itof: 0x01000000 -> 0x4b800000 -> 0x01000000, diff 0
test_itof: 0x00800000 -> 0x4b000000 -> 0x00800000, diff 0
test_itof: 0x0f000000 -> 0x4d700000 -> 0x0f000000, diff 0
test_itof: 0x0e000000 -> 0x4d600000 -> 0x0e000000, diff 0
test_itof: 0x0d000000 -> 0x4d500000 -> 0x0d000000, diff 0
test_itof: 0x0c000000 -> 0x4d400000 -> 0x0c000000, diff 0
test_itof: 0x0b000000 -> 0x4d300000 -> 0x0b000000, diff 0
test_itof: 0x0a000000 -> 0x4d200000 -> 0x0a000000, diff 0
test_itof: 0x09000000 -> 0x4d100000 -> 0x09000000, diff 0
test_itof: 0x08000000 -> 0x4d000000 -> 0x08000000, diff 0
test_itof: 0x07000000 -> 0x4ce00000 -> 0x07000000, diff 0

Unfortunately it seems difficult to write a generic module
which uses math-emu:
- <math-emu/soft-fp.h> includes <asm/sfp-machine.h>,
  but only a handful of archs have it
- <asm/sfp-machine.h> isn't always self-contained and may depend
  on various $arch-specific declarations being present

The given test module works on sparc64 and ppc64, where it uses
the kernel's sfp-machine.h, and on x86 where it uses a stub
sfp-machine.h supplied by itself.  I tried to cross-compile it
for alpha, but that failed due to its sfp-machine.h not being
self-contained.  I didn't try sh or s390.

/Mikael
David Miller - July 19, 2010, 10:12 p.m.
From: Mikael Pettersson <mikpe@it.uu.se>
Date: Mon, 19 Jul 2010 23:58:42 +0200

> The kernel's math-emu code contains a macro _FP_FROM_INT() which is
> used to convert an integer to a raw normalized floating-point value.
> It does this basically in three steps:
> 
> 1. Compute the exponent from the number of leading zero bits.
> 2. Downshift large fractions to put the MSB in the right position
>    for normalized fractions.
> 3. Upshift small fractions to put the MSB in the right position.
> 
> There is an boundary error in step 2, causing a fraction with its
> MSB exactly one bit above the normalized MSB position to not be
> downshifted.  This results in a non-normalized raw float, which when
> packed becomes a massively inaccurate representation for that input.
> 
> The impact of this depends on a number of arch-specific factors,
> but it is known to have broken emulation of FXTOD instructions
> on UltraSPARC III, which was originally reported as GCC bug 44631
> <http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44631>.
> 
> Any arch which uses math-emu to emulate conversions from integers to
> same-size floats may be affected.
> 
> The fix is simple: the exponent comparison used to determine if the
> fraction should be downshifted must be "<=" not "<".
> 
> I'm sending a kernel module to test this as a reply to this message.
> There are also SPARC user-space test cases in the GCC bug entry.
> 
> Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>

Thanks for fixing this Mikael:

Acked-by: David S. Miller <davem@davemloft.net>

Has anyone done an audit to compare the copy of math-emu in glibc, gcc,
and the linux kernel so that we don't have bugs living in some places
but not others?

These sources really need to be consolidated somehow.

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Martin Schwidefsky - July 20, 2010, 7:34 a.m.
On Tue, 20 Jul 2010 00:12:02 +0200
Mikael Pettersson <mikpe@it.uu.se> wrote:

> Unfortunately it seems difficult to write a generic module
> which uses math-emu:
> - <math-emu/soft-fp.h> includes <asm/sfp-machine.h>,
>   but only a handful of archs have it
> - <asm/sfp-machine.h> isn't always self-contained and may depend
>   on various $arch-specific declarations being present
> 
> The given test module works on sparc64 and ppc64, where it uses
> the kernel's sfp-machine.h, and on x86 where it uses a stub
> sfp-machine.h supplied by itself.  I tried to cross-compile it
> for alpha, but that failed due to its sfp-machine.h not being
> self-contained.  I didn't try sh or s390.

It would be challange to try this on s390. The math emulation code is
only used for really old 31 bit machines. Starting with the G5 the fpu
can do IEEE754, I would say the math emulation code is irrelevant for
s390 by now.
Mikael Pettersson - July 20, 2010, 1:35 p.m.
Mikael Pettersson writes:
 > The kernel's math-emu code contains a macro _FP_FROM_INT() which is
 > used to convert an integer to a raw normalized floating-point value.
 > It does this basically in three steps:
 > 
 > 1. Compute the exponent from the number of leading zero bits.
 > 2. Downshift large fractions to put the MSB in the right position
 >    for normalized fractions.
 > 3. Upshift small fractions to put the MSB in the right position.
 > 
 > There is an boundary error in step 2, causing a fraction with its
 > MSB exactly one bit above the normalized MSB position to not be
 > downshifted.  This results in a non-normalized raw float, which when
 > packed becomes a massively inaccurate representation for that input.
 > 
 > The impact of this depends on a number of arch-specific factors,
 > but it is known to have broken emulation of FXTOD instructions
 > on UltraSPARC III, which was originally reported as GCC bug 44631
 > <http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44631>.
 > 
 > Any arch which uses math-emu to emulate conversions from integers to
 > same-size floats may be affected.
 > 
 > The fix is simple: the exponent comparison used to determine if the
 > fraction should be downshifted must be "<=" not "<".
 > 
 > I'm sending a kernel module to test this as a reply to this message.
 > There are also SPARC user-space test cases in the GCC bug entry.
 > 
 > Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>

I forgot to mention that this needs to be backported to older kernels,
so can the maintainer who picks this up please add

	Cc: stable@kernel.org

Thanks,

/Mikael

 > ---
 >  include/math-emu/op-common.h |    2 +-
 >  1 file changed, 1 insertion(+), 1 deletion(-)
 > 
 > diff -rupN linux-2.6.35-rc5/include/math-emu/op-common.h linux-2.6.35-rc5.mathemu-FP_FROM_INT-fraction-downshift-condition/include/math-emu/op-common.h
 > --- linux-2.6.35-rc5/include/math-emu/op-common.h	2010-05-17 19:51:32.000000000 +0200
 > +++ linux-2.6.35-rc5.mathemu-FP_FROM_INT-fraction-downshift-condition/include/math-emu/op-common.h	2010-07-18 22:33:46.000000000 +0200
 > @@ -799,7 +799,7 @@ do {									\
 >  		X##_e -= (_FP_W_TYPE_SIZE - rsize);			\
 >  	X##_e = rsize - X##_e - 1;					\
 >  									\
 > -	if (_FP_FRACBITS_##fs < rsize && _FP_WFRACBITS_##fs < X##_e)	\
 > +	if (_FP_FRACBITS_##fs < rsize && _FP_WFRACBITS_##fs <= X##_e)	\
 >  	  __FP_FRAC_SRS_1(ur_, (X##_e - _FP_WFRACBITS_##fs + 1), rsize);\
 >  	_FP_FRAC_DISASSEMBLE_##wc(X, ur_, rsize);			\
 >  	if ((_FP_WFRACBITS_##fs - X##_e - 1) > 0)			\
 > --
 > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
 > the body of a message to majordomo@vger.kernel.org
 > More majordomo info at  http://vger.kernel.org/majordomo-info.html
 > Please read the FAQ at  http://www.tux.org/lkml/
 > 
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller - July 21, 2010, 1:45 a.m.
From: Mikael Pettersson <mikpe@it.uu.se>
Date: Mon, 19 Jul 2010 23:58:42 +0200

> The kernel's math-emu code contains a macro _FP_FROM_INT() which is
> used to convert an integer to a raw normalized floating-point value.
> It does this basically in three steps:
 ...
> The fix is simple: the exponent comparison used to determine if the
> fraction should be downshifted must be "<=" not "<".
> 
> I'm sending a kernel module to test this as a reply to this message.
> There are also SPARC user-space test cases in the GCC bug entry.
> 
> Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>

Applied and I'll make sure this gets into -stable too.

Thanks!
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff -rupN linux-2.6.35-rc5/include/math-emu/op-common.h linux-2.6.35-rc5.mathemu-FP_FROM_INT-fraction-downshift-condition/include/math-emu/op-common.h
--- linux-2.6.35-rc5/include/math-emu/op-common.h	2010-05-17 19:51:32.000000000 +0200
+++ linux-2.6.35-rc5.mathemu-FP_FROM_INT-fraction-downshift-condition/include/math-emu/op-common.h	2010-07-18 22:33:46.000000000 +0200
@@ -799,7 +799,7 @@  do {									\
 		X##_e -= (_FP_W_TYPE_SIZE - rsize);			\
 	X##_e = rsize - X##_e - 1;					\
 									\
-	if (_FP_FRACBITS_##fs < rsize && _FP_WFRACBITS_##fs < X##_e)	\
+	if (_FP_FRACBITS_##fs < rsize && _FP_WFRACBITS_##fs <= X##_e)	\
 	  __FP_FRAC_SRS_1(ur_, (X##_e - _FP_WFRACBITS_##fs + 1), rsize);\
 	_FP_FRAC_DISASSEMBLE_##wc(X, ur_, rsize);			\
 	if ((_FP_WFRACBITS_##fs - X##_e - 1) > 0)			\