Message ID | 7FDC38C0-90D4-4A32-8FD5-ED65DC469055@imgtec.com |
---|---|
State | New |
Headers | show |
On 12/21/2016 11:54 AM, Yunqiang Su wrote: > By this patch, I add a build-time option ` --with-unfused-madd4=yes/no’, > and runtime option -m(no-)unfused-madd4, > to disable generate madd.fmt instructions. Your patch also needs a documentation change so that the new command-line option is listed in the GCC manual with other MIPS target options. -Sandra
Sandra Loosemore <sandra@codesourcery.com> writes: > On 12/21/2016 11:54 AM, Yunqiang Su wrote: > > By this patch, I add a build-time option ` --with-unfused-madd4=yes/no', > > and runtime option -m(no-)unfused-madd4, > > to disable generate madd.fmt instructions. > > Your patch also needs a documentation change so that the new > command-line option is listed in the GCC manual with other MIPS target > options. Any opinions on option names to control this? Is it best to target the specific feature that is non-compliant on loongson or apply a general -mfix-loongson type option? I'm not sure I have a strong opinion either way but there do seem to be multiple possible variants. Thanks, Matthew
Matthew Fortune <Matthew.Fortune@imgtec.com> writes: > Sandra Loosemore <sandra@codesourcery.com> writes: >> On 12/21/2016 11:54 AM, Yunqiang Su wrote: >> > By this patch, I add a build-time option ` --with-unfused-madd4=yes/no', >> > and runtime option -m(no-)unfused-madd4, >> > to disable generate madd.fmt instructions. >> >> Your patch also needs a documentation change so that the new >> command-line option is listed in the GCC manual with other MIPS target >> options. > > Any opinions on option names to control this? Is it best to target the specific > feature that is non-compliant on loongson or apply a general -mfix-loongson > type option? > > I'm not sure I have a strong opinion either way but there do seem to be > multiple possible variants. Wasn't sure from this thread whether Loongson simply had a fused implementation (without intermediate rounding) or whether the instructions gave numerically incorrect results for some inputs. It sounds from a later thread like it's generating incorrect results, is that right? If so, then FWIW I agree an -mfix option would be more consistent. E.g. one of the -mfix-vr4120 errata was an incorrect integer division result and one of the -mfix-sb1 errata was an incorrect single-precision float division result. The latter case could have been handled by an option to disable DIV.S and DIV.PS, but the -mfix option gave more control. If instead the problem is that the instructions are fused then that's also what the original MIPS 4 parts did, so maybe an option to control fusedness would make sense. Thanks, Richard
> 在 2016年12月22日,23:31,Richard Sandiford <rdsandiford@googlemail.com> 写道: > > Matthew Fortune <Matthew.Fortune@imgtec.com> writes: >> Sandra Loosemore <sandra@codesourcery.com> writes: >>> On 12/21/2016 11:54 AM, Yunqiang Su wrote: >>>> By this patch, I add a build-time option ` --with-unfused-madd4=yes/no', >>>> and runtime option -m(no-)unfused-madd4, >>>> to disable generate madd.fmt instructions. >>> >>> Your patch also needs a documentation change so that the new >>> command-line option is listed in the GCC manual with other MIPS target >>> options. >> >> Any opinions on option names to control this? Is it best to target the specific >> feature that is non-compliant on loongson or apply a general -mfix-loongson >> type option? >> >> I'm not sure I have a strong opinion either way but there do seem to be >> multiple possible variants. > > Wasn't sure from this thread whether Loongson simply had a fused > implementation (without intermediate rounding) or whether the > instructions gave numerically incorrect results for some inputs. I test to define ISA_HAS_FUSED_MADD4 true and define ISA_HAS_UNFUSED_MADD4 false, and try to build a test case. With ISA_HAS_FUSED_MADD4, the result is about 1e-17, and with ISA_HAS_UNFUSED_MADD4, the result is about 1e-17, both of the are incorrect (the expect value is 0). The test case is #include <stdio.h> double a = 0.6; double b = 0.4; double c = 0.6; double d = 0.4; int main(void) { double x = a * b - c * d; printf("%le\n", x); return 0; } > It sounds from a later thread like it's generating incorrect results, > is that right? If so, then FWIW I agree an -mfix option would be more > consistent. E.g. one of the -mfix-vr4120 errata was an incorrect > integer division result and one of the -mfix-sb1 errata was an incorrect > single-precision float division result. The latter case could have been > handled by an option to disable DIV.S and DIV.PS, but the -mfix option > gave more control. > > If instead the problem is that the instructions are fused then that's > also what the original MIPS 4 parts did, so maybe an option to control > fusedness would make sense. The result to thread it fused or unfused, is different, while neither of them is correct. > > Thanks, > Richard
> 在 2016年12月22日,23:48,Yunqiang Su <Yunqiang.Su@imgtec.com> 写道: > >> >> 在 2016年12月22日,23:31,Richard Sandiford <rdsandiford@googlemail.com> 写道: >> >> Matthew Fortune <Matthew.Fortune@imgtec.com> writes: >>> Sandra Loosemore <sandra@codesourcery.com> writes: >>>> On 12/21/2016 11:54 AM, Yunqiang Su wrote: >>>>> By this patch, I add a build-time option ` --with-unfused-madd4=yes/no', >>>>> and runtime option -m(no-)unfused-madd4, >>>>> to disable generate madd.fmt instructions. >>>> >>>> Your patch also needs a documentation change so that the new >>>> command-line option is listed in the GCC manual with other MIPS target >>>> options. >>> >>> Any opinions on option names to control this? Is it best to target the specific >>> feature that is non-compliant on loongson or apply a general -mfix-loongson >>> type option? >>> >>> I'm not sure I have a strong opinion either way but there do seem to be >>> multiple possible variants. >> >> Wasn't sure from this thread whether Loongson simply had a fused >> implementation (without intermediate rounding) or whether the >> instructions gave numerically incorrect results for some inputs. > > I test to define ISA_HAS_FUSED_MADD4 true and > define ISA_HAS_UNFUSED_MADD4 false, and try to build a test case. > With ISA_HAS_FUSED_MADD4, the result is about 1e-17, > and with ISA_HAS_UNFUSED_MADD4, the result is about 1e-17, > both of the are incorrect (the expect value is 0). > > The test case is > > #include <stdio.h> > > double a = 0.6; > double b = 0.4; > double c = 0.6; > double d = 0.4; > > int main(void) > { > double x = a * b - c * d; > printf("%le\n", x); > return 0; > } > > >> It sounds from a later thread like it's generating incorrect results, >> is that right? If so, then FWIW I agree an -mfix option would be more >> consistent. E.g. one of the -mfix-vr4120 errata was an incorrect >> integer division result and one of the -mfix-sb1 errata was an incorrect >> single-precision float division result. The latter case could have been >> handled by an option to disable DIV.S and DIV.PS, but the -mfix option >> gave more control. >> >> If instead the problem is that the instructions are fused then that's >> also what the original MIPS 4 parts did, so maybe an option to control >> fusedness would make sense. > > The result to thread it fused or unfused, is different, while neither of them > is correct. ohh, the result are same, and neither is correct. both of them are 1.332268e-17. > >> >> Thanks, >> Richard
Yunqiang Su <Yunqiang.Su@imgtec.com> writes: >> 在 2016年12月22日,23:48,Yunqiang Su <Yunqiang.Su@imgtec.com> 写道: >> >>> >>> 在 2016年12月22日,23:31,Richard Sandiford >>> <rdsandiford@googlemail.com> 写道: >>> >>> Matthew Fortune <Matthew.Fortune@imgtec.com> writes: >>>> Sandra Loosemore <sandra@codesourcery.com> writes: >>>>> On 12/21/2016 11:54 AM, Yunqiang Su wrote: >>>>>> By this patch, I add a build-time option ` --with-unfused-madd4=yes/no', >>>>>> and runtime option -m(no-)unfused-madd4, >>>>>> to disable generate madd.fmt instructions. >>>>> >>>>> Your patch also needs a documentation change so that the new >>>>> command-line option is listed in the GCC manual with other MIPS target >>>>> options. >>>> >>>> Any opinions on option names to control this? Is it best to target >>>> the specific >>>> feature that is non-compliant on loongson or apply a general -mfix-loongson >>>> type option? >>>> >>>> I'm not sure I have a strong opinion either way but there do seem to be >>>> multiple possible variants. >>> >>> Wasn't sure from this thread whether Loongson simply had a fused >>> implementation (without intermediate rounding) or whether the >>> instructions gave numerically incorrect results for some inputs. >> >> I test to define ISA_HAS_FUSED_MADD4 true and >> define ISA_HAS_UNFUSED_MADD4 false, and try to build a test case. >> With ISA_HAS_FUSED_MADD4, the result is about 1e-17, >> and with ISA_HAS_UNFUSED_MADD4, the result is about 1e-17, >> both of the are incorrect (the expect value is 0). >> >> The test case is >> >> #include <stdio.h> >> >> double a = 0.6; >> double b = 0.4; >> double c = 0.6; >> double d = 0.4; >> >> int main(void) >> { >> double x = a * b - c * d; >> printf("%le\n", x); >> return 0; >> } >> >> >>> It sounds from a later thread like it's generating incorrect results, >>> is that right? If so, then FWIW I agree an -mfix option would be more >>> consistent. E.g. one of the -mfix-vr4120 errata was an incorrect >>> integer division result and one of the -mfix-sb1 errata was an incorrect >>> single-precision float division result. The latter case could have been >>> handled by an option to disable DIV.S and DIV.PS, but the -mfix option >>> gave more control. >>> >>> If instead the problem is that the instructions are fused then that's >>> also what the original MIPS 4 parts did, so maybe an option to control >>> fusedness would make sense. >> >> The result to thread it fused or unfused, is different, while neither of them >> is correct. > > ohh, the result are same, and neither is correct. > both of them are 1.332268e-17. That's the expected result for an implementation in which the subtraction is fused with the first multiplication without intermediate rounding. The second 0.4 * 0.6 isn't exactly representable and rounds down, to a value slightly less than 0.24. Then the fused operation subtracts this value from the exact result of the first 0.4 * 0.6 (0.24), giving a value slightly greater than 0. I see Paul Hua's patch does add Loongson to the list of targets with a fused implementation (should have checked earlier, sorry). So I think after that patch we would do the right thing. (In particular, -ffp-contract=off would then disable the fusing.) Thanks, Richard
> 在 2016年12月23日,00:18,Richard Sandiford <rdsandiford@googlemail.com> 写道: > > Yunqiang Su <Yunqiang.Su@imgtec.com> writes: >>> 在 2016年12月22日,23:48,Yunqiang Su <Yunqiang.Su@imgtec.com> 写道: >>> >>>> >>>> 在 2016年12月22日,23:31,Richard Sandiford >>>> <rdsandiford@googlemail.com> 写道: >>>> >>>> Matthew Fortune <Matthew.Fortune@imgtec.com> writes: >>>>> Sandra Loosemore <sandra@codesourcery.com> writes: >>>>>> On 12/21/2016 11:54 AM, Yunqiang Su wrote: >>>>>>> By this patch, I add a build-time option ` --with-unfused-madd4=yes/no', >>>>>>> and runtime option -m(no-)unfused-madd4, >>>>>>> to disable generate madd.fmt instructions. >>>>>> >>>>>> Your patch also needs a documentation change so that the new >>>>>> command-line option is listed in the GCC manual with other MIPS target >>>>>> options. >>>>> >>>>> Any opinions on option names to control this? Is it best to target >>>>> the specific >>>>> feature that is non-compliant on loongson or apply a general -mfix-loongson >>>>> type option? >>>>> >>>>> I'm not sure I have a strong opinion either way but there do seem to be >>>>> multiple possible variants. >>>> >>>> Wasn't sure from this thread whether Loongson simply had a fused >>>> implementation (without intermediate rounding) or whether the >>>> instructions gave numerically incorrect results for some inputs. >>> >>> I test to define ISA_HAS_FUSED_MADD4 true and >>> define ISA_HAS_UNFUSED_MADD4 false, and try to build a test case. >>> With ISA_HAS_FUSED_MADD4, the result is about 1e-17, >>> and with ISA_HAS_UNFUSED_MADD4, the result is about 1e-17, >>> both of the are incorrect (the expect value is 0). >>> >>> The test case is >>> >>> #include <stdio.h> >>> >>> double a = 0.6; >>> double b = 0.4; >>> double c = 0.6; >>> double d = 0.4; >>> >>> int main(void) >>> { >>> double x = a * b - c * d; >>> printf("%le\n", x); >>> return 0; >>> } >>> >>> >>>> It sounds from a later thread like it's generating incorrect results, >>>> is that right? If so, then FWIW I agree an -mfix option would be more >>>> consistent. E.g. one of the -mfix-vr4120 errata was an incorrect >>>> integer division result and one of the -mfix-sb1 errata was an incorrect >>>> single-precision float division result. The latter case could have been >>>> handled by an option to disable DIV.S and DIV.PS, but the -mfix option >>>> gave more control. >>>> >>>> If instead the problem is that the instructions are fused then that's >>>> also what the original MIPS 4 parts did, so maybe an option to control >>>> fusedness would make sense. >>> >>> The result to thread it fused or unfused, is different, while neither of them >>> is correct. >> >> ohh, the result are same, and neither is correct. >> both of them are 1.332268e-17. > > That's the expected result for an implementation in which the subtraction > is fused with the first multiplication without intermediate rounding. > The second 0.4 * 0.6 isn't exactly representable and rounds down, to a > value slightly less than 0.24. Then the fused operation subtracts this > value from the exact result of the first 0.4 * 0.6 (0.24), giving a > value slightly greater than 0. I see. But I think 1.332268e-17 is too big than expected. 1-e17 for double is totally unacceptable. > > I see Paul Hua's patch does add Loongson to the list of targets > with a fused implementation (should have checked earlier, sorry). > So I think after that patch we would do the right thing. (In particular, > -ffp-contract=off would then disable the fusing.) will -ffp-contract=off disable some other optimization? If so, I don’t think that will be an ideal choice for distributions, like Debian. > > Thanks, > Richard
On aarch64 target the result are 1.332268e-17. On x86 with fma target the result are also 1.332268e-17. so, I don't think the Loongson's madd.fmt/msub.fmt is incorrect. We should do something for usage of fused madd, the all things has been tested an fedora21 remix for loongson(1). 1, gcc:add loongson to the list of targets with a fused implementation. 2, glibc:__fma() function,we should use madd.fmt like aarch64. see patch: http://www.loongnix.org/cgit/glibc-2.20/commit/?id=14023742e6ef571b61439d0d7bb7939e663fe624 3, kernel: the emulation when a float exception taken. (1),http://www.loongnix.org/index.php/Loongnix-20161130%E7%89%88%E6%9C%AC%E5%9C%A8Fedora21_remix%E7%B3%BB%E7%BB%9F%E4%B8%8A%E5%8F%91%E5%B8%83 Thanks, Paul On Fri, Dec 23, 2016 at 12:38 AM, Yunqiang Su <Yunqiang.Su@imgtec.com> wrote: > >> 在 2016年12月23日,00:18,Richard Sandiford <rdsandiford@googlemail.com> 写道: >> >> Yunqiang Su <Yunqiang.Su@imgtec.com> writes: >>>> 在 2016年12月22日,23:48,Yunqiang Su <Yunqiang.Su@imgtec.com> 写道: >>>> >>>>> >>>>> 在 2016年12月22日,23:31,Richard Sandiford >>>>> <rdsandiford@googlemail.com> 写道: >>>>> >>>>> Matthew Fortune <Matthew.Fortune@imgtec.com> writes: >>>>>> Sandra Loosemore <sandra@codesourcery.com> writes: >>>>>>> On 12/21/2016 11:54 AM, Yunqiang Su wrote: >>>>>>>> By this patch, I add a build-time option ` --with-unfused-madd4=yes/no', >>>>>>>> and runtime option -m(no-)unfused-madd4, >>>>>>>> to disable generate madd.fmt instructions. >>>>>>> >>>>>>> Your patch also needs a documentation change so that the new >>>>>>> command-line option is listed in the GCC manual with other MIPS target >>>>>>> options. >>>>>> >>>>>> Any opinions on option names to control this? Is it best to target >>>>>> the specific >>>>>> feature that is non-compliant on loongson or apply a general -mfix-loongson >>>>>> type option? >>>>>> >>>>>> I'm not sure I have a strong opinion either way but there do seem to be >>>>>> multiple possible variants. >>>>> >>>>> Wasn't sure from this thread whether Loongson simply had a fused >>>>> implementation (without intermediate rounding) or whether the >>>>> instructions gave numerically incorrect results for some inputs. >>>> >>>> I test to define ISA_HAS_FUSED_MADD4 true and >>>> define ISA_HAS_UNFUSED_MADD4 false, and try to build a test case. >>>> With ISA_HAS_FUSED_MADD4, the result is about 1e-17, >>>> and with ISA_HAS_UNFUSED_MADD4, the result is about 1e-17, >>>> both of the are incorrect (the expect value is 0). >>>> >>>> The test case is >>>> >>>> #include <stdio.h> >>>> >>>> double a = 0.6; >>>> double b = 0.4; >>>> double c = 0.6; >>>> double d = 0.4; >>>> >>>> int main(void) >>>> { >>>> double x = a * b - c * d; >>>> printf("%le\n", x); >>>> return 0; >>>> } >>>> >>>> >>>>> It sounds from a later thread like it's generating incorrect results, >>>>> is that right? If so, then FWIW I agree an -mfix option would be more >>>>> consistent. E.g. one of the -mfix-vr4120 errata was an incorrect >>>>> integer division result and one of the -mfix-sb1 errata was an incorrect >>>>> single-precision float division result. The latter case could have been >>>>> handled by an option to disable DIV.S and DIV.PS, but the -mfix option >>>>> gave more control. >>>>> >>>>> If instead the problem is that the instructions are fused then that's >>>>> also what the original MIPS 4 parts did, so maybe an option to control >>>>> fusedness would make sense. >>>> >>>> The result to thread it fused or unfused, is different, while neither of them >>>> is correct. >>> >>> ohh, the result are same, and neither is correct. >>> both of them are 1.332268e-17. >> >> That's the expected result for an implementation in which the subtraction >> is fused with the first multiplication without intermediate rounding. >> The second 0.4 * 0.6 isn't exactly representable and rounds down, to a >> value slightly less than 0.24. Then the fused operation subtracts this >> value from the exact result of the first 0.4 * 0.6 (0.24), giving a >> value slightly greater than 0. > > I see. But I think 1.332268e-17 is too big than expected. > 1-e17 for double is totally unacceptable. > >> >> I see Paul Hua's patch does add Loongson to the list of targets >> with a fused implementation (should have checked earlier, sorry). >> So I think after that patch we would do the right thing. (In particular, >> -ffp-contract=off would then disable the fusing.) > > will -ffp-contract=off disable some other optimization? > If so, I don’t think that will be an ideal choice for distributions, like Debian. > >> >> Thanks, >> Richard >
> 在 2016年12月23日,10:47,Paul Hua <paul.hua.gm@gmail.com> 写道: > > On aarch64 target the result are 1.332268e-17. > On x86 with fma target the result are also 1.332268e-17. > > so, I don't think the Loongson's madd.fmt/msub.fmt is incorrect. OMG. Will this behavior make some app wrong working abnormal? > > We should do something for usage of fused madd, the all things has > been tested an fedora21 remix for loongson(1). We met lots of ftbfs on Debian with madd.fmt enabled. So maybe FMA should not be used for normal applications at all? and for generic linux distributions FMA shouldn’t be enabled by default. > > 1, gcc:add loongson to the list of targets with a fused implementation. If so, I think we should do this. > 2, glibc:__fma() function,we should use madd.fmt like aarch64. > see patch: http://www.loongnix.org/cgit/glibc-2.20/commit/?id=14023742e6ef571b61439d0d7bb7939e663fe624 Your patch shouldn’t be apply to upstream, as you don’t consider old CPUs without FMA. In fact all MIPS <= MIPSr5 don’t have FMA. > 3, kernel: the emulation when a float exception taken. The big problem is that Loongson use the same encode for (unfused) madd.fmt to (fused) madd.fmt. We cannot trap this in kernel. > > (1),http://www.loongnix.org/index.php/Loongnix-20161130%E7%89%88%E6%9C%AC%E5%9C%A8Fedora21_remix%E7%B3%BB%E7%BB%9F%E4%B8%8A%E5%8F%91%E5%B8%83 > > Thanks, > Paul > > On Fri, Dec 23, 2016 at 12:38 AM, Yunqiang Su <Yunqiang.Su@imgtec.com> wrote: >> >>> 在 2016年12月23日,00:18,Richard Sandiford <rdsandiford@googlemail.com> 写道: >>> >>> Yunqiang Su <Yunqiang.Su@imgtec.com> writes: >>>>> 在 2016年12月22日,23:48,Yunqiang Su <Yunqiang.Su@imgtec.com> 写道: >>>>> >>>>>> >>>>>> 在 2016年12月22日,23:31,Richard Sandiford >>>>>> <rdsandiford@googlemail.com> 写道: >>>>>> >>>>>> Matthew Fortune <Matthew.Fortune@imgtec.com> writes: >>>>>>> Sandra Loosemore <sandra@codesourcery.com> writes: >>>>>>>> On 12/21/2016 11:54 AM, Yunqiang Su wrote: >>>>>>>>> By this patch, I add a build-time option ` --with-unfused-madd4=yes/no', >>>>>>>>> and runtime option -m(no-)unfused-madd4, >>>>>>>>> to disable generate madd.fmt instructions. >>>>>>>> >>>>>>>> Your patch also needs a documentation change so that the new >>>>>>>> command-line option is listed in the GCC manual with other MIPS target >>>>>>>> options. >>>>>>> >>>>>>> Any opinions on option names to control this? Is it best to target >>>>>>> the specific >>>>>>> feature that is non-compliant on loongson or apply a general -mfix-loongson >>>>>>> type option? >>>>>>> >>>>>>> I'm not sure I have a strong opinion either way but there do seem to be >>>>>>> multiple possible variants. >>>>>> >>>>>> Wasn't sure from this thread whether Loongson simply had a fused >>>>>> implementation (without intermediate rounding) or whether the >>>>>> instructions gave numerically incorrect results for some inputs. >>>>> >>>>> I test to define ISA_HAS_FUSED_MADD4 true and >>>>> define ISA_HAS_UNFUSED_MADD4 false, and try to build a test case. >>>>> With ISA_HAS_FUSED_MADD4, the result is about 1e-17, >>>>> and with ISA_HAS_UNFUSED_MADD4, the result is about 1e-17, >>>>> both of the are incorrect (the expect value is 0). >>>>> >>>>> The test case is >>>>> >>>>> #include <stdio.h> >>>>> >>>>> double a = 0.6; >>>>> double b = 0.4; >>>>> double c = 0.6; >>>>> double d = 0.4; >>>>> >>>>> int main(void) >>>>> { >>>>> double x = a * b - c * d; >>>>> printf("%le\n", x); >>>>> return 0; >>>>> } >>>>> >>>>> >>>>>> It sounds from a later thread like it's generating incorrect results, >>>>>> is that right? If so, then FWIW I agree an -mfix option would be more >>>>>> consistent. E.g. one of the -mfix-vr4120 errata was an incorrect >>>>>> integer division result and one of the -mfix-sb1 errata was an incorrect >>>>>> single-precision float division result. The latter case could have been >>>>>> handled by an option to disable DIV.S and DIV.PS, but the -mfix option >>>>>> gave more control. >>>>>> >>>>>> If instead the problem is that the instructions are fused then that's >>>>>> also what the original MIPS 4 parts did, so maybe an option to control >>>>>> fusedness would make sense. >>>>> >>>>> The result to thread it fused or unfused, is different, while neither of them >>>>> is correct. >>>> >>>> ohh, the result are same, and neither is correct. >>>> both of them are 1.332268e-17. >>> >>> That's the expected result for an implementation in which the subtraction >>> is fused with the first multiplication without intermediate rounding. >>> The second 0.4 * 0.6 isn't exactly representable and rounds down, to a >>> value slightly less than 0.24. Then the fused operation subtracts this >>> value from the exact result of the first 0.4 * 0.6 (0.24), giving a >>> value slightly greater than 0. >> >> I see. But I think 1.332268e-17 is too big than expected. >> 1-e17 for double is totally unacceptable. >> >>> >>> I see Paul Hua's patch does add Loongson to the list of targets >>> with a fused implementation (should have checked earlier, sorry). >>> So I think after that patch we would do the right thing. (In particular, >>> -ffp-contract=off would then disable the fusing.) >> >> will -ffp-contract=off disable some other optimization? >> If so, I don’t think that will be an ideal choice for distributions, like Debian. >> >>> >>> Thanks, >>> Richard >>
> 在 2016年12月22日,23:31,Richard Sandiford <rdsandiford@googlemail.com> 写道: > > Matthew Fortune <Matthew.Fortune@imgtec.com> writes: >> Sandra Loosemore <sandra@codesourcery.com> writes: >>> On 12/21/2016 11:54 AM, Yunqiang Su wrote: >>>> By this patch, I add a build-time option ` --with-unfused-madd4=yes/no', >>>> and runtime option -m(no-)unfused-madd4, >>>> to disable generate madd.fmt instructions. >>> >>> Your patch also needs a documentation change so that the new >>> command-line option is listed in the GCC manual with other MIPS target >>> options. >> >> Any opinions on option names to control this? Is it best to target the specific >> feature that is non-compliant on loongson or apply a general -mfix-loongson >> type option? >> >> I'm not sure I have a strong opinion either way but there do seem to be >> multiple possible variants. > > Wasn't sure from this thread whether Loongson simply had a fused > implementation (without intermediate rounding) or whether the I test some cases on Loongson 3A 1000 with glibc’s test: math/test-double. -march=mips64r2 1. set ISA_HAS_FUSED_MADD4 1 set ISA_HAS_UNFUSED_MADD4 0 56 errors occurred, which is the same with amd64 with avx2 enabled. 2. set ISA_HAS_FUSED_MADD4 0 set ISA_HAS_UNFUSED_MADD4 0 no error happens 3. set ISA_HAS_FUSED_MADD4 0 set ISA_HAS_UNFUSED_MADD4 1 816 errors occurred. So it seems Loongson’s madd.fmt is are fused. > instructions gave numerically incorrect results for some inputs. > It sounds from a later thread like it's generating incorrect results, > is that right? If so, then FWIW I agree an -mfix option would be more > consistent. E.g. one of the -mfix-vr4120 errata was an incorrect > integer division result and one of the -mfix-sb1 errata was an incorrect > single-precision float division result. The latter case could have been > handled by an option to disable DIV.S and DIV.PS, but the -mfix option > gave more control. > > If instead the problem is that the instructions are fused then that's > also what the original MIPS 4 parts did, so maybe an option to control > fusedness would make sense. Since for binary Linux distributions, like Debian, we need to generate generic output, we will need an option to disable madd.fmt at all. Besides this, should we have another options, like -mmadd-is-fused -mno-madd-is-fused ? Ohh, maybe we should have options like these? build-time —with-madd=fused/unfused/no runtime -mmadd=fused/unfused/no > > Thanks, > Richard
On Fri, 23 Dec 2016, Yunqiang Su wrote: > > 3, kernel: the emulation when a float exception taken. > > The big problem is that Loongson use the same encode for (unfused) > madd.fmt to (fused) madd.fmt. We cannot trap this in kernel. Why is that a problem? Just add a setting like `cpu_has_fused_madd' to <asm/cpu-features.h>, switch processing in the emulator accordingly and initialise the setting early on in bootstrap as processor features are determined. We'd have to do it likewise for the R8000 MIPS IV CPU Richard referred to previously if we had full support for this processor (unlikely to happen). Cc-ing <linux-mips@linux-mips.org> in case you want to discuss it further with the kernel developers; arguably it's a kernel bug already that the FPU as perceived by user software has different semantics on FPU Loongson hardware depending on whether or not the `nofpu' kernel parameter has been set (I don't know if denormals are handled by Loongson FPU hardware or trap with an Unimplemented Operation exception, but if the former, then this inconsistency applies for such data in the regular FPU mode as well). HTH, Maciej
diff --git a/src/gcc/config.gcc b/src/gcc/config.gcc index 1b7da0e..3a30b44 100644 --- a/src/gcc/config.gcc +++ b/src/gcc/config.gcc @@ -3991,7 +3991,7 @@ case "${target}" in ;; mips*-*-*) - supported_defaults="abi arch arch_32 arch_64 float fpu nan fp_32 odd_spreg_32 tune tune_32 tune_64 divide llsc mips-plt synci" + supported_defaults="abi arch arch_32 arch_64 float fpu nan fp_32 odd_spreg_32 unfused_madd4 tune tune_32 tune_64 divide llsc mips-plt synci" case ${with_float} in "" | soft | hard) @@ -4048,6 +4048,19 @@ case "${target}" in exit 1 ;; esac + + case ${with_unfused_madd4} in + "" | yes) + with_unfused_madd4="unfused-madd4" + ;; + no) + with_unfused_madd4="no-unfused-madd4" + ;; + *) + echo "Unknown unfused_madd4 type used in --with-unfused-madd4=$with_unfused_madd4" 1>&2 + exit 1 + ;; + esac case ${with_abi} in "" | 32 | o64 | n32 | 64 | eabi) @@ -4547,7 +4560,7 @@ case ${target} in esac t= -all_defaults="abi cpu cpu_32 cpu_64 arch arch_32 arch_64 tune tune_32 tune_64 schedule float mode fpu nan fp_32 odd_spreg_32 divide llsc mips-plt synci tls" +all_defaults="abi cpu cpu_32 cpu_64 arch arch_32 arch_64 tune tune_32 tune_64 schedule float mode fpu nan fp_32 odd_spreg_32 unfused_madd4 divide llsc mips-plt synci tls" for option in $all_defaults do eval "val=\$with_"`echo $option | sed s/-/_/g` diff --git a/src/gcc/config/mips/mips.c b/src/gcc/config/mips/mips.c index 5af3d1e..236ab94 100644 --- a/src/gcc/config/mips/mips.c +++ b/src/gcc/config/mips/mips.c @@ -17992,6 +17992,16 @@ mips_option_override (void) will be produced. */ target_flags |= MASK_ODD_SPREG; } + + /* If neither -munfused-madd nor -mno-unfused-madd was given on the command + line, set MASK_UNFSUED_MADD based on the ISA. */ + if ((target_flags_explicit & MASK_UNFUSED_MADD4) == 0) + { + if (!ISA_HAS_UNFUSED_MADD4) + target_flags &= ~MASK_UNFUSED_MADD4; + else + target_flags |= MASK_UNFUSED_MADD4; + } if (!ISA_HAS_COMPACT_BRANCHES && mips_cb == MIPS_CB_ALWAYS) { diff --git a/src/gcc/config/mips/mips.h b/src/gcc/config/mips/mips.h index 763ca58..8c7d24e 100644 --- a/src/gcc/config/mips/mips.h +++ b/src/gcc/config/mips/mips.h @@ -892,6 +892,7 @@ struct mips_cpu_info { ":%{!msoft-float:%{!msingle-float:%{!mfp*:-mfp%(VALUE)}}}}" }, \ {"odd_spreg_32", "%{" OPT_ARCH32 ":%{!msoft-float:%{!msingle-float:" \ "%{!modd-spreg:%{!mno-odd-spreg:-m%(VALUE)}}}}}" }, \ + {"unfused_madd4", "%{!munfused-madd4:%{!mno-unfused-madd4:-m%(VALUE)}}" }, \ {"divide", "%{!mdivide-traps:%{!mdivide-breaks:-mdivide-%(VALUE)}}" }, \ {"llsc", "%{!mllsc:%{!mno-llsc:-m%(VALUE)}}" }, \ {"mips-plt", "%{!mplt:%{!mno-plt:-m%(VALUE)}}" }, \ @@ -1089,7 +1090,7 @@ struct mips_cpu_info { /* ISA has 4 operand unfused madd instructions of the form 'd = [+-] (a * b [+-] c)'. */ -#define ISA_HAS_UNFUSED_MADD4 (ISA_HAS_FP4 && !TARGET_MIPS8000) +#define ISA_HAS_UNFUSED_MADD4 (ISA_HAS_FP4 && !TARGET_MIPS8000 && TARGET_UNFUSED_MADD4) /* ISA has 3 operand r6 fused madd instructions of the form 'c = c [+-] (a * b)'. */ diff --git a/src/gcc/config/mips/mips.opt b/src/gcc/config/mips/mips.opt index ebd67e4..a8c23f6 100644 --- a/src/gcc/config/mips/mips.opt +++ b/src/gcc/config/mips/mips.opt @@ -412,6 +412,10 @@ modd-spreg Target Report Mask(ODD_SPREG) Enable use of odd-numbered single-precision registers. +munfused-madd4 +Target Report Mask(UNFUSED_MADD4) +Enable unfused multiply-add/multiply-sub instruction, aka madd.fmt/msub.fmt. + mframe-header-opt Target Report Var(flag_frame_header_optimization) Optimization Optimize frame header.