Message ID | 20180213175129.GS5867@tucnak |
---|---|
State | New |
Headers | show |
Series | FIx endless match.pd recursion on cst1 + cst2 + cst3 (PR tree-optimization/84334) | expand |
On February 13, 2018 6:51:29 PM GMT+01:00, Jakub Jelinek <jakub@redhat.com> wrote: >Hi! > >On the following testcase, we recurse infinitely, because >we have float re-association enabled, but also rounding-math, so >we try to optimize (cst1 + cst2) + cst3 as (cst2 + cst3) + cst1 >but (cst2 + cst3) doesn't simplify and we try again and optimize >it as (cst3 + cst1) + cst2 and then (cst1 + cst2) + cst3 and so on >forever. If @0 is not a CONSTANT_CLASS_P, there is not a problem, >if it is, the code just checks if we can actually simplify the >operation between cst2 and cst3 into a constant. Is there a reason to try simplifying at all for constant @0? I'd rather not try to avoid all the complex code. Richard. >Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > >2018-02-13 Jakub Jelinek <jakub@redhat.com> > > PR tree-optimization/84334 > * match.pd ((A +- CST1) +- CST2 -> A + CST3): If A is > also a CONSTANT_CLASS_P, only optimize if we can fold the > operation between CST1 and CST2 into a constant. > > * gcc.dg/pr84334.c: New test. > >--- gcc/match.pd.jj 2018-02-13 09:33:31.000000000 +0100 >+++ gcc/match.pd 2018-02-13 12:14:08.108314686 +0100 >@@ -1733,9 +1733,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > CONSTANT_CLASS_P@2) > /* If one of the types wraps, use that one. */ > (if (!ANY_INTEGRAL_TYPE_P (type) || TYPE_OVERFLOW_WRAPS (type)) >- (if (outer_op == PLUS_EXPR) >- (plus (view_convert @0) (inner_op @2 (view_convert @1))) >- (minus (view_convert @0) (neg_inner_op @2 (view_convert @1)))) >+ /* If all 3 captures are CONSTANT_CLASS_P, only optimize if we >+ can simplify @2 with @1 into a constant, otherwise we might recurse >+ forever. */ >+ (if (CONSTANT_CLASS_P (@0)) >+ (with { tree cst = fold_unary (VIEW_CONVERT_EXPR, type, @1); >+ if (cst && CONSTANT_CLASS_P (cst)) >+ cst = const_binop (outer_op == PLUS_EXPR >+ ? inner_op : neg_inner_op, type, >+ @2, cst); } >+ (if (cst) >+ (outer_op (view_convert @0) { cst; }))) >+ (if (outer_op == PLUS_EXPR) >+ (plus (view_convert @0) (inner_op @2 (view_convert @1))) >+ (minus (view_convert @0) (neg_inner_op @2 (view_convert @1))))) > (if (!ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0)) > || TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))) > (if (outer_op == PLUS_EXPR) >--- gcc/testsuite/gcc.dg/pr84334.c.jj 2018-02-13 12:18:12.765463667 >+0100 >+++ gcc/testsuite/gcc.dg/pr84334.c 2018-02-13 11:36:56.019632428 +0100 >@@ -0,0 +1,12 @@ >+/* PR tree-optimization/84334 */ >+/* { dg-do compile } */ >+/* { dg-options "-Ofast -frounding-math" } */ >+ >+float >+foo (void) >+{ >+ float a = 9.999999974752427078783512115478515625e-7f; >+ float b = 1.999999994950485415756702423095703125e-6f; >+ float c = 4.999999873689375817775726318359375e-6f; >+ return a + b + c; >+} > > Jakub
On Tue, 13 Feb 2018, Richard Biener wrote: > On February 13, 2018 6:51:29 PM GMT+01:00, Jakub Jelinek <jakub@redhat.com> wrote: >> Hi! >> >> On the following testcase, we recurse infinitely, because >> we have float re-association enabled, but also rounding-math, so >> we try to optimize (cst1 + cst2) + cst3 as (cst2 + cst3) + cst1 >> but (cst2 + cst3) doesn't simplify and we try again and optimize >> it as (cst3 + cst1) + cst2 and then (cst1 + cst2) + cst3 and so on >> forever. If @0 is not a CONSTANT_CLASS_P, there is not a problem, >> if it is, the code just checks if we can actually simplify the >> operation between cst2 and cst3 into a constant. > > Is there a reason to try simplifying at all for constant @0? Yes. cst2+cst3 might simplify (the operation happens to be exact and not require rounding), which leaves us with only one addition instead of 2. On the other hand, mixing -frounding-math with reassociation seems strange to me, and likely not worth optimizing for.
On Tue, Feb 13, 2018 at 03:28:25PM -0400, Marc Glisse wrote: > On Tue, 13 Feb 2018, Richard Biener wrote: > > > On February 13, 2018 6:51:29 PM GMT+01:00, Jakub Jelinek <jakub@redhat.com> wrote: > > > Hi! > > > > > > On the following testcase, we recurse infinitely, because > > > we have float re-association enabled, but also rounding-math, so > > > we try to optimize (cst1 + cst2) + cst3 as (cst2 + cst3) + cst1 > > > but (cst2 + cst3) doesn't simplify and we try again and optimize > > > it as (cst3 + cst1) + cst2 and then (cst1 + cst2) + cst3 and so on > > > forever. If @0 is not a CONSTANT_CLASS_P, there is not a problem, > > > if it is, the code just checks if we can actually simplify the > > > operation between cst2 and cst3 into a constant. > > > > Is there a reason to try simplifying at all for constant @0? > > Yes. cst2+cst3 might simplify (the operation happens to be exact and not > require rounding), which leaves us with only one addition instead of 2. Yeah, exactly, e.g. /* { dg-do compile } */ /* { dg-options "-Ofast -frounding-math" } */ float foo (void) { float a = 9.999999974752427078783512115478515625e-7f; float b = 1024.0f; float c = 2048.0f; return a + b + c; } would no longer be optimized into a single addition rather than 2. Jakub
On Tue, 13 Feb 2018, Marc Glisse wrote: > On Tue, 13 Feb 2018, Richard Biener wrote: > > > On February 13, 2018 6:51:29 PM GMT+01:00, Jakub Jelinek <jakub@redhat.com> > > wrote: > > > Hi! > > > > > > On the following testcase, we recurse infinitely, because > > > we have float re-association enabled, but also rounding-math, so > > > we try to optimize (cst1 + cst2) + cst3 as (cst2 + cst3) + cst1 > > > but (cst2 + cst3) doesn't simplify and we try again and optimize > > > it as (cst3 + cst1) + cst2 and then (cst1 + cst2) + cst3 and so on > > > forever. If @0 is not a CONSTANT_CLASS_P, there is not a problem, > > > if it is, the code just checks if we can actually simplify the > > > operation between cst2 and cst3 into a constant. > > > > Is there a reason to try simplifying at all for constant @0? > > Yes. cst2+cst3 might simplify (the operation happens to be exact and not > require rounding), which leaves us with only one addition instead of 2. > > On the other hand, mixing -frounding-math with reassociation seems strange to > me, and likely not worth optimizing for. ./cc1 -quiet t.c -O -frounding-math -fassociative-math cc1: warning: -fassociative-math disabled; other options take precedence So _maybe_ we should disable these patterns for !flag_associative_math when dealing with FP? Richard.
On Wed, Feb 14, 2018 at 12:09:57PM +0100, Richard Biener wrote: > On Tue, 13 Feb 2018, Marc Glisse wrote: > > > On Tue, 13 Feb 2018, Richard Biener wrote: > > > > > On February 13, 2018 6:51:29 PM GMT+01:00, Jakub Jelinek <jakub@redhat.com> > > > wrote: > > > > Hi! > > > > > > > > On the following testcase, we recurse infinitely, because > > > > we have float re-association enabled, but also rounding-math, so > > > > we try to optimize (cst1 + cst2) + cst3 as (cst2 + cst3) + cst1 > > > > but (cst2 + cst3) doesn't simplify and we try again and optimize > > > > it as (cst3 + cst1) + cst2 and then (cst1 + cst2) + cst3 and so on > > > > forever. If @0 is not a CONSTANT_CLASS_P, there is not a problem, > > > > if it is, the code just checks if we can actually simplify the > > > > operation between cst2 and cst3 into a constant. > > > > > > Is there a reason to try simplifying at all for constant @0? > > > > Yes. cst2+cst3 might simplify (the operation happens to be exact and not > > require rounding), which leaves us with only one addition instead of 2. > > > > On the other hand, mixing -frounding-math with reassociation seems strange to > > me, and likely not worth optimizing for. > > ./cc1 -quiet t.c -O -frounding-math -fassociative-math > cc1: warning: -fassociative-math disabled; other options take precedence You need ./cc1 -quiet t.c -O -fassociative-math -fno-trapping-math -fno-signed-zeros -frounding-math > So _maybe_ we should disable these patterns for !flag_associative_math > when dealing with FP? We do, this is in block with: /* We can't reassociate floating-point unless -fassociative-math or fixed-point plus or minus because of saturation to +-Inf. */ (if ((!FLOAT_TYPE_P (type) || flag_associative_math) && !FIXED_POINT_TYPE_P (type)) But that doesn't mean you can't request associative math and rounding math at the same time. Jakub
On Wed, 14 Feb 2018, Richard Biener wrote: > On Tue, 13 Feb 2018, Marc Glisse wrote: > >> On Tue, 13 Feb 2018, Richard Biener wrote: >> >>> On February 13, 2018 6:51:29 PM GMT+01:00, Jakub Jelinek <jakub@redhat.com> >>> wrote: >>>> Hi! >>>> >>>> On the following testcase, we recurse infinitely, because >>>> we have float re-association enabled, but also rounding-math, so >>>> we try to optimize (cst1 + cst2) + cst3 as (cst2 + cst3) + cst1 >>>> but (cst2 + cst3) doesn't simplify and we try again and optimize >>>> it as (cst3 + cst1) + cst2 and then (cst1 + cst2) + cst3 and so on >>>> forever. If @0 is not a CONSTANT_CLASS_P, there is not a problem, >>>> if it is, the code just checks if we can actually simplify the >>>> operation between cst2 and cst3 into a constant. >>> >>> Is there a reason to try simplifying at all for constant @0? >> >> Yes. cst2+cst3 might simplify (the operation happens to be exact and not >> require rounding), which leaves us with only one addition instead of 2. >> >> On the other hand, mixing -frounding-math with reassociation seems strange to >> me, and likely not worth optimizing for. > > ./cc1 -quiet t.c -O -frounding-math -fassociative-math > cc1: warning: -fassociative-math disabled; other options take precedence > > So _maybe_ we should disable these patterns for !flag_associative_math > when dealing with FP? There is (if ((!FLOAT_TYPE_P (type) || flag_associative_math) && !FIXED_POINT_TYPE_P (type)) above, which I think covers this transformation.
--- gcc/match.pd.jj 2018-02-13 09:33:31.000000000 +0100 +++ gcc/match.pd 2018-02-13 12:14:08.108314686 +0100 @@ -1733,9 +1733,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) CONSTANT_CLASS_P@2) /* If one of the types wraps, use that one. */ (if (!ANY_INTEGRAL_TYPE_P (type) || TYPE_OVERFLOW_WRAPS (type)) - (if (outer_op == PLUS_EXPR) - (plus (view_convert @0) (inner_op @2 (view_convert @1))) - (minus (view_convert @0) (neg_inner_op @2 (view_convert @1)))) + /* If all 3 captures are CONSTANT_CLASS_P, only optimize if we + can simplify @2 with @1 into a constant, otherwise we might recurse + forever. */ + (if (CONSTANT_CLASS_P (@0)) + (with { tree cst = fold_unary (VIEW_CONVERT_EXPR, type, @1); + if (cst && CONSTANT_CLASS_P (cst)) + cst = const_binop (outer_op == PLUS_EXPR + ? inner_op : neg_inner_op, type, + @2, cst); } + (if (cst) + (outer_op (view_convert @0) { cst; }))) + (if (outer_op == PLUS_EXPR) + (plus (view_convert @0) (inner_op @2 (view_convert @1))) + (minus (view_convert @0) (neg_inner_op @2 (view_convert @1))))) (if (!ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0)) || TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))) (if (outer_op == PLUS_EXPR) --- gcc/testsuite/gcc.dg/pr84334.c.jj 2018-02-13 12:18:12.765463667 +0100 +++ gcc/testsuite/gcc.dg/pr84334.c 2018-02-13 11:36:56.019632428 +0100 @@ -0,0 +1,12 @@ +/* PR tree-optimization/84334 */ +/* { dg-do compile } */ +/* { dg-options "-Ofast -frounding-math" } */ + +float +foo (void) +{ + float a = 9.999999974752427078783512115478515625e-7f; + float b = 1.999999994950485415756702423095703125e-6f; + float c = 4.999999873689375817775726318359375e-6f; + return a + b + c; +}