Message ID | 1630633930-10334-1-git-send-email-apinski@marvell.com |
---|---|
State | New |
Headers | show |
Series | [aarch64] Fix target/95969: __builtin_aarch64_im_lane_boundsi interferes with gimple | expand |
apinski--- via Gcc-patches <gcc-patches@gcc.gnu.org> writes: > From: Andrew Pinski <apinski@marvell.com> > > This patch adds simple folding of __builtin_aarch64_im_lane_boundsi where > we are not going to error out. It fixes the problem by the removal > of the function from the IR. > > OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions. > > gcc/ChangeLog: > > * config/aarch64/aarch64-builtins.c (aarch64_fold_builtin_lane_check): > New function. > (aarch64_general_fold_builtin): Handle AARCH64_SIMD_BUILTIN_LANE_CHECK. > (aarch64_general_gimple_fold_builtin): Likewise. > --- > gcc/config/aarch64/aarch64-builtins.c | 35 +++++++++++++++++++++++++++ > 1 file changed, 35 insertions(+) > > diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c > index f6b41d9c200..d4414373aa4 100644 > --- a/gcc/config/aarch64/aarch64-builtins.c > +++ b/gcc/config/aarch64/aarch64-builtins.c > @@ -29,6 +29,7 @@ > #include "rtl.h" > #include "tree.h" > #include "gimple.h" > +#include "ssa.h" > #include "memmodel.h" > #include "tm_p.h" > #include "expmed.h" > @@ -2333,6 +2334,27 @@ aarch64_general_builtin_rsqrt (unsigned int fn) > return NULL_TREE; > } > > +/* Return true if the lane check can be removed as there is no > + error going to be emitted. */ > +static bool > +aarch64_fold_builtin_lane_check (tree arg0, tree arg1, tree arg2) > +{ > + if (TREE_CODE (arg0) != INTEGER_CST) > + return false; > + if (TREE_CODE (arg1) != INTEGER_CST) > + return false; > + if (TREE_CODE (arg2) != INTEGER_CST) > + return false; > + > + auto totalsize = wi::to_widest (arg0); > + auto elementsize = wi::to_widest (arg1); > + if (totalsize == 0 || elementsize == 0) > + return false; > + auto lane = wi::to_widest (arg2); > + auto high = wi::udiv_trunc (totalsize, elementsize); > + return wi::ltu_p (lane, high); > +} > + > #undef VAR1 > #define VAR1(T, N, MAP, FLAG, A) \ > case AARCH64_SIMD_BUILTIN_##T##_##N##A: > @@ -2353,6 +2375,11 @@ aarch64_general_fold_builtin (unsigned int fcode, tree type, > VAR1 (UNOP, floatv4si, 2, ALL, v4sf) > VAR1 (UNOP, floatv2di, 2, ALL, v2df) > return fold_build1 (FLOAT_EXPR, type, args[0]); > + case AARCH64_SIMD_BUILTIN_LANE_CHECK: > + if (n_args == 3 Do we need this check? If it's safe to rely on frontend testing for aarch64_general_gimple_fold_builtin then hopefully it should be here too. I think an assert would be better. (FTR, we have extra checking for SVE because the overloaded functions don't have definite prototypes.) > + && aarch64_fold_builtin_lane_check (args[0], args[1], args[2])) > + return fold_convert (void_type_node, integer_zero_node); Could this just be void_node instead? VOID_CST is the tree constant code for void_type_node. It would be good to add the testcase, as a -fdump-tree-optimized test that checks for a single instance of { = \*ptr}. Thanks, Richard
On Fri, Sep 3, 2021 at 2:42 AM Richard Sandiford via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > apinski--- via Gcc-patches <gcc-patches@gcc.gnu.org> writes: > > From: Andrew Pinski <apinski@marvell.com> > > > > This patch adds simple folding of __builtin_aarch64_im_lane_boundsi where > > we are not going to error out. It fixes the problem by the removal > > of the function from the IR. > > > > OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions. > > > > gcc/ChangeLog: > > > > * config/aarch64/aarch64-builtins.c (aarch64_fold_builtin_lane_check): > > New function. > > (aarch64_general_fold_builtin): Handle AARCH64_SIMD_BUILTIN_LANE_CHECK. > > (aarch64_general_gimple_fold_builtin): Likewise. > > --- > > gcc/config/aarch64/aarch64-builtins.c | 35 +++++++++++++++++++++++++++ > > 1 file changed, 35 insertions(+) > > > > diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c > > index f6b41d9c200..d4414373aa4 100644 > > --- a/gcc/config/aarch64/aarch64-builtins.c > > +++ b/gcc/config/aarch64/aarch64-builtins.c > > @@ -29,6 +29,7 @@ > > #include "rtl.h" > > #include "tree.h" > > #include "gimple.h" > > +#include "ssa.h" > > #include "memmodel.h" > > #include "tm_p.h" > > #include "expmed.h" > > @@ -2333,6 +2334,27 @@ aarch64_general_builtin_rsqrt (unsigned int fn) > > return NULL_TREE; > > } > > > > +/* Return true if the lane check can be removed as there is no > > + error going to be emitted. */ > > +static bool > > +aarch64_fold_builtin_lane_check (tree arg0, tree arg1, tree arg2) > > +{ > > + if (TREE_CODE (arg0) != INTEGER_CST) > > + return false; > > + if (TREE_CODE (arg1) != INTEGER_CST) > > + return false; > > + if (TREE_CODE (arg2) != INTEGER_CST) > > + return false; > > + > > + auto totalsize = wi::to_widest (arg0); > > + auto elementsize = wi::to_widest (arg1); > > + if (totalsize == 0 || elementsize == 0) > > + return false; > > + auto lane = wi::to_widest (arg2); > > + auto high = wi::udiv_trunc (totalsize, elementsize); > > + return wi::ltu_p (lane, high); > > +} > > + > > #undef VAR1 > > #define VAR1(T, N, MAP, FLAG, A) \ > > case AARCH64_SIMD_BUILTIN_##T##_##N##A: > > @@ -2353,6 +2375,11 @@ aarch64_general_fold_builtin (unsigned int fcode, tree type, > > VAR1 (UNOP, floatv4si, 2, ALL, v4sf) > > VAR1 (UNOP, floatv2di, 2, ALL, v2df) > > return fold_build1 (FLOAT_EXPR, type, args[0]); > > + case AARCH64_SIMD_BUILTIN_LANE_CHECK: > > + if (n_args == 3 > > Do we need this check? If it's safe to rely on frontend testing > for aarch64_general_gimple_fold_builtin then hopefully it should > be here too. I think an assert would be better. I added it just in case, I do find it interesting that it was passed but never checked. > > (FTR, we have extra checking for SVE because the overloaded functions > don't have definite prototypes.) > > > + && aarch64_fold_builtin_lane_check (args[0], args[1], args[2])) > > + return fold_convert (void_type_node, integer_zero_node); > > Could this just be void_node instead? VOID_CST is the tree constant > code for void_type_node. Oh I had missed there was such a thing, definitely better using that instead of the above. > > It would be good to add the testcase, as a -fdump-tree-optimized test > that checks for a single instance of { = \*ptr}. Yes I was thinking about adding one but I will definitely add one now. Thanks, Andrew Pinski > > Thanks, > Richard
diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c index f6b41d9c200..d4414373aa4 100644 --- a/gcc/config/aarch64/aarch64-builtins.c +++ b/gcc/config/aarch64/aarch64-builtins.c @@ -29,6 +29,7 @@ #include "rtl.h" #include "tree.h" #include "gimple.h" +#include "ssa.h" #include "memmodel.h" #include "tm_p.h" #include "expmed.h" @@ -2333,6 +2334,27 @@ aarch64_general_builtin_rsqrt (unsigned int fn) return NULL_TREE; } +/* Return true if the lane check can be removed as there is no + error going to be emitted. */ +static bool +aarch64_fold_builtin_lane_check (tree arg0, tree arg1, tree arg2) +{ + if (TREE_CODE (arg0) != INTEGER_CST) + return false; + if (TREE_CODE (arg1) != INTEGER_CST) + return false; + if (TREE_CODE (arg2) != INTEGER_CST) + return false; + + auto totalsize = wi::to_widest (arg0); + auto elementsize = wi::to_widest (arg1); + if (totalsize == 0 || elementsize == 0) + return false; + auto lane = wi::to_widest (arg2); + auto high = wi::udiv_trunc (totalsize, elementsize); + return wi::ltu_p (lane, high); +} + #undef VAR1 #define VAR1(T, N, MAP, FLAG, A) \ case AARCH64_SIMD_BUILTIN_##T##_##N##A: @@ -2353,6 +2375,11 @@ aarch64_general_fold_builtin (unsigned int fcode, tree type, VAR1 (UNOP, floatv4si, 2, ALL, v4sf) VAR1 (UNOP, floatv2di, 2, ALL, v2df) return fold_build1 (FLOAT_EXPR, type, args[0]); + case AARCH64_SIMD_BUILTIN_LANE_CHECK: + if (n_args == 3 + && aarch64_fold_builtin_lane_check (args[0], args[1], args[2])) + return fold_convert (void_type_node, integer_zero_node); + break; default: break; } @@ -2440,6 +2467,14 @@ aarch64_general_gimple_fold_builtin (unsigned int fcode, gcall *stmt) } break; } + case AARCH64_SIMD_BUILTIN_LANE_CHECK: + if (aarch64_fold_builtin_lane_check (args[0], args[1], args[2])) + { + unlink_stmt_vdef (stmt); + release_defs (stmt); + new_stmt = gimple_build_nop (); + } + break; default: break; }
From: Andrew Pinski <apinski@marvell.com> This patch adds simple folding of __builtin_aarch64_im_lane_boundsi where we are not going to error out. It fixes the problem by the removal of the function from the IR. OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions. gcc/ChangeLog: * config/aarch64/aarch64-builtins.c (aarch64_fold_builtin_lane_check): New function. (aarch64_general_fold_builtin): Handle AARCH64_SIMD_BUILTIN_LANE_CHECK. (aarch64_general_gimple_fold_builtin): Likewise. --- gcc/config/aarch64/aarch64-builtins.c | 35 +++++++++++++++++++++++++++ 1 file changed, 35 insertions(+)