Message ID | 5242CAEE.5010700@arm.com |
---|---|
State | New |
Headers | show |
Ping~ Thanks, Yufeng On 09/25/13 12:37, Yufeng Zhang wrote: > Hello, > > Please find the updated version of the patch in the attachment. It has > addressed the previous comments and also included some changes in order > to pass the bootstrapping on x86_64. > > It's also passed the regtest on arm-none-eabi and aarch64-none-elf. > > It will also fix the test failure as reported here: > http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01317.html > > OK for the trunk? > > Thanks, > Yufeng > > > gcc/ > > * gimple-ssa-strength-reduction.c (safe_to_multiply_p): New > function. > (backtrace_base_for_ref): Call get_unwidened, check 'base_in' > again and set unwidend_p with true; call safe_to_multiply_p to > avoid > unsafe unwidened cases. > > gcc/testsuite/ > > * gcc.dg/tree-ssa/slsr-40.c: New test. > > > > On 09/11/13 13:39, Bill Schmidt wrote: >> On Wed, 2013-09-11 at 10:32 +0200, Richard Biener wrote: >>> On Tue, Sep 10, 2013 at 5:53 PM, Yufeng Zhang<Yufeng.Zhang@arm.com> wrote: >>>> Hi, >>>> >>>> Following Bin's patch in >>>> http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00695.html, this patch tweaks >>>> backtrace_base_for_ref () to strip of any widening conversion after the >>>> first TREE_CODE check fails. Without this patch, the test >>>> (gcc.dg/tree-ssa/slsr-39.c) in Bin's patch will fail on AArch64, as >>>> backtrace_base_for_ref () will stop if not seeing an ssa_name since the tree >>>> code can be nop_expr instead. >>>> >>>> Regtested on arm and aarch64; still bootstrapping x86_64. >>>> >>>> OK for the trunk if the x86_64 bootstrap succeeds? >>> >>> Please add a testcase. >> >> Also, the comment "Strip of" should read "Strip off". Otherwise I have >> no comments. >> >> Thanks, >> Bill >> >>> >>> Richard.
On Wed, Sep 25, 2013 at 1:37 PM, Yufeng Zhang <Yufeng.Zhang@arm.com> wrote: > Hello, > > Please find the updated version of the patch in the attachment. It has > addressed the previous comments and also included some changes in order to > pass the bootstrapping on x86_64. > > It's also passed the regtest on arm-none-eabi and aarch64-none-elf. > > It will also fix the test failure as reported here: > http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01317.html > > OK for the trunk? + where n is a 32-bit unsigned int and pointer are 64-bit long. In this + case, the gimple for (n - 1) is: + + _2 = n_1(D) + 4294967295; // 0xFFFFFFFF + + and it is wrong to multiply the large constant by 4 in the 64-bit space. */ + +static bool +safe_to_multiply_p (tree type, double_int cst) +{ + if (TYPE_UNSIGNED (type) + && ! double_int_fits_to_tree_p (signed_type_for (type), cst)) + return false; + + return true; +} This looks wrong. The only relevant check is as whether the multiplication overflows the original type as you miss the implicit truncation that happens. Which is something you don't know unless you know the value. It definitely isn't a property of a type and a constant but the property of two constants and a type. Or the predicate has a wrong name. The use of get_unwidened in this core routine looks like this is all happening in the wrong place and we should have picked up another candidate for this instead? I'm sure Bill will know more here. Richard. > Thanks, > Yufeng > > > gcc/ > > * gimple-ssa-strength-reduction.c (safe_to_multiply_p): New > function. > (backtrace_base_for_ref): Call get_unwidened, check 'base_in' > again and set unwidend_p with true; call safe_to_multiply_p to avoid > unsafe unwidened cases. > > gcc/testsuite/ > > * gcc.dg/tree-ssa/slsr-40.c: New test. > > > > > On 09/11/13 13:39, Bill Schmidt wrote: >> >> On Wed, 2013-09-11 at 10:32 +0200, Richard Biener wrote: >>> >>> On Tue, Sep 10, 2013 at 5:53 PM, Yufeng Zhang<Yufeng.Zhang@arm.com> >>> wrote: >>>> >>>> Hi, >>>> >>>> Following Bin's patch in >>>> http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00695.html, this patch >>>> tweaks >>>> backtrace_base_for_ref () to strip of any widening conversion after the >>>> first TREE_CODE check fails. Without this patch, the test >>>> (gcc.dg/tree-ssa/slsr-39.c) in Bin's patch will fail on AArch64, as >>>> backtrace_base_for_ref () will stop if not seeing an ssa_name since the >>>> tree >>>> code can be nop_expr instead. >>>> >>>> Regtested on arm and aarch64; still bootstrapping x86_64. >>>> >>>> OK for the trunk if the x86_64 bootstrap succeeds? >>> >>> >>> Please add a testcase. >> >> >> Also, the comment "Strip of" should read "Strip off". Otherwise I have >> no comments. >> >> Thanks, >> Bill >> >>> >>> Richard.
On Tue, 2013-10-01 at 12:19 +0200, Richard Biener wrote: > On Wed, Sep 25, 2013 at 1:37 PM, Yufeng Zhang <Yufeng.Zhang@arm.com> wrote: > > Hello, > > > > Please find the updated version of the patch in the attachment. It has > > addressed the previous comments and also included some changes in order to > > pass the bootstrapping on x86_64. > > > > It's also passed the regtest on arm-none-eabi and aarch64-none-elf. > > > > It will also fix the test failure as reported here: > > http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01317.html > > > > OK for the trunk? > > + where n is a 32-bit unsigned int and pointer are 64-bit long. In this > + case, the gimple for (n - 1) is: > + > + _2 = n_1(D) + 4294967295; // 0xFFFFFFFF > + > + and it is wrong to multiply the large constant by 4 in the 64-bit space. */ > + > +static bool > +safe_to_multiply_p (tree type, double_int cst) > +{ > + if (TYPE_UNSIGNED (type) > + && ! double_int_fits_to_tree_p (signed_type_for (type), cst)) > + return false; > + > + return true; > +} > > This looks wrong. The only relevant check is as whether the > multiplication overflows the original type as you miss the implicit > truncation that happens. Which is something you don't know > unless you know the value. It definitely isn't a property of a type > and a constant but the property of two constants and a type. > Or the predicate has a wrong name. > > The use of get_unwidened in this core routine looks like this is > all happening in the wrong place and we should have picked up > another candidate for this instead? I'm sure Bill will know more here. I'm not happy with how this patch is progressing. Without having looked too deeply, this might be better handled earlier when determining which casts are safe to use in building candidates. What you have here seems more like closing the barn door after the horse got out. Maybe that's the only solution, but it doesn't seem likely. Another problem is that your test case isn't testing anything except that the compiler doesn't crash. That isn't sufficient as a regression test. I'll spend some time looking at this to see if I can find a better approach. It might be a day or two before I can get to it. In addition to the included test case, are there any other cases you've found that I should be concerned with? Thanks, Bill > > Richard. > > > > > Thanks, > > Yufeng > > > > > > gcc/ > > > > * gimple-ssa-strength-reduction.c (safe_to_multiply_p): New > > function. > > (backtrace_base_for_ref): Call get_unwidened, check 'base_in' > > again and set unwidend_p with true; call safe_to_multiply_p to avoid > > unsafe unwidened cases. > > > > gcc/testsuite/ > > > > * gcc.dg/tree-ssa/slsr-40.c: New test. > > > > > > > > > > On 09/11/13 13:39, Bill Schmidt wrote: > >> > >> On Wed, 2013-09-11 at 10:32 +0200, Richard Biener wrote: > >>> > >>> On Tue, Sep 10, 2013 at 5:53 PM, Yufeng Zhang<Yufeng.Zhang@arm.com> > >>> wrote: > >>>> > >>>> Hi, > >>>> > >>>> Following Bin's patch in > >>>> http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00695.html, this patch > >>>> tweaks > >>>> backtrace_base_for_ref () to strip of any widening conversion after the > >>>> first TREE_CODE check fails. Without this patch, the test > >>>> (gcc.dg/tree-ssa/slsr-39.c) in Bin's patch will fail on AArch64, as > >>>> backtrace_base_for_ref () will stop if not seeing an ssa_name since the > >>>> tree > >>>> code can be nop_expr instead. > >>>> > >>>> Regtested on arm and aarch64; still bootstrapping x86_64. > >>>> > >>>> OK for the trunk if the x86_64 bootstrap succeeds? > >>> > >>> > >>> Please add a testcase. > >> > >> > >> Also, the comment "Strip of" should read "Strip off". Otherwise I have > >> no comments. > >> > >> Thanks, > >> Bill > >> > >>> > >>> Richard. >
On Tue, 2013-10-01 at 08:17 -0500, Bill Schmidt wrote: > On Tue, 2013-10-01 at 12:19 +0200, Richard Biener wrote: > > On Wed, Sep 25, 2013 at 1:37 PM, Yufeng Zhang <Yufeng.Zhang@arm.com> wrote: > > > Hello, > > > > > > Please find the updated version of the patch in the attachment. It has > > > addressed the previous comments and also included some changes in order to > > > pass the bootstrapping on x86_64. > > > > > > It's also passed the regtest on arm-none-eabi and aarch64-none-elf. > > > > > > It will also fix the test failure as reported here: > > > http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01317.html > > > > > > OK for the trunk? > > > > + where n is a 32-bit unsigned int and pointer are 64-bit long. In this > > + case, the gimple for (n - 1) is: > > + > > + _2 = n_1(D) + 4294967295; // 0xFFFFFFFF > > + > > + and it is wrong to multiply the large constant by 4 in the 64-bit space. */ > > + > > +static bool > > +safe_to_multiply_p (tree type, double_int cst) > > +{ > > + if (TYPE_UNSIGNED (type) > > + && ! double_int_fits_to_tree_p (signed_type_for (type), cst)) > > + return false; > > + > > + return true; > > +} > > > > This looks wrong. The only relevant check is as whether the > > multiplication overflows the original type as you miss the implicit > > truncation that happens. Which is something you don't know > > unless you know the value. It definitely isn't a property of a type > > and a constant but the property of two constants and a type. > > Or the predicate has a wrong name. > > > > The use of get_unwidened in this core routine looks like this is > > all happening in the wrong place and we should have picked up > > another candidate for this instead? I'm sure Bill will know more here. > > I'm not happy with how this patch is progressing. Without having looked > too deeply, this might be better handled earlier when determining which > casts are safe to use in building candidates. What you have here seems > more like closing the barn door after the horse got out. Maybe that's > the only solution, but it doesn't seem likely. > > Another problem is that your test case isn't testing anything except > that the compiler doesn't crash. That isn't sufficient as a regression > test. Sorry, that was a pre-coffee comment. I would like also to see a test that verifies the expected gimple, though, not just that the test runs. > > I'll spend some time looking at this to see if I can find a better > approach. It might be a day or two before I can get to it. In addition > to the included test case, are there any other cases you've found that I > should be concerned with? > > Thanks, > Bill > > > > > Richard. > > > > > > > > > Thanks, > > > Yufeng > > > > > > > > > gcc/ > > > > > > * gimple-ssa-strength-reduction.c (safe_to_multiply_p): New > > > function. > > > (backtrace_base_for_ref): Call get_unwidened, check 'base_in' > > > again and set unwidend_p with true; call safe_to_multiply_p to avoid > > > unsafe unwidened cases. > > > > > > gcc/testsuite/ > > > > > > * gcc.dg/tree-ssa/slsr-40.c: New test. > > > > > > > > > > > > > > > On 09/11/13 13:39, Bill Schmidt wrote: > > >> > > >> On Wed, 2013-09-11 at 10:32 +0200, Richard Biener wrote: > > >>> > > >>> On Tue, Sep 10, 2013 at 5:53 PM, Yufeng Zhang<Yufeng.Zhang@arm.com> > > >>> wrote: > > >>>> > > >>>> Hi, > > >>>> > > >>>> Following Bin's patch in > > >>>> http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00695.html, this patch > > >>>> tweaks > > >>>> backtrace_base_for_ref () to strip of any widening conversion after the > > >>>> first TREE_CODE check fails. Without this patch, the test > > >>>> (gcc.dg/tree-ssa/slsr-39.c) in Bin's patch will fail on AArch64, as > > >>>> backtrace_base_for_ref () will stop if not seeing an ssa_name since the > > >>>> tree > > >>>> code can be nop_expr instead. > > >>>> > > >>>> Regtested on arm and aarch64; still bootstrapping x86_64. > > >>>> > > >>>> OK for the trunk if the x86_64 bootstrap succeeds? > > >>> > > >>> > > >>> Please add a testcase. > > >> > > >> > > >> Also, the comment "Strip of" should read "Strip off". Otherwise I have > > >> no comments. > > >> > > >> Thanks, > > >> Bill > > >> > > >>> > > >>> Richard. > >
On Tue, 2013-10-01 at 08:17 -0500, Bill Schmidt wrote: > On Tue, 2013-10-01 at 12:19 +0200, Richard Biener wrote: > > On Wed, Sep 25, 2013 at 1:37 PM, Yufeng Zhang <Yufeng.Zhang@arm.com> wrote: > > > Hello, > > > > > > Please find the updated version of the patch in the attachment. It has > > > addressed the previous comments and also included some changes in order to > > > pass the bootstrapping on x86_64. > > > > > > It's also passed the regtest on arm-none-eabi and aarch64-none-elf. > > > > > > It will also fix the test failure as reported here: > > > http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01317.html > > > > > > OK for the trunk? > > > > + where n is a 32-bit unsigned int and pointer are 64-bit long. In this > > + case, the gimple for (n - 1) is: > > + > > + _2 = n_1(D) + 4294967295; // 0xFFFFFFFF > > + > > + and it is wrong to multiply the large constant by 4 in the 64-bit space. */ > > + > > +static bool > > +safe_to_multiply_p (tree type, double_int cst) > > +{ > > + if (TYPE_UNSIGNED (type) > > + && ! double_int_fits_to_tree_p (signed_type_for (type), cst)) > > + return false; > > + > > + return true; > > +} > > > > This looks wrong. The only relevant check is as whether the > > multiplication overflows the original type as you miss the implicit > > truncation that happens. Which is something you don't know > > unless you know the value. It definitely isn't a property of a type > > and a constant but the property of two constants and a type. > > Or the predicate has a wrong name. > > > > The use of get_unwidened in this core routine looks like this is > > all happening in the wrong place and we should have picked up > > another candidate for this instead? I'm sure Bill will know more here. > > I'm not happy with how this patch is progressing. Without having looked > too deeply, this might be better handled earlier when determining which > casts are safe to use in building candidates. What you have here seems > more like closing the barn door after the horse got out. Maybe that's > the only solution, but it doesn't seem likely. > > Another problem is that your test case isn't testing anything except > that the compiler doesn't crash. That isn't sufficient as a regression > test. > > I'll spend some time looking at this to see if I can find a better > approach. It might be a day or two before I can get to it. In addition > to the included test case, are there any other cases you've found that I > should be concerned with? To help me investigate this without having to build a cross compiler, could you please compile your test case (without the patch applied) using -fdump-tree-reassoc2 -fdump-tree-slsr-details and send me the generated dump files? Thanks, Bill > > Thanks, > Bill > > > > > Richard. > > > > > > > > > Thanks, > > > Yufeng > > > > > > > > > gcc/ > > > > > > * gimple-ssa-strength-reduction.c (safe_to_multiply_p): New > > > function. > > > (backtrace_base_for_ref): Call get_unwidened, check 'base_in' > > > again and set unwidend_p with true; call safe_to_multiply_p to avoid > > > unsafe unwidened cases. > > > > > > gcc/testsuite/ > > > > > > * gcc.dg/tree-ssa/slsr-40.c: New test. > > > > > > > > > > > > > > > On 09/11/13 13:39, Bill Schmidt wrote: > > >> > > >> On Wed, 2013-09-11 at 10:32 +0200, Richard Biener wrote: > > >>> > > >>> On Tue, Sep 10, 2013 at 5:53 PM, Yufeng Zhang<Yufeng.Zhang@arm.com> > > >>> wrote: > > >>>> > > >>>> Hi, > > >>>> > > >>>> Following Bin's patch in > > >>>> http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00695.html, this patch > > >>>> tweaks > > >>>> backtrace_base_for_ref () to strip of any widening conversion after the > > >>>> first TREE_CODE check fails. Without this patch, the test > > >>>> (gcc.dg/tree-ssa/slsr-39.c) in Bin's patch will fail on AArch64, as > > >>>> backtrace_base_for_ref () will stop if not seeing an ssa_name since the > > >>>> tree > > >>>> code can be nop_expr instead. > > >>>> > > >>>> Regtested on arm and aarch64; still bootstrapping x86_64. > > >>>> > > >>>> OK for the trunk if the x86_64 bootstrap succeeds? > > >>> > > >>> > > >>> Please add a testcase. > > >> > > >> > > >> Also, the comment "Strip of" should read "Strip off". Otherwise I have > > >> no comments. > > >> > > >> Thanks, > > >> Bill > > >> > > >>> > > >>> Richard. > >
Hi Bill, Thank you for the review and the offer to help. On 10/01/13 15:36, Bill Schmidt wrote: > On Tue, 2013-10-01 at 08:17 -0500, Bill Schmidt wrote: >> On Tue, 2013-10-01 at 12:19 +0200, Richard Biener wrote: >>> On Wed, Sep 25, 2013 at 1:37 PM, Yufeng Zhang<Yufeng.Zhang@arm.com> wrote: >>>> Hello, >>>> >>>> Please find the updated version of the patch in the attachment. It has >>>> addressed the previous comments and also included some changes in order to >>>> pass the bootstrapping on x86_64. >>>> >>>> It's also passed the regtest on arm-none-eabi and aarch64-none-elf. >>>> >>>> It will also fix the test failure as reported here: >>>> http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01317.html >>>> >>>> OK for the trunk? >>> >>> + where n is a 32-bit unsigned int and pointer are 64-bit long. In this >>> + case, the gimple for (n - 1) is: >>> + >>> + _2 = n_1(D) + 4294967295; // 0xFFFFFFFF >>> + >>> + and it is wrong to multiply the large constant by 4 in the 64-bit space. */ >>> + >>> +static bool >>> +safe_to_multiply_p (tree type, double_int cst) >>> +{ >>> + if (TYPE_UNSIGNED (type) >>> +&& ! double_int_fits_to_tree_p (signed_type_for (type), cst)) >>> + return false; >>> + >>> + return true; >>> +} >>> >>> This looks wrong. The only relevant check is as whether the >>> multiplication overflows the original type as you miss the implicit >>> truncation that happens. Which is something you don't know >>> unless you know the value. It definitely isn't a property of a type >>> and a constant but the property of two constants and a type. >>> Or the predicate has a wrong name. >>> >>> The use of get_unwidened in this core routine looks like this is >>> all happening in the wrong place and we should have picked up >>> another candidate for this instead? I'm sure Bill will know more here. >> >> I'm not happy with how this patch is progressing. Without having looked >> too deeply, this might be better handled earlier when determining which >> casts are safe to use in building candidates. What you have here seems >> more like closing the barn door after the horse got out. Maybe that's >> the only solution, but it doesn't seem likely. >> >> Another problem is that your test case isn't testing anything except >> that the compiler doesn't crash. That isn't sufficient as a regression >> test. >> >> I'll spend some time looking at this to see if I can find a better >> approach. It might be a day or two before I can get to it. In addition >> to the included test case, are there any other cases you've found that I >> should be concerned with? > > To help me investigate this without having to build a cross compiler, > could you please compile your test case (without the patch applied) > using -fdump-tree-reassoc2 -fdump-tree-slsr-details and send me the > generated dump files? The issue is not specific to AArch64; please find the attached dumps generated from the x86-64 gcc by compiling gcc.dg/tree-ssa/slsr-39.c. W.r.t your comment in the other email about adding a test to verify the expected gimple, I think the existing test gcc.dg/tree-ssa/slsr-39.c is sufficient. The test currently fails on both AArch64 and x86-64, and presumably also fails on any other 64-bit target where pointer is 64-bit and int is 32-bit size. The patch I proposed is to fix this issue and gcc.dg/tree-ssa/slsr-39.c itself shall be a good regression test (with specific verification on gimple ir). The new test proposed in this patch is to regtest the issue my original patch has, which is a runtime failure due to incorrect optimization. I'll address other comments in separate emails. Thanks, Yufeng ;; Function foo (foo, funcdef_no=0, decl_uid=1722, symbol_order=0) ;; 1 loops found ;; ;; Loop 0 ;; header 0, latch 1 ;; depth 0, outer -1 ;; nodes: 0 1 2 ;; 2 succs { 1 } foo (int[50] * a2, int v1) { int j; long unsigned int _3; long unsigned int _4; int[50] * _6; int _11; int _12; int _13; <bb 2>: j_2 = v1_1(D) + 5; _3 = (long unsigned int) j_2; _4 = _3 * 200; _6 = a2_5(D) + _4; j_7 = v1_1(D) + 6; *_6[j_2] = j_2; *_6[j_7] = j_2; _11 = v1_1(D) + 4; _12 = *_6[_11]; _13 = _12 + 1; *_6[_11] = _13; return; } ;; Function foo (foo, funcdef_no=0, decl_uid=1722, symbol_order=0) ;; 1 loops found ;; ;; Loop 0 ;; header 0, latch 1 ;; depth 0, outer -1 ;; nodes: 0 1 2 ;; 2 succs { 1 } Strength reduction candidate vector: 1 [2] j_2 = v1_1(D) + 5; ADD : v1_1(D) + (5 * 1) : int basis: 0 dependent: 5 sibling: 0 next-interp: 0 dead-savings: 0 2 [2] _3 = (long unsigned int) j_2; ADD : v1_1(D) + (5 * 1) : long unsigned int basis: 0 dependent: 0 sibling: 0 next-interp: 0 dead-savings: 0 3 [2] _4 = _3 * 200; MULT : (v1_1(D) + 5) * 200 : long unsigned int basis: 0 dependent: 0 sibling: 0 next-interp: 0 dead-savings: 1 4 [2] _6 = a2_5(D) + _4; ADD : a2_5(D) + (1 * _4) : int[50] * basis: 0 dependent: 0 sibling: 0 next-interp: 0 dead-savings: 0 5 [2] j_7 = v1_1(D) + 6; ADD : v1_1(D) + (6 * 1) : int basis: 1 dependent: 8 sibling: 0 next-interp: 0 dead-savings: 0 6 [2] *_6[j_2] = j_2; REF : _6 + ((sizetype) j_2 * 4) + 0 : int[50] * basis: 0 dependent: 0 sibling: 0 next-interp: 0 dead-savings: 0 7 [2] *_6[j_7] = j_2; REF : _6 + ((sizetype) j_7 * 4) + 0 : int[50] * basis: 0 dependent: 0 sibling: 0 next-interp: 0 dead-savings: 0 8 [2] _11 = v1_1(D) + 4; ADD : v1_1(D) + (4 * 1) : int basis: 5 dependent: 0 sibling: 0 next-interp: 0 dead-savings: 0 9 [2] _12 = *_6[_11]; REF : _6 + ((sizetype) _11 * 4) + 0 : int[50] * basis: 0 dependent: 11 sibling: 0 next-interp: 0 dead-savings: 0 10 [2] _13 = _12 + 1; ADD : _12 + (1 * 1) : int basis: 0 dependent: 0 sibling: 0 next-interp: 0 dead-savings: 0 11 [2] *_6[_11] = _13; REF : _6 + ((sizetype) _11 * 4) + 0 : int[50] * basis: 9 dependent: 0 sibling: 0 next-interp: 0 dead-savings: 0 Strength reduction candidate chains: _6 -> 6 -> 11 -> 9 -> 7 v1_1(D) -> 1 -> 8 -> 5 -> 3 -> 2 a2_5(D) -> 4 _12 -> 10 Processing dependency tree rooted at 1. Processing dependency tree rooted at 9. foo (int[50] * a2, int v1) { int j; long unsigned int _3; long unsigned int _4; int[50] * _6; int _11; int _12; int _13; sizetype _15; sizetype _16; int[50] * _17; sizetype _18; sizetype _19; int[50] * _20; <bb 2>: j_2 = v1_1(D) + 5; _3 = (long unsigned int) j_2; _4 = _3 * 200; _6 = a2_5(D) + _4; j_7 = v1_1(D) + 6; *_6[j_2] = j_2; *_6[j_7] = j_2; _11 = v1_1(D) + 4; _15 = (sizetype) _11; _16 = _15 * 4; _17 = _6 + _16; _12 = MEM[(int[50] *)_17]; _13 = _12 + 1; _18 = (sizetype) _11; _19 = _18 * 4; _20 = _6 + _19; MEM[(int[50] *)_20] = _13; return; }
OK, thanks. The problem that you've encountered is that you are attempting to do something illegal. ;) (Bin's original patch is actually to blame for that, as well as me for not catching it then.) As your new test shows, it is unsafe to do the transformation in backtrace_base_for_ref when widening from an unsigned type, because the unsigned type has wrap semantics by default. (The actual test must be done on TYPE_OVERFLOW_WRAPS since this wrap semantics can be added or removed by compile option -- see the comments with legal_cast_p and legal_cast_p_1 later in the module.) You cannot in general prove that the transformation is allowable for a specific constant, because you don't know that what you're adding it to won't cause an overflow that's handled incorrectly. I believe the correct fix for the unsigned-overflow case is to fail backtrace_base_for_ref if legal_cast_p (in_type, out_type) returns false, where in_type is the type of the new *PBASE, and out_type is the widening type that you're looking through. So you can't just STRIP_NOPS, you have to check the cast for legitimacy for this transformation. This does not explain why backtrace_base_for_ref does not find all the opportunities on slsr-39.c. I don't immediately see what's preventing that. Note that the transformation is legal in that case because you are widening from a signed int to an unsigned int, which won't cause problems. You guys need to dig deeper into why those opportunities are missed when sizetype is larger than int. Let me know if you need help figuring it out. Thanks, Bill On Tue, 2013-10-01 at 16:06 +0100, Yufeng Zhang wrote: > Hi Bill, > > Thank you for the review and the offer to help. > > On 10/01/13 15:36, Bill Schmidt wrote: > > On Tue, 2013-10-01 at 08:17 -0500, Bill Schmidt wrote: > >> On Tue, 2013-10-01 at 12:19 +0200, Richard Biener wrote: > >>> On Wed, Sep 25, 2013 at 1:37 PM, Yufeng Zhang<Yufeng.Zhang@arm.com> wrote: > >>>> Hello, > >>>> > >>>> Please find the updated version of the patch in the attachment. It has > >>>> addressed the previous comments and also included some changes in order to > >>>> pass the bootstrapping on x86_64. > >>>> > >>>> It's also passed the regtest on arm-none-eabi and aarch64-none-elf. > >>>> > >>>> It will also fix the test failure as reported here: > >>>> http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01317.html > >>>> > >>>> OK for the trunk? > >>> > >>> + where n is a 32-bit unsigned int and pointer are 64-bit long. In this > >>> + case, the gimple for (n - 1) is: > >>> + > >>> + _2 = n_1(D) + 4294967295; // 0xFFFFFFFF > >>> + > >>> + and it is wrong to multiply the large constant by 4 in the 64-bit space. */ > >>> + > >>> +static bool > >>> +safe_to_multiply_p (tree type, double_int cst) > >>> +{ > >>> + if (TYPE_UNSIGNED (type) > >>> +&& ! double_int_fits_to_tree_p (signed_type_for (type), cst)) > >>> + return false; > >>> + > >>> + return true; > >>> +} > >>> > >>> This looks wrong. The only relevant check is as whether the > >>> multiplication overflows the original type as you miss the implicit > >>> truncation that happens. Which is something you don't know > >>> unless you know the value. It definitely isn't a property of a type > >>> and a constant but the property of two constants and a type. > >>> Or the predicate has a wrong name. > >>> > >>> The use of get_unwidened in this core routine looks like this is > >>> all happening in the wrong place and we should have picked up > >>> another candidate for this instead? I'm sure Bill will know more here. > >> > >> I'm not happy with how this patch is progressing. Without having looked > >> too deeply, this might be better handled earlier when determining which > >> casts are safe to use in building candidates. What you have here seems > >> more like closing the barn door after the horse got out. Maybe that's > >> the only solution, but it doesn't seem likely. > >> > >> Another problem is that your test case isn't testing anything except > >> that the compiler doesn't crash. That isn't sufficient as a regression > >> test. > >> > >> I'll spend some time looking at this to see if I can find a better > >> approach. It might be a day or two before I can get to it. In addition > >> to the included test case, are there any other cases you've found that I > >> should be concerned with? > > > > To help me investigate this without having to build a cross compiler, > > could you please compile your test case (without the patch applied) > > using -fdump-tree-reassoc2 -fdump-tree-slsr-details and send me the > > generated dump files? > > The issue is not specific to AArch64; please find the attached dumps > generated from the x86-64 gcc by compiling gcc.dg/tree-ssa/slsr-39.c. > > W.r.t your comment in the other email about adding a test to verify the > expected gimple, I think the existing test gcc.dg/tree-ssa/slsr-39.c is > sufficient. The test currently fails on both AArch64 and x86-64, and > presumably also fails on any other 64-bit target where pointer is 64-bit > and int is 32-bit size. The patch I proposed is to fix this issue and > gcc.dg/tree-ssa/slsr-39.c itself shall be a good regression test (with > specific verification on gimple ir). > > The new test proposed in this patch is to regtest the issue my original > patch has, which is a runtime failure due to incorrect optimization. > > I'll address other comments in separate emails. > > Thanks, > Yufeng
On Tue, 2013-10-01 at 11:56 -0500, Bill Schmidt wrote: > OK, thanks. The problem that you've encountered is that you are > attempting to do something illegal. ;) (Bin's original patch is > actually to blame for that, as well as me for not catching it then.) > > As your new test shows, it is unsafe to do the transformation in > backtrace_base_for_ref when widening from an unsigned type, because the > unsigned type has wrap semantics by default. (The actual test must be > done on TYPE_OVERFLOW_WRAPS since this wrap semantics can be added or > removed by compile option -- see the comments with legal_cast_p and > legal_cast_p_1 later in the module.) > > You cannot in general prove that the transformation is allowable for a > specific constant, because you don't know that what you're adding it to > won't cause an overflow that's handled incorrectly. > > I believe the correct fix for the unsigned-overflow case is to fail > backtrace_base_for_ref if legal_cast_p (in_type, out_type) returns > false, where in_type is the type of the new *PBASE, and out_type is the > widening type that you're looking through. So you can't just > STRIP_NOPS, you have to check the cast for legitimacy for this > transformation. > > This does not explain why backtrace_base_for_ref does not find all the > opportunities on slsr-39.c. I don't immediately see what's preventing > that. Note that the transformation is legal in that case because you > are widening from a signed int to an unsigned int, which won't cause > problems. You guys need to dig deeper into why those opportunities are > missed when sizetype is larger than int. Let me know if you need help > figuring it out. Sorry, I had to leave before and wanted to get this response back to you in case I didn't get back soon. I've looked at this some more, and your general approach should work ok once you get the legal_cast_p check in place where you do the get_unwidened call now. Once you know you have a legal widening, you don't have to worry about the safe_to_multiply_p stuff. I.e., you don't need the last two chunks in the patch to backtrace_base_for_ref, and you don't need the unwidened_p variable. It should all fall out properly by just restricting your unwidening to legal casts. Thanks, Bill > > Thanks, > Bill > > On Tue, 2013-10-01 at 16:06 +0100, Yufeng Zhang wrote: > > Hi Bill, > > > > Thank you for the review and the offer to help. > > > > On 10/01/13 15:36, Bill Schmidt wrote: > > > On Tue, 2013-10-01 at 08:17 -0500, Bill Schmidt wrote: > > >> On Tue, 2013-10-01 at 12:19 +0200, Richard Biener wrote: > > >>> On Wed, Sep 25, 2013 at 1:37 PM, Yufeng Zhang<Yufeng.Zhang@arm.com> wrote: > > >>>> Hello, > > >>>> > > >>>> Please find the updated version of the patch in the attachment. It has > > >>>> addressed the previous comments and also included some changes in order to > > >>>> pass the bootstrapping on x86_64. > > >>>> > > >>>> It's also passed the regtest on arm-none-eabi and aarch64-none-elf. > > >>>> > > >>>> It will also fix the test failure as reported here: > > >>>> http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01317.html > > >>>> > > >>>> OK for the trunk? > > >>> > > >>> + where n is a 32-bit unsigned int and pointer are 64-bit long. In this > > >>> + case, the gimple for (n - 1) is: > > >>> + > > >>> + _2 = n_1(D) + 4294967295; // 0xFFFFFFFF > > >>> + > > >>> + and it is wrong to multiply the large constant by 4 in the 64-bit space. */ > > >>> + > > >>> +static bool > > >>> +safe_to_multiply_p (tree type, double_int cst) > > >>> +{ > > >>> + if (TYPE_UNSIGNED (type) > > >>> +&& ! double_int_fits_to_tree_p (signed_type_for (type), cst)) > > >>> + return false; > > >>> + > > >>> + return true; > > >>> +} > > >>> > > >>> This looks wrong. The only relevant check is as whether the > > >>> multiplication overflows the original type as you miss the implicit > > >>> truncation that happens. Which is something you don't know > > >>> unless you know the value. It definitely isn't a property of a type > > >>> and a constant but the property of two constants and a type. > > >>> Or the predicate has a wrong name. > > >>> > > >>> The use of get_unwidened in this core routine looks like this is > > >>> all happening in the wrong place and we should have picked up > > >>> another candidate for this instead? I'm sure Bill will know more here. > > >> > > >> I'm not happy with how this patch is progressing. Without having looked > > >> too deeply, this might be better handled earlier when determining which > > >> casts are safe to use in building candidates. What you have here seems > > >> more like closing the barn door after the horse got out. Maybe that's > > >> the only solution, but it doesn't seem likely. > > >> > > >> Another problem is that your test case isn't testing anything except > > >> that the compiler doesn't crash. That isn't sufficient as a regression > > >> test. > > >> > > >> I'll spend some time looking at this to see if I can find a better > > >> approach. It might be a day or two before I can get to it. In addition > > >> to the included test case, are there any other cases you've found that I > > >> should be concerned with? > > > > > > To help me investigate this without having to build a cross compiler, > > > could you please compile your test case (without the patch applied) > > > using -fdump-tree-reassoc2 -fdump-tree-slsr-details and send me the > > > generated dump files? > > > > The issue is not specific to AArch64; please find the attached dumps > > generated from the x86-64 gcc by compiling gcc.dg/tree-ssa/slsr-39.c. > > > > W.r.t your comment in the other email about adding a test to verify the > > expected gimple, I think the existing test gcc.dg/tree-ssa/slsr-39.c is > > sufficient. The test currently fails on both AArch64 and x86-64, and > > presumably also fails on any other 64-bit target where pointer is 64-bit > > and int is 32-bit size. The patch I proposed is to fix this issue and > > gcc.dg/tree-ssa/slsr-39.c itself shall be a good regression test (with > > specific verification on gimple ir). > > > > The new test proposed in this patch is to regtest the issue my original > > patch has, which is a runtime failure due to incorrect optimization. > > > > I'll address other comments in separate emails. > > > > Thanks, > > Yufeng
diff --git a/gcc/gimple-ssa-strength-reduction.c b/gcc/gimple-ssa-strength-reduction.c index 8d48add..1c04382 100644 --- a/gcc/gimple-ssa-strength-reduction.c +++ b/gcc/gimple-ssa-strength-reduction.c @@ -750,6 +750,40 @@ slsr_process_phi (gimple phi, bool speed) add_cand_for_stmt (phi, c); } +/* Utility function for backtrace_base_for_ref. + + Given + + T2 = T2' + CST + RES = (wider_type) T2 * C3 + + where TYPE is TREE_TYPE (T2), this function returns false when it is + _not_ safe to carry out the following transformation. + + RES = (wider_type) T2' * C3 + (wider_type) CST * C3 + + One example unsafe case is: + + int array[40]; + array[n - 1] + + where n is a 32-bit unsigned int and pointer are 64-bit long. In this + case, the gimple for (n - 1) is: + + _2 = n_1(D) + 4294967295; // 0xFFFFFFFF + + and it is wrong to multiply the large constant by 4 in the 64-bit space. */ + +static bool +safe_to_multiply_p (tree type, double_int cst) +{ + if (TYPE_UNSIGNED (type) + && ! double_int_fits_to_tree_p (signed_type_for (type), cst)) + return false; + + return true; +} + /* Given PBASE which is a pointer to tree, look up the defining statement for it and check whether the candidate is in the form of: @@ -766,10 +800,19 @@ backtrace_base_for_ref (tree *pbase) { tree base_in = *pbase; slsr_cand_t base_cand; + bool unwidened_p = false; STRIP_NOPS (base_in); if (TREE_CODE (base_in) != SSA_NAME) - return tree_to_double_int (integer_zero_node); + { + /* Strip off widening conversion(s) to handle cases where + e.g. 'B' is widened from an 'int' in order to calculate + a 64-bit address. */ + base_in = get_unwidened (base_in, NULL_TREE); + if (TREE_CODE (base_in) != SSA_NAME) + return tree_to_double_int (integer_zero_node); + unwidened_p = true; + } base_cand = base_cand_from_table (base_in); @@ -777,7 +820,10 @@ backtrace_base_for_ref (tree *pbase) { if (base_cand->kind == CAND_ADD && base_cand->index.is_one () - && TREE_CODE (base_cand->stride) == INTEGER_CST) + && TREE_CODE (base_cand->stride) == INTEGER_CST + && (! unwidened_p + || safe_to_multiply_p (TREE_TYPE (base_cand->stride), + tree_to_double_int (base_cand->stride)))) { /* X = B + (1 * S), S is integer constant. */ *pbase = base_cand->base_expr; @@ -785,8 +831,11 @@ backtrace_base_for_ref (tree *pbase) } else if (base_cand->kind == CAND_ADD && TREE_CODE (base_cand->stride) == INTEGER_CST - && integer_onep (base_cand->stride)) - { + && integer_onep (base_cand->stride) + && (! unwidened_p + || safe_to_multiply_p (TREE_TYPE (base_cand->base_expr), + base_cand->index))) + { /* X = B + (i * S), S is integer one. */ *pbase = base_cand->base_expr; return base_cand->index; diff --git a/gcc/testsuite/gcc.dg/tree-ssa/slsr-40.c b/gcc/testsuite/gcc.dg/tree-ssa/slsr-40.c new file mode 100644 index 0000000..72726a3 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/slsr-40.c @@ -0,0 +1,27 @@ +/* Verify straight-line strength reduction for array + subscripting. + + elems[n-1] is reduced to elems + n * 4 + 0xffffffff * 4, only when + pointers are of the same size as that of int (assuming 4 bytes). */ + +/* { dg-do run } */ +/* { dg-options "-O2" } */ + +struct data +{ + unsigned long elms[1]; +} gData; + +void __attribute__((noinline)) +foo (struct data *dst, unsigned int n) +{ + dst->elms[n - 1] &= 1; +} + +int +main () +{ + foo (&gData, 1); + return 0; +} +