diff mbox

Relax check against commuting XOR and ASHIFTRT in combine.c

Message ID 53C80023.6000100@arm.com
State New
Headers show

Commit Message

Alan Lawrence July 17, 2014, 4:56 p.m. UTC
Ok, the attached tests are passing on x86_64-none-linux-gnu, aarch64-none-elf, 
arm-none-eabi, and a bunch of smaller platforms for which I've only built a 
stage 1 compiler (i.e. as far as necessary to assemble). That's with either 
change to simplify_shift_const_1.

As to the addition of "result_mode != shift_mode", or removing the whole check 
against XOR - I've now tested the latter:

bootstrapped on x86_64-none-linux-gnu, check-gcc and check-ada;
bootstrapped on arm-none-linux-gnueabihf;
bootstrapped on aarch64-none-linux-gnu;
cross-tested check-gcc on aarch64-none-elf;
cross-tested on arm-none-eabi;
(and Uros has bootstrapped on alpha and done a suite of tests, as per 
https://gcc.gnu.org/ml/gcc-testresults/2014-07/msg01236.html).

 From a perspective of paranoia, I'd lean towards adding "result_mode != 
shift_mode", but for neatness removing the whole check against XOR is nicer. So 
I'd defer to the maintainers as to whether one might be preferable to the 
other...(but my unproven suspicion is that the two are equivalent, and no case 
where result_mode != shift_mode is possible!)

--Alan

Alan Lawrence wrote:
> Thanks for the suggestions! I think I've got a reasonably platform-independent 
> testcase that scans the rtl dump, just trying it on a few more platforms now...
> 
> As to running on Alpha: bootstrap succeeds, and the regression testsuite doesn't 
> raise any issues (https://gcc.gnu.org/ml/gcc-testresults/2014-07/msg01236.html) 
> - and that's with a more aggressive patch that completely rolls back the 
> original r76965:
> 
> Index: combine.c
> ===================================================================
> --- combine.c   (revision 212523)
> +++ combine.c   (working copy)
> @@ -10218,9 +10218,6 @@
>              if (CONST_INT_P (XEXP (varop, 1))
>                  /* We can't do this if we have (ashiftrt (xor))  and the
>                     constant has its sign bit set in shift_mode.  */
> -             && !(code == ASHIFTRT && GET_CODE (varop) == XOR
> -                  && 0 > trunc_int_for_mode (INTVAL (XEXP (varop, 1)),
> -                                             shift_mode))
>                  && (new_rtx = simplify_const_binary_operation
>                      (code, result_mode,
>                       gen_int_mode (INTVAL (XEXP (varop, 1)), result_mode),
> @@ -10237,10 +10234,7 @@
>                 logical expression, make a new logical expression, and apply
>                 the inverse distributive law.  This also can't be done
>                 for some (ashiftrt (xor)).  */
> -         if (CONST_INT_P (XEXP (varop, 1))
> -            && !(code == ASHIFTRT && GET_CODE (varop) == XOR
> -                 && 0 > trunc_int_for_mode (INTVAL (XEXP (varop, 1)),
> -                                            shift_mode)))
> +         if (CONST_INT_P (XEXP (varop, 1)))
>                {
>                  rtx lhs = simplify_shift_const (NULL_RTX, code, shift_mode,
>                                                  XEXP (varop, 0), count);
> 
> I'm testing this version more widely but initial indications are good.
> 
> However, I've not succeeded in checking Ada on Alpha, as GCC's Ada frontend 
> requires an Ada compiler to bootstrap. So I have to ask: does anyone actually 
> use Ada on Alpha? (And if so, would they please be able to test the above patch?)
> 
> Moreover, I don't really see we have much reason to believe the check against 
> commuting is required even for Ada/Alpha. GCC's internals have changed 
> substantially in the interim, with the Ada frontend no longer generating RTL 
> directly, as we now have intervening GENERIC/GIMPLE tree stages. Unless there is 
> a logical/bitwise explanation for why the commuting of ashiftrc and xor is 
> unsafe, is the best explanation that the Ada frontend was generating RTL that 
> may have looked OK at the time but we would now consider dubious, malformed, bad?
> 
> (E.g., these days I don't see how to produce an ashiftrt of one mode containing
> an XOR of another without an intervening sign_extend, zero_extend or subreg.)
> 
> --Alan
> 
> Jeff Law wrote:
>> On 06/30/14 13:05, Alan Lawrence wrote:
>>> combine.c includes a check which prevents
>>>
>>> (ashiftrt (xor A C2) C1)
>>>
>>> from being commuted to
>>>
>>> (xor (ashiftrt A C1) (ashiftrt C2 C1))
>>>
>>> for constants C1, C2 if C2 has its sign bit set.
>>>
>>> Specifically, this prevents (ashiftrt (not A) C1) from being commuted to
>>>
>>> (not (ashiftrt A C1))
>>>
>>> because the former is rewritten to (ashiftrt (xor A -1) C1) above, with
>>> a comment /* Make this fit the case below.  */ - which it no longer does.
>>>
>>> If result_mode == shift_mode, I can see no reason to prevent this
>>> normalisation (indeed, I'm not sure I can see any reason to prevent it
>>> even if result_mode != shift_mode - but I've not managed to produce such
>>> a case in my own testing, as there are always intervening subreg's or
>>> sign_extend's, or to build a toolchain on which to reproduce the
>>> original bug, so I'm being cautious). Hence this patch allows
>>> commutation if the two modes are equal.
>>>
>>> As an illustrative example, on AArch64, without this patch, compiling
>>> this snippet of the current arm_neon.h:
>>>
>>> typedef long long int int64x1_t __attribute__ ((__vector_size__((8))));
>>> int64x1 vcgez_s64(int64x1_t a)
>>> {
>>>    return (int64x1_t) {a >=0 ? -1 : 0};
>>> }
>>>
>>> produces a sequence involving a logical not (mvn) followed by asr (plus
>>> some inter-bank moves), as the combiner takes (ashiftrt (not x) 63),
>>> "simplifies", and fails to match the resulting RTL
>>>
>>> (set (reg:DI 78 [ D.2593 ])
>>>      (ashiftrt:DI (xor:DI (reg:DI 0 x0 [ aD.2565 ])
>>>              (const_int -1 [0xffffffffffffffff]))
>>>          (const_int 63 [0x3f])))
>>>
>>> However, with this patch, the combiner simplifies to
>>>
>>> (set (reg:DI 84 [ D.2604 ])
>>>      (neg:DI (ge:DI (reg:DI 32 v0 [ aD.2569 ])
>>>              (const_int 0 [0]))))
>>>
>>> which matches a pattern to output the desired cmge instruction.
>>>
>>> (For the curious: the check against commutation was added in January
>>> 2004, r76965, https://gcc.gnu.org/ml/gcc-patches/2004-01/msg03406.html -
>>> however, I've not been able to build an ADA toolchain of that era, or an
>>> Alpha/VMS toolchain, on which to reproduce the bug; if anyone has such
>>> and is willing and able to investigate, by all means!)
>>>
>>> Testing:
>>>
>>> aarch64-none-elf: check-gcc (cross-tested)
>>> x86-none-linux-gnu: bootstrapped; check-gcc, check-ada, check-g++
>>> arm-none-linux-gnueabi: bootstrapped
>>> arm-none-eabi: check-gcc (cross-tested)
>>>
>>>
>>> gcc/ChangeLog:
>>>
>>>      * combine.c (simplify_shift_const_1): Allow commuting ASHIFTRT and XOR
>>>      if same mode.
>> You'll want to use your sample code from above to create a testcase. 
>> You can either dump the RTL and search it, or scan the assembly code.
>>
>> Look in gcc/testsuite/gcc.target/arm for examples.
>>
>> Given the original problem which prompted Kenner to make this change was 
>> Ada related on the Alpha, you might ping rth and/or uros to see if they 
>> could do a "quick" regression of those platforms with Ada enabled.
>>
>> I'm naturally hesitant to approve since this was something pretty Kenner 
>> explicitly tried to avoid AFAICT.  Kenner wasn't ever known for trying 
>> to paper over problems -- if he checked in a change like this, much more 
>> likely than not it was well thought out and analyzed.  He also probably 
>> knew combine.c better than anyone at that time (possibly even still true 
>> today).
>>
>>
>> Jeff
>>
>>
>>
> 
> 
> 
>

Comments

Alan Lawrence Aug. 20, 2014, 10:05 a.m. UTC | #1
Thanks to Arnaud for confirming that Adacore does not have interest in the
Ada/Alpha combination (https://gcc.gnu.org/ml/gcc-patches/2014-08/msg01832.html).

As per below, I've tested check-ada on x86_64-none-linux-gnu without problems.
Can I say, "ping"?  :)

Cheers, Alan



Alan Lawrence wrote:
> Ok, the attached tests are passing on x86_64-none-linux-gnu, aarch64-none-elf, 
> arm-none-eabi, and a bunch of smaller platforms for which I've only built a 
> stage 1 compiler (i.e. as far as necessary to assemble). That's with either 
> change to simplify_shift_const_1.
> 
> As to the addition of "result_mode != shift_mode", or removing the whole check 
> against XOR - I've now tested the latter:
> 
> bootstrapped on x86_64-none-linux-gnu, check-gcc and check-ada;
> bootstrapped on arm-none-linux-gnueabihf;
> bootstrapped on aarch64-none-linux-gnu;
> cross-tested check-gcc on aarch64-none-elf;
> cross-tested on arm-none-eabi;
> (and Uros has bootstrapped on alpha and done a suite of tests, as per 
> https://gcc.gnu.org/ml/gcc-testresults/2014-07/msg01236.html).
> 
>  From a perspective of paranoia, I'd lean towards adding "result_mode != 
> shift_mode", but for neatness removing the whole check against XOR is nicer. So 
> I'd defer to the maintainers as to whether one might be preferable to the 
> other...(but my unproven suspicion is that the two are equivalent, and no case 
> where result_mode != shift_mode is possible!)
> 
> --Alan
> 
> Alan Lawrence wrote:
>> Thanks for the suggestions! I think I've got a reasonably platform-independent 
>> testcase that scans the rtl dump, just trying it on a few more platforms now...
>>
>> As to running on Alpha: bootstrap succeeds, and the regression testsuite doesn't 
>> raise any issues (https://gcc.gnu.org/ml/gcc-testresults/2014-07/msg01236.html) 
>> - and that's with a more aggressive patch that completely rolls back the 
>> original r76965:
>>
>> Index: combine.c
>> ===================================================================
>> --- combine.c   (revision 212523)
>> +++ combine.c   (working copy)
>> @@ -10218,9 +10218,6 @@
>>              if (CONST_INT_P (XEXP (varop, 1))
>>                  /* We can't do this if we have (ashiftrt (xor))  and the
>>                     constant has its sign bit set in shift_mode.  */
>> -             && !(code == ASHIFTRT && GET_CODE (varop) == XOR
>> -                  && 0 > trunc_int_for_mode (INTVAL (XEXP (varop, 1)),
>> -                                             shift_mode))
>>                  && (new_rtx = simplify_const_binary_operation
>>                      (code, result_mode,
>>                       gen_int_mode (INTVAL (XEXP (varop, 1)), result_mode),
>> @@ -10237,10 +10234,7 @@
>>                 logical expression, make a new logical expression, and apply
>>                 the inverse distributive law.  This also can't be done
>>                 for some (ashiftrt (xor)).  */
>> -         if (CONST_INT_P (XEXP (varop, 1))
>> -            && !(code == ASHIFTRT && GET_CODE (varop) == XOR
>> -                 && 0 > trunc_int_for_mode (INTVAL (XEXP (varop, 1)),
>> -                                            shift_mode)))
>> +         if (CONST_INT_P (XEXP (varop, 1)))
>>                {
>>                  rtx lhs = simplify_shift_const (NULL_RTX, code, shift_mode,
>>                                                  XEXP (varop, 0), count);
>>
>> I'm testing this version more widely but initial indications are good.
>>
>> However, I've not succeeded in checking Ada on Alpha, as GCC's Ada frontend 
>> requires an Ada compiler to bootstrap. So I have to ask: does anyone actually 
>> use Ada on Alpha? (And if so, would they please be able to test the above patch?)
>>
>> Moreover, I don't really see we have much reason to believe the check against 
>> commuting is required even for Ada/Alpha. GCC's internals have changed 
>> substantially in the interim, with the Ada frontend no longer generating RTL 
>> directly, as we now have intervening GENERIC/GIMPLE tree stages. Unless there is 
>> a logical/bitwise explanation for why the commuting of ashiftrc and xor is 
>> unsafe, is the best explanation that the Ada frontend was generating RTL that 
>> may have looked OK at the time but we would now consider dubious, malformed, bad?
>>
>> (E.g., these days I don't see how to produce an ashiftrt of one mode containing
>> an XOR of another without an intervening sign_extend, zero_extend or subreg.)
>>
>> --Alan
>>
>> Jeff Law wrote:
>>> On 06/30/14 13:05, Alan Lawrence wrote:
>>>> combine.c includes a check which prevents
>>>>
>>>> (ashiftrt (xor A C2) C1)
>>>>
>>>> from being commuted to
>>>>
>>>> (xor (ashiftrt A C1) (ashiftrt C2 C1))
>>>>
>>>> for constants C1, C2 if C2 has its sign bit set.
>>>>
>>>> Specifically, this prevents (ashiftrt (not A) C1) from being commuted to
>>>>
>>>> (not (ashiftrt A C1))
>>>>
>>>> because the former is rewritten to (ashiftrt (xor A -1) C1) above, with
>>>> a comment /* Make this fit the case below.  */ - which it no longer does.
>>>>
>>>> If result_mode == shift_mode, I can see no reason to prevent this
>>>> normalisation (indeed, I'm not sure I can see any reason to prevent it
>>>> even if result_mode != shift_mode - but I've not managed to produce such
>>>> a case in my own testing, as there are always intervening subreg's or
>>>> sign_extend's, or to build a toolchain on which to reproduce the
>>>> original bug, so I'm being cautious). Hence this patch allows
>>>> commutation if the two modes are equal.
>>>>
>>>> As an illustrative example, on AArch64, without this patch, compiling
>>>> this snippet of the current arm_neon.h:
>>>>
>>>> typedef long long int int64x1_t __attribute__ ((__vector_size__((8))));
>>>> int64x1 vcgez_s64(int64x1_t a)
>>>> {
>>>>    return (int64x1_t) {a >=0 ? -1 : 0};
>>>> }
>>>>
>>>> produces a sequence involving a logical not (mvn) followed by asr (plus
>>>> some inter-bank moves), as the combiner takes (ashiftrt (not x) 63),
>>>> "simplifies", and fails to match the resulting RTL
>>>>
>>>> (set (reg:DI 78 [ D.2593 ])
>>>>      (ashiftrt:DI (xor:DI (reg:DI 0 x0 [ aD.2565 ])
>>>>              (const_int -1 [0xffffffffffffffff]))
>>>>          (const_int 63 [0x3f])))
>>>>
>>>> However, with this patch, the combiner simplifies to
>>>>
>>>> (set (reg:DI 84 [ D.2604 ])
>>>>      (neg:DI (ge:DI (reg:DI 32 v0 [ aD.2569 ])
>>>>              (const_int 0 [0]))))
>>>>
>>>> which matches a pattern to output the desired cmge instruction.
>>>>
>>>> (For the curious: the check against commutation was added in January
>>>> 2004, r76965, https://gcc.gnu.org/ml/gcc-patches/2004-01/msg03406.html -
>>>> however, I've not been able to build an ADA toolchain of that era, or an
>>>> Alpha/VMS toolchain, on which to reproduce the bug; if anyone has such
>>>> and is willing and able to investigate, by all means!)
>>>>
>>>> Testing:
>>>>
>>>> aarch64-none-elf: check-gcc (cross-tested)
>>>> x86-none-linux-gnu: bootstrapped; check-gcc, check-ada, check-g++
>>>> arm-none-linux-gnueabi: bootstrapped
>>>> arm-none-eabi: check-gcc (cross-tested)
>>>>
>>>>
>>>> gcc/ChangeLog:
>>>>
>>>>      * combine.c (simplify_shift_const_1): Allow commuting ASHIFTRT and XOR
>>>>      if same mode.
>>> You'll want to use your sample code from above to create a testcase. 
>>> You can either dump the RTL and search it, or scan the assembly code.
>>>
>>> Look in gcc/testsuite/gcc.target/arm for examples.
>>>
>>> Given the original problem which prompted Kenner to make this change was 
>>> Ada related on the Alpha, you might ping rth and/or uros to see if they 
>>> could do a "quick" regression of those platforms with Ada enabled.
>>>
>>> I'm naturally hesitant to approve since this was something pretty Kenner 
>>> explicitly tried to avoid AFAICT.  Kenner wasn't ever known for trying 
>>> to paper over problems -- if he checked in a change like this, much more 
>>> likely than not it was well thought out and analyzed.  He also probably 
>>> knew combine.c better than anyone at that time (possibly even still true 
>>> today).
>>>
>>>
>>> Jeff
>>>
>>>
>>>
>>
>>
Jeff Law Sept. 5, 2014, 6:06 p.m. UTC | #2
On 07/17/14 10:56, Alan Lawrence wrote:
> Ok, the attached tests are passing on x86_64-none-linux-gnu,
> aarch64-none-elf, arm-none-eabi, and a bunch of smaller platforms for
> which I've only built a stage 1 compiler (i.e. as far as necessary to
> assemble). That's with either change to simplify_shift_const_1.
>
> As to the addition of "result_mode != shift_mode", or removing the whole
> check against XOR - I've now tested the latter:
>
> bootstrapped on x86_64-none-linux-gnu, check-gcc and check-ada;
> bootstrapped on arm-none-linux-gnueabihf;
> bootstrapped on aarch64-none-linux-gnu;
> cross-tested check-gcc on aarch64-none-elf;
> cross-tested on arm-none-eabi;
> (and Uros has bootstrapped on alpha and done a suite of tests, as per
> https://gcc.gnu.org/ml/gcc-testresults/2014-07/msg01236.html).
>
>  From a perspective of paranoia, I'd lean towards adding "result_mode !=
> shift_mode", but for neatness removing the whole check against XOR is
> nicer. So I'd defer to the maintainers as to whether one might be
> preferable to the other...(but my unproven suspicion is that the two are
> equivalent, and no case where result_mode != shift_mode is possible!)
So first, whether or not someone cares about Alpha-VMS isn't the issue 
at hand.  It's whether or not the new code is correct or not. 
Similarly the fact that the code generation paths are radically 
different now when compared to 2004 and we can't currently trigger the 
cases where the modes are different isn't the issue, again, it's whether 
or not your code is correct or not.

I think the key is to look at try_widen_shift_mode and realize that it 
can return a mode larger than the original mode of the operations. 
However, it only does so when it presented with a case where it knows 
the sign bit being shifted in from the left will be the same as the sign 
bit in the original mode.

In the case of an XOR when the sign bit set in shift_mode, that's not 
going to be the case.  We would violate the assumption made when we 
decided to widen the shift to shift_mode.

So your relaxation is safe when shift_mode == result_mode, but unsafe 
otherwise -- even though we don't currently have a testcase for the 
shift_mode != result_mode case, we don't want to break that.

So your updated patch is correct.  However, I would ask that you make 
one additional change.  Namely the comment before the two fragments of 
code you changed needs updating.  Something like

"... and the constant has its sign bit set in shift_mode and shift_mode
  is different than result_mode".

The 2nd comment should have similar clarifying comments.

You should also include your testcase for a cursory review.

So with the inclusion of the testcase and updated comments, we should be 
good to go.  However, please post the final patch for that cursory 
review of the testcase.

jeff
Jeff Law Sept. 19, 2014, 5:38 a.m. UTC | #3
On 09/18/14 03:35, Alan Lawrence wrote:
> Moreover, I think we both agree that if result_mode==shift_mode, the
> transformation is correct. "Just" putting that check in, achieves
> what I'm trying for here, so I'd be happy to go with the attached
> patch and call it a day. However, I'm a little concerned about the
> other cases - i.e. where shift_mode is wider than result_mode.
Let's go ahead and get the attached patch installed.  I'm pretty sure 
it's correct and I know you want to see something move forward here.  We 
can iterate further if we want.

>
> If I understand correctly (and I'm not sure about that, let's see how
> far I get), this means we'll perform the shift in (say) DImode, when
> we're only really concerned about the lower (say) 32-bits (for an
> originally-SImode shift).
That's the class of cases I'm concerned about.


  try_widen_shift_mode will in this case
> check that the result of the operation *inside* the shift (in our
> case an XOR) has 33 sign bit copies (via num_sign_bit_copies), i.e.
> that its *top* 32-bits are all equal to the original SImode sign bit.
> <count> of these bits may be fed into the top of the desired SImode
> result by the DImode shift. Right so far?
Correct.

>
> AFAICT, num_sign_bit_copies for an XOR, conservatively returns the
> minimum of the num_sign_bit_copies of its two operands. I'm not sure
> whether this is behaviour we should rely on in its callers, or for
> the sake of abstraction we should treat num_sign_bit_copies as a
> black box (which does what it says on the, erm, tin).
Treat it as a black box.  It returns the number of known sign bit 
copies.  There may be more, but never less.


>
> If the former, doesn't having num_sign_bit_copies >= the difference
> in size between result_mode and shift_mode, of both operands to the
> XOR, guarantee safety of the commutation (whether the constant is
> positive or negative)? We could perform the shift (in the larger
> mode) on both of the XOR operands safely, then XOR together their
> lower parts.
I had convinced myself that when we flip the sign bit via the XOR and 
commute the XOR out that we invalidate the assumptions made when 
widening.  But I'm not so sure anymore.  Damn I hate changes made 
without suitable tests :(

I almost convinced myself the problem is in the adjustment of C2 in the 
widened case, but that's not a problem either.  At least not on paper.

>
> If, however, we want to play safe and ensure that we deal safely with
>  any XOR whose top (mode size difference + 1) bits were the same,
> then I think the restriction that the XOR constant is positive is
> neither necessary nor sufficient; rather (mirroring
> try_widen_shift_mode) I think we need that num_sign_bit_copies of the
> constant in shift_mode, is more than the size difference between
> result_mode and shift_mode.
But isn't that the same?  Isn't the only case where it isn't the same 
when the constant has bits set that are outside the mode of the other 
operand?

Hmm, what about (xor:QI A -1)?  In that case -1 will be represented with 
bits outside the precision of QImode.

>
> Hmmm. I might try that patch at some point, I think it is the right
> check to make. (Meta-comment: this would be *so*much* easier if we
> could write unit tests more easily!) In the meantime I'd be happy to
> settle for the attached...
No argument on the unit testing comment.  It's a major failing in the 
design of GCC that we can't easily build a unit testing framework.

Jeff
Alan Lawrence Sept. 22, 2014, 11:16 a.m. UTC | #4
Ok thanks Jeff. In that case I think I should draw this to the attention of the 
AArch64 maintainers to check the testsuite updates are OK before I commit...?

Methinks it may be possible to get further, or at least increase our confidence, 
if I "mock" out try_widen_shift_mode, and/or try injecting some dubious RTL from 
a builtin, although this'll only give a momentary snapshot of behaviour. I may 
or may not have time to look into this though ;)...

Cheers, Alan

Jeff Law wrote:
> On 09/18/14 03:35, Alan Lawrence wrote:
>> Moreover, I think we both agree that if result_mode==shift_mode, the
>> transformation is correct. "Just" putting that check in, achieves
>> what I'm trying for here, so I'd be happy to go with the attached
>> patch and call it a day. However, I'm a little concerned about the
>> other cases - i.e. where shift_mode is wider than result_mode.
> Let's go ahead and get the attached patch installed.  I'm pretty sure 
> it's correct and I know you want to see something move forward here.  We 
> can iterate further if we want.
> 
>> If I understand correctly (and I'm not sure about that, let's see how
>> far I get), this means we'll perform the shift in (say) DImode, when
>> we're only really concerned about the lower (say) 32-bits (for an
>> originally-SImode shift).
> That's the class of cases I'm concerned about.
> 
> 
>   try_widen_shift_mode will in this case
>> check that the result of the operation *inside* the shift (in our
>> case an XOR) has 33 sign bit copies (via num_sign_bit_copies), i.e.
>> that its *top* 32-bits are all equal to the original SImode sign bit.
>> <count> of these bits may be fed into the top of the desired SImode
>> result by the DImode shift. Right so far?
> Correct.
> 
>> AFAICT, num_sign_bit_copies for an XOR, conservatively returns the
>> minimum of the num_sign_bit_copies of its two operands. I'm not sure
>> whether this is behaviour we should rely on in its callers, or for
>> the sake of abstraction we should treat num_sign_bit_copies as a
>> black box (which does what it says on the, erm, tin).
> Treat it as a black box.  It returns the number of known sign bit 
> copies.  There may be more, but never less.
> 
> 
>> If the former, doesn't having num_sign_bit_copies >= the difference
>> in size between result_mode and shift_mode, of both operands to the
>> XOR, guarantee safety of the commutation (whether the constant is
>> positive or negative)? We could perform the shift (in the larger
>> mode) on both of the XOR operands safely, then XOR together their
>> lower parts.
> I had convinced myself that when we flip the sign bit via the XOR and 
> commute the XOR out that we invalidate the assumptions made when 
> widening.  But I'm not so sure anymore.  Damn I hate changes made 
> without suitable tests :(
> 
> I almost convinced myself the problem is in the adjustment of C2 in the 
> widened case, but that's not a problem either.  At least not on paper.
> 
>> If, however, we want to play safe and ensure that we deal safely with
>>  any XOR whose top (mode size difference + 1) bits were the same,
>> then I think the restriction that the XOR constant is positive is
>> neither necessary nor sufficient; rather (mirroring
>> try_widen_shift_mode) I think we need that num_sign_bit_copies of the
>> constant in shift_mode, is more than the size difference between
>> result_mode and shift_mode.
> But isn't that the same?  Isn't the only case where it isn't the same 
> when the constant has bits set that are outside the mode of the other 
> operand?
> 
> Hmm, what about (xor:QI A -1)?  In that case -1 will be represented with 
> bits outside the precision of QImode.
> 
>> Hmmm. I might try that patch at some point, I think it is the right
>> check to make. (Meta-comment: this would be *so*much* easier if we
>> could write unit tests more easily!) In the meantime I'd be happy to
>> settle for the attached...
> No argument on the unit testing comment.  It's a major failing in the 
> design of GCC that we can't easily build a unit testing framework.
> 
> Jeff
>
Jeff Law Sept. 22, 2014, 5:02 p.m. UTC | #5
On 09/22/14 05:16, Alan Lawrence wrote:
> Ok thanks Jeff. In that case I think I should draw this to the attention
> of the AArch64 maintainers to check the testsuite updates are OK before
> I commit...?
Can't hurt.

>
> Methinks it may be possible to get further, or at least increase our
> confidence, if I "mock" out try_widen_shift_mode, and/or try injecting
> some dubious RTL from a builtin, although this'll only give a momentary
> snapshot of behaviour. I may or may not have time to look into this
> though ;)...
Yea, it's something I'd pondered as well, though I tend to inject the 
RTL I want directly in the debugger.  The downside is doing so doesn't 
ensure various tables are updated properly, and I vaguely recall a 
per-pseudo table for the combiner's nonzero_bits, signbit_copies and 
friends.


jeff
Marcus Shawcroft Sept. 23, 2014, 11:32 a.m. UTC | #6
On 22 September 2014 12:16, Alan Lawrence <alan.lawrence@arm.com> wrote:
> Ok thanks Jeff. In that case I think I should draw this to the attention of
> the AArch64 maintainers to check the testsuite updates are OK before I
> commit...?

OK with me.

Cheers
/Marcus
diff mbox

Patch

diff --git a/gcc/testsuite/gcc.dg/combine_ashiftrt_1.c b/gcc/testsuite/gcc.dg/combine_ashiftrt_1.c
new file mode 100644
index 0000000000000000000000000000000000000000..90e64fd10dc358f10ad03a90041605bc3ccb7011
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/combine_ashiftrt_1.c
@@ -0,0 +1,18 @@ 
+/* { dg-do compile {target sparc64*-*-* aarch64*-*-* x86_64-*-* powerpc64*-*-*} } */
+/* { dg-options "-O2 -fdump-rtl-combine-all" } */
+
+typedef long long int int64_t;
+
+int64_t
+foo (int64_t a)
+{
+  return (~a) >> 63;
+}
+
+/* The combine phase will try to combine not & ashiftrt, and
+   combine_simplify_rtx should transform (ashiftrt (not x) 63)
+   to (not (ashiftrt x 63)) and then to (neg (ge x 0)). We look for
+   the *attempt* to match this RTL pattern, regardless of whether an
+   actual insn may be found on the platform.  */
+/* { dg-final { scan-rtl-dump "\\(neg:DI \\(ge:DI" "combine" } } */
+/* { dg-final { cleanup-rtl-dump "combine" } } */
diff --git a/gcc/testsuite/gcc.dg/combine_ashiftrt_2.c b/gcc/testsuite/gcc.dg/combine_ashiftrt_2.c
new file mode 100644
index 0000000000000000000000000000000000000000..fd6827caed230ea5dd2d6ec4431b11bf826531ea
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/combine_ashiftrt_2.c
@@ -0,0 +1,18 @@ 
+/* { dg-do compile {target arm*-*-* i?86-*-* powerpc-*-* sparc-*-*} } */
+/* { dg-options "-O2 -fdump-rtl-combine-all" } */
+
+typedef long int32_t;
+
+int32_t
+foo (int32_t a)
+{
+  return (~a) >> 31;
+}
+
+/* The combine phase will try to combine not & ashiftrt, and
+   combine_simplify_rtx should transform (ashiftrt (not x) 31)
+   to (not (ashiftrt x 63)) and then to (neg (ge x 0)). We look for
+   the *attempt* to match this RTL pattern, regardless of whether an
+   actual insn may be found on the platform.  */
+/* { dg-final { scan-rtl-dump "\\(neg:SI \\(ge:SI" "combine" } } */
+/* { dg-final { cleanup-rtl-dump "combine" } } */