diff mbox

[RFA,PR,tree-optimization/59749] Fix recently introduced ree bug

Message ID 52D38607.4090700@redhat.com
State New
Headers show

Commit Message

Jeff Law Jan. 13, 2014, 6:21 a.m. UTC
On 01/10/14 14:52, Jakub Jelinek wrote:
> There is one thing I still worry about, if some target has
> an insn to say sign extend or zero extend a short memory load
> into HARD_REGNO_NREGS () > 1 register, but the other involved
> register has the only one (or fewer) hard registers available to it.
> Consider registers SImode hard registers 0, 1, 2, 3:
>    (set (reg:SI 3) (something:SI))
>    (set (reg:HI 0) (expression:HI))
>    (set (reg:SI 2) (sign_extend:SI (reg:HI 0)))
>    (set (reg:DI 0) (sign_extend:DI (reg:HI 0)))
>    (use (reg:SI 3))
> we transform this into:
>    (set (reg:SI 3) (something:SI))
>    (set (reg:SI 2) (sign_extend:SI (expression:HI)))
>    (set (reg:SI 0) (reg:HI 2))
>    (set (reg:DI 0) (sign_extend:DI (reg:HI 0)))
>    (use (reg:SI 3))
> first (well, the middle is then pending in copy list), and next:
>    (set (reg:SI 3) (something))
>    (set (reg:DI 2) (sign_extend:DI (expression:HI)))
>    (set (reg:DI 0) (reg:DI 2))
>    (use (reg:SI 3))
> but that looks wrong, because the second instruction would now clobber
> (reg:SI 3).  Dunno if we have such an target and thus if it is possible
> to construct a testcase.
No need to construct a testcase, there's a few that trip the condition 
in the existing testsuite :-)

Basically I just put a check in combine_set_extension to detect when 
widening of the result of the reaching def requires more hard registers 
than it previously needed and ran the testsuite.



> So, I'd say the handling of the second extend should notice that
> it is actually extending load into a different register and bail out
> if it would need more hard registers than it needed previously, or
> something similar.
Yes, like in the attached patch?  OK for the trunk?
commit 1313449102ac8d62e36818d8660ef2e897bd59e3
Author: Jeff Law <law@redhat.com>
Date:   Fri Jan 10 14:31:15 2014 -0700

    	PR tree-optimization/59747
    	* ree.c (find_and_remove_re): Properly handle case where a second
    	eliminated extension requires widening a copy created for elimination
    	of a prior extension.
    	(combine_set_extension): Ensure that the number of hard regs needed
    	for a destination register does not change when we widen it.
    
    	PR tree-optimization/59747
    	* gcc.c-torture/execute/pr59747.c: New test.

Comments

Jakub Jelinek Jan. 13, 2014, 7:34 a.m. UTC | #1
On Sun, Jan 12, 2014 at 11:21:59PM -0700, Jeff Law wrote:
> --- a/gcc/ree.c
> +++ b/gcc/ree.c
> @@ -297,6 +297,13 @@ combine_set_extension (ext_cand *cand, rtx curr_insn, rtx *orig_set)
>    else
>      new_reg = gen_rtx_REG (cand->mode, REGNO (SET_DEST (*orig_set)));
>  
> +  /* We're going to be widening the result of DEF_INSN, ensure that doing so
> +     doesn't change the number of hard registers needed for the result.  */
> +  if (HARD_REGNO_NREGS (REGNO (new_reg), cand->mode)
> +      != HARD_REGNO_NREGS (REGNO (SET_SRC (*orig_set)),

Note you can use orig_src instead of SET_SRC (*orig_set) here.

> +			   GET_MODE (SET_DEST (*orig_set))))
> +	return false;
> +
>    /* Merge constants by directly moving the constant into the register under
>       some conditions.  Recall that RTL constants are sign-extended.  */
>    if (GET_CODE (orig_src) == CONST_INT

Are you sure the above is needed even for the
REGNO (new_reg) == REGNO (SET_DEST (*orig_set))
&& REGNO (new_reg) == REGNO (orig_src) case?
I mean in that case no copy insn is going to be scheduled right now, nor has
been previously scheduled, so we are back to what the code did before
r206418.  I can imagine it can be a problem, but doesn't have to be.

(set (reg:SI 3) (something:SI))
(set (reg:SI 2) (expression:SI))	// def_insn
(use (reg:SI 3))
(set (reg:DI 3) (sign_extend:DI (reg:SI 2)))

So, perhaps if we wanted to handle the HARD_REGNO_NREGS != HARD_REGNO_NREGS
case when all 3 REGNOs are the same, we'd need to limit it to the case where
cand->insn and curr_insn are in the same bb, DF_INSN_LUID of curr_insn
is smaller than DF_INSN_LUID of cand->insn and the extra hard regs aren't
used between the two.  Perhaps not worth it?

BTW, I'm surprised to hear that it triggers in the testsuite already (for
the 3 REGNOs the same case, or different?), is that on x86_64 or i?86?
Do you have an example?  I'm surprised that we'd have post reload a pattern
that extends into multiple hard registers.

	Jakub
Jeff Law Jan. 15, 2014, 5:56 p.m. UTC | #2
On 01/13/14 00:34, Jakub Jelinek wrote:
> On Sun, Jan 12, 2014 at 11:21:59PM -0700, Jeff Law wrote:
>> --- a/gcc/ree.c
>> +++ b/gcc/ree.c
>> @@ -297,6 +297,13 @@ combine_set_extension (ext_cand *cand, rtx curr_insn, rtx *orig_set)
>>     else
>>       new_reg = gen_rtx_REG (cand->mode, REGNO (SET_DEST (*orig_set)));
>>
>> +  /* We're going to be widening the result of DEF_INSN, ensure that doing so
>> +     doesn't change the number of hard registers needed for the result.  */
>> +  if (HARD_REGNO_NREGS (REGNO (new_reg), cand->mode)
>> +      != HARD_REGNO_NREGS (REGNO (SET_SRC (*orig_set)),
>
> Note you can use orig_src instead of SET_SRC (*orig_set) here.
Yea.  A little manual CSE never hurts.  Fixed.


>
>> +			   GET_MODE (SET_DEST (*orig_set))))
>> +	return false;
>> +
>>     /* Merge constants by directly moving the constant into the register under
>>        some conditions.  Recall that RTL constants are sign-extended.  */
>>     if (GET_CODE (orig_src) == CONST_INT
>
> Are you sure the above is needed even for the
> REGNO (new_reg) == REGNO (SET_DEST (*orig_set))
> && REGNO (new_reg) == REGNO (orig_src) case?
It wasn't clear when I was reviewing the code, so I took the 
conservative approach of always rejecting when the # hard regs changes 
as a result of widening.

>
> (set (reg:SI 3) (something:SI))
> (set (reg:SI 2) (expression:SI))	// def_insn
> (use (reg:SI 3))
> (set (reg:DI 3) (sign_extend:DI (reg:SI 2)))
>
> So, perhaps if we wanted to handle the HARD_REGNO_NREGS != HARD_REGNO_NREGS
> case when all 3 REGNOs are the same, we'd need to limit it to the case where
> cand->insn and curr_insn are in the same bb, DF_INSN_LUID of curr_insn
> is smaller than DF_INSN_LUID of cand->insn and the extra hard regs aren't
> used between the two.  Perhaps not worth it?
I doubt it's worth it.  A size change here is pretty unusual.

>
> BTW, I'm surprised to hear that it triggers in the testsuite already (for
> the 3 REGNOs the same case, or different?), is that on x86_64 or i?86?
> Do you have an example?  I'm surprised that we'd have post reload a pattern
> that extends into multiple hard registers.
There were only a couple on x86_64, both related to handling vector 
regs.  They would actually fail later, but they tripped the changing 
size check.

Here's one example:

(insn 16 15 17 2 (set (reg:V4SI 21 xmm0 [99])
         (vec_select:V4SI (reg:V8SI 22 xmm1 [orig:85 vect__3.443 ] [85])
             (parallel [
                     (const_int 4 [0x4])
                     (const_int 5 [0x5])
                     (const_int 6 [0x6])
                     (const_int 7 [0x7])
                 ]))) j.c:26 1926 {vec_extract_hi_v8si}
      (nil))
(insn 17 16 18 2 (set (reg:V4DI 21 xmm0 [orig:98 vect__4.444 ] [98])
         (sign_extend:V4DI (reg:V4SI 21 xmm0 [99]))) j.c:26 2583 
{avx2_sign_extendv4siv4di2}
      (nil))



Changing (reg:V4SI xmm0) to (reg:V4DI xmm0) changes the number of hard 
regs and thus tripped the check.

The transformation would actually fail later when it tried to actually 
combine the extension and vec_select into this insn:

(insn 16 15 17 2 (set (reg:V4DI 21 xmm0)
         (sign_extend:V4DI (vec_select:V4SI (reg:V8SI 22 xmm1 [orig:85 
vect__3.443 ] [85])
                 (parallel [
                         (const_int 4 [0x4])
                         (const_int 5 [0x5])
                         (const_int 6 [0x6])
                         (const_int 7 [0x7])
                     ])))) j.c:26 -1

I guess I could run a -m32 multilib test and see if I can get it to 
trigger in on something that will create a valid insn.


Jeff
Jakub Jelinek Jan. 15, 2014, 5:59 p.m. UTC | #3
On Wed, Jan 15, 2014 at 10:56:35AM -0700, Jeff Law wrote:
> On 01/13/14 00:34, Jakub Jelinek wrote:
> >On Sun, Jan 12, 2014 at 11:21:59PM -0700, Jeff Law wrote:
> >>--- a/gcc/ree.c
> >>+++ b/gcc/ree.c
> >>@@ -297,6 +297,13 @@ combine_set_extension (ext_cand *cand, rtx curr_insn, rtx *orig_set)
> >>    else
> >>      new_reg = gen_rtx_REG (cand->mode, REGNO (SET_DEST (*orig_set)));
> >>
> >>+  /* We're going to be widening the result of DEF_INSN, ensure that doing so
> >>+     doesn't change the number of hard registers needed for the result.  */
> >>+  if (HARD_REGNO_NREGS (REGNO (new_reg), cand->mode)
> >>+      != HARD_REGNO_NREGS (REGNO (SET_SRC (*orig_set)),
> >
> >Note you can use orig_src instead of SET_SRC (*orig_set) here.
> Yea.  A little manual CSE never hurts.  Fixed.

The patch is ok with that fix then.  Thanks.

	Jakub
Eric Botcazou Jan. 16, 2014, 11:52 a.m. UTC | #4
> Yes, like in the attached patch?  OK for the trunk?

Unfortunately this broke again bootstrap with RTL checking enabled on x86-64:

/home/eric/svn/gcc/libgcc/libgcc2.c: In function '__negdi2':
/home/eric/svn/gcc/libgcc/libgcc2.c:71:1: internal compiler error: RTL check: 
expected code 'reg', have 'ne' in rhs_regno, at rtl.h:1125
 }
 ^
0x9cb813 rtl_check_failed_code1(rtx_def const*, rtx_code, char const*, int, 
char const*)
        /home/eric/svn/gcc/gcc/rtl.c:773
0x146a6ab rhs_regno
        /home/eric/svn/gcc/gcc/rtl.h:1125
0x146a6ab combine_set_extension
        /home/eric/svn/gcc/gcc/ree.c:303
0x146a6ab merge_def_and_ext
        /home/eric/svn/gcc/gcc/ree.c:658
0x146bfbd combine_reaching_defs
        /home/eric/svn/gcc/gcc/ree.c:786
0x146d149 find_and_remove_re
        /home/eric/svn/gcc/gcc/ree.c:993
0x146d149 rest_of_handle_ree
        /home/eric/svn/gcc/gcc/ree.c:1064
0x146d149 execute
        /home/eric/svn/gcc/gcc/ree.c:1103
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.
make[5]: *** [_negdi2.o] Error 1
Jeff Law Jan. 16, 2014, 4:43 p.m. UTC | #5
On 01/16/14 04:52, Eric Botcazou wrote:
>> Yes, like in the attached patch?  OK for the trunk?
>
> Unfortunately this broke again bootstrap with RTL checking enabled on x86-64:
>
> /home/eric/svn/gcc/libgcc/libgcc2.c: In function '__negdi2':
> /home/eric/svn/gcc/libgcc/libgcc2.c:71:1: internal compiler error: RTL check:
> expected code 'reg', have 'ne' in rhs_regno, at rtl.h:1125
I'm on it.  Looks like a stupidity leak on my part.

jeff
Jeff Law Jan. 16, 2014, 6:56 p.m. UTC | #6
On 01/16/14 04:52, Eric Botcazou wrote:
>> Yes, like in the attached patch?  OK for the trunk?
>
> Unfortunately this broke again bootstrap with RTL checking enabled on x86-64:
>
> /home/eric/svn/gcc/libgcc/libgcc2.c: In function '__negdi2':
> /home/eric/svn/gcc/libgcc/libgcc2.c:71:1: internal compiler error: RTL check:
> expected code 'reg', have 'ne' in rhs_regno, at rtl.h:1125
>   }
Patch testing in progress.  Should be wrapped up before I finish today.

jeff
diff mbox

Patch

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index c554609..a82e23c 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -5,6 +5,13 @@ 
 	occurs before the extension when optimizing extensions with
 	different source and destination hard registers.
 
+	PR tree-optimization/59747
+	* ree.c (find_and_remove_re): Properly handle case where a second
+	eliminated extension requires widening a copy created for elimination
+	of a prior extension.
+	(combine_set_extension): Ensure that the number of hard regs needed
+	for a destination register does not change when we widen it.
+
 2014-01-10  Jan Hubicka  <jh@suse.cz>
 
 	PR ipa/58585
diff --git a/gcc/ree.c b/gcc/ree.c
index 63cc8cc..3ee97cd 100644
--- a/gcc/ree.c
+++ b/gcc/ree.c
@@ -297,6 +297,13 @@  combine_set_extension (ext_cand *cand, rtx curr_insn, rtx *orig_set)
   else
     new_reg = gen_rtx_REG (cand->mode, REGNO (SET_DEST (*orig_set)));
 
+  /* We're going to be widening the result of DEF_INSN, ensure that doing so
+     doesn't change the number of hard registers needed for the result.  */
+  if (HARD_REGNO_NREGS (REGNO (new_reg), cand->mode)
+      != HARD_REGNO_NREGS (REGNO (SET_SRC (*orig_set)),
+			   GET_MODE (SET_DEST (*orig_set))))
+	return false;
+
   /* Merge constants by directly moving the constant into the register under
      some conditions.  Recall that RTL constants are sign-extended.  */
   if (GET_CODE (orig_src) == CONST_INT
@@ -1017,11 +1024,20 @@  find_and_remove_re (void)
   for (unsigned int i = 0; i < reinsn_copy_list.length (); i += 2)
     {
       rtx curr_insn = reinsn_copy_list[i];
+      rtx def_insn = reinsn_copy_list[i + 1];
+
+      /* Use the mode of the destination of the defining insn
+	 for the mode of the copy.  This is necessary if the
+	 defining insn was used to eliminate a second extension
+	 that was wider than the first.  */
+      rtx sub_rtx = *get_sub_rtx (def_insn);
       rtx pat = PATTERN (curr_insn);
-      rtx new_reg = gen_rtx_REG (GET_MODE (SET_DEST (pat)),
+      rtx new_dst = gen_rtx_REG (GET_MODE (SET_DEST (sub_rtx)),
 				 REGNO (XEXP (SET_SRC (pat), 0)));
-      rtx set = gen_rtx_SET (VOIDmode, new_reg, SET_DEST (pat));
-      emit_insn_after (set, reinsn_copy_list[i + 1]);
+      rtx new_src = gen_rtx_REG (GET_MODE (SET_DEST (sub_rtx)),
+				 REGNO (SET_DEST (pat)));
+      rtx set = gen_rtx_SET (VOIDmode, new_dst, new_src);
+      emit_insn_after (set, def_insn);
     }
 
   /* Delete all useless extensions here in one sweep.  */
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index f40d56e..a603952 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -3,6 +3,9 @@ 
 	PR middle-end/59743
 	* gcc.c-torture/compile/pr59743.c: New test.
 
+	PR tree-optimization/59747
+	* gcc.c-torture/execute/pr59747.c: New test.
+
 2014-01-10  Jan Hubicka  <jh@suse.cz>
 
 	PR ipa/58585
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr59747.c b/gcc/testsuite/gcc.c-torture/execute/pr59747.c
new file mode 100644
index 0000000..d45a908
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr59747.c
@@ -0,0 +1,27 @@ 
+extern void abort (void);
+extern void exit (int);
+
+int a[6], b, c = 1, d;
+short e;
+
+int __attribute__ ((noinline))
+fn1 (int p)
+{
+  b = a[p];
+}
+
+int
+main ()
+{
+  if (sizeof (long long) != 8)
+    exit (0);
+
+  a[0] = 1;
+  if (c)
+    e--;
+  d = e;
+  long long f = e;
+  if (fn1 ((f >> 56) & 1) != 0)
+    abort ();
+  exit (0);
+}