diff mbox

Change replace_rtx if from is a REG (PR target/70245, take 2)

Message ID 20160317111658.GI3017@tucnak.redhat.com
State New
Headers show

Commit Message

Jakub Jelinek March 17, 2016, 11:16 a.m. UTC
Hi!

On Wed, Mar 16, 2016 at 05:48:33PM -0500, Segher Boessenkool wrote:
> On Wed, Mar 16, 2016 at 01:59:29PM +0100, Bernd Schmidt wrote:
> > On 03/16/2016 01:22 PM, Jakub Jelinek wrote:
> > >So, this is what we've converged to on IRC and passed bootstrap/regtest
> > >on x86_64-linux and i686-linux.  Is this ok for trunk?
> > 
> > The explanation was a bit confusing at first, but I think this looks 
> > reasonable. The assert worries me, but triggering it would indicate a 
> > potential miscompilation, so I think it is preferrable to have it.
> 
> This caused PR70261.  Maybe it's just the assert.

Unfortunately it is not just the assert, and it is on various targets.
On powerpc64, one issue is during combiner, where we most likely don't
strictly care if we replace all, but can't replace one with bad mode and
can run into that case, and then during prologue threading, where
the rs6000 backend for whatever strange reason I haven't understood
really wants pointer equality instead of REGNO comparison (even when the
modes match), one (reg:DI 12) should be replaced, another (reg:DI 12)
should not.

Thus, I've reverted the patch (kept the testcase), and after some
discussions on IRC bootstrapped/regtested on x86_64-linux and i686-linux
following version, which right now should change behavior just for the i?86
case and nothing else, so shouldn't break other targets.
I believe at least the epiphany and sh peepholes that use replace_rtx will
want similar treatment, but will leave testing of that to their maintainers.

Ok for trunk?

2016-03-17  Jakub Jelinek  <jakub@redhat.com>

	PR target/70245
	* rtl.h (replace_rtx): Add ALL_REGS argument.
	* rtlanal.c (replace_rtx): Likewise.  If true, use REGNO
	equality and assert mode is the same, instead of just rtx pointer
	equality.
	* config/i386/i386.md (mov + arithmetics with load peephole): Pass
	true as ALL_REGS argument to replace_rtx.



	Jakub

Comments

Bernd Schmidt March 17, 2016, 11:26 a.m. UTC | #1
On 03/17/2016 12:16 PM, Jakub Jelinek wrote:

> Thus, I've reverted the patch (kept the testcase), and after some
> discussions on IRC bootstrapped/regtested on x86_64-linux and i686-linux
> following version, which right now should change behavior just for the i?86
> case and nothing else, so shouldn't break other targets.
> I believe at least the epiphany and sh peepholes that use replace_rtx will
> want similar treatment, but will leave testing of that to their maintainers.
>
> Ok for trunk?

Ok.


Bernd
Alan Modra March 17, 2016, 12:37 p.m. UTC | #2
On Thu, Mar 17, 2016 at 12:16:58PM +0100, Jakub Jelinek wrote:
> the rs6000 backend for whatever strange reason I haven't understood
> really wants pointer equality instead of REGNO comparison (even when the
> modes match), one (reg:DI 12) should be replaced, another (reg:DI 12)
> should not.

By the look of what you posted in the bugzilla, the pattern is the
parallel emitted by rs6000_emit_savres_rtx.  In that parallel, the
stack memory locations for register saves are described relative to
whatever frame_reg_rtx is in use, which may be r12.
rs6000_frame_related wants to translate the frame_reg_rtx into stack
pointer plus offset for debug info.

The parallel matches save_gpregs_<mode>_r12 and similar in rs6000.md,
which emit a call to an out-of-line register save function.  This
function actually takes r12 as a parameter, hence the (use (reg:P 12))
in the pattern.

rs6000_frame_related probably should just be replacing individual
SETs in the parallel using simplify_replace_rtx.  Especially since
after calling replace_rtx, it already iterates over them to simplify.
Jakub Jelinek March 17, 2016, 12:44 p.m. UTC | #3
On Thu, Mar 17, 2016 at 11:07:03PM +1030, Alan Modra wrote:
> On Thu, Mar 17, 2016 at 12:16:58PM +0100, Jakub Jelinek wrote:
> > the rs6000 backend for whatever strange reason I haven't understood
> > really wants pointer equality instead of REGNO comparison (even when the
> > modes match), one (reg:DI 12) should be replaced, another (reg:DI 12)
> > should not.
> 
> By the look of what you posted in the bugzilla, the pattern is the
> parallel emitted by rs6000_emit_savres_rtx.  In that parallel, the
> stack memory locations for register saves are described relative to
> whatever frame_reg_rtx is in use, which may be r12.
> rs6000_frame_related wants to translate the frame_reg_rtx into stack
> pointer plus offset for debug info.
> 
> The parallel matches save_gpregs_<mode>_r12 and similar in rs6000.md,
> which emit a call to an out-of-line register save function.  This
> function actually takes r12 as a parameter, hence the (use (reg:P 12))
> in the pattern.
> 
> rs6000_frame_related probably should just be replacing individual
> SETs in the parallel using simplify_replace_rtx.  Especially since
> after calling replace_rtx, it already iterates over them to simplify.

That was one thing, and another thing was during combiner, where it replaced
just subset of the registers and left others in (e.g. with different mode).

And on aarch64 trying to replace flags reg with CC_NZ mode when there is CC
mode (or vice versa?).  I bet for GCC 7 we want to analyze all uses of
replace_rtx for what exactly we want.

	Jakub
Jakub Jelinek March 17, 2016, 1:42 p.m. UTC | #4
On Thu, Mar 17, 2016 at 01:32:15PM +0000, Joern Rennecke wrote:
> > Note that during register allocation / reload, REGNO equivalence is generally wrong,
> as it fails to distinguish between the frame pointer and a hard register that has the
> same regno as the frame pointer which has been created as a register allocation for
> a pseudo register when the frame pointer is being eliminated.
> lra_get_elimination_hard_regno actually gets this wrong (as well as the interfaces -
> there is no rtx value available, so it'd be very hard to fix), which means that targets
> can't use safely lra with frame pointer elimination and matching constraints, unless
> they use a separate soft frame pointer.
> If you want to decide automagically if you'd run into an ambiguity with register
> elimination, you have to check if either side of the comparison appears in
> regs_eliminate_1[i].from for 0 <= i < ARRAY_SIZE(regs_eliminate_1) .
> Unfortunately, although rarely encountered, it is possible to use the frame pointer
> in a mode other than Pmode, and once cleanup_subreg_operands is done with such
> a reference, an equality check with frame_pointer_rtx will fail.

Sure.  Which is why the trunk now uses the old behavior of replace_rtx
unless you ask for different one through extra argument.

In epiphany.md, I see
(define_peephole2
  [(parallel [(set (match_operand:WMODE 0 "gpr_operand" "")
                   (match_operand:WMODE 1 "" ""))
              (clobber (match_operand 8 "cc_operand"))])
   (match_operand 2 "" "")
   (set (match_operand:WMODE2 3 "gpr_operand" "")
        (match_operand:WMODE2 9 "gpr_operand" ""))
   (set (match_dup 3)
        (if_then_else:WMODE2 (match_operator 5 "proper_comparison_operator"
                               [(match_operand 6 "cc_operand")
                                (match_operand 7 "const0_operand")])
                             (match_operand:WMODE2 4 "nonmemory_operand" "")
                             (match_dup 3)))]
  "REGNO (operands[0]) == REGNO (operands[9])
   && peep2_reg_dead_p (3, operands[0])
   && !reg_set_p (operands[0], operands[2])
   && !reg_set_p (operands[3], operands[2])
   && !reg_overlap_mentioned_p (operands[3], operands[2])"
  [(parallel [(set (match_dup 10) (match_dup 1))
              (clobber (match_dup 8))])
   (match_dup 2)
   (set (match_dup 3)
        (if_then_else:WMODE2 (match_dup 5) (match_dup 4) (match_dup 3)))]
{
  operands[10] = simplify_gen_subreg (<WMODE:MODE>mode, operands[3],
                                      <WMODE2:MODE>mode, 0);
  replace_rtx (operands[2], operands[9], operands[3]);
  replace_rtx (operands[2], operands[0], operands[10]);
  gcc_assert (!reg_overlap_mentioned_p (operands[0], operands[2]));
})
and peephole2s are after RA, and I bet you rely on the replacements to be done no matter
if there is pointer equality or just REGNO equality (and, can there be different modes
or just one?).  sh.md has something similar.

	Jakub
Oleg Endo March 19, 2016, 1:45 a.m. UTC | #5
On Thu, 2016-03-17 at 12:16 +0100, Jakub Jelinek wrote:

> Thus, I've reverted the patch (kept the testcase), and after some
> discussions on IRC bootstrapped/regtested on x86_64-linux and i686
> -linux following version, which right now should change behavior just 
> for the i?86 case and nothing else, so shouldn't break other targets.
> I believe at least the epiphany and sh peepholes that use replace_rtx 
> will want similar treatment, but will leave testing of that to their
> maintainers.

As far as I can see, replace_rtx is used only in one pattern in sh.md:

(define_peephole2
  [(set (match_operand:SI 0 "general_movdst_operand" "")
	(match_operand:SI 1 "arith_reg_or_0_operand" ""))
   (set (match_operand:SI 2 "arith_reg_dest" "")
	(if_then_else:SI (match_operator 4 "equality_comparison_operator"
			   [(match_operand:SI 3 "arith_reg_operand" "")
			    (const_int 0)])
	 (match_dup 0)
	 (match_dup 2)))]
  "TARGET_SHMEDIA && peep2_reg_dead_p (2, operands[0])
   && (!REG_P (operands[1]) || GENERAL_REGISTER_P (REGNO (operands[1])))"
  [(set (match_dup 2)
	(if_then_else:SI (match_dup 4) (match_dup 1) (match_dup 2)))]
{
  replace_rtx (operands[4], operands[0], operands[1]);
})

This pattern is only for SH5/SH64, which I'm planning on removing after
GCC 6.  So no concerns or further actions required here.

Cheers,
Oleg
diff mbox

Patch

--- gcc/rtl.h.jj	2016-03-16 10:36:36.000000000 +0100
+++ gcc/rtl.h	2016-03-17 10:05:53.144242638 +0100
@@ -3011,7 +3011,7 @@  extern bool can_nonlocal_goto (const rtx
 extern void copy_reg_eh_region_note_forward (rtx, rtx_insn *, rtx);
 extern void copy_reg_eh_region_note_backward (rtx, rtx_insn *, rtx);
 extern int inequality_comparisons_p (const_rtx);
-extern rtx replace_rtx (rtx, rtx, rtx);
+extern rtx replace_rtx (rtx, rtx, rtx, bool = false);
 extern void replace_label (rtx *, rtx, rtx, bool);
 extern void replace_label_in_insn (rtx_insn *, rtx, rtx, bool);
 extern bool rtx_referenced_p (const_rtx, const_rtx);
--- gcc/rtlanal.c.jj	2016-03-17 08:59:37.000000000 +0100
+++ gcc/rtlanal.c	2016-03-17 10:16:42.870241836 +0100
@@ -2946,10 +2946,13 @@  inequality_comparisons_p (const_rtx x)
    not enter into CONST_DOUBLE for the replace.
 
    Note that copying is not done so X must not be shared unless all copies
-   are to be modified.  */
+   are to be modified.
+
+   ALL_REGS is true if we want to replace all REGs equal to FROM, not just
+   those pointer-equal ones.  */
 
 rtx
-replace_rtx (rtx x, rtx from, rtx to)
+replace_rtx (rtx x, rtx from, rtx to, bool all_regs)
 {
   int i, j;
   const char *fmt;
@@ -2961,9 +2964,17 @@  replace_rtx (rtx x, rtx from, rtx to)
   if (x == 0)
     return 0;
 
-  if (GET_CODE (x) == SUBREG)
+  if (all_regs
+      && REG_P (x)
+      && REG_P (from)
+      && REGNO (x) == REGNO (from))
+    {
+      gcc_assert (GET_MODE (x) == GET_MODE (from));
+      return to;
+    }
+  else if (GET_CODE (x) == SUBREG)
     {
-      rtx new_rtx = replace_rtx (SUBREG_REG (x), from, to);
+      rtx new_rtx = replace_rtx (SUBREG_REG (x), from, to, all_regs);
 
       if (CONST_INT_P (new_rtx))
 	{
@@ -2979,7 +2990,7 @@  replace_rtx (rtx x, rtx from, rtx to)
     }
   else if (GET_CODE (x) == ZERO_EXTEND)
     {
-      rtx new_rtx = replace_rtx (XEXP (x, 0), from, to);
+      rtx new_rtx = replace_rtx (XEXP (x, 0), from, to, all_regs);
 
       if (CONST_INT_P (new_rtx))
 	{
@@ -2997,10 +3008,11 @@  replace_rtx (rtx x, rtx from, rtx to)
   for (i = GET_RTX_LENGTH (GET_CODE (x)) - 1; i >= 0; i--)
     {
       if (fmt[i] == 'e')
-	XEXP (x, i) = replace_rtx (XEXP (x, i), from, to);
+	XEXP (x, i) = replace_rtx (XEXP (x, i), from, to, all_regs);
       else if (fmt[i] == 'E')
 	for (j = XVECLEN (x, i) - 1; j >= 0; j--)
-	  XVECEXP (x, i, j) = replace_rtx (XVECEXP (x, i, j), from, to);
+	  XVECEXP (x, i, j) = replace_rtx (XVECEXP (x, i, j),
+					   from, to, all_regs);
     }
 
   return x;
--- gcc/config/i386/i386.md.jj	2016-03-16 10:36:36.000000000 +0100
+++ gcc/config/i386/i386.md	2016-03-17 10:09:01.281636259 +0100
@@ -17885,7 +17885,7 @@  (define_peephole2
    (parallel [(set (match_dup 0)
                    (match_op_dup 3 [(match_dup 0) (match_dup 1)]))
               (clobber (reg:CC FLAGS_REG))])]
-  "operands[4] = replace_rtx (operands[2], operands[0], operands[1]);")
+  "operands[4] = replace_rtx (operands[2], operands[0], operands[1], true);")
 
 (define_peephole2
   [(set (match_operand 0 "register_operand")