Fix PR 61225

> -----Original Message-----
> From: Jeff Law [mailto:law@redhat.com]
> Sent: Tuesday, December 09, 2014 5:29 AM
> To: Zhenqiang Chen
> Cc: Steven Bosscher; gcc-patches@gcc.gnu.org; Jakub Jelinek
> Subject: Re: [PATCH] Fix PR 61225
> 
> On 12/04/14 01:43, Zhenqiang Chen wrote:
> >>> > >
> >>> > >          Part of PR rtl-optimization/61225
> >>> > >          * combine.c (refer_same_reg_p): New function.
> >>> > >          (combine_instructions): Handle I1 -> I2 -> I3; I2 ->
insn.
> >>> > >          (try_combine): Add one more parameter TO_COMBINED_INSN,
> >>> > > which
> >> >is
> >>> > >          used to create a new insn parallel (TO_COMBINED_INSN,
I3).
> >>> > >
> >>> > >testsuite/ChangeLog:
> >>> > >2014-08-04  Zhenqiang Chen<zhenqiang.chen@linaro.org>
> >>> > >
> >>> > >          * gcc.target/i386/pr61225.c: New test.
> THanks for the updates and clarifications.  Just a few minor things and
while
> it's a bit of a hack, I'll approve:
> 
> 
> 
> > +
> > +/* A is a compare (reg1, 0) and B is SINGLE_SET which SET_SRC is reg2.
> > +   It returns TRUE, if reg1 == reg2, and no other refer of reg1
> > +   except A and B.  */
> > +
> > +static bool
> > +refer_same_reg_p (rtx_insn *a, rtx_insn *b)
> > +{
> > +  rtx seta = single_set (a);
> > +  rtx setb = single_set (b);
> > +
> > +  if (BLOCK_FOR_INSN (a) != BLOCK_FOR_INSN (b)
> > +      || !seta
> > +      || !setb)
> > +    return false;
> > 
> > +  if (GET_CODE (SET_SRC (seta)) != COMPARE
> > +      || GET_MODE_CLASS (GET_MODE (SET_DEST (seta))) != MODE_CC
> > +      || !REG_P (XEXP (SET_SRC (seta), 0))
> > +      || XEXP (SET_SRC (seta), 1) != const0_rtx
> > +      || !REG_P (SET_SRC (setb))
> > +      || REGNO (SET_SRC (setb)) != REGNO (XEXP (SET_SRC (seta), 0)))
> > +    return false;
> Do you need to verify SETA and SETB satisfy single_set?  Or has that
> already been done elsewhere?

A is NEXT_INSN (insn)
B is prev_nonnote_nondebug_insn (insn),

For I1 -> I2 -> B; I2 -> A;
LOG_LINK can make sure I1 and I2 are single_set, but not A and B. And I did
found codes in function try_combine, which can make sure B (or I3) is
single_set. 

So I think the check can skip failed cases at early stage.

> The name refer_same_reg_p seems wrong -- your function is verifying the
> underlying RTL store as well as the existence of a a dependency between
> the insns.  Can you try to come up with a better name?

Change it to can_reuse_cc_set_p.

> Please use CONST0_RTX (mode)  IIRC that'll allow this to work regardless
> of the size of the modes relative to the host word size.

Updated. 

> > +
> > +  if (DF_REG_USE_COUNT (REGNO (SET_SRC (setb))) > 2)
> > +    {
> > +      df_ref use;
> > +      rtx insn;
> > +      unsigned int i = REGNO (SET_SRC (setb));
> > +
> > +      for (use = DF_REG_USE_CHAIN (i); use; use = DF_REF_NEXT_REG
(use))
> > +        {
> > +	  insn = DF_REF_INSN (use);
> > +	  if (insn != a && insn != b && !(NOTE_P (insn) || DEBUG_INSN_P
> (insn)))
> > +	    return false;
> > +	}
> > +    }
> > +
> > +  return true;
> > +}
> Is this fragment really needed?  Does it ever trigger?  I'd think that
> for > 2 uses punting would be fine.  Do we really commonly have cases
> with > 2 uses, but where they're all in SETA and SETB?

The check is to make sure the correctness.  Here is a case,

int 
f1 (int *x)
{
  int t = --*x;
  if (!t)
    foo (x);
  return t;
}

  _4 = *x_3(D);
  _5 = _4 + -1;
  *x_3(D) = _5;
  # DEBUG t => _5
  if (_5 == 0)
   ...
  <bb 4>:
  return _5;

"_5" is used in "return _5". So we can not remove "_5 = _4 + -1".

> >   		  }
> >   	      }
> >
> > +	  /* Try to combine a compare insn that sets CC
> > +	     with a preceding insn that can set CC, and maybe with its
> > +	     logical predecessor as well.
> > +	     We need this special code because data flow connections
> > +	     do not get entered in LOG_LINKS.  */
> > +	  if ((prev = prev_nonnote_nondebug_insn (insn)) != NULL_RTX
> > +	      && refer_same_reg_p (insn, prev)
> > +	      && max_combine >= 4)
> > +	    {
> > +		struct insn_link *next1;
> > +		FOR_EACH_LOG_LINK (next1, prev)
> > +		  {
> > +		    rtx_insn *link1 = next1->insn;
> > +		    if (NOTE_P (link1))
> > +		      continue;
> > +		    /* I1 -> I2 -> I3; I2 -> insn;
> > +		       output parallel (insn, I3).  */
> > +		    FOR_EACH_LOG_LINK (nextlinks, link1)
> > +		      if ((next = try_combine (prev, link1,
> > +					       nextlinks->insn, NULL,
> > +					       &new_direct_jump_p,
> > +					       last_combined_insn, insn)) !=
0)
> > +
> > +			{
> > +			  delete_insn (insn);
> > +			  insn = next;
> > +			  statistics_counter_event (cfun, "four-insn
combine",
> 1);
> > +			  goto retry;
> > +			}
> > +		    /* I2 -> I3; I2 -> insn
> > +		       output next = parallel (insn, I3).  */
> > +		    if ((next = try_combine (prev, link1,
> > +					     NULL, NULL,
> > +					     &new_direct_jump_p,
> > +					     last_combined_insn, insn)) !=
0)
> > +
> > +		      {
> > +			delete_insn (insn);
> > +			insn = next;
> > +			statistics_counter_event (cfun, "three-insn
combine",
> 1);
> > +			goto retry;
> > +		      }
> > +		  }
> > +	    }
> So you've got two new combine cases here, but I think the testcase only
> tests one of them.  Can you include a testcase for both of hte major
> paths above (I1->I2->I3; I2->insn and I2->I3; I2->INSN)

pr61225.c is the case to cover I1->I2->I3; I2->insn.

For I2 -> I3; I2 -> insn, I tried my test cases and found peephole2 can also
handle them. So I removed the code from the patch.

Here is the final patch.
Bootstrap and no make check regression on X86-64.

ChangeLog:
2014-11-09  Zhenqiang Chen  <zhenqiang.chen@linaro.org>

	Part of PR rtl-optimization/61225
	* combine.c (can_reuse_cc_set_p): New function.
	(combine_instructions): Handle I1 -> I2 -> I3; I2 -> insn.
	(try_combine): Add one more parameter TO_COMBINED_INSN, which
	is used to create a new insn parallel (TO_COMBINED_INSN, I3).

testsuite/ChangeLog:
2014-11-09  Zhenqiang Chen<zhenqiang.chen@linaro.org>

	* gcc.target/i386/pr61225.c: New test.

+/* { dg-final { cleanup-rtl-dump "combine" } } */

Fix PR 61225

Commit Message

Comments

Patch