[PING] Fix PR rtl-optimization/pr60663

Message ID	20140410161004.GG1817@tucnak.redhat.com
State	New
Headers	show Return-Path: <gcc-patches-return-364807-incoming=patchwork.ozlabs.org@gcc.gnu.org> DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:references:mime-version :content-type:in-reply-to; q=dns; s=default; b=C/o3BJEU/UUGu516n qH2JhxLYI3tNs9kkljzNU3LPWHWwkWXXWocricv1S+avDaqvSsh6o91EtIvOby7R u0rW49ZyVDM90MjCIogdg+JqPrfMHklAw9FKnXlI6aG37ST3BFGCEzvNBhFxSc9h l19NvYtXm3SgsaUfnp8CZU+9q4= Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk Sender: gcc-patches-owner@gcc.gnu.org Date: Thu, 10 Apr 2014 18:10:04 +0200 From: Jakub Jelinek <jakub@redhat.com> To: Zhenqiang Chen <zhenqiang.chen@linaro.org>, Jeff Law <law@redhat.com> Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org> Subject: Re: [PING] [PATCH] Fix PR rtl-optimization/pr60663 Message-ID: <20140410161004.GG1817@tucnak.redhat.com> Reply-To: Jakub Jelinek <jakub@redhat.com> References: <CACgzC7CKvhWtNe4YVUTUmu4qzyXXissNWRNMx5Cih8F5xRzY6A@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <CACgzC7CKvhWtNe4YVUTUmu4qzyXXissNWRNMx5Cih8F5xRzY6A@mail.gmail.com> User-Agent: Mutt/1.5.21 (2010-09-15)

Message ID

20140410161004.GG1817@tucnak.redhat.com

State

New

Headers

DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender:date
	:from:to:cc:subject:message-id:reply-to:references:mime-version
	:content-type:in-reply-to; q=dns; s=default; b=C/o3BJEU/UUGu516n
	qH2JhxLYI3tNs9kkljzNU3LPWHWwkWXXWocricv1S+avDaqvSsh6o91EtIvOby7R
	u0rW49ZyVDM90MjCIogdg+JqPrfMHklAw9FKnXlI6aG37ST3BFGCEzvNBhFxSc9h
	l19NvYtXm3SgsaUfnp8CZU+9q4=
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
Sender: gcc-patches-owner@gcc.gnu.org
Date: Thu, 10 Apr 2014 18:10:04 +0200
From: Jakub Jelinek <jakub@redhat.com>
To: Zhenqiang Chen <zhenqiang.chen@linaro.org>, Jeff Law <law@redhat.com>
Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: Re: [PING] [PATCH] Fix PR rtl-optimization/pr60663
Message-ID: <20140410161004.GG1817@tucnak.redhat.com>
Reply-To: Jakub Jelinek <jakub@redhat.com>
References: <CACgzC7CKvhWtNe4YVUTUmu4qzyXXissNWRNMx5Cih8F5xRzY6A@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CACgzC7CKvhWtNe4YVUTUmu4qzyXXissNWRNMx5Cih8F5xRzY6A@mail.gmail.com>
User-Agent: Mutt/1.5.21 (2010-09-15)

Commit Message

Jakub Jelinek April 10, 2014, 4:10 p.m. UTC

On Tue, Apr 01, 2014 at 11:41:12AM +0800, Zhenqiang Chen wrote:
> Ping?
> 
> Bootstrap and no make check regression on X86-64.
> 
> Bootstrap on ARM. In ARM regression test, some new PASS and FAIL of
> debug info check for gcc.dg/guality/pr36728-1.c and
> gcc.dg/guality/pr36728-2.c since register allocation result is
> different with the patch. There is no real new FAIL due to the patch.

> > --- a/gcc/cse.c
> > +++ b/gcc/cse.c
> > @@ -4280,6 +4280,19 @@ find_sets_in_insn (rtx insn, struct set **psets)
> >                 ;
> >               else if (GET_CODE (SET_SRC (y)) == CALL)
> >                 ;
> > +             else if (GET_CODE (SET_SRC (y)) == ASM_OPERANDS)
> > +               {
> > +                 if (i + 1 < lim)
> > +                   {
> > +                     rtx n = XVECEXP (x, 0, i + 1);
> > +                     /* For inline assemble with multiple outputs, we can not
> > +                        handle the SET separately.  Refer PR60663.  */
> > +                     if (GET_CODE (n) == SET
> > +                         && GET_CODE (SET_SRC (n)) == ASM_OPERANDS)
> > +                       break;
> > +                   }
> > +                 sets[n_sets++].rtl = y;
> > +               }
> >               else
> >                 sets[n_sets++].rtl = y;
> >             }

This doesn't look like a correct fix.  First of all, it will not handle
many of the cases where we should not break the inline asm appart,
e.g. if you have:
int
foo (void)
{
  unsigned i;
  asm ("%0" : "=r" (i) : : "r5");
  return i;
}
then it will still happily drop the clobber on the floor.
But also, e.g. volatile asm or asm goto which is even stronger reason
not to fiddle with it too much, isn't handled by not adding the sets in
find_sets_in_insn, but rather just setting src_volatile flag.
So, I'd say a better fix than this is something like following patch
(untested yet, but fixes the testcase).

Or, fix up the insane arm costs for ASM_OPERANDS:
    case ASM_OPERANDS:
      /* Just a guess.  Cost one insn per input.  */
      *cost = COSTS_N_INSNS (ASM_OPERANDS_INPUT_LENGTH (x));
      return true;
I don't think this heuristics is even close to reality most of the
time, more importantly, for no inputs, just outputs and clobbers it means
*cost = 0; and that is why ARM is supposedly the only target now where CSE
thinks it is worthwhile to break all inline asms without inputs appart.

2014-04-10  Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/60663
	* cse.c (cse_insn): Set src_volatile on ASM_OPERANDS in
	PARALLEL.

	* gcc.target/arm/pr60663.c: New test.



	Jakub

Comments

Richard Henderson April 10, 2014, 7:27 p.m. UTC | #1

On 04/10/2014 09:10 AM, Jakub Jelinek wrote:
> 2014-04-10  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR rtl-optimization/60663
> 	* cse.c (cse_insn): Set src_volatile on ASM_OPERANDS in
> 	PARALLEL.
> 
> 	* gcc.target/arm/pr60663.c: New test.

Ok if it passes.

But you're right that ARM backend needs that rtx_costs change too.


r~

Zhenqiang Chen April 11, 2014, 9:19 a.m. UTC | #2

On 11 April 2014 00:10, Jakub Jelinek <jakub@redhat.com> wrote:
> On Tue, Apr 01, 2014 at 11:41:12AM +0800, Zhenqiang Chen wrote:
>> Ping?
>>
>> Bootstrap and no make check regression on X86-64.
>>
>> Bootstrap on ARM. In ARM regression test, some new PASS and FAIL of
>> debug info check for gcc.dg/guality/pr36728-1.c and
>> gcc.dg/guality/pr36728-2.c since register allocation result is
>> different with the patch. There is no real new FAIL due to the patch.
>
>> > --- a/gcc/cse.c
>> > +++ b/gcc/cse.c
>> > @@ -4280,6 +4280,19 @@ find_sets_in_insn (rtx insn, struct set **psets)
>> >                 ;
>> >               else if (GET_CODE (SET_SRC (y)) == CALL)
>> >                 ;
>> > +             else if (GET_CODE (SET_SRC (y)) == ASM_OPERANDS)
>> > +               {
>> > +                 if (i + 1 < lim)
>> > +                   {
>> > +                     rtx n = XVECEXP (x, 0, i + 1);
>> > +                     /* For inline assemble with multiple outputs, we can not
>> > +                        handle the SET separately.  Refer PR60663.  */
>> > +                     if (GET_CODE (n) == SET
>> > +                         && GET_CODE (SET_SRC (n)) == ASM_OPERANDS)
>> > +                       break;
>> > +                   }
>> > +                 sets[n_sets++].rtl = y;
>> > +               }
>> >               else
>> >                 sets[n_sets++].rtl = y;
>> >             }
>
> This doesn't look like a correct fix.  First of all, it will not handle
> many of the cases where we should not break the inline asm appart,
> e.g. if you have:
> int
> foo (void)
> {
>   unsigned i;
>   asm ("%0" : "=r" (i) : : "r5");
>   return i;
> }
> then it will still happily drop the clobber on the floor.
> But also, e.g. volatile asm or asm goto which is even stronger reason
> not to fiddle with it too much, isn't handled by not adding the sets in
> find_sets_in_insn, but rather just setting src_volatile flag.
> So, I'd say a better fix than this is something like following patch
> (untested yet, but fixes the testcase).

Thanks. The patch can fix the issue. I tested it on X86-64 and ARM.
There is no regression.

> Or, fix up the insane arm costs for ASM_OPERANDS:
>     case ASM_OPERANDS:
>       /* Just a guess.  Cost one insn per input.  */
>       *cost = COSTS_N_INSNS (ASM_OPERANDS_INPUT_LENGTH (x));
>       return true;
> I don't think this heuristics is even close to reality most of the
> time, more importantly, for no inputs, just outputs and clobbers it means
> *cost = 0; and that is why ARM is supposedly the only target now where CSE
> thinks it is worthwhile to break all inline asms without inputs appart.

I will raise a patch to discuss it for ARM backend.

Thanks!
-Zhenqiang

> 2014-04-10  Jakub Jelinek  <jakub@redhat.com>
>
>         PR rtl-optimization/60663
>         * cse.c (cse_insn): Set src_volatile on ASM_OPERANDS in
>         PARALLEL.
>
>         * gcc.target/arm/pr60663.c: New test.
>
> --- gcc/cse.c.jj        2014-03-12 10:13:41.000000000 +0100
> +++ gcc/cse.c   2014-04-10 17:21:27.517330918 +0200
> @@ -4642,6 +4642,13 @@ cse_insn (rtx insn)
>           && REGNO (dest) >= FIRST_PSEUDO_REGISTER)
>         sets[i].src_volatile = 1;
>
> +      /* Also do not record result of a non-volatile inline asm with
> +        more than one result or with clobbers, we do not want CSE to
> +        break the inline asm apart.  */
> +      else if (GET_CODE (src) == ASM_OPERANDS
> +              && GET_CODE (x) == PARALLEL)
> +       sets[i].src_volatile = 1;
> +
>  #if 0
>        /* It is no longer clear why we used to do this, but it doesn't
>          appear to still be needed.  So let's try without it since this
> --- gcc/testsuite/gcc.target/arm/pr60663.c.jj   2014-04-10 17:30:04.392608591 +0200
> +++ gcc/testsuite/gcc.target/arm/pr60663.c      2014-04-10 17:29:25.000000000 +0200
> @@ -0,0 +1,11 @@
> +/* PR rtl-optimization/60663 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -march=armv7-a" } */
> +
> +int
> +foo (void)
> +{
> +  unsigned i, j;
> +  asm ("%0 %1" : "=r" (i), "=r" (j));
> +  return i;
> +}
>
>
>         Jakub

Jakub Jelinek April 11, 2014, 9:30 a.m. UTC | #3

On Fri, Apr 11, 2014 at 05:19:59PM +0800, Zhenqiang Chen wrote:
> > Or, fix up the insane arm costs for ASM_OPERANDS:
> >     case ASM_OPERANDS:
> >       /* Just a guess.  Cost one insn per input.  */
> >       *cost = COSTS_N_INSNS (ASM_OPERANDS_INPUT_LENGTH (x));
> >       return true;
> > I don't think this heuristics is even close to reality most of the
> > time, more importantly, for no inputs, just outputs and clobbers it means
> > *cost = 0; and that is why ARM is supposedly the only target now where CSE
> > thinks it is worthwhile to break all inline asms without inputs appart.
> 
> I will raise a patch to discuss it for ARM backend.

AFAIK Kyrill is already working on such a patch, but it was agreed to put
it into stage1 only and eventually backport for 4.9.1.

	Jakub

--- gcc/cse.c.jj	2014-03-12 10:13:41.000000000 +0100
+++ gcc/cse.c	2014-04-10 17:21:27.517330918 +0200
@@ -4642,6 +4642,13 @@  cse_insn (rtx insn)
 	  && REGNO (dest) >= FIRST_PSEUDO_REGISTER)
 	sets[i].src_volatile = 1;
 
+      /* Also do not record result of a non-volatile inline asm with
+	 more than one result or with clobbers, we do not want CSE to
+	 break the inline asm apart.  */
+      else if (GET_CODE (src) == ASM_OPERANDS
+	       && GET_CODE (x) == PARALLEL)
+	sets[i].src_volatile = 1;
+
 #if 0
       /* It is no longer clear why we used to do this, but it doesn't
 	 appear to still be needed.  So let's try without it since this
--- gcc/testsuite/gcc.target/arm/pr60663.c.jj	2014-04-10 17:30:04.392608591 +0200
+++ gcc/testsuite/gcc.target/arm/pr60663.c	2014-04-10 17:29:25.000000000 +0200
@@ -0,0 +1,11 @@ 
+/* PR rtl-optimization/60663 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+int
+foo (void)
+{
+  unsigned i, j;
+  asm ("%0 %1" : "=r" (i), "=r" (j));
+  return i;
+}

[PING] Fix PR rtl-optimization/pr60663

Commit Message

Comments

Patch