diff mbox

[rs6000] Fix PR target/71733, ICE with -mcpu=power9 -mno-vsx

Message ID ed8225d9-e937-30fd-ab80-648f557cc0ff@vnet.ibm.com
State New
Headers show

Commit Message

Peter Bergner July 6, 2016, 2:26 a.m. UTC
The following patch fixes a bug where we do not disable POWER9 vector dform
addressing when we compile for POWER9 but without VSX support.  This manifested
itself with us trying to use dform addressing with altivec loads/stores
which is illegal, leading to an ICE.

This has bootstrapped and regtested with no regessions.  Ok for trunk?

This also affects the FSF 6 branch, ok there too, assuming bootstrap and
regtesting complete cleanly?

Peter

gcc/
	* config/rs6000/rs6000.c (rs6000_option_override_internal): Disable
	-mpower9-dform-vector when disabling -mpower9-vector.

gcc/testsuite/
	* gcc.target/powerpc/pr71733.c: New test.

Comments

David Edelsohn July 6, 2016, 5:53 p.m. UTC | #1
On Tue, Jul 5, 2016 at 10:26 PM, Peter Bergner <bergner@vnet.ibm.com> wrote:
> The following patch fixes a bug where we do not disable POWER9 vector dform
> addressing when we compile for POWER9 but without VSX support.  This manifested
> itself with us trying to use dform addressing with altivec loads/stores
> which is illegal, leading to an ICE.

Peter,

DFORM definitely should be disabled without VSX, but the patch seems
incomplete.  If VSX and DFORM are enabled, and GCC chooses an Altivec
instruction alternative in a pattern, what is to prevent the
generation of a DFORM address?

Thanks, David
Peter Bergner July 6, 2016, 6:02 p.m. UTC | #2
On 7/6/16 12:53 PM, David Edelsohn wrote:
> On Tue, Jul 5, 2016 at 10:26 PM, Peter Bergner <bergner@vnet.ibm.com> wrote:
>> The following patch fixes a bug where we do not disable POWER9 vector dform
>> addressing when we compile for POWER9 but without VSX support.  This manifested
>> itself with us trying to use dform addressing with altivec loads/stores
>> which is illegal, leading to an ICE.
>
> Peter,
>
> DFORM definitely should be disabled without VSX, but the patch seems
> incomplete.  If VSX and DFORM are enabled, and GCC chooses an Altivec
> instruction alternative in a pattern, what is to prevent the
> generation of a DFORM address?

That's a good question.  I'm currently attempting to find out why we
seem to think reg+offset is ok.  With -mcpu=power8 -mno-vsx, we use
reg+reg addressing right from expand.  With -mcpu=power9 -mno-vsx,
we use reg+offset right from expand.  I'll see where the disconnect
is happening.

Peter
Michael Meissner July 6, 2016, 7:19 p.m. UTC | #3
On Tue, Jul 05, 2016 at 09:26:50PM -0500, Peter Bergner wrote:
> The following patch fixes a bug where we do not disable POWER9 vector dform
> addressing when we compile for POWER9 but without VSX support.  This manifested
> itself with us trying to use dform addressing with altivec loads/stores
> which is illegal, leading to an ICE.
> 
> This has bootstrapped and regtested with no regessions.  Ok for trunk?
> 
> This also affects the FSF 6 branch, ok there too, assuming bootstrap and
> regtesting complete cleanly?
> 
> Peter
> 
> gcc/
> 	* config/rs6000/rs6000.c (rs6000_option_override_internal): Disable
> 	-mpower9-dform-vector when disabling -mpower9-vector.
> 
> gcc/testsuite/
> 	* gcc.target/powerpc/pr71733.c: New test.
> 
> 
> Index: gcc/config/rs6000/rs6000.c
> ===================================================================
> --- gcc/config/rs6000/rs6000.c	(revision 237945)
> +++ gcc/config/rs6000/rs6000.c	(working copy)
> @@ -4303,7 +4303,8 @@ rs6000_option_override_internal (bool gl
>      {
>        if (rs6000_isa_flags_explicit & OPTION_MASK_P8_VECTOR)
>  	error ("-mpower9-vector requires -mpower8-vector");
> -      rs6000_isa_flags &= ~OPTION_MASK_P9_VECTOR;
> +      rs6000_isa_flags &= ~(OPTION_MASK_P9_VECTOR
> +			    | OPTION_MASK_P9_DFORM_VECTOR);
>      }

Note, this should be

+      rs6000_isa_flags &= ~(OPTION_MASK_P9_VECTOR_SCALAR
+			    | OPTION_MASK_P9_DFORM_VECTOR);

However, we probably need to add all of the other options that depend on VSX.
Peter Bergner July 6, 2016, 10:01 p.m. UTC | #4
On 7/6/16 2:19 PM, Michael Meissner wrote:
> On Tue, Jul 05, 2016 at 09:26:50PM -0500, Peter Bergner wrote:
>> -      rs6000_isa_flags &= ~OPTION_MASK_P9_VECTOR;
>> +      rs6000_isa_flags &= ~(OPTION_MASK_P9_VECTOR
>> +			    | OPTION_MASK_P9_DFORM_VECTOR);
>>      }
>
> Note, this should be
>
> +      rs6000_isa_flags &= ~(OPTION_MASK_P9_VECTOR_SCALAR
> +			    | OPTION_MASK_P9_DFORM_VECTOR);

I think you mean the following, since we have to disable -mpower9-vector
too:

-      rs6000_isa_flags &= ~OPTION_MASK_P9_VECTOR;
+      rs6000_isa_flags &= ~(OPTION_MASK_P9_VECTOR
+			    | OPTION_MASK_P9_DFORM_SCALAR
+			    | OPTION_MASK_P9_DFORM_VECTOR);

I had thought about adding the dform scalar flag, but it was already
correctly disabled and I wasn't sure whether we could have the p9
dform scalar without the vector part.  Probably not, so consider
the patch above as the latest.


> However, we probably need to add all of the other options that depend on VSX.

Yes, there is a cascade affect on the disabling of options when you
explicitly disable something.  It'd be nice if this was somehow
automated, where we have some table showing the dependencies of
the options and the compiler just follows the table disabling things
that depend on something that has been disabled.  Could be hard to
make that dependency list though, given how many we have.

Peter
Michael Meissner July 6, 2016, 11:29 p.m. UTC | #5
On Wed, Jul 06, 2016 at 05:01:38PM -0500, Peter Bergner wrote:
> On 7/6/16 2:19 PM, Michael Meissner wrote:
> >On Tue, Jul 05, 2016 at 09:26:50PM -0500, Peter Bergner wrote:
> >>-      rs6000_isa_flags &= ~OPTION_MASK_P9_VECTOR;
> >>+      rs6000_isa_flags &= ~(OPTION_MASK_P9_VECTOR
> >>+			    | OPTION_MASK_P9_DFORM_VECTOR);
> >>     }
> >
> >Note, this should be
> >
> >+      rs6000_isa_flags &= ~(OPTION_MASK_P9_VECTOR_SCALAR
> >+			    | OPTION_MASK_P9_DFORM_VECTOR);
> 
> I think you mean the following, since we have to disable -mpower9-vector
> too:
> 
> -      rs6000_isa_flags &= ~OPTION_MASK_P9_VECTOR;
> +      rs6000_isa_flags &= ~(OPTION_MASK_P9_VECTOR
> +			    | OPTION_MASK_P9_DFORM_SCALAR
> +			    | OPTION_MASK_P9_DFORM_VECTOR);
> 
> I had thought about adding the dform scalar flag, but it was already
> correctly disabled and I wasn't sure whether we could have the p9
> dform scalar without the vector part.  Probably not, so consider
> the patch above as the latest.

Yes, you can have P9 dform scalar without P9 dform vector.

> 
> >However, we probably need to add all of the other options that depend on VSX.
> 
> Yes, there is a cascade affect on the disabling of options when you
> explicitly disable something.  It'd be nice if this was somehow
> automated, where we have some table showing the dependencies of
> the options and the compiler just follows the table disabling things
> that depend on something that has been disabled.  Could be hard to
> make that dependency list though, given how many we have.

Yep.  I'm thinking we need masks in rs6000-cpus.def of the options to turn off
if -mno-vsx (and even worse -mno-altivec).
Peter Bergner July 9, 2016, 2:31 p.m. UTC | #6
On 7/6/16 6:29 PM, Michael Meissner wrote:
> On Wed, Jul 06, 2016 at 05:01:38PM -0500, Peter Bergner wrote:
>> I had thought about adding the dform scalar flag, but it was already
>> correctly disabled and I wasn't sure whether we could have the p9
>> dform scalar without the vector part.  Probably not, so consider
>> the patch above as the latest.
>
> Yes, you can have P9 dform scalar without P9 dform vector.

So you're saying my original patch is the correct change?  Ie:

-      rs6000_isa_flags &= ~OPTION_MASK_P9_VECTOR;
+      rs6000_isa_flags &= ~(OPTION_MASK_P9_VECTOR
+			    | OPTION_MASK_P9_DFORM_VECTOR);


Peter
Peter Bergner July 9, 2016, 3:26 p.m. UTC | #7
On 7/6/16 12:53 PM, David Edelsohn wrote:
> On Tue, Jul 5, 2016 at 10:26 PM, Peter Bergner <bergner@vnet.ibm.com> wrote:
>> The following patch fixes a bug where we do not disable POWER9 vector dform
>> addressing when we compile for POWER9 but without VSX support.  This manifested
>> itself with us trying to use dform addressing with altivec loads/stores
>> which is illegal, leading to an ICE.
> 
> Peter,
> 
> DFORM definitely should be disabled without VSX, but the patch seems
> incomplete.  If VSX and DFORM are enabled, and GCC chooses an Altivec
> instruction alternative in a pattern, what is to prevent the
> generation of a DFORM address?

Adding Uli and Alan to the CC since there are some reload questions. :-)

So I dug into this a little more.  With -mcpu=power9 (ie, no -mno-vsx),
I could not get an altivec pattern generated, without resorting to using
an altivec builtin.  The altivec builtins also seem to force reg+reg
addressing, since hey, that's all they support.  So I couldn't create
a test case to answer your question of what if VSX and DFORM are enabled
and we have an altivec pattern with a dform address.  Maybe there should
be a test case that generates one of the altivec load/store patterns,
but I couldn't do it.

To try and answer your question, I went back to using -mcpu=power9 -mno-vsx
with unpatched trunk (ie, without my patch to disable vector dform when
power9 vector is disabled).  Looking at the following simple load test
case, it seems reload is capable of fixing up dform addresses when they're
used in an altivec pattern:

bergner@genoa:~/gcc/BUGS/sawdey/altivec$ cat vec_load.i
typedef __attribute__((altivec(vector__))) __attribute__((aligned(16))) unsigned char vec_t;
vec_t
foo (vec_t *dst)
{
  return dst[1];
}
bergner@genoa:~/gcc/BUGS/sawdey/altivec$ /home/bergner/gcc/build/gcc-fsf-mainline-debug-3/gcc/xgcc -B/home/bergner/gcc/build/gcc-fsf-mainline-debug-3/gcc -S -O1 -mcpu=power9 -mno-vsx vec_load.i -fdump-rtl-all-all

This gives us the following just before reload:

(insn 11 6 12 2 (set (reg/i:V16QI 79 2)
        (mem:V16QI (plus:DI (reg:DI 3 3 [ dstD.2412 ])
                (const_int 16 [0x10])) [0 MEM[(vec_tD.2411 *)dst_2(D) + 16B]+0 S16 A128])) vec_load.i:6 1209 {*altivec_movv16qi}
     (expr_list:REG_DEAD (reg:DI 3 3 [ dstD.2412 ])
        (nil)))

During reload, we see a reload for input:

Spilling for insn 11.

Reloads for insn # 11
Reload 0: reload_in (DI) = (plus:DI (reg:DI 3 3 [ dstD.2412 ])
                                                    (const_int 16 [0x10]))
	BASE_REGS, RELOAD_FOR_INPUT (opnum = 1), inc by 16
	reload_in_reg: (plus:DI (reg:DI 3 3 [ dstD.2412 ])
                                                    (const_int 16 [0x10]))
	reload_reg_rtx: (reg:DI 3 3)
Insns emitted for these reloads:
(insn 18 16 11 2 (set (reg:DI 3 3)
        (plus:DI (reg:DI 3 3 [ dstD.2412 ])
            (const_int 16 [0x10]))) vec_load.i:6 75 {*adddi3}
     (nil))

...which fixes the address so it satisfies its constraints and leaves us
with the rtl we want:

(insn 18 6 11 2 (set (reg:DI 3 3)
        (plus:DI (reg:DI 3 3 [ dstD.2412 ])
            (const_int 16 [0x10]))) vec_load.i:6 75 {*adddi3}
     (nil))
(insn 11 18 12 2 (set (reg/i:V16QI 79 2)
        (mem:V16QI (reg:DI 3 3) [0 MEM[(vec_tD.2411 *)dst_2(D) + 16B]+0 S16 A128])) vec_load.i:6 1209 {*altivec_movv16qi}
     (nil))

So to answer your question, at least for the altivec load case, reload prevents
us from using reg+offset addressing.


The problem occurs when we have an altivec store, which is what the original
test case had:

bergner@genoa:~/gcc/BUGS/sawdey/altivec$ cat vec_store.i
typedef __attribute__((altivec(vector__))) __attribute__((aligned(16))) unsigned char vec_t;
void
foo (vec_t *dst, vec_t src)
{
  dst[1] = src;
}
bergner@genoa:~/gcc/BUGS/sawdey/altivec$ /home/bergner/gcc/build/gcc-fsf-mainline-debug-3/gcc/xgcc -B/home/bergner/gcc/build/gcc-fsf-mainline-debug-3/gcc -S -O1 -mcpu=power9 -mno-vsx vec_store.i -fdump-rtl-all-all

This gives us the following just before reload:

(insn 7 4 0 2 (set (mem:V16QI (plus:DI (reg:DI 3 3 [ dstD.2412 ])
                (const_int 16 [0x10])) [0 MEM[(vec_tD.2411 *)dst_1(D) + 16B]+0 S16 A128])
        (reg:V16QI 79 2 [ srcD.2413 ])) vec_store.i:5 1209 {*altivec_movv16qi}
     (expr_list:REG_DEAD (reg:V16QI 79 2 [ srcD.2413 ])
        (expr_list:REG_DEAD (reg:DI 3 3 [ dstD.2412 ])
            (nil))))


During reload, we see a reload for output:

Spilling for insn 7.
Using reg 9 for reload 1
Spilling for insn 7.
Using reg 9 for reload 1

Reloads for insn # 7
Reload 0: reload_out (V16QI) = (mem:V16QI (plus:DI (reg:DI 3 3 [ dstD.2412 ])
                                                        (const_int 16 [0x10])) [0 MEM[(vec_tD.2411 *)dst_1(D) + 16B]+0 S16 A128])
        NO_REGS, RELOAD_FOR_OUTPUT (opnum = 0), optional
        reload_out_reg: (mem:V16QI (plus:DI (reg:DI 3 3 [ dstD.2412 ])
                                                        (const_int 16 [0x10])) [0 MEM[(vec_tD.2411 *)dst_1(D) + 16B]+0 S16 A128])
Reload 1: reload_in (V16QI) = (reg:V16QI 79 2 [ srcD.2413 ])
        GENERAL_REGS, RELOAD_FOR_INPUT (opnum = 1)
        reload_in_reg: (reg:V16QI 79 2 [ srcD.2413 ])
        reload_reg_rtx: (reg:V16QI 9 9)
Insns emitted for these reloads:
(insn 12 11 13 2 (set (mem/c:V16QI (plus:DI (reg/f:DI 1 1)
                (const_int -16 [0xfffffffffffffff0])) [0  S16 A128])
        (reg:V16QI 79 2 [ srcD.2413 ])) vec_store.i:5 -1
     (nil))

(insn 13 12 7 2 (set (reg:V16QI 9 9)
        (mem/c:V16QI (plus:DI (reg/f:DI 1 1)
                (const_int -16 [0xfffffffffffffff0])) [0  S16 A128])) vec_store.i:5 -1
     (nil))


The Reload 1 on the other hand is totally bogus.  I'd expect to see an input reload
for the reg+offset address.  Instead, it reloads the src altivec reg and really
creates a mess, since it's loading into a GPR and not a altivec register and
it's using a reg+offset address which is wrong here.

My guess is that we created the input reload for the src reg because
rs6000_legitimate_address_p() and quad_addredss_p() still say that a reg+offset
address for a V16QImode mode is ok, even though in this specific case, it
isn't since we're targeting an altivec store pattern...but they don't know
that.  So reload seems to say, if the address is ok, the src reg must need
reloading.  Did I get that right???

So how do we fix this?  It seems we need a way to disambiguate vector load/store
addresses that should allow reg+offset addresses (when they're targeting one of
the new power9 dform load/store patterns and when they should not allow reg+offset
addresses (when they're targeting older altivec and vsx load/stores that only
support reg+reg addressing).  Does anyone have an idea of how we can accomplish
that?

I'll note that I tried to modify rs6000_secondary_reload_memory to catch
the case where we are passing in an vector dform address with a regclass of
ALTIVEC_REGS, but that didn't seem to help.  That might be because
rs6000_legitimate_address_p() and quad_addredss_p() still say the reg+offset
address are ok.  Do we need to add an extra argument to those to tell them
to not allow reg+offset addressing when we know it's not allowed?

Peter
diff mbox

Patch

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 237945)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -4303,7 +4303,8 @@  rs6000_option_override_internal (bool gl
     {
       if (rs6000_isa_flags_explicit & OPTION_MASK_P8_VECTOR)
 	error ("-mpower9-vector requires -mpower8-vector");
-      rs6000_isa_flags &= ~OPTION_MASK_P9_VECTOR;
+      rs6000_isa_flags &= ~(OPTION_MASK_P9_VECTOR
+			    | OPTION_MASK_P9_DFORM_VECTOR);
     }
 
   /* There have been bugs with -mvsx-timode that don't show up with -mlra,
Index: gcc/testsuite/gcc.target/powerpc/pr71733.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/pr71733.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/pr71733.c	(working copy)
@@ -0,0 +1,14 @@ 
+/* Test for ICE arising from dform code generation with VSX disabled.  */
+
+/* { dg-do compile } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */
+/* { dg-options "-O0 -mcpu=power9 -mno-vsx" } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-skip-if "" { powerpc*-*-aix* } { "*" } { "" } } */
+
+typedef __attribute__((altivec(vector__))) unsigned char vec_t;
+vec_t
+foo (vec_t src)
+{
+  return src;
+}