diff mbox

[rs6000] Fix incorrect mode usage for vec_select

Message ID 68c028db-0b45-4145-6fe6-0a46bfde3ede@linux.vnet.ibm.com
State New
Headers show

Commit Message

Bill Schmidt March 8, 2017, 3:47 p.m. UTC
Hi,

As noted by Jakub in https://gcc.gnu.org/ml/gcc-patches/2017-03/msg00183.html,
the PowerPC back end incorrectly uses vec_select with 2 elements for a mode
that has only one.  This is due to faulty mode iterator use:  V1TImode was
wrongly included in the VSX_LE mode iterator, but should instead have been
in the VSX_LE_128 mode iterator.

This patch fixes that, and with VSX_LE no longer including V1TImode, it is
now redundant with VSX_D, so that patch removes VSX_LE altogether.

Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no regressions.
Is this ok for trunk?

I am uncertain whether we should backport the fix to gcc 5 and 6, since,
although the code is technically incorrect, it works just fine.  The fix
is needed in trunk to permit the sanity checking that Jakub has proposed
for genrecog.

Thanks,
Bill


2017-03-08  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* config/rs6000/rs6000.c (rs6000_gen_le_vsx_permute): Use rotate
	instead of vec_select for V1TImode.
	* conifg/rs6000/vsx.md (VSX_LE): Remove mode iterator that is no
	longer needed.
	(VSX_LE_128): Add V1TI to this mode iterator.
	(*vsx_le_perm_load_<mode>): Change to use VSX_D mode iterator.
	(*vsx_le_perm_store_<mode>): Likewise.
	(pre-reload splitter for VSX stores): Likewise.
	(post-reload splitter for VSX stores): Likewise.
	(*vsx_xxpermdi2_le_<mode>): Likewise.
	(*vsx_lxvd2x2_le_<mode>): Likewise.
	(*vsx_stxvd2x2_le_<mode>): Likewise.

Comments

Segher Boessenkool March 9, 2017, 8:31 p.m. UTC | #1
On Wed, Mar 08, 2017 at 09:47:32AM -0600, Bill Schmidt wrote:
> As noted by Jakub in https://gcc.gnu.org/ml/gcc-patches/2017-03/msg00183.html,
> the PowerPC back end incorrectly uses vec_select with 2 elements for a mode
> that has only one.  This is due to faulty mode iterator use:  V1TImode was
> wrongly included in the VSX_LE mode iterator, but should instead have been
> in the VSX_LE_128 mode iterator.
> 
> This patch fixes that, and with VSX_LE no longer including V1TImode, it is
> now redundant with VSX_D, so that patch removes VSX_LE altogether.
> 
> Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no regressions.
> Is this ok for trunk?

Yes, thanks.

> I am uncertain whether we should backport the fix to gcc 5 and 6, since,
> although the code is technically incorrect, it works just fine.  The fix
> is needed in trunk to permit the sanity checking that Jakub has proposed
> for genrecog.

Then let's not, not until we backport something that touches this code.
OTOH a backport of this is approved as well, if you prefer that.


Segher
Bill Schmidt Oct. 20, 2017, 9:41 p.m. UTC | #2
> On Mar 9, 2017, at 2:31 PM, Segher Boessenkool <segher@kernel.crashing.org> wrote:
> 
> On Wed, Mar 08, 2017 at 09:47:32AM -0600, Bill Schmidt wrote:
>> As noted by Jakub in https://gcc.gnu.org/ml/gcc-patches/2017-03/msg00183.html,
>> the PowerPC back end incorrectly uses vec_select with 2 elements for a mode
>> that has only one.  This is due to faulty mode iterator use:  V1TImode was
>> wrongly included in the VSX_LE mode iterator, but should instead have been
>> in the VSX_LE_128 mode iterator.
>> 
>> This patch fixes that, and with VSX_LE no longer including V1TImode, it is
>> now redundant with VSX_D, so that patch removes VSX_LE altogether.
>> 
>> Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no regressions.
>> Is this ok for trunk?
> 
> Yes, thanks.
> 
>> I am uncertain whether we should backport the fix to gcc 5 and 6, since,
>> although the code is technically incorrect, it works just fine.  The fix
>> is needed in trunk to permit the sanity checking that Jakub has proposed
>> for genrecog.
> 
> Then let's not, not until we backport something that touches this code.
> OTOH a backport of this is approved as well, if you prefer that.

As this was rediscovered with PR81294, I've gone ahead with the backport for 6.
It does not apply to 5, as the faulty code doesn't exist there.

Thanks!
Bill

2017-10-20  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	Backport from mainline
	2017-03-09  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* config/rs6000/rs6000.c (rs6000_gen_le_vsx_permute): Use rotate
	instead of vec_select for V1TImode.
	* conifg/rs6000/vsx.md (VSX_LE): Remove mode iterator that is no
	longer needed.
	(VSX_LE_128): Add V1TI to this mode iterator.
	(*vsx_le_perm_load_<mode>): Change to use VSX_D mode iterator.
	(*vsx_le_perm_store_<mode>): Likewise.
	(pre-reload splitter for VSX stores): Likewise.
	(post-reload splitter for VSX stores): Likewise.
	(*vsx_xxpermdi2_le_<mode>): Likewise.
	(*vsx_lxvd2x2_le_<mode>): Likewise.
	(*vsx_stxvd2x2_le_<mode>): Likewise.


Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 253955)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -9591,7 +9591,7 @@ rs6000_gen_le_vsx_permute (rtx source, machine_mod
 {
   /* Use ROTATE instead of VEC_SELECT on IEEE 128-bit floating point, and
      128-bit integers if they are allowed in VSX registers.  */
-  if (FLOAT128_VECTOR_P (mode) || mode == TImode)
+  if (FLOAT128_VECTOR_P (mode) || mode == TImode || mode == V1TImode)
     return gen_rtx_ROTATE (mode, source, GEN_INT (64));
   else
     {
Index: gcc/config/rs6000/vsx.md
===================================================================
--- gcc/config/rs6000/vsx.md	(revision 253955)
+++ gcc/config/rs6000/vsx.md	(working copy)
@@ -24,15 +24,12 @@
 ;; Iterator for the 2 64-bit vector types
 (define_mode_iterator VSX_D [V2DF V2DI])
 
-;; Iterator for the 2 64-bit vector types + 128-bit types that are loaded with
-;; lxvd2x to properly handle swapping words on little endian
-(define_mode_iterator VSX_LE [V2DF V2DI V1TI])
-
 ;; Mode iterator to handle swapping words on little endian for the 128-bit
 ;; types that goes in a single vector register.
 (define_mode_iterator VSX_LE_128 [(KF   "FLOAT128_VECTOR_P (KFmode)")
 				  (TF   "FLOAT128_VECTOR_P (TFmode)")
-				  (TI	"TARGET_VSX_TIMODE")])
+				  (TI	"TARGET_VSX_TIMODE")
+				  V1TI])
 
 ;; Iterator for the 2 32-bit vector types
 (define_mode_iterator VSX_W [V4SF V4SI])
@@ -300,8 +297,8 @@
 ;; The patterns for LE permuted loads and stores come before the general
 ;; VSX moves so they match first.
 (define_insn_and_split "*vsx_le_perm_load_<mode>"
-  [(set (match_operand:VSX_LE 0 "vsx_register_operand" "=<VSa>")
-        (match_operand:VSX_LE 1 "memory_operand" "Z"))]
+  [(set (match_operand:VSX_D 0 "vsx_register_operand" "=<VSa>")
+        (match_operand:VSX_D 1 "memory_operand" "Z"))]
   "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR"
   "#"
   "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR"
@@ -414,8 +411,8 @@
    (set_attr "length" "8")])
 
 (define_insn "*vsx_le_perm_store_<mode>"
-  [(set (match_operand:VSX_LE 0 "memory_operand" "=Z")
-        (match_operand:VSX_LE 1 "vsx_register_operand" "+<VSa>"))]
+  [(set (match_operand:VSX_D 0 "memory_operand" "=Z")
+        (match_operand:VSX_D 1 "vsx_register_operand" "+<VSa>"))]
   "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR"
   "#"
   [(set_attr "type" "vecstore")
@@ -422,8 +419,8 @@
    (set_attr "length" "12")])
 
 (define_split
-  [(set (match_operand:VSX_LE 0 "memory_operand" "")
-        (match_operand:VSX_LE 1 "vsx_register_operand" ""))]
+  [(set (match_operand:VSX_D 0 "memory_operand" "")
+        (match_operand:VSX_D 1 "vsx_register_operand" ""))]
   "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR && !reload_completed"
   [(set (match_dup 2)
         (vec_select:<MODE>
@@ -441,8 +438,8 @@
 ;; The post-reload split requires that we re-permute the source
 ;; register in case it is still live.
 (define_split
-  [(set (match_operand:VSX_LE 0 "memory_operand" "")
-        (match_operand:VSX_LE 1 "vsx_register_operand" ""))]
+  [(set (match_operand:VSX_D 0 "memory_operand" "")
+        (match_operand:VSX_D 1 "vsx_register_operand" ""))]
   "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR && reload_completed"
   [(set (match_dup 1)
         (vec_select:<MODE>
@@ -2003,9 +2000,9 @@
 ;; xxpermdi for little endian loads and stores.  We need several of
 ;; these since the form of the PARALLEL differs by mode.
 (define_insn "*vsx_xxpermdi2_le_<mode>"
-  [(set (match_operand:VSX_LE 0 "vsx_register_operand" "=<VSa>")
-        (vec_select:VSX_LE
-          (match_operand:VSX_LE 1 "vsx_register_operand" "<VSa>")
+  [(set (match_operand:VSX_D 0 "vsx_register_operand" "=<VSa>")
+        (vec_select:VSX_D
+          (match_operand:VSX_D 1 "vsx_register_operand" "<VSa>")
           (parallel [(const_int 1) (const_int 0)])))]
   "!BYTES_BIG_ENDIAN && VECTOR_MEM_VSX_P (<MODE>mode)"
   "xxpermdi %x0,%x1,%x1,2"
@@ -2052,9 +2049,9 @@
 ;; lxvd2x for little endian loads.  We need several of
 ;; these since the form of the PARALLEL differs by mode.
 (define_insn "*vsx_lxvd2x2_le_<mode>"
-  [(set (match_operand:VSX_LE 0 "vsx_register_operand" "=<VSa>")
-        (vec_select:VSX_LE
-          (match_operand:VSX_LE 1 "memory_operand" "Z")
+  [(set (match_operand:VSX_D 0 "vsx_register_operand" "=<VSa>")
+        (vec_select:VSX_D
+          (match_operand:VSX_D 1 "memory_operand" "Z")
           (parallel [(const_int 1) (const_int 0)])))]
   "!BYTES_BIG_ENDIAN && VECTOR_MEM_VSX_P (<MODE>mode) && !TARGET_P9_VECTOR"
   "lxvd2x %x0,%y1"
@@ -2101,9 +2098,9 @@
 ;; stxvd2x for little endian stores.  We need several of
 ;; these since the form of the PARALLEL differs by mode.
 (define_insn "*vsx_stxvd2x2_le_<mode>"
-  [(set (match_operand:VSX_LE 0 "memory_operand" "=Z")
-        (vec_select:VSX_LE
-          (match_operand:VSX_LE 1 "vsx_register_operand" "<VSa>")
+  [(set (match_operand:VSX_D 0 "memory_operand" "=Z")
+        (vec_select:VSX_D
+          (match_operand:VSX_D 1 "vsx_register_operand" "<VSa>")
           (parallel [(const_int 1) (const_int 0)])))]
   "!BYTES_BIG_ENDIAN && VECTOR_MEM_VSX_P (<MODE>mode) && !TARGET_P9_VECTOR"
   "stxvd2x %x1,%y0"

> 
> 
> Segher
>
Jakub Jelinek Oct. 20, 2017, 9:47 p.m. UTC | #3
On Fri, Oct 20, 2017 at 04:41:04PM -0500, Bill Schmidt wrote:
> As this was rediscovered with PR81294, I've gone ahead with the backport for 6.
> It does not apply to 5, as the faulty code doesn't exist there.

Well, 5 is closed anyway.

	Jakub
Bill Schmidt Oct. 20, 2017, 9:51 p.m. UTC | #4
On Oct 20, 2017, at 4:47 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> 
> On Fri, Oct 20, 2017 at 04:41:04PM -0500, Bill Schmidt wrote:
>> As this was rediscovered with PR81294, I've gone ahead with the backport for 6.
>> It does not apply to 5, as the faulty code doesn't exist there.
> 
> Well, 5 is closed anyway.
> 
Right, of course -- I was still looking to see if there would be an easy patch for Doko
to carry if he so chose, but that's not looking good.

Bill
diff mbox

Patch

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 245952)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -10420,7 +10420,7 @@  rs6000_gen_le_vsx_permute (rtx source, machine_mod
 {
   /* Use ROTATE instead of VEC_SELECT on IEEE 128-bit floating point, and
      128-bit integers if they are allowed in VSX registers.  */
-  if (FLOAT128_VECTOR_P (mode) || mode == TImode)
+  if (FLOAT128_VECTOR_P (mode) || mode == TImode || mode == V1TImode)
     return gen_rtx_ROTATE (mode, source, GEN_INT (64));
   else
     {
Index: gcc/config/rs6000/vsx.md
===================================================================
--- gcc/config/rs6000/vsx.md	(revision 245952)
+++ gcc/config/rs6000/vsx.md	(working copy)
@@ -27,15 +27,12 @@ 
 ;; Iterator for the 2 64-bit vector types
 (define_mode_iterator VSX_D [V2DF V2DI])
 
-;; Iterator for the 2 64-bit vector types + 128-bit types that are loaded with
-;; lxvd2x to properly handle swapping words on little endian
-(define_mode_iterator VSX_LE [V2DF V2DI V1TI])
-
 ;; Mode iterator to handle swapping words on little endian for the 128-bit
 ;; types that goes in a single vector register.
 (define_mode_iterator VSX_LE_128 [(KF   "FLOAT128_VECTOR_P (KFmode)")
 				  (TF   "FLOAT128_VECTOR_P (TFmode)")
-				  (TI	"TARGET_VSX_TIMODE")])
+				  (TI	"TARGET_VSX_TIMODE")
+				  V1TI])
 
 ;; Iterator for the 2 32-bit vector types
 (define_mode_iterator VSX_W [V4SF V4SI])
@@ -387,8 +384,8 @@ 
 ;; The patterns for LE permuted loads and stores come before the general
 ;; VSX moves so they match first.
 (define_insn_and_split "*vsx_le_perm_load_<mode>"
-  [(set (match_operand:VSX_LE 0 "vsx_register_operand" "=<VSa>")
-        (match_operand:VSX_LE 1 "memory_operand" "Z"))]
+  [(set (match_operand:VSX_D 0 "vsx_register_operand" "=<VSa>")
+        (match_operand:VSX_D 1 "memory_operand" "Z"))]
   "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR"
   "#"
   "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR"
@@ -501,8 +498,8 @@ 
    (set_attr "length" "8")])
 
 (define_insn "*vsx_le_perm_store_<mode>"
-  [(set (match_operand:VSX_LE 0 "memory_operand" "=Z")
-        (match_operand:VSX_LE 1 "vsx_register_operand" "+<VSa>"))]
+  [(set (match_operand:VSX_D 0 "memory_operand" "=Z")
+        (match_operand:VSX_D 1 "vsx_register_operand" "+<VSa>"))]
   "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR"
   "#"
   [(set_attr "type" "vecstore")
@@ -509,8 +506,8 @@ 
    (set_attr "length" "12")])
 
 (define_split
-  [(set (match_operand:VSX_LE 0 "memory_operand" "")
-        (match_operand:VSX_LE 1 "vsx_register_operand" ""))]
+  [(set (match_operand:VSX_D 0 "memory_operand" "")
+        (match_operand:VSX_D 1 "vsx_register_operand" ""))]
   "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR && !reload_completed"
   [(set (match_dup 2)
         (vec_select:<MODE>
@@ -528,8 +525,8 @@ 
 ;; The post-reload split requires that we re-permute the source
 ;; register in case it is still live.
 (define_split
-  [(set (match_operand:VSX_LE 0 "memory_operand" "")
-        (match_operand:VSX_LE 1 "vsx_register_operand" ""))]
+  [(set (match_operand:VSX_D 0 "memory_operand" "")
+        (match_operand:VSX_D 1 "vsx_register_operand" ""))]
   "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR && reload_completed"
   [(set (match_dup 1)
         (vec_select:<MODE>
@@ -2061,9 +2058,9 @@ 
 ;; xxpermdi for little endian loads and stores.  We need several of
 ;; these since the form of the PARALLEL differs by mode.
 (define_insn "*vsx_xxpermdi2_le_<mode>"
-  [(set (match_operand:VSX_LE 0 "vsx_register_operand" "=<VSa>")
-        (vec_select:VSX_LE
-          (match_operand:VSX_LE 1 "vsx_register_operand" "<VSa>")
+  [(set (match_operand:VSX_D 0 "vsx_register_operand" "=<VSa>")
+        (vec_select:VSX_D
+          (match_operand:VSX_D 1 "vsx_register_operand" "<VSa>")
           (parallel [(const_int 1) (const_int 0)])))]
   "!BYTES_BIG_ENDIAN && VECTOR_MEM_VSX_P (<MODE>mode)"
   "xxpermdi %x0,%x1,%x1,2"
@@ -2110,9 +2107,9 @@ 
 ;; lxvd2x for little endian loads.  We need several of
 ;; these since the form of the PARALLEL differs by mode.
 (define_insn "*vsx_lxvd2x2_le_<mode>"
-  [(set (match_operand:VSX_LE 0 "vsx_register_operand" "=<VSa>")
-        (vec_select:VSX_LE
-          (match_operand:VSX_LE 1 "memory_operand" "Z")
+  [(set (match_operand:VSX_D 0 "vsx_register_operand" "=<VSa>")
+        (vec_select:VSX_D
+          (match_operand:VSX_D 1 "memory_operand" "Z")
           (parallel [(const_int 1) (const_int 0)])))]
   "!BYTES_BIG_ENDIAN && VECTOR_MEM_VSX_P (<MODE>mode) && !TARGET_P9_VECTOR"
   "lxvd2x %x0,%y1"
@@ -2159,9 +2156,9 @@ 
 ;; stxvd2x for little endian stores.  We need several of
 ;; these since the form of the PARALLEL differs by mode.
 (define_insn "*vsx_stxvd2x2_le_<mode>"
-  [(set (match_operand:VSX_LE 0 "memory_operand" "=Z")
-        (vec_select:VSX_LE
-          (match_operand:VSX_LE 1 "vsx_register_operand" "<VSa>")
+  [(set (match_operand:VSX_D 0 "memory_operand" "=Z")
+        (vec_select:VSX_D
+          (match_operand:VSX_D 1 "vsx_register_operand" "<VSa>")
           (parallel [(const_int 1) (const_int 0)])))]
   "!BYTES_BIG_ENDIAN && VECTOR_MEM_VSX_P (<MODE>mode) && !TARGET_P9_VECTOR"
   "stxvd2x %x1,%y0"