diff mbox series

rs6000: Fix infinite loop building ghostscript and icu [PR93658]

Message ID 9591f6f2-e705-dc9e-b77b-2571481e4170@linux.ibm.com
State New
Headers show
Series rs6000: Fix infinite loop building ghostscript and icu [PR93658] | expand

Commit Message

Peter Bergner Feb. 20, 2020, 3:17 a.m. UTC
PR93658 shows a problem in rs6000_legitimate_address_p() where we erroneously
mark an Altivec address as being invalid, which causes LRA to go into an
infinite loop spilling the same address over and over again.  The problem
is, when VSX is enabled, the rs6000_vector_mem[<vector mode>], returns
VECTOR_VSX rather than VECTOR_ALTIVEC, causing us to not handle Altivec
load/store addresses correctly.  The following patch tests for both
and fixes the test case.

This passed bootstrap and regtesting on powerpc64le-linux and powerpc64-linux
(in both 32-bit and 64-bit modes) with no regressions.  Ok for trunk?
The same bug exists in FSF 9 anf FSF 8 branches.  Ok for those too after
some burn in on trunk and clean regtests on the backports?

Peter


gcc/
	PR target/93658
	* config/rs6000/rs6000.c (rs6000_legitimate_address_p): Handle VSX
	vector modes.

gcc/testsuite/
	PR target/93658
	* gcc.target/powerpc/pr93658.c: New test.

Comments

Segher Boessenkool Feb. 20, 2020, 7:47 a.m. UTC | #1
On Wed, Feb 19, 2020 at 09:17:26PM -0600, Peter Bergner wrote:
> PR93658 shows a problem in rs6000_legitimate_address_p() where we erroneously
> mark an Altivec address as being invalid, which causes LRA to go into an
> infinite loop spilling the same address over and over again.  The problem
> is, when VSX is enabled, the rs6000_vector_mem[<vector mode>], returns
> VECTOR_VSX rather than VECTOR_ALTIVEC, causing us to not handle Altivec
> load/store addresses correctly.  The following patch tests for both
> and fixes the test case.

Ooh you found the problem, that was fast, great!

> This passed bootstrap and regtesting on powerpc64le-linux and powerpc64-linux
> (in both 32-bit and 64-bit modes) with no regressions.  Ok for trunk?
> The same bug exists in FSF 9 anf FSF 8 branches.  Ok for those too after
> some burn in on trunk and clean regtests on the backports?

Okay for all.  You may want to check it into 9 a bit faster than usual,
to meet the release schedule.  It should be perfectly safe enough for
that.  Do run the regstraps, of course ;-)

Thanks!


Segher


> gcc/
> 	PR target/93658
> 	* config/rs6000/rs6000.c (rs6000_legitimate_address_p): Handle VSX
> 	vector modes.
> 
> gcc/testsuite/
> 	PR target/93658
> 	* gcc.target/powerpc/pr93658.c: New test.
Peter Bergner Feb. 20, 2020, 5:33 p.m. UTC | #2
On 2/20/20 1:47 AM, Segher Boessenkool wrote:
> On Wed, Feb 19, 2020 at 09:17:26PM -0600, Peter Bergner wrote:
>> This passed bootstrap and regtesting on powerpc64le-linux and powerpc64-linux
>> (in both 32-bit and 64-bit modes) with no regressions.  Ok for trunk?
>> The same bug exists in FSF 9 anf FSF 8 branches.  Ok for those too after
>> some burn in on trunk and clean regtests on the backports?
> 
> Okay for all.  You may want to check it into 9 a bit faster than usual,
> to meet the release schedule.  It should be perfectly safe enough for
> that.  Do run the regstraps, of course ;-)

Ok, I pushed the trunk fix now.  I'll kick off the release branch
backports now.

Jakub, I know you're getting the GCC 8.4 release ready.  Is this fix ok
for FSF 8 now or do you want me to wait until after 8.4 is out?


Peter
Richard Biener Feb. 21, 2020, 9:23 a.m. UTC | #3
On Thu, Feb 20, 2020 at 6:33 PM Peter Bergner <bergner@linux.ibm.com> wrote:
>
> On 2/20/20 1:47 AM, Segher Boessenkool wrote:
> > On Wed, Feb 19, 2020 at 09:17:26PM -0600, Peter Bergner wrote:
> >> This passed bootstrap and regtesting on powerpc64le-linux and powerpc64-linux
> >> (in both 32-bit and 64-bit modes) with no regressions.  Ok for trunk?
> >> The same bug exists in FSF 9 anf FSF 8 branches.  Ok for those too after
> >> some burn in on trunk and clean regtests on the backports?
> >
> > Okay for all.  You may want to check it into 9 a bit faster than usual,
> > to meet the release schedule.  It should be perfectly safe enough for
> > that.  Do run the regstraps, of course ;-)
>
> Ok, I pushed the trunk fix now.  I'll kick off the release branch
> backports now.
>
> Jakub, I know you're getting the GCC 8.4 release ready.  Is this fix ok
> for FSF 8 now or do you want me to wait until after 8.4 is out?

It's OK for 8.4.

Richard.

>
> Peter
>
>
Peter Bergner Feb. 23, 2020, 12:25 a.m. UTC | #4
On 2/20/20 11:33 AM, Peter Bergner wrote:
> Ok, I pushed the trunk fix now.  I'll kick off the release branch
> backports now.
> 
> Jakub, I know you're getting the GCC 8.4 release ready.  Is this fix ok
> for FSF 8 now or do you want me to wait until after 8.4 is out?

The backport of the PR93658 fix to GCC 8 & 9 showed a few testsuite regressions.
With Mike's help, we tracked those down to some fixes Mike had applied to
trunk, but did not backport.  With the following patch, PR93658 is fixed in
GCC 9 with no regressions.

GCC 8 is also regression free with the small update to the fragile insn
counting in vss-vector-6-le.c.  We made major changes to this test case
in trunk, but I think it's easier to just modify the count in the patch
hunk below.

Is this still ok for GCC 8 & 9?

Peter



rs6000: Fix infinite loop building ghostscript and icu [PR93658]

Fix rs6000_legitimate_address_p(), which erroneously marks a valid Altivec
address as being invalid, which causes LRA's process_address()  to go into
an infinite loop spilling the same address over and over again.
Include Mike's earlier commits that fix bugs this patch exposes.

	Backport from master
	2020-02-20  Peter Bergner  <bergner@linux.ibm.com>

	PR target/93658
	* config/rs6000/rs6000.c (rs6000_legitimate_address_p): Handle VSX
	vector modes.

	* gcc.target/powerpc/pr93658.c: New test.

Fix PR 93568 (thinko)

	Backport from master
	2020-02-05  Michael Meissner  <meissner@linux.ibm.com>

	PR target/93568
	* config/rs6000/rs6000.c (get_vector_offset): Fix

Adjust how variable vector extraction is done.

	Backport from master
	2020-02-03  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.c (get_vector_offset): New helper function
	to calculate the offset in memory from the start of a vector of a
	particular element.  Add code to keep the element number in
	bounds if the element number is variable.
	(rs6000_adjust_vec_address): Move calculation of offset of the
	vector element to get_vector_offset.
	(rs6000_split_vec_extract_var): Do not do the initial AND of
	element here, move the code to get_vector_offset.

Fix bad code of vector extract of PC-relative address with variable element #.

	Backport from master
	2020-01-06  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/vsx.md (vsx_extract_<mode>_var, VSX_D iterator):
	Use 'Q' for doing vector extract from memory.
	(vsx_extract_v4sf_var): Use 'Q' for doing vector extract from
	memory.
	(vsx_extract_<mode>_var, VSX_EXTRACT_I iterator): Use 'Q' for
	doing vector extract from memory.
	(vsx_extract_<mode>_<VS_scalar>mode_var): Use 'Q' for doing vector
	extract from memory.

Only for GCC 8:

	* gcc.target/powerpc/vsx-vector-6-le.c: Update fragile insn count.


diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 87d60078bb0..1d93570e7a5 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -7017,10 +7017,53 @@ rs6000_expand_vector_extract (rtx target, rtx vec, rtx elt)
     }
 }
 
+/* Return the offset within a memory object (MEM) of a vector type to a given
+   element within the vector (ELEMENT) with an element size (SCALAR_SIZE).  If
+   the element is constant, we return a constant integer.
+
+   Otherwise, we use a base register temporary to calculate the offset after
+   masking it to fit within the bounds of the vector and scaling it.  The
+   masking is required by the 64-bit ELF version 2 ABI for the vec_extract
+   built-in function.  */
+
+static rtx
+get_vector_offset (rtx mem, rtx element, rtx base_tmp, unsigned scalar_size)
+{
+  if (CONST_INT_P (element))
+    return GEN_INT (INTVAL (element) * scalar_size);
+
+  /* All insns should use the 'Q' constraint (address is a single register) if
+     the element number is not a constant.  */
+  gcc_assert (satisfies_constraint_Q (mem));
+
+  /* Mask the element to make sure the element number is between 0 and the
+     maximum number of elements - 1 so that we don't generate an address
+     outside the vector.  */
+  rtx num_ele_m1 = GEN_INT (GET_MODE_NUNITS (GET_MODE (mem)) - 1);
+  rtx and_op = gen_rtx_AND (Pmode, element, num_ele_m1);
+  emit_insn (gen_rtx_SET (base_tmp, and_op));
+
+  /* Shift the element to get the byte offset from the element number.  */
+  int shift = exact_log2 (scalar_size);
+  gcc_assert (shift >= 0);
+
+  if (shift > 0)
+    {
+      rtx shift_op = gen_rtx_ASHIFT (Pmode, base_tmp, GEN_INT (shift));
+      emit_insn (gen_rtx_SET (base_tmp, shift_op));
+    }
+
+  return base_tmp;
+}
+
 /* Adjust a memory address (MEM) of a vector type to point to a scalar field
    within the vector (ELEMENT) with a mode (SCALAR_MODE).  Use a base register
    temporary (BASE_TMP) to fixup the address.  Return the new memory address
-   that is valid for reads or writes to a given register (SCALAR_REG).  */
+   that is valid for reads or writes to a given register (SCALAR_REG).
+
+   This function is expected to be called after reload is completed when we are
+   splitting insns.  The temporary BASE_TMP might be set multiple times with
+   this code.  */
 
 rtx
 rs6000_adjust_vec_address (rtx scalar_reg,
@@ -7031,7 +7074,6 @@ rs6000_adjust_vec_address (rtx scalar_reg,
 {
   unsigned scalar_size = GET_MODE_SIZE (scalar_mode);
   rtx addr = XEXP (mem, 0);
-  rtx element_offset;
   rtx new_addr;
   bool valid_addr_p;
 
@@ -7040,26 +7082,7 @@ rs6000_adjust_vec_address (rtx scalar_reg,
 
   /* Calculate what we need to add to the address to get the element
      address.  */
-  if (CONST_INT_P (element))
-    element_offset = GEN_INT (INTVAL (element) * scalar_size);
-  else
-    {
-      int byte_shift = exact_log2 (scalar_size);
-      gcc_assert (byte_shift >= 0);
-
-      if (byte_shift == 0)
-	element_offset = element;
-
-      else
-	{
-	  if (TARGET_POWERPC64)
-	    emit_insn (gen_ashldi3 (base_tmp, element, GEN_INT (byte_shift)));
-	  else
-	    emit_insn (gen_ashlsi3 (base_tmp, element, GEN_INT (byte_shift)));
-
-	  element_offset = base_tmp;
-	}
-    }
+  rtx element_offset = get_vector_offset (mem, element, base_tmp, scalar_size);
 
   /* Create the new address pointing to the element within the vector.  If we
      are adding 0, we don't have to change the address.  */
@@ -7194,13 +7217,9 @@ rs6000_split_vec_extract_var (rtx dest, rtx src, rtx element, rtx tmp_gpr,
      systems.  */
   if (MEM_P (src))
     {
-      int num_elements = GET_MODE_NUNITS (mode);
-      rtx num_ele_m1 = GEN_INT (num_elements - 1);
-
-      emit_insn (gen_anddi3 (element, element, num_ele_m1));
-      gcc_assert (REG_P (tmp_gpr));
-      emit_move_insn (dest, rs6000_adjust_vec_address (dest, src, element,
-						       tmp_gpr, scalar_mode));
+      emit_move_insn (dest,
+		      rs6000_adjust_vec_address (dest, src, element, tmp_gpr,
+						 scalar_mode));
       return;
     }
 
@@ -9339,7 +9358,7 @@ rs6000_legitimate_address_p (machine_mode mode, rtx x, bool reg_ok_strict)
   bool quad_offset_p = mode_supports_dq_form (mode);
 
   /* If this is an unaligned stvx/ldvx type address, discard the outer AND.  */
-  if (VECTOR_MEM_ALTIVEC_P (mode)
+  if (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode)
       && GET_CODE (x) == AND
       && CONST_INT_P (XEXP (x, 1))
       && INTVAL (XEXP (x, 1)) == -16)
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 607c0cd33f2..5973e0a399d 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -3295,7 +3295,7 @@
 ;; Variable V2DI/V2DF extract
 (define_insn_and_split "vsx_extract_<mode>_var"
   [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand" "=v,<VSa>,r")
-	(unspec:<VS_scalar> [(match_operand:VSX_D 1 "input_operand" "v,m,m")
+	(unspec:<VS_scalar> [(match_operand:VSX_D 1 "input_operand" "v,Q,Q")
 			     (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
 			    UNSPEC_VSX_EXTRACT))
    (clobber (match_scratch:DI 3 "=r,&b,&b"))
@@ -3364,7 +3364,7 @@
 ;; Variable V4SF extract
 (define_insn_and_split "vsx_extract_v4sf_var"
   [(set (match_operand:SF 0 "gpc_reg_operand" "=ww,ww,?r")
-	(unspec:SF [(match_operand:V4SF 1 "input_operand" "v,m,m")
+	(unspec:SF [(match_operand:V4SF 1 "input_operand" "v,Q,Q")
 		    (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
 		   UNSPEC_VSX_EXTRACT))
    (clobber (match_scratch:DI 3 "=r,&b,&b"))
@@ -3724,7 +3724,7 @@
 (define_insn_and_split "vsx_extract_<mode>_var"
   [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand" "=r,r,r")
 	(unspec:<VS_scalar>
-	 [(match_operand:VSX_EXTRACT_I 1 "input_operand" "wK,v,m")
+	 [(match_operand:VSX_EXTRACT_I 1 "input_operand" "wK,v,Q")
 	  (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
 	 UNSPEC_VSX_EXTRACT))
    (clobber (match_scratch:DI 3 "=r,r,&b"))
@@ -3743,7 +3743,7 @@
   [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand" "=r,r,r")
 	(zero_extend:<VS_scalar>
 	 (unspec:<VSX_EXTRACT_I:VS_scalar>
-	  [(match_operand:VSX_EXTRACT_I 1 "input_operand" "wK,v,m")
+	  [(match_operand:VSX_EXTRACT_I 1 "input_operand" "wK,v,Q")
 	   (match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
 	  UNSPEC_VSX_EXTRACT)))
    (clobber (match_scratch:DI 3 "=r,r,&b"))
diff --git a/gcc/testsuite/gcc.target/powerpc/pr93658.c b/gcc/testsuite/gcc.target/powerpc/pr93658.c
new file mode 100644
index 00000000000..0170d34d259
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr93658.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fstack-protector-strong -mcpu=power8" } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+
+/* PR93658: Failure compiling this test is an infinite loop in LRA's
+   process_address(), so set a short timeout limit.  */
+/* { dg-timeout 5 } */
+
+void bar();
+char b;
+void
+foo (void)
+{
+  char a;
+  int d = b;
+  char *e = &a;
+  while (d)
+    *e++ = --d;
+  bar ();
+}

Only for GCC 8:

diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-vector-6-le.c b/gcc/testsuite/gcc.target/powerpc/vsx-vector-6-le.c
index fe7eeb12ff9..dc6ccb77d1f 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx-vector-6-le.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-vector-6-le.c
@@ -14,7 +14,7 @@
    their usage counts being stable.  Therefore, we just ensure at least one
    xxlor instruction was generated.  */
 /* { dg-final { scan-assembler "xxlor" } } */
-/* { dg-final { scan-assembler-times "xvcmpeqdp" 5 } } */
+/* { dg-final { scan-assembler-times "xvcmpeqdp" 6 } } */
 /* { dg-final { scan-assembler-times "xvcmpgtdp" 8 } } */
 /* { dg-final { scan-assembler-times "xvcmpgedp" 6 } } */
 /* { dg-final { scan-assembler-times "xvrdpim" 1 } } */
Segher Boessenkool Feb. 23, 2020, 12:49 a.m. UTC | #5
On Sat, Feb 22, 2020 at 06:25:43PM -0600, Peter Bergner wrote:
> On 2/20/20 11:33 AM, Peter Bergner wrote:
> > Ok, I pushed the trunk fix now.  I'll kick off the release branch
> > backports now.
> > 
> > Jakub, I know you're getting the GCC 8.4 release ready.  Is this fix ok
> > for FSF 8 now or do you want me to wait until after 8.4 is out?
> 
> The backport of the PR93658 fix to GCC 8 & 9 showed a few testsuite regressions.
> With Mike's help, we tracked those down to some fixes Mike had applied to
> trunk, but did not backport.  With the following patch, PR93658 is fixed in
> GCC 9 with no regressions.
> 
> GCC 8 is also regression free with the small update to the fragile insn
> counting in vss-vector-6-le.c.  We made major changes to this test case
> in trunk, but I think it's easier to just modify the count in the patch
> hunk below.
> 
> Is this still ok for GCC 8 & 9?

Please do each of those backports as separate commits (so that if there
is a problem with them, we can bisect it; it also should be easier to do
this way, even :-) )

(So backport the oldest first, etc.)

> rs6000: Fix infinite loop building ghostscript and icu [PR93658]
> 
> Fix rs6000_legitimate_address_p(), which erroneously marks a valid Altivec
> address as being invalid, which causes LRA's process_address()  to go into
> an infinite loop spilling the same address over and over again.
> Include Mike's earlier commits that fix bugs this patch exposes.
> 
> 	Backport from master
> 	2020-02-20  Peter Bergner  <bergner@linux.ibm.com>
> 
> 	PR target/93658
> 	* config/rs6000/rs6000.c (rs6000_legitimate_address_p): Handle VSX
> 	vector modes.
> 
> 	* gcc.target/powerpc/pr93658.c: New test.
> 
> Fix PR 93568 (thinko)
> 
> 	Backport from master
> 	2020-02-05  Michael Meissner  <meissner@linux.ibm.com>
> 
> 	PR target/93568
> 	* config/rs6000/rs6000.c (get_vector_offset): Fix

(This changelog was committed truncated like this...  If you can find a
fuller version, please use that, but don't bother too much).

> Adjust how variable vector extraction is done.
> 
> 	Backport from master
> 	2020-02-03  Michael Meissner  <meissner@linux.ibm.com>
> 
> 	* config/rs6000/rs6000.c (get_vector_offset): New helper function
> 	to calculate the offset in memory from the start of a vector of a
> 	particular element.  Add code to keep the element number in
> 	bounds if the element number is variable.
> 	(rs6000_adjust_vec_address): Move calculation of offset of the
> 	vector element to get_vector_offset.
> 	(rs6000_split_vec_extract_var): Do not do the initial AND of
> 	element here, move the code to get_vector_offset.
> 
> Fix bad code of vector extract of PC-relative address with variable element #.
> 
> 	Backport from master
> 	2020-01-06  Michael Meissner  <meissner@linux.ibm.com>
> 
> 	* config/rs6000/vsx.md (vsx_extract_<mode>_var, VSX_D iterator):
> 	Use 'Q' for doing vector extract from memory.
> 	(vsx_extract_v4sf_var): Use 'Q' for doing vector extract from
> 	memory.
> 	(vsx_extract_<mode>_var, VSX_EXTRACT_I iterator): Use 'Q' for
> 	doing vector extract from memory.
> 	(vsx_extract_<mode>_<VS_scalar>mode_var): Use 'Q' for doing vector
> 	extract from memory.
> 
> Only for GCC 8:
> 
> 	* gcc.target/powerpc/vsx-vector-6-le.c: Update fragile insn count.

All those are okay to backport.  Thanks!


Segher
Peter Bergner Feb. 24, 2020, 4:19 a.m. UTC | #6
On 2/22/20 6:49 PM, Segher Boessenkool wrote:
> On Sat, Feb 22, 2020 at 06:25:43PM -0600, Peter Bergner wrote:
>> Is this still ok for GCC 8 & 9?
> 
> Please do each of those backports as separate commits (so that if there
> is a problem with them, we can bisect it; it also should be easier to do
> this way, even :-) )

Ok, done.  Although, I committed the get_vector_offset() addition and
typo fix together, since they are tied to each other.



>> 	PR target/93568
>> 	* config/rs6000/rs6000.c (get_vector_offset): Fix
> 
> (This changelog was committed truncated like this...  If you can find a
> fuller version, please use that, but don't bother too much).

I couldn't find any non-truncated text for the ChangeLog entry,
so I just wrote some verbiage for it myself.



> All those are okay to backport.  Thanks!

Committed and pushed as 3 separate commits to each release branch.
Thanks!

Peter
diff mbox series

Patch

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index f34e1ba70c6..9910b27ed24 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -8808,7 +8808,7 @@  rs6000_legitimate_address_p (machine_mode mode, rtx x, bool reg_ok_strict)
   bool quad_offset_p = mode_supports_dq_form (mode);
 
   /* If this is an unaligned stvx/ldvx type address, discard the outer AND.  */
-  if (VECTOR_MEM_ALTIVEC_P (mode)
+  if (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode)
       && GET_CODE (x) == AND
       && CONST_INT_P (XEXP (x, 1))
       && INTVAL (XEXP (x, 1)) == -16)
diff --git a/gcc/testsuite/gcc.target/powerpc/pr93658.c b/gcc/testsuite/gcc.target/powerpc/pr93658.c
new file mode 100644
index 00000000000..0170d34d259
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr93658.c
@@ -0,0 +1,20 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O3 -fstack-protector-strong -mcpu=power8" } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+
+/* PR93658: Failure compiling this test is an infinite loop in LRA's
+   process_address(), so set a short timeout limit.  */
+/* { dg-timeout 5 } */
+
+void bar();
+char b;
+void
+foo (void)
+{
+  char a;
+  int d = b;
+  char *e = &a;
+  while (d)
+    *e++ = --d;
+  bar ();
+}