Patchwork [ARM] fix ICE with NEON quad modes and immediate offsets

login
register
mail settings
Submitter Nathan Froyd
Date Dec. 2, 2010, 8:35 p.m.
Message ID <20101202203503.GV24280@codesourcery.com>
Download mbox | patch
Permalink /patch/74021/
State New
Headers show

Comments

Nathan Froyd - Dec. 2, 2010, 8:35 p.m.
The simple testcase below:

void neon_internal_error(int32x4_t *dst, char *src)
{
  *dst = *(int32x4_t *)(src+1008);
}

ICEs in change_address_1 during assembly output.  The offending call
originates in output_move_neon from the adjust_address call below:

	for (i = 0; i < nregs; i++)
	  {
	    /* We're only using DImode here because it's a convenient size.  */
	    ops[0] = gen_rtx_REG (DImode, REGNO (reg) + 2 * i);
	    ops[1] = adjust_address (mem, DImode, 8 * i);
	    if (reg_overlap_mentioned_p (ops[0], mem))
	      {
		gcc_assert (overlap == -1);
		overlap = i;
	      }
	    else
	      {
		sprintf (buff, "v%sr%%?\t%%P0, %%1", load ? "ld" : "st");
		output_asm_insn (buff, ops);
	      }
	  }

when we attempt to decompose a NEON quad mode move into two double mode
moves.

The problem is that we want to generate the address [X:SI+1016]:DI,
which is, according to arm_legitimate_address_p, not a legitimate
address:

  if (TARGET_NEON
      && (VALID_NEON_DREG_MODE (mode) || VALID_NEON_QREG_MODE (mode)))
    return (code == CONST_INT
	    && INTVAL (index) < 1016
	    && INTVAL (index) > -1024
	    && (INTVAL (index) & 3) == 0);

This test is not correct.  V{LD,ST}R accept offsets evenly divisible by
4 from -1020 to 1020.  The above test is attempting to ensure that
whenever we have an address X+OFFSET for a quad mode, we can safely form the
address X+OFFSET+8 for the second (double mode) half of the quad mode
value.  Hence the slightly smaller upper bound.  But for double modes,
we don't have any such restriction, and we can safely permit the full
addressing range supported by the hardware.

The patch below, therefore, splits the test in two, one for NEON double
modes and one for NEON quad modes, with appropriate ranges for each.

Tested on arm-none-eabi.  OK to commit?

-Nathan

gcc/
	* config/arm/arm.c (arm_legitimate_index_p): Split
	VALID_NEON_QREG_MODE and VALID_NEON_DREG_MODE cases.  Permit
	slightly larger constants in the latter case.
	(thumb2_legitimate_index_p): Likewise.

gcc/testsuite/
	* gcc.target/arm/neon-offset-1.c: New test.
Richard Earnshaw - Dec. 2, 2010, 11 p.m.
On 2 Dec 2010, at 20:35, "Nathan Froyd" <froydnj@codesourcery.com> wrote:

> The simple testcase below:
> 
> void neon_internal_error(int32x4_t *dst, char *src)
> {
>  *dst = *(int32x4_t *)(src+1008);
> }
> 
> ICEs in change_address_1 during assembly output.  The offending call
> originates in output_move_neon from the adjust_address call below:
> 
>    for (i = 0; i < nregs; i++)
>      {
>        /* We're only using DImode here because it's a convenient size.  */
>        ops[0] = gen_rtx_REG (DImode, REGNO (reg) + 2 * i);
>        ops[1] = adjust_address (mem, DImode, 8 * i);
>        if (reg_overlap_mentioned_p (ops[0], mem))
>          {
>        gcc_assert (overlap == -1);
>        overlap = i;
>          }
>        else
>          {
>        sprintf (buff, "v%sr%%?\t%%P0, %%1", load ? "ld" : "st");
>        output_asm_insn (buff, ops);
>          }
>      }
> 
> when we attempt to decompose a NEON quad mode move into two double mode
> moves.
> 
> The problem is that we want to generate the address [X:SI+1016]:DI,
> which is, according to arm_legitimate_address_p, not a legitimate
> address:
> 
>  if (TARGET_NEON
>      && (VALID_NEON_DREG_MODE (mode) || VALID_NEON_QREG_MODE (mode)))
>    return (code == CONST_INT
>        && INTVAL (index) < 1016
>        && INTVAL (index) > -1024
>        && (INTVAL (index) & 3) == 0);
> 
> This test is not correct.  V{LD,ST}R accept offsets evenly divisible by
> 4 from -1020 to 1020.  The above test is attempting to ensure that
> whenever we have an address X+OFFSET for a quad mode, we can safely form the
> address X+OFFSET+8 for the second (double mode) half of the quad mode
> value.  Hence the slightly smaller upper bound.  But for double modes,
> we don't have any such restriction, and we can safely permit the full
> addressing range supported by the hardware.
> 
> The patch below, therefore, splits the test in two, one for NEON double
> modes and one for NEON quad modes, with appropriate ranges for each.
> 
> Tested on arm-none-eabi.  OK to commit?
> 
> -Nathan
> 
> gcc/
>    * config/arm/arm.c (arm_legitimate_index_p): Split
>    VALID_NEON_QREG_MODE and VALID_NEON_DREG_MODE cases.  Permit
>    slightly larger constants in the latter case.
>    (thumb2_legitimate_index_p): Likewise.
> 
> gcc/testsuite/
>    * gcc.target/arm/neon-offset-1.c: New test.

Ok. 

R.
Nathan Froyd - Dec. 3, 2010, 12:27 a.m.
On Thu, Dec 02, 2010 at 11:00:31PM +0000, Richard Earnshaw wrote:
> On 2 Dec 2010, at 20:35, "Nathan Froyd" <froydnj@codesourcery.com> wrote:
> > gcc/
> >    * config/arm/arm.c (arm_legitimate_index_p): Split
> >    VALID_NEON_QREG_MODE and VALID_NEON_DREG_MODE cases.  Permit
> >    slightly larger constants in the latter case.
> >    (thumb2_legitimate_index_p): Likewise.
> > 
> > gcc/testsuite/
> >    * gcc.target/arm/neon-offset-1.c: New test.
> 
> Ok. 

Thanks.  Is this OK for branches, too, or just mainline?

-Nathan
Richard Earnshaw - Dec. 6, 2010, 1:58 p.m.
On Thu, 2010-12-02 at 19:27 -0500, Nathan Froyd wrote:
> On Thu, Dec 02, 2010 at 11:00:31PM +0000, Richard Earnshaw wrote:
> > On 2 Dec 2010, at 20:35, "Nathan Froyd" <froydnj@codesourcery.com> wrote:
> > > gcc/
> > >    * config/arm/arm.c (arm_legitimate_index_p): Split
> > >    VALID_NEON_QREG_MODE and VALID_NEON_DREG_MODE cases.  Permit
> > >    slightly larger constants in the latter case.
> > >    (thumb2_legitimate_index_p): Likewise.
> > > 
> > > gcc/testsuite/
> > >    * gcc.target/arm/neon-offset-1.c: New test.
> > 
> > Ok. 
> 
> Thanks.  Is this OK for branches, too, or just mainline?
> 
> -Nathan

If the bug can be reproduced on the branch, then yes, this is ok there
too.  If not, then please stick to mainline only.

R.

Patch

Index: gcc/config/arm/arm.c
===================================================================
--- gcc/config/arm/arm.c	(revision 167383)
+++ gcc/config/arm/arm.c	(working copy)
@@ -5649,13 +5649,25 @@  arm_legitimate_index_p (enum machine_mod
 	    && INTVAL (index) > -1024
 	    && (INTVAL (index) & 3) == 0);
 
-  if (TARGET_NEON
-      && (VALID_NEON_DREG_MODE (mode) || VALID_NEON_QREG_MODE (mode)))
+  /* For quad modes, we restrict the constant offset to be slightly less
+     than what the instruction format permits.  We do this because for
+     quad mode moves, we will actually decompose them into two separate
+     double-mode reads or writes.  INDEX must therefore be a valid
+     (double-mode) offset and so should INDEX+8.  */
+  if (TARGET_NEON && VALID_NEON_QREG_MODE (mode))
     return (code == CONST_INT
 	    && INTVAL (index) < 1016
 	    && INTVAL (index) > -1024
 	    && (INTVAL (index) & 3) == 0);
 
+  /* We have no such constraint on double mode offsets, so we permit the
+     full range of the instruction format.  */
+  if (TARGET_NEON && VALID_NEON_DREG_MODE (mode))
+    return (code == CONST_INT
+	    && INTVAL (index) < 1024
+	    && INTVAL (index) > -1024
+	    && (INTVAL (index) & 3) == 0);
+
   if (TARGET_REALLY_IWMMXT && VALID_IWMMXT_REG_MODE (mode))
     return (code == CONST_INT
 	    && INTVAL (index) < 1024
@@ -5769,13 +5781,25 @@  thumb2_legitimate_index_p (enum machine_
 		&& (INTVAL (index) & 3) == 0);
     }
 
-  if (TARGET_NEON
-      && (VALID_NEON_DREG_MODE (mode) || VALID_NEON_QREG_MODE (mode)))
+  /* For quad modes, we restrict the constant offset to be slightly less
+     than what the instruction format permits.  We do this because for
+     quad mode moves, we will actually decompose them into two separate
+     double-mode reads or writes.  INDEX must therefore be a valid
+     (double-mode) offset and so should INDEX+8.  */
+  if (TARGET_NEON && VALID_NEON_QREG_MODE (mode))
     return (code == CONST_INT
 	    && INTVAL (index) < 1016
 	    && INTVAL (index) > -1024
 	    && (INTVAL (index) & 3) == 0);
 
+  /* We have no such constraint on double mode offsets, so we permit the
+     full range of the instruction format.  */
+  if (TARGET_NEON && VALID_NEON_DREG_MODE (mode))
+    return (code == CONST_INT
+	    && INTVAL (index) < 1024
+	    && INTVAL (index) > -1024
+	    && (INTVAL (index) & 3) == 0);
+
   if (arm_address_register_rtx_p (index, strict_p)
       && (GET_MODE_SIZE (mode) <= 4))
     return 1;
Index: gcc/testsuite/gcc.target/arm/neon-offset-1.c
===================================================================
--- gcc/testsuite/gcc.target/arm/neon-offset-1.c	(revision 0)
+++ gcc/testsuite/gcc.target/arm/neon-offset-1.c	(revision 0)
@@ -0,0 +1,11 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-O1" } */
+/* { dg-add-options arm_neon } */
+
+#include <arm_neon.h>
+
+void neon_internal_error(int32x4_t *dst, char *src)
+{
+  *dst = *(int32x4_t *)(src+1008);
+}