Message ID | 20101202203503.GV24280@codesourcery.com |
---|---|
State | New |
Headers | show |
On 2 Dec 2010, at 20:35, "Nathan Froyd" <froydnj@codesourcery.com> wrote: > The simple testcase below: > > void neon_internal_error(int32x4_t *dst, char *src) > { > *dst = *(int32x4_t *)(src+1008); > } > > ICEs in change_address_1 during assembly output. The offending call > originates in output_move_neon from the adjust_address call below: > > for (i = 0; i < nregs; i++) > { > /* We're only using DImode here because it's a convenient size. */ > ops[0] = gen_rtx_REG (DImode, REGNO (reg) + 2 * i); > ops[1] = adjust_address (mem, DImode, 8 * i); > if (reg_overlap_mentioned_p (ops[0], mem)) > { > gcc_assert (overlap == -1); > overlap = i; > } > else > { > sprintf (buff, "v%sr%%?\t%%P0, %%1", load ? "ld" : "st"); > output_asm_insn (buff, ops); > } > } > > when we attempt to decompose a NEON quad mode move into two double mode > moves. > > The problem is that we want to generate the address [X:SI+1016]:DI, > which is, according to arm_legitimate_address_p, not a legitimate > address: > > if (TARGET_NEON > && (VALID_NEON_DREG_MODE (mode) || VALID_NEON_QREG_MODE (mode))) > return (code == CONST_INT > && INTVAL (index) < 1016 > && INTVAL (index) > -1024 > && (INTVAL (index) & 3) == 0); > > This test is not correct. V{LD,ST}R accept offsets evenly divisible by > 4 from -1020 to 1020. The above test is attempting to ensure that > whenever we have an address X+OFFSET for a quad mode, we can safely form the > address X+OFFSET+8 for the second (double mode) half of the quad mode > value. Hence the slightly smaller upper bound. But for double modes, > we don't have any such restriction, and we can safely permit the full > addressing range supported by the hardware. > > The patch below, therefore, splits the test in two, one for NEON double > modes and one for NEON quad modes, with appropriate ranges for each. > > Tested on arm-none-eabi. OK to commit? > > -Nathan > > gcc/ > * config/arm/arm.c (arm_legitimate_index_p): Split > VALID_NEON_QREG_MODE and VALID_NEON_DREG_MODE cases. Permit > slightly larger constants in the latter case. > (thumb2_legitimate_index_p): Likewise. > > gcc/testsuite/ > * gcc.target/arm/neon-offset-1.c: New test. Ok. R.
On Thu, Dec 02, 2010 at 11:00:31PM +0000, Richard Earnshaw wrote: > On 2 Dec 2010, at 20:35, "Nathan Froyd" <froydnj@codesourcery.com> wrote: > > gcc/ > > * config/arm/arm.c (arm_legitimate_index_p): Split > > VALID_NEON_QREG_MODE and VALID_NEON_DREG_MODE cases. Permit > > slightly larger constants in the latter case. > > (thumb2_legitimate_index_p): Likewise. > > > > gcc/testsuite/ > > * gcc.target/arm/neon-offset-1.c: New test. > > Ok. Thanks. Is this OK for branches, too, or just mainline? -Nathan
On Thu, 2010-12-02 at 19:27 -0500, Nathan Froyd wrote: > On Thu, Dec 02, 2010 at 11:00:31PM +0000, Richard Earnshaw wrote: > > On 2 Dec 2010, at 20:35, "Nathan Froyd" <froydnj@codesourcery.com> wrote: > > > gcc/ > > > * config/arm/arm.c (arm_legitimate_index_p): Split > > > VALID_NEON_QREG_MODE and VALID_NEON_DREG_MODE cases. Permit > > > slightly larger constants in the latter case. > > > (thumb2_legitimate_index_p): Likewise. > > > > > > gcc/testsuite/ > > > * gcc.target/arm/neon-offset-1.c: New test. > > > > Ok. > > Thanks. Is this OK for branches, too, or just mainline? > > -Nathan If the bug can be reproduced on the branch, then yes, this is ok there too. If not, then please stick to mainline only. R.
Index: gcc/config/arm/arm.c =================================================================== --- gcc/config/arm/arm.c (revision 167383) +++ gcc/config/arm/arm.c (working copy) @@ -5649,13 +5649,25 @@ arm_legitimate_index_p (enum machine_mod && INTVAL (index) > -1024 && (INTVAL (index) & 3) == 0); - if (TARGET_NEON - && (VALID_NEON_DREG_MODE (mode) || VALID_NEON_QREG_MODE (mode))) + /* For quad modes, we restrict the constant offset to be slightly less + than what the instruction format permits. We do this because for + quad mode moves, we will actually decompose them into two separate + double-mode reads or writes. INDEX must therefore be a valid + (double-mode) offset and so should INDEX+8. */ + if (TARGET_NEON && VALID_NEON_QREG_MODE (mode)) return (code == CONST_INT && INTVAL (index) < 1016 && INTVAL (index) > -1024 && (INTVAL (index) & 3) == 0); + /* We have no such constraint on double mode offsets, so we permit the + full range of the instruction format. */ + if (TARGET_NEON && VALID_NEON_DREG_MODE (mode)) + return (code == CONST_INT + && INTVAL (index) < 1024 + && INTVAL (index) > -1024 + && (INTVAL (index) & 3) == 0); + if (TARGET_REALLY_IWMMXT && VALID_IWMMXT_REG_MODE (mode)) return (code == CONST_INT && INTVAL (index) < 1024 @@ -5769,13 +5781,25 @@ thumb2_legitimate_index_p (enum machine_ && (INTVAL (index) & 3) == 0); } - if (TARGET_NEON - && (VALID_NEON_DREG_MODE (mode) || VALID_NEON_QREG_MODE (mode))) + /* For quad modes, we restrict the constant offset to be slightly less + than what the instruction format permits. We do this because for + quad mode moves, we will actually decompose them into two separate + double-mode reads or writes. INDEX must therefore be a valid + (double-mode) offset and so should INDEX+8. */ + if (TARGET_NEON && VALID_NEON_QREG_MODE (mode)) return (code == CONST_INT && INTVAL (index) < 1016 && INTVAL (index) > -1024 && (INTVAL (index) & 3) == 0); + /* We have no such constraint on double mode offsets, so we permit the + full range of the instruction format. */ + if (TARGET_NEON && VALID_NEON_DREG_MODE (mode)) + return (code == CONST_INT + && INTVAL (index) < 1024 + && INTVAL (index) > -1024 + && (INTVAL (index) & 3) == 0); + if (arm_address_register_rtx_p (index, strict_p) && (GET_MODE_SIZE (mode) <= 4)) return 1; Index: gcc/testsuite/gcc.target/arm/neon-offset-1.c =================================================================== --- gcc/testsuite/gcc.target/arm/neon-offset-1.c (revision 0) +++ gcc/testsuite/gcc.target/arm/neon-offset-1.c (revision 0) @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-options "-O1" } */ +/* { dg-add-options arm_neon } */ + +#include <arm_neon.h> + +void neon_internal_error(int32x4_t *dst, char *src) +{ + *dst = *(int32x4_t *)(src+1008); +}