diff mbox series

[committed] aarch64: Fix address mode for vec_concat pattern [PR100305]

Message ID mpto8dybb5q.fsf@arm.com
State New
Headers show
Series [committed] aarch64: Fix address mode for vec_concat pattern [PR100305] | expand

Commit Message

Richard Sandiford April 28, 2021, 4:56 p.m. UTC
The load_pair_lanes<mode> patterns match a vec_concat of two
adjacent 64-bit memory locations as a single 128-bit load.
The Utq constraint made sure that the address was suitable
for a 128-bit vector, but this meant that it allowed some
addresses that aren't valid for the 64-bit element mode.

Two obvious fixes were:

(1) Continue to accept addresses that aren't valid for the element
    modes.  This would mean changing the mode of operands[1] before
    printing it.  It would also mean using a custom predicate instead
    of the current memory_operand.

(2) Restrict addresses to the intersection of those that are valid
    element and vector addresses.

The problem with (1) is that, as well as being more complicated,
it doesn't deal with the fact that we still have a memory_operand
for the second element.  If we encourage the first operand to be
outside the range of a normal element memory_operand, we'll have
to reload the second operand to make it valid.  This reload will
often be dead code, but will be kept around because the RTL
pattern makes it look as though the second element address
is still needed.

This patch therefore does (2) instead.

As mentioned in the PR notes, I think we have a general problem
with the way that the aarch64 port deals with paired addresses.
There's nothing to guarantee that the two addresses will be
reloaded in a way that keeps them “obviously” adjacent, so the
rtx_equal_p conditions could fail if something rechecked them
later.

For this particular pattern, I think it would be better to teach
simplify-rtx.c to fold the vec_concat to a normal vector memory
reference, to remove any suggestion that targets should try to
match the unsimplified form.  That obviously wouldn't be suitable
for backports though.

Tested on aarch64-linux-gnu.  Pushed to trunk so far, will backport
to GCC 11 and GCC 10 too.

Richard


gcc/
	PR target/100305
	* config/aarch64/constraints.md (Utq): Require the address to
	be valid for both the element mode and for V2DImode.

gcc/testsuite/
	PR target/100305
	* gcc.c-torture/compile/pr100305.c: New test.
---
 gcc/config/aarch64/constraints.md              |  2 ++
 gcc/testsuite/gcc.c-torture/compile/pr100305.c | 13 +++++++++++++
 2 files changed, 15 insertions(+)
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr100305.c
diff mbox series

Patch

diff --git a/gcc/config/aarch64/constraints.md b/gcc/config/aarch64/constraints.md
index fd3e925e9a3..3b49b452119 100644
--- a/gcc/config/aarch64/constraints.md
+++ b/gcc/config/aarch64/constraints.md
@@ -327,6 +327,8 @@  (define_relaxed_memory_constraint "Utq"
   "@internal
    An address valid for loading or storing a 128-bit AdvSIMD register"
   (and (match_code "mem")
+       (match_test "aarch64_legitimate_address_p (GET_MODE (op),
+						  XEXP (op, 0), 1)")
        (match_test "aarch64_legitimate_address_p (V2DImode,
 						  XEXP (op, 0), 1)")))
 
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr100305.c b/gcc/testsuite/gcc.c-torture/compile/pr100305.c
new file mode 100644
index 00000000000..e098b90b589
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr100305.c
@@ -0,0 +1,13 @@ 
+/* { dg-options "-O" } */
+
+typedef double v2df __attribute__((vector_size(16)));
+
+#define N 4096
+void consume (void *);
+v2df
+foo (void)
+{
+  double x[N+2];
+  consume (x);
+  return (v2df) { x[N], x[N + 1] };
+}