diff mbox series

[committed] match.pd: Don't optimize vector X + (X << C) -> X * (1 + (1 << C)) if there is no mult support [PR99544]

Message ID 20210313080158.GD231854@tucnak
State New
Headers show
Series [committed] match.pd: Don't optimize vector X + (X << C) -> X * (1 + (1 << C)) if there is no mult support [PR99544] | expand

Commit Message

Jakub Jelinek March 13, 2021, 8:01 a.m. UTC
Hi!

E.g. on aarch64, the target has V2DImode addition and shift by scalar
optabs, but doesn't have V2DImode multiply.  The following testcase
ICEs because this simplification is done after last lowering, but
generally, even if it is done before that, turning it into a multiplication
will not be an improvement because that means scalarization, while the former
can be done in vectors.

It would be nice if we added expansion support for vector multiplication
by uniform constants using shifts and additions like we have for scalar
multiplication, but that is something that can be done in stage1.

Bootstrapped/regtested on aarch64-linux, x86_64-linux and i686-linux,
acked by Richi in the PR, committed to trunk.

2021-03-13  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/99544
	* match.pd (X + (X << C) -> X * (1 + (1 << C))): Don't simplify
	if for vector types multiplication can't be done in type's mode.

	* gcc.dg/gomp/pr99544.c: New test.



	Jakub
diff mbox series

Patch

--- gcc/match.pd.jj	2021-02-25 10:22:39.740401251 +0100
+++ gcc/match.pd	2021-03-12 11:51:08.375897831 +0100
@@ -2788,7 +2788,10 @@  (define_operator_list COND_TERNARY
  (plus:c @0 (lshift:s @0 INTEGER_CST@1))
   (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0))
        && tree_fits_uhwi_p (@1)
-       && tree_to_uhwi (@1) < element_precision (type))
+       && tree_to_uhwi (@1) < element_precision (type)
+       && (INTEGRAL_TYPE_P (TREE_TYPE (@0))
+	   || optab_handler (smul_optab,
+			     TYPE_MODE (type)) != CODE_FOR_nothing))
    (with { tree t = type;
 	   if (!TYPE_OVERFLOW_WRAPS (t)) t = unsigned_type_for (t);
 	   wide_int w = wi::set_bit_in_zero (tree_to_uhwi (@1),
@@ -2804,7 +2807,10 @@  (define_operator_list COND_TERNARY
        && tree_fits_uhwi_p (@1)
        && tree_to_uhwi (@1) < element_precision (type)
        && tree_fits_uhwi_p (@2)
-       && tree_to_uhwi (@2) < element_precision (type))
+       && tree_to_uhwi (@2) < element_precision (type)
+       && (INTEGRAL_TYPE_P (TREE_TYPE (@0))
+	   || optab_handler (smul_optab,
+			     TYPE_MODE (type)) != CODE_FOR_nothing))
    (with { tree t = type;
 	   if (!TYPE_OVERFLOW_WRAPS (t)) t = unsigned_type_for (t);
 	   unsigned int prec = element_precision (type);
--- gcc/testsuite/gcc.dg/gomp/pr99544.c.jj	2021-03-12 11:48:06.551906424 +0100
+++ gcc/testsuite/gcc.dg/gomp/pr99544.c	2021-03-12 11:49:32.020961796 +0100
@@ -0,0 +1,13 @@ 
+/* PR tree-optimization/99544 */
+/* { dg-do compile } */
+/* { dg-options "-Os -fopenmp" } */
+
+long
+foo (long a, long b, long c)
+{
+  long d, e;
+  #pragma omp teams distribute parallel for simd firstprivate (a, b, c) lastprivate(e)
+  for (d = a; d < b; d++)
+    e = c + d * 5;
+  return e;
+}