PR91166 - Unfolded ZIPs of constants
diff mbox series

Message ID CAAgBjM=yhvx93aAZE8Q9Y2fkY5AQrvujJKDoCBc3eKPYmROUoQ@mail.gmail.com
State New
Headers show
Series
  • PR91166 - Unfolded ZIPs of constants
Related show

Commit Message

Prathamesh Kulkarni July 17, 2019, 11:55 a.m. UTC
Hi,
The attached patch tries to fix PR91166.
Does it look OK ?
Bootstrap+test in progress on aarch64-linux-gnu and x86_64-unknown-linux-gnu.

Thanks,
Prathamesh
2019-07-17  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>

	PR middle-end/91166
	* match.pd (vec_perm_expr(v, v, mask) -> v): New pattern.
	(define_predicates): Add entry for uniform_vector_p.

testsuite/
	* gcc.target/aarch64/sve/pr91166.c: New test.

Comments

Richard Sandiford July 19, 2019, 12:42 p.m. UTC | #1
Not really my area, but FWIW...

Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes:
> Hi,
> The attached patch tries to fix PR91166.
> Does it look OK ?
> Bootstrap+test in progress on aarch64-linux-gnu and x86_64-unknown-linux-gnu.
>
> Thanks,
> Prathamesh
>
> 2019-07-17  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>
>
> 	PR middle-end/91166
> 	* match.pd (vec_perm_expr(v, v, mask) -> v): New pattern.
> 	(define_predicates): Add entry for uniform_vector_p.
>
> testsuite/
> 	* gcc.target/aarch64/sve/pr91166.c: New test.
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 4a7aa0185d8..2ad98c28fd8 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -36,7 +36,8 @@ along with GCC; see the file COPYING3.  If not see
>     integer_valued_real_p
>     integer_pow2p
>     uniform_integer_cst_p
> -   HONOR_NANS)
> +   HONOR_NANS
> +   uniform_vector_p)
>  
>  /* Operator lists.  */
>  (define_operator_list tcc_comparison
> @@ -5568,3 +5569,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>           { bitsize_int (at * tree_to_uhwi (TYPE_SIZE (TREE_TYPE (type)))); })
>         (if (changed)
>          (vec_perm { op0; } { op1; } { op2; }))))))))))
> +
> +/* VEC_PERM_EXPR (v, v, mask) -> v where v contains same element.  */
> +(simplify
> + (vec_perm (vec_duplicate@0 @1) @0 @2)
> + { @0; })
> +
> +(simplify
> + (vec_perm uniform_vector_p@0 @0 @1)
> + { @0; }) 

No need for the curly braces here, can use "@0" as the target of
the simplification.

It'd probably be worth using (match ...) to define a new predicate
that handles (vec_duplicate ...), VECTOR_CST and CONSTRUCTOR,
calling into uniform_vector_p for the latter two.

Thanks,
Richard

> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c
> new file mode 100644
> index 00000000000..42654be3b31
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c
> @@ -0,0 +1,20 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -march=armv8.2-a+sve -fdump-tree-optimized" } */
> +
> +void
> +f1 (double x[][4]) 
> +{
> +  for (int i = 0; i < 4; ++i)
> +    for (int j = 0; j < 4; ++j)
> +      x[i][j] = 0;
> +}
> +
> +void
> +f2 (double x[][4], double y)
> +{
> +  for (int i = 0; i < 4; ++i)
> +    for (int j = 0; j < 4; ++j)
> +      x[i][j] = y;
> +}
> +
> +/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized"} } */
Prathamesh Kulkarni July 23, 2019, 10:34 a.m. UTC | #2
On Fri, 19 Jul 2019 at 18:12, Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Not really my area, but FWIW...
>
> Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes:
> > Hi,
> > The attached patch tries to fix PR91166.
> > Does it look OK ?
> > Bootstrap+test in progress on aarch64-linux-gnu and x86_64-unknown-linux-gnu.
> >
> > Thanks,
> > Prathamesh
> >
> > 2019-07-17  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>
> >
> >       PR middle-end/91166
> >       * match.pd (vec_perm_expr(v, v, mask) -> v): New pattern.
> >       (define_predicates): Add entry for uniform_vector_p.
> >
> > testsuite/
> >       * gcc.target/aarch64/sve/pr91166.c: New test.
> >
> > diff --git a/gcc/match.pd b/gcc/match.pd
> > index 4a7aa0185d8..2ad98c28fd8 100644
> > --- a/gcc/match.pd
> > +++ b/gcc/match.pd
> > @@ -36,7 +36,8 @@ along with GCC; see the file COPYING3.  If not see
> >     integer_valued_real_p
> >     integer_pow2p
> >     uniform_integer_cst_p
> > -   HONOR_NANS)
> > +   HONOR_NANS
> > +   uniform_vector_p)
> >
> >  /* Operator lists.  */
> >  (define_operator_list tcc_comparison
> > @@ -5568,3 +5569,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >           { bitsize_int (at * tree_to_uhwi (TYPE_SIZE (TREE_TYPE (type)))); })
> >         (if (changed)
> >          (vec_perm { op0; } { op1; } { op2; }))))))))))
> > +
> > +/* VEC_PERM_EXPR (v, v, mask) -> v where v contains same element.  */
> > +(simplify
> > + (vec_perm (vec_duplicate@0 @1) @0 @2)
> > + { @0; })
> > +
> > +(simplify
> > + (vec_perm uniform_vector_p@0 @0 @1)
> > + { @0; })
>
> No need for the curly braces here, can use "@0" as the target of
> the simplification.
>
> It'd probably be worth using (match ...) to define a new predicate
> that handles (vec_duplicate ...), VECTOR_CST and CONSTRUCTOR,
> calling into uniform_vector_p for the latter two.
Hi,
Thanks for the suggestions.
Does this version look OK ?

Thanks,
Prathamesh

>
> Thanks,
> Richard
>
> > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c
> > new file mode 100644
> > index 00000000000..42654be3b31
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c
> > @@ -0,0 +1,20 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O3 -march=armv8.2-a+sve -fdump-tree-optimized" } */
> > +
> > +void
> > +f1 (double x[][4])
> > +{
> > +  for (int i = 0; i < 4; ++i)
> > +    for (int j = 0; j < 4; ++j)
> > +      x[i][j] = 0;
> > +}
> > +
> > +void
> > +f2 (double x[][4], double y)
> > +{
> > +  for (int i = 0; i < 4; ++i)
> > +    for (int j = 0; j < 4; ++j)
> > +      x[i][j] = y;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized"} } */
2019-07-23  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>

	PR middle-end/91166
	* match.pd (vec_perm_expr(v, v, mask) -> v): New pattern.
	(define_predicates): Add entry for uniform_vector_p.
	(vec_same_elem_p): New match pattern.

testsuite/
	* gcc.target/aarch64/sve/pr91166.c: New test.

diff --git a/gcc/match.pd b/gcc/match.pd
index 4a7aa0185d8..f14670a7982 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -36,7 +36,8 @@ along with GCC; see the file COPYING3.  If not see
    integer_valued_real_p
    integer_pow2p
    uniform_integer_cst_p
-   HONOR_NANS)
+   HONOR_NANS
+   uniform_vector_p)
 
 /* Operator lists.  */
 (define_operator_list tcc_comparison
@@ -5568,3 +5569,15 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
          { bitsize_int (at * tree_to_uhwi (TYPE_SIZE (TREE_TYPE (type)))); })
        (if (changed)
         (vec_perm { op0; } { op1; } { op2; }))))))))))
+
+/* VEC_PERM_EXPR (v, v, mask) -> v where v contains same element.  */
+
+(match (vec_same_elem_p @0)
+ uniform_vector_p@0)
+
+(match (vec_same_elem_p @0)
+ (vec_duplicate @0))
+
+(simplify
+ (vec_perm (vec_same_elem_p@0 @1) @0 @2)
+ @0)
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c
new file mode 100644
index 00000000000..42654be3b31
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=armv8.2-a+sve -fdump-tree-optimized" } */
+
+void
+f1 (double x[][4]) 
+{
+  for (int i = 0; i < 4; ++i)
+    for (int j = 0; j < 4; ++j)
+      x[i][j] = 0;
+}
+
+void
+f2 (double x[][4], double y)
+{
+  for (int i = 0; i < 4; ++i)
+    for (int j = 0; j < 4; ++j)
+      x[i][j] = y;
+}
+
+/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized"} } */
Richard Biener July 23, 2019, 11:06 a.m. UTC | #3
On Tue, 23 Jul 2019, Prathamesh Kulkarni wrote:

> On Fri, 19 Jul 2019 at 18:12, Richard Sandiford
> <richard.sandiford@arm.com> wrote:
> >
> > Not really my area, but FWIW...
> >
> > Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes:
> > > Hi,
> > > The attached patch tries to fix PR91166.
> > > Does it look OK ?
> > > Bootstrap+test in progress on aarch64-linux-gnu and x86_64-unknown-linux-gnu.
> > >
> > > Thanks,
> > > Prathamesh
> > >
> > > 2019-07-17  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>
> > >
> > >       PR middle-end/91166
> > >       * match.pd (vec_perm_expr(v, v, mask) -> v): New pattern.
> > >       (define_predicates): Add entry for uniform_vector_p.
> > >
> > > testsuite/
> > >       * gcc.target/aarch64/sve/pr91166.c: New test.
> > >
> > > diff --git a/gcc/match.pd b/gcc/match.pd
> > > index 4a7aa0185d8..2ad98c28fd8 100644
> > > --- a/gcc/match.pd
> > > +++ b/gcc/match.pd
> > > @@ -36,7 +36,8 @@ along with GCC; see the file COPYING3.  If not see
> > >     integer_valued_real_p
> > >     integer_pow2p
> > >     uniform_integer_cst_p
> > > -   HONOR_NANS)
> > > +   HONOR_NANS
> > > +   uniform_vector_p)
> > >
> > >  /* Operator lists.  */
> > >  (define_operator_list tcc_comparison
> > > @@ -5568,3 +5569,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > >           { bitsize_int (at * tree_to_uhwi (TYPE_SIZE (TREE_TYPE (type)))); })
> > >         (if (changed)
> > >          (vec_perm { op0; } { op1; } { op2; }))))))))))
> > > +
> > > +/* VEC_PERM_EXPR (v, v, mask) -> v where v contains same element.  */
> > > +(simplify
> > > + (vec_perm (vec_duplicate@0 @1) @0 @2)
> > > + { @0; })
> > > +
> > > +(simplify
> > > + (vec_perm uniform_vector_p@0 @0 @1)
> > > + { @0; })
> >
> > No need for the curly braces here, can use "@0" as the target of
> > the simplification.
> >
> > It'd probably be worth using (match ...) to define a new predicate
> > that handles (vec_duplicate ...), VECTOR_CST and CONSTRUCTOR,
> > calling into uniform_vector_p for the latter two.
> Hi,
> Thanks for the suggestions.
> Does this version look OK ?

Can you write

+(simplify
+ (vec_perm (vec_same_elem_p@0 @1) @0 @2)
+ @0)

as

 (vec_perm vec_same_elem_p@0 @0 @1)

?

Otherwise looks OK.

Thanks,
Richard.
 
> Thanks,
> Prathamesh
> 
> >
> > Thanks,
> > Richard
> >
> > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c
> > > new file mode 100644
> > > index 00000000000..42654be3b31
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c
> > > @@ -0,0 +1,20 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O3 -march=armv8.2-a+sve -fdump-tree-optimized" } */
> > > +
> > > +void
> > > +f1 (double x[][4])
> > > +{
> > > +  for (int i = 0; i < 4; ++i)
> > > +    for (int j = 0; j < 4; ++j)
> > > +      x[i][j] = 0;
> > > +}
> > > +
> > > +void
> > > +f2 (double x[][4], double y)
> > > +{
> > > +  for (int i = 0; i < 4; ++i)
> > > +    for (int j = 0; j < 4; ++j)
> > > +      x[i][j] = y;
> > > +}
> > > +
> > > +/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized"} } */
>
Prathamesh Kulkarni July 23, 2019, 11:55 a.m. UTC | #4
On Tue, 23 Jul 2019 at 16:36, Richard Biener <rguenther@suse.de> wrote:
>
> On Tue, 23 Jul 2019, Prathamesh Kulkarni wrote:
>
> > On Fri, 19 Jul 2019 at 18:12, Richard Sandiford
> > <richard.sandiford@arm.com> wrote:
> > >
> > > Not really my area, but FWIW...
> > >
> > > Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes:
> > > > Hi,
> > > > The attached patch tries to fix PR91166.
> > > > Does it look OK ?
> > > > Bootstrap+test in progress on aarch64-linux-gnu and x86_64-unknown-linux-gnu.
> > > >
> > > > Thanks,
> > > > Prathamesh
> > > >
> > > > 2019-07-17  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>
> > > >
> > > >       PR middle-end/91166
> > > >       * match.pd (vec_perm_expr(v, v, mask) -> v): New pattern.
> > > >       (define_predicates): Add entry for uniform_vector_p.
> > > >
> > > > testsuite/
> > > >       * gcc.target/aarch64/sve/pr91166.c: New test.
> > > >
> > > > diff --git a/gcc/match.pd b/gcc/match.pd
> > > > index 4a7aa0185d8..2ad98c28fd8 100644
> > > > --- a/gcc/match.pd
> > > > +++ b/gcc/match.pd
> > > > @@ -36,7 +36,8 @@ along with GCC; see the file COPYING3.  If not see
> > > >     integer_valued_real_p
> > > >     integer_pow2p
> > > >     uniform_integer_cst_p
> > > > -   HONOR_NANS)
> > > > +   HONOR_NANS
> > > > +   uniform_vector_p)
> > > >
> > > >  /* Operator lists.  */
> > > >  (define_operator_list tcc_comparison
> > > > @@ -5568,3 +5569,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > > >           { bitsize_int (at * tree_to_uhwi (TYPE_SIZE (TREE_TYPE (type)))); })
> > > >         (if (changed)
> > > >          (vec_perm { op0; } { op1; } { op2; }))))))))))
> > > > +
> > > > +/* VEC_PERM_EXPR (v, v, mask) -> v where v contains same element.  */
> > > > +(simplify
> > > > + (vec_perm (vec_duplicate@0 @1) @0 @2)
> > > > + { @0; })
> > > > +
> > > > +(simplify
> > > > + (vec_perm uniform_vector_p@0 @0 @1)
> > > > + { @0; })
> > >
> > > No need for the curly braces here, can use "@0" as the target of
> > > the simplification.
> > >
> > > It'd probably be worth using (match ...) to define a new predicate
> > > that handles (vec_duplicate ...), VECTOR_CST and CONSTRUCTOR,
> > > calling into uniform_vector_p for the latter two.
> > Hi,
> > Thanks for the suggestions.
> > Does this version look OK ?
>
> Can you write
>
> +(simplify
> + (vec_perm (vec_same_elem_p@0 @1) @0 @2)
> + @0)
>
> as
>
>  (vec_perm vec_same_elem_p@0 @0 @1)
>
> ?
(simplify
 (vec_perm vec_same_elem_p@0 @0 @1)
 @0)

results in:
gimple-match.c: In function ‘bool
gimple_simplify_VEC_PERM_EXPR(gimple_match_op*, gimple**, tree_node*
(*)(tree), code_helper, tree, tree, tree, tree)’:
gimple-match.c:103031:36: error: cannot convert ‘tree_node* (*)(tree)’
{aka ‘tree_node* (*)(tree_node*)’} to ‘tree_node**’
   if (gimple_vec_same_elem_p (op0, valueize))
                                    ^~~~~~~~

because gimple_vec_same_elem_p has tree *res_ops as 2nd param and
we're passing valueize as 2nd arg.

Thanks,
Prathamesh
>
> Otherwise looks OK.
>
> Thanks,
> Richard.
>
> > Thanks,
> > Prathamesh
> >
> > >
> > > Thanks,
> > > Richard
> > >
> > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c
> > > > new file mode 100644
> > > > index 00000000000..42654be3b31
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c
> > > > @@ -0,0 +1,20 @@
> > > > +/* { dg-do compile } */
> > > > +/* { dg-options "-O3 -march=armv8.2-a+sve -fdump-tree-optimized" } */
> > > > +
> > > > +void
> > > > +f1 (double x[][4])
> > > > +{
> > > > +  for (int i = 0; i < 4; ++i)
> > > > +    for (int j = 0; j < 4; ++j)
> > > > +      x[i][j] = 0;
> > > > +}
> > > > +
> > > > +void
> > > > +f2 (double x[][4], double y)
> > > > +{
> > > > +  for (int i = 0; i < 4; ++i)
> > > > +    for (int j = 0; j < 4; ++j)
> > > > +      x[i][j] = y;
> > > > +}
> > > > +
> > > > +/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized"} } */
> >
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
> GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)
Richard Biener July 23, 2019, 12:18 p.m. UTC | #5
On Tue, 23 Jul 2019, Prathamesh Kulkarni wrote:

> On Tue, 23 Jul 2019 at 16:36, Richard Biener <rguenther@suse.de> wrote:
> >
> > On Tue, 23 Jul 2019, Prathamesh Kulkarni wrote:
> >
> > > On Fri, 19 Jul 2019 at 18:12, Richard Sandiford
> > > <richard.sandiford@arm.com> wrote:
> > > >
> > > > Not really my area, but FWIW...
> > > >
> > > > Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes:
> > > > > Hi,
> > > > > The attached patch tries to fix PR91166.
> > > > > Does it look OK ?
> > > > > Bootstrap+test in progress on aarch64-linux-gnu and x86_64-unknown-linux-gnu.
> > > > >
> > > > > Thanks,
> > > > > Prathamesh
> > > > >
> > > > > 2019-07-17  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>
> > > > >
> > > > >       PR middle-end/91166
> > > > >       * match.pd (vec_perm_expr(v, v, mask) -> v): New pattern.
> > > > >       (define_predicates): Add entry for uniform_vector_p.
> > > > >
> > > > > testsuite/
> > > > >       * gcc.target/aarch64/sve/pr91166.c: New test.
> > > > >
> > > > > diff --git a/gcc/match.pd b/gcc/match.pd
> > > > > index 4a7aa0185d8..2ad98c28fd8 100644
> > > > > --- a/gcc/match.pd
> > > > > +++ b/gcc/match.pd
> > > > > @@ -36,7 +36,8 @@ along with GCC; see the file COPYING3.  If not see
> > > > >     integer_valued_real_p
> > > > >     integer_pow2p
> > > > >     uniform_integer_cst_p
> > > > > -   HONOR_NANS)
> > > > > +   HONOR_NANS
> > > > > +   uniform_vector_p)
> > > > >
> > > > >  /* Operator lists.  */
> > > > >  (define_operator_list tcc_comparison
> > > > > @@ -5568,3 +5569,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > > > >           { bitsize_int (at * tree_to_uhwi (TYPE_SIZE (TREE_TYPE (type)))); })
> > > > >         (if (changed)
> > > > >          (vec_perm { op0; } { op1; } { op2; }))))))))))
> > > > > +
> > > > > +/* VEC_PERM_EXPR (v, v, mask) -> v where v contains same element.  */
> > > > > +(simplify
> > > > > + (vec_perm (vec_duplicate@0 @1) @0 @2)
> > > > > + { @0; })
> > > > > +
> > > > > +(simplify
> > > > > + (vec_perm uniform_vector_p@0 @0 @1)
> > > > > + { @0; })
> > > >
> > > > No need for the curly braces here, can use "@0" as the target of
> > > > the simplification.
> > > >
> > > > It'd probably be worth using (match ...) to define a new predicate
> > > > that handles (vec_duplicate ...), VECTOR_CST and CONSTRUCTOR,
> > > > calling into uniform_vector_p for the latter two.
> > > Hi,
> > > Thanks for the suggestions.
> > > Does this version look OK ?
> >
> > Can you write
> >
> > +(simplify
> > + (vec_perm (vec_same_elem_p@0 @1) @0 @2)
> > + @0)
> >
> > as
> >
> >  (vec_perm vec_same_elem_p@0 @0 @1)
> >
> > ?
> (simplify
>  (vec_perm vec_same_elem_p@0 @0 @1)
>  @0)
> 
> results in:
> gimple-match.c: In function ‘bool
> gimple_simplify_VEC_PERM_EXPR(gimple_match_op*, gimple**, tree_node*
> (*)(tree), code_helper, tree, tree, tree, tree)’:
> gimple-match.c:103031:36: error: cannot convert ‘tree_node* (*)(tree)’
> {aka ‘tree_node* (*)(tree_node*)’} to ‘tree_node**’
>    if (gimple_vec_same_elem_p (op0, valueize))
>                                     ^~~~~~~~
> 
> because gimple_vec_same_elem_p has tree *res_ops as 2nd param and
> we're passing valueize as 2nd arg.

Ah, you need the

(match vec_same_elem_p
 @0
 (if (uniform_vector_p (@0)))

(match vec_same_elem_p
 (vec_duplicate @0))

form then.

> Thanks,
> Prathamesh
> >
> > Otherwise looks OK.
> >
> > Thanks,
> > Richard.
> >
> > > Thanks,
> > > Prathamesh
> > >
> > > >
> > > > Thanks,
> > > > Richard
> > > >
> > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c
> > > > > new file mode 100644
> > > > > index 00000000000..42654be3b31
> > > > > --- /dev/null
> > > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c
> > > > > @@ -0,0 +1,20 @@
> > > > > +/* { dg-do compile } */
> > > > > +/* { dg-options "-O3 -march=armv8.2-a+sve -fdump-tree-optimized" } */
> > > > > +
> > > > > +void
> > > > > +f1 (double x[][4])
> > > > > +{
> > > > > +  for (int i = 0; i < 4; ++i)
> > > > > +    for (int j = 0; j < 4; ++j)
> > > > > +      x[i][j] = 0;
> > > > > +}
> > > > > +
> > > > > +void
> > > > > +f2 (double x[][4], double y)
> > > > > +{
> > > > > +  for (int i = 0; i < 4; ++i)
> > > > > +    for (int j = 0; j < 4; ++j)
> > > > > +      x[i][j] = y;
> > > > > +}
> > > > > +
> > > > > +/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized"} } */
> > >
> >
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
> > GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)
>
Prathamesh Kulkarni July 23, 2019, 12:38 p.m. UTC | #6
On Tue, 23 Jul 2019 at 17:48, Richard Biener <rguenther@suse.de> wrote:
>
> On Tue, 23 Jul 2019, Prathamesh Kulkarni wrote:
>
> > On Tue, 23 Jul 2019 at 16:36, Richard Biener <rguenther@suse.de> wrote:
> > >
> > > On Tue, 23 Jul 2019, Prathamesh Kulkarni wrote:
> > >
> > > > On Fri, 19 Jul 2019 at 18:12, Richard Sandiford
> > > > <richard.sandiford@arm.com> wrote:
> > > > >
> > > > > Not really my area, but FWIW...
> > > > >
> > > > > Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes:
> > > > > > Hi,
> > > > > > The attached patch tries to fix PR91166.
> > > > > > Does it look OK ?
> > > > > > Bootstrap+test in progress on aarch64-linux-gnu and x86_64-unknown-linux-gnu.
> > > > > >
> > > > > > Thanks,
> > > > > > Prathamesh
> > > > > >
> > > > > > 2019-07-17  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>
> > > > > >
> > > > > >       PR middle-end/91166
> > > > > >       * match.pd (vec_perm_expr(v, v, mask) -> v): New pattern.
> > > > > >       (define_predicates): Add entry for uniform_vector_p.
> > > > > >
> > > > > > testsuite/
> > > > > >       * gcc.target/aarch64/sve/pr91166.c: New test.
> > > > > >
> > > > > > diff --git a/gcc/match.pd b/gcc/match.pd
> > > > > > index 4a7aa0185d8..2ad98c28fd8 100644
> > > > > > --- a/gcc/match.pd
> > > > > > +++ b/gcc/match.pd
> > > > > > @@ -36,7 +36,8 @@ along with GCC; see the file COPYING3.  If not see
> > > > > >     integer_valued_real_p
> > > > > >     integer_pow2p
> > > > > >     uniform_integer_cst_p
> > > > > > -   HONOR_NANS)
> > > > > > +   HONOR_NANS
> > > > > > +   uniform_vector_p)
> > > > > >
> > > > > >  /* Operator lists.  */
> > > > > >  (define_operator_list tcc_comparison
> > > > > > @@ -5568,3 +5569,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > > > > >           { bitsize_int (at * tree_to_uhwi (TYPE_SIZE (TREE_TYPE (type)))); })
> > > > > >         (if (changed)
> > > > > >          (vec_perm { op0; } { op1; } { op2; }))))))))))
> > > > > > +
> > > > > > +/* VEC_PERM_EXPR (v, v, mask) -> v where v contains same element.  */
> > > > > > +(simplify
> > > > > > + (vec_perm (vec_duplicate@0 @1) @0 @2)
> > > > > > + { @0; })
> > > > > > +
> > > > > > +(simplify
> > > > > > + (vec_perm uniform_vector_p@0 @0 @1)
> > > > > > + { @0; })
> > > > >
> > > > > No need for the curly braces here, can use "@0" as the target of
> > > > > the simplification.
> > > > >
> > > > > It'd probably be worth using (match ...) to define a new predicate
> > > > > that handles (vec_duplicate ...), VECTOR_CST and CONSTRUCTOR,
> > > > > calling into uniform_vector_p for the latter two.
> > > > Hi,
> > > > Thanks for the suggestions.
> > > > Does this version look OK ?
> > >
> > > Can you write
> > >
> > > +(simplify
> > > + (vec_perm (vec_same_elem_p@0 @1) @0 @2)
> > > + @0)
> > >
> > > as
> > >
> > >  (vec_perm vec_same_elem_p@0 @0 @1)
> > >
> > > ?
> > (simplify
> >  (vec_perm vec_same_elem_p@0 @0 @1)
> >  @0)
> >
> > results in:
> > gimple-match.c: In function ‘bool
> > gimple_simplify_VEC_PERM_EXPR(gimple_match_op*, gimple**, tree_node*
> > (*)(tree), code_helper, tree, tree, tree, tree)’:
> > gimple-match.c:103031:36: error: cannot convert ‘tree_node* (*)(tree)’
> > {aka ‘tree_node* (*)(tree_node*)’} to ‘tree_node**’
> >    if (gimple_vec_same_elem_p (op0, valueize))
> >                                     ^~~~~~~~
> >
> > because gimple_vec_same_elem_p has tree *res_ops as 2nd param and
> > we're passing valueize as 2nd arg.
>
> Ah, you need the
>
> (match vec_same_elem_p
>  @0
>  (if (uniform_vector_p (@0)))
>
> (match vec_same_elem_p
>  (vec_duplicate @0))
>
> form then.
Thanks, that worked.
Is the attached patch OK to commit ?

Thanks,
Prathamesh
>
> > Thanks,
> > Prathamesh
> > >
> > > Otherwise looks OK.
> > >
> > > Thanks,
> > > Richard.
> > >
> > > > Thanks,
> > > > Prathamesh
> > > >
> > > > >
> > > > > Thanks,
> > > > > Richard
> > > > >
> > > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c
> > > > > > new file mode 100644
> > > > > > index 00000000000..42654be3b31
> > > > > > --- /dev/null
> > > > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c
> > > > > > @@ -0,0 +1,20 @@
> > > > > > +/* { dg-do compile } */
> > > > > > +/* { dg-options "-O3 -march=armv8.2-a+sve -fdump-tree-optimized" } */
> > > > > > +
> > > > > > +void
> > > > > > +f1 (double x[][4])
> > > > > > +{
> > > > > > +  for (int i = 0; i < 4; ++i)
> > > > > > +    for (int j = 0; j < 4; ++j)
> > > > > > +      x[i][j] = 0;
> > > > > > +}
> > > > > > +
> > > > > > +void
> > > > > > +f2 (double x[][4], double y)
> > > > > > +{
> > > > > > +  for (int i = 0; i < 4; ++i)
> > > > > > +    for (int j = 0; j < 4; ++j)
> > > > > > +      x[i][j] = y;
> > > > > > +}
> > > > > > +
> > > > > > +/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized"} } */
> > > >
> > >
> > > --
> > > Richard Biener <rguenther@suse.de>
> > > SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
> > > GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)
> >
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
> GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)
Richard Biener July 23, 2019, 12:44 p.m. UTC | #7
On Tue, 23 Jul 2019, Prathamesh Kulkarni wrote:

> On Tue, 23 Jul 2019 at 17:48, Richard Biener <rguenther@suse.de> wrote:
> >
> > On Tue, 23 Jul 2019, Prathamesh Kulkarni wrote:
> >
> > > On Tue, 23 Jul 2019 at 16:36, Richard Biener <rguenther@suse.de> wrote:
> > > >
> > > > On Tue, 23 Jul 2019, Prathamesh Kulkarni wrote:
> > > >
> > > > > On Fri, 19 Jul 2019 at 18:12, Richard Sandiford
> > > > > <richard.sandiford@arm.com> wrote:
> > > > > >
> > > > > > Not really my area, but FWIW...
> > > > > >
> > > > > > Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes:
> > > > > > > Hi,
> > > > > > > The attached patch tries to fix PR91166.
> > > > > > > Does it look OK ?
> > > > > > > Bootstrap+test in progress on aarch64-linux-gnu and x86_64-unknown-linux-gnu.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Prathamesh
> > > > > > >
> > > > > > > 2019-07-17  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>
> > > > > > >
> > > > > > >       PR middle-end/91166
> > > > > > >       * match.pd (vec_perm_expr(v, v, mask) -> v): New pattern.
> > > > > > >       (define_predicates): Add entry for uniform_vector_p.
> > > > > > >
> > > > > > > testsuite/
> > > > > > >       * gcc.target/aarch64/sve/pr91166.c: New test.
> > > > > > >
> > > > > > > diff --git a/gcc/match.pd b/gcc/match.pd
> > > > > > > index 4a7aa0185d8..2ad98c28fd8 100644
> > > > > > > --- a/gcc/match.pd
> > > > > > > +++ b/gcc/match.pd
> > > > > > > @@ -36,7 +36,8 @@ along with GCC; see the file COPYING3.  If not see
> > > > > > >     integer_valued_real_p
> > > > > > >     integer_pow2p
> > > > > > >     uniform_integer_cst_p
> > > > > > > -   HONOR_NANS)
> > > > > > > +   HONOR_NANS
> > > > > > > +   uniform_vector_p)
> > > > > > >
> > > > > > >  /* Operator lists.  */
> > > > > > >  (define_operator_list tcc_comparison
> > > > > > > @@ -5568,3 +5569,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > > > > > >           { bitsize_int (at * tree_to_uhwi (TYPE_SIZE (TREE_TYPE (type)))); })
> > > > > > >         (if (changed)
> > > > > > >          (vec_perm { op0; } { op1; } { op2; }))))))))))
> > > > > > > +
> > > > > > > +/* VEC_PERM_EXPR (v, v, mask) -> v where v contains same element.  */
> > > > > > > +(simplify
> > > > > > > + (vec_perm (vec_duplicate@0 @1) @0 @2)
> > > > > > > + { @0; })
> > > > > > > +
> > > > > > > +(simplify
> > > > > > > + (vec_perm uniform_vector_p@0 @0 @1)
> > > > > > > + { @0; })
> > > > > >
> > > > > > No need for the curly braces here, can use "@0" as the target of
> > > > > > the simplification.
> > > > > >
> > > > > > It'd probably be worth using (match ...) to define a new predicate
> > > > > > that handles (vec_duplicate ...), VECTOR_CST and CONSTRUCTOR,
> > > > > > calling into uniform_vector_p for the latter two.
> > > > > Hi,
> > > > > Thanks for the suggestions.
> > > > > Does this version look OK ?
> > > >
> > > > Can you write
> > > >
> > > > +(simplify
> > > > + (vec_perm (vec_same_elem_p@0 @1) @0 @2)
> > > > + @0)
> > > >
> > > > as
> > > >
> > > >  (vec_perm vec_same_elem_p@0 @0 @1)
> > > >
> > > > ?
> > > (simplify
> > >  (vec_perm vec_same_elem_p@0 @0 @1)
> > >  @0)
> > >
> > > results in:
> > > gimple-match.c: In function ‘bool
> > > gimple_simplify_VEC_PERM_EXPR(gimple_match_op*, gimple**, tree_node*
> > > (*)(tree), code_helper, tree, tree, tree, tree)’:
> > > gimple-match.c:103031:36: error: cannot convert ‘tree_node* (*)(tree)’
> > > {aka ‘tree_node* (*)(tree_node*)’} to ‘tree_node**’
> > >    if (gimple_vec_same_elem_p (op0, valueize))
> > >                                     ^~~~~~~~
> > >
> > > because gimple_vec_same_elem_p has tree *res_ops as 2nd param and
> > > we're passing valueize as 2nd arg.
> >
> > Ah, you need the
> >
> > (match vec_same_elem_p
> >  @0
> >  (if (uniform_vector_p (@0)))
> >
> > (match vec_same_elem_p
> >  (vec_duplicate @0))
> >
> > form then.
> Thanks, that worked.
> Is the attached patch OK to commit ?

Yes.

Thanks,
Richard.

> Thanks,
> Prathamesh
> >
> > > Thanks,
> > > Prathamesh
> > > >
> > > > Otherwise looks OK.
> > > >
> > > > Thanks,
> > > > Richard.
> > > >
> > > > > Thanks,
> > > > > Prathamesh
> > > > >
> > > > > >
> > > > > > Thanks,
> > > > > > Richard
> > > > > >
> > > > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c
> > > > > > > new file mode 100644
> > > > > > > index 00000000000..42654be3b31
> > > > > > > --- /dev/null
> > > > > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c
> > > > > > > @@ -0,0 +1,20 @@
> > > > > > > +/* { dg-do compile } */
> > > > > > > +/* { dg-options "-O3 -march=armv8.2-a+sve -fdump-tree-optimized" } */
> > > > > > > +
> > > > > > > +void
> > > > > > > +f1 (double x[][4])
> > > > > > > +{
> > > > > > > +  for (int i = 0; i < 4; ++i)
> > > > > > > +    for (int j = 0; j < 4; ++j)
> > > > > > > +      x[i][j] = 0;
> > > > > > > +}
> > > > > > > +
> > > > > > > +void
> > > > > > > +f2 (double x[][4], double y)
> > > > > > > +{
> > > > > > > +  for (int i = 0; i < 4; ++i)
> > > > > > > +    for (int j = 0; j < 4; ++j)
> > > > > > > +      x[i][j] = y;
> > > > > > > +}
> > > > > > > +
> > > > > > > +/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized"} } */
> > > > >
> > > >
> > > > --
> > > > Richard Biener <rguenther@suse.de>
> > > > SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
> > > > GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)
> > >
> >
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
> > GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)
>

Patch
diff mbox series

diff --git a/gcc/match.pd b/gcc/match.pd
index 4a7aa0185d8..2ad98c28fd8 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -36,7 +36,8 @@  along with GCC; see the file COPYING3.  If not see
    integer_valued_real_p
    integer_pow2p
    uniform_integer_cst_p
-   HONOR_NANS)
+   HONOR_NANS
+   uniform_vector_p)
 
 /* Operator lists.  */
 (define_operator_list tcc_comparison
@@ -5568,3 +5569,12 @@  DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
          { bitsize_int (at * tree_to_uhwi (TYPE_SIZE (TREE_TYPE (type)))); })
        (if (changed)
         (vec_perm { op0; } { op1; } { op2; }))))))))))
+
+/* VEC_PERM_EXPR (v, v, mask) -> v where v contains same element.  */
+(simplify
+ (vec_perm (vec_duplicate@0 @1) @0 @2)
+ { @0; })
+
+(simplify
+ (vec_perm uniform_vector_p@0 @0 @1)
+ { @0; }) 
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c
new file mode 100644
index 00000000000..42654be3b31
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/pr91166.c
@@ -0,0 +1,20 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=armv8.2-a+sve -fdump-tree-optimized" } */
+
+void
+f1 (double x[][4]) 
+{
+  for (int i = 0; i < 4; ++i)
+    for (int j = 0; j < 4; ++j)
+      x[i][j] = 0;
+}
+
+void
+f2 (double x[][4], double y)
+{
+  for (int i = 0; i < 4; ++i)
+    for (int j = 0; j < 4; ++j)
+      x[i][j] = y;
+}
+
+/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized"} } */