diff mbox series

bswap: Handle bswapping of pointers [PR96573]

Message ID 20210331193052.GT1179226@tucnak
State New
Headers show
Series bswap: Handle bswapping of pointers [PR96573] | expand

Commit Message

Jakub Jelinek March 31, 2021, 7:30 p.m. UTC
Hi!

In GCC8/9 we used to optimize this into a bswap, but we no longer do.
Handling byteswapping of pointers is easy, all we need is to allow them,
for the __builtin_bswap* we already use TYPE_PRECISION to determine
the precision and we cast the operand and result to the correct type
if they aren't uselessly convertible to what the builtin expects.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-03-31  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/96573
	* gimple-ssa-store-merging.c (init_symbolic_number): Handle
	also pointer types.

	* gcc.dg/pr96573.c: New test.


	Jakub

Comments

Richard Biener April 1, 2021, 7:49 a.m. UTC | #1
On Wed, 31 Mar 2021, Jakub Jelinek wrote:

> Hi!
> 
> In GCC8/9 we used to optimize this into a bswap, but we no longer do.
> Handling byteswapping of pointers is easy, all we need is to allow them,
> for the __builtin_bswap* we already use TYPE_PRECISION to determine
> the precision and we cast the operand and result to the correct type
> if they aren't uselessly convertible to what the builtin expects.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Thanks,
Richard.

> 2021-03-31  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR tree-optimization/96573
> 	* gimple-ssa-store-merging.c (init_symbolic_number): Handle
> 	also pointer types.
> 
> 	* gcc.dg/pr96573.c: New test.
> 
> --- gcc/gimple-ssa-store-merging.c.jj	2021-02-22 17:54:05.701798074 +0100
> +++ gcc/gimple-ssa-store-merging.c	2021-03-31 11:30:55.519845647 +0200
> @@ -333,7 +333,7 @@ init_symbolic_number (struct symbolic_nu
>  {
>    int size;
>  
> -  if (! INTEGRAL_TYPE_P (TREE_TYPE (src)))
> +  if (!INTEGRAL_TYPE_P (TREE_TYPE (src)) && !POINTER_TYPE_P (TREE_TYPE (src)))
>      return false;
>  
>    n->base_addr = n->offset = n->alias_set = n->vuse = NULL_TREE;
> --- gcc/testsuite/gcc.dg/pr96573.c.jj	2021-03-31 11:39:19.382122478 +0200
> +++ gcc/testsuite/gcc.dg/pr96573.c	2021-03-31 11:39:54.186727148 +0200
> @@ -0,0 +1,20 @@
> +/* PR tree-optimization/96573 */
> +/* { dg-do compile { target { lp64 || ilp32 } } } */
> +/* { dg-require-effective-target bswap } */
> +/* { dg-options "-O3 -fdump-tree-optimized" } */
> +/* { dg-final { scan-tree-dump "__builtin_bswap" "optimized" } } */
> +
> +typedef __SIZE_TYPE__ size_t;
> +
> +void *
> +foo (void * const p)
> +{
> +  const size_t m = sizeof (p) - 1;
> +  const unsigned char * const o = (unsigned char*) &p;
> +  void *n;
> +  unsigned char * const q = (unsigned char *) &n;
> +  unsigned char i;
> +  for (i = 0; i <= m; ++i)
> +    q[m - i] = o[i];
> +  return n;
> +}
> 
> 	Jakub
> 
>
Alex Coplan April 1, 2021, 1:16 p.m. UTC | #2
Hi Jakub,

On 31/03/2021 21:30, Jakub Jelinek via Gcc-patches wrote:
> Hi!
> 
> In GCC8/9 we used to optimize this into a bswap, but we no longer do.
> Handling byteswapping of pointers is easy, all we need is to allow them,
> for the __builtin_bswap* we already use TYPE_PRECISION to determine
> the precision and we cast the operand and result to the correct type
> if they aren't uselessly convertible to what the builtin expects.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2021-03-31  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR tree-optimization/96573
> 	* gimple-ssa-store-merging.c (init_symbolic_number): Handle
> 	also pointer types.
> 
> 	* gcc.dg/pr96573.c: New test.

FYI, I'm seeing the new test failing on aarch64:

PASS: gcc.dg/pr96573.c (test for excess errors)
FAIL: gcc.dg/pr96573.c scan-tree-dump optimized "__builtin_bswap"

Thanks,
Alex

> 
> --- gcc/gimple-ssa-store-merging.c.jj	2021-02-22 17:54:05.701798074 +0100
> +++ gcc/gimple-ssa-store-merging.c	2021-03-31 11:30:55.519845647 +0200
> @@ -333,7 +333,7 @@ init_symbolic_number (struct symbolic_nu
>  {
>    int size;
>  
> -  if (! INTEGRAL_TYPE_P (TREE_TYPE (src)))
> +  if (!INTEGRAL_TYPE_P (TREE_TYPE (src)) && !POINTER_TYPE_P (TREE_TYPE (src)))
>      return false;
>  
>    n->base_addr = n->offset = n->alias_set = n->vuse = NULL_TREE;
> --- gcc/testsuite/gcc.dg/pr96573.c.jj	2021-03-31 11:39:19.382122478 +0200
> +++ gcc/testsuite/gcc.dg/pr96573.c	2021-03-31 11:39:54.186727148 +0200
> @@ -0,0 +1,20 @@
> +/* PR tree-optimization/96573 */
> +/* { dg-do compile { target { lp64 || ilp32 } } } */
> +/* { dg-require-effective-target bswap } */
> +/* { dg-options "-O3 -fdump-tree-optimized" } */
> +/* { dg-final { scan-tree-dump "__builtin_bswap" "optimized" } } */
> +
> +typedef __SIZE_TYPE__ size_t;
> +
> +void *
> +foo (void * const p)
> +{
> +  const size_t m = sizeof (p) - 1;
> +  const unsigned char * const o = (unsigned char*) &p;
> +  void *n;
> +  unsigned char * const q = (unsigned char *) &n;
> +  unsigned char i;
> +  for (i = 0; i <= m; ++i)
> +    q[m - i] = o[i];
> +  return n;
> +}
> 
> 	Jakub
>
Jakub Jelinek April 6, 2021, 10:31 a.m. UTC | #3
On Thu, Apr 01, 2021 at 02:16:55PM +0100, Alex Coplan via Gcc-patches wrote:
> FYI, I'm seeing the new test failing on aarch64:
> 
> PASS: gcc.dg/pr96573.c (test for excess errors)
> FAIL: gcc.dg/pr96573.c scan-tree-dump optimized "__builtin_bswap"

The vectorizer in the aarch64 case manages to emit a VEC_PERM_EXPR instead
(which is just as efficient).

So, do we want to go for the following (and/or perhaps also restrict the test to
a couple of targets where it works?  In my last distro build it failed only
on aarch64-linux, while armv7hl-linux-gnueabi and
{i686,x86_64,powerpc64le,s390x}-linux were fine)?

2021-04-06  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/96573
	* gcc.dg/pr96573.c: Instead of __builtin_bswap accept also
	VEC_PERM_EXPR with bswapping permutation.

--- gcc/testsuite/gcc.dg/pr96573.c.jj	2021-04-01 10:50:56.238629197 +0200
+++ gcc/testsuite/gcc.dg/pr96573.c	2021-04-06 12:20:16.314520746 +0200
@@ -2,7 +2,7 @@
 /* { dg-do compile { target { lp64 || ilp32 } } } */
 /* { dg-require-effective-target bswap } */
 /* { dg-options "-O3 -fdump-tree-optimized" } */
-/* { dg-final { scan-tree-dump "__builtin_bswap" "optimized" } } */
+/* { dg-final { scan-tree-dump "__builtin_bswap\|VEC_PERM_EXPR\[^\n\r]*7, 6, 5, 4, 3, 2, 1, 0" "optimized" } } */
 
 typedef __SIZE_TYPE__ size_t;
 


	Jakub
Richard Biener April 6, 2021, 10:35 a.m. UTC | #4
On Tue, 6 Apr 2021, Jakub Jelinek wrote:

> On Thu, Apr 01, 2021 at 02:16:55PM +0100, Alex Coplan via Gcc-patches wrote:
> > FYI, I'm seeing the new test failing on aarch64:
> > 
> > PASS: gcc.dg/pr96573.c (test for excess errors)
> > FAIL: gcc.dg/pr96573.c scan-tree-dump optimized "__builtin_bswap"
> 
> The vectorizer in the aarch64 case manages to emit a VEC_PERM_EXPR instead
> (which is just as efficient).
> 
> So, do we want to go for the following (and/or perhaps also restrict the test to
> a couple of targets where it works?  In my last distro build it failed only
> on aarch64-linux, while armv7hl-linux-gnueabi and
> {i686,x86_64,powerpc64le,s390x}-linux were fine)?

Works for me.

> 2021-04-06  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR tree-optimization/96573
> 	* gcc.dg/pr96573.c: Instead of __builtin_bswap accept also
> 	VEC_PERM_EXPR with bswapping permutation.
> 
> --- gcc/testsuite/gcc.dg/pr96573.c.jj	2021-04-01 10:50:56.238629197 +0200
> +++ gcc/testsuite/gcc.dg/pr96573.c	2021-04-06 12:20:16.314520746 +0200
> @@ -2,7 +2,7 @@
>  /* { dg-do compile { target { lp64 || ilp32 } } } */
>  /* { dg-require-effective-target bswap } */
>  /* { dg-options "-O3 -fdump-tree-optimized" } */
> -/* { dg-final { scan-tree-dump "__builtin_bswap" "optimized" } } */
> +/* { dg-final { scan-tree-dump "__builtin_bswap\|VEC_PERM_EXPR\[^\n\r]*7, 6, 5, 4, 3, 2, 1, 0" "optimized" } } */
>  
>  typedef __SIZE_TYPE__ size_t;
>  
> 
> 
> 	Jakub
> 
>
diff mbox series

Patch

--- gcc/gimple-ssa-store-merging.c.jj	2021-02-22 17:54:05.701798074 +0100
+++ gcc/gimple-ssa-store-merging.c	2021-03-31 11:30:55.519845647 +0200
@@ -333,7 +333,7 @@  init_symbolic_number (struct symbolic_nu
 {
   int size;
 
-  if (! INTEGRAL_TYPE_P (TREE_TYPE (src)))
+  if (!INTEGRAL_TYPE_P (TREE_TYPE (src)) && !POINTER_TYPE_P (TREE_TYPE (src)))
     return false;
 
   n->base_addr = n->offset = n->alias_set = n->vuse = NULL_TREE;
--- gcc/testsuite/gcc.dg/pr96573.c.jj	2021-03-31 11:39:19.382122478 +0200
+++ gcc/testsuite/gcc.dg/pr96573.c	2021-03-31 11:39:54.186727148 +0200
@@ -0,0 +1,20 @@ 
+/* PR tree-optimization/96573 */
+/* { dg-do compile { target { lp64 || ilp32 } } } */
+/* { dg-require-effective-target bswap } */
+/* { dg-options "-O3 -fdump-tree-optimized" } */
+/* { dg-final { scan-tree-dump "__builtin_bswap" "optimized" } } */
+
+typedef __SIZE_TYPE__ size_t;
+
+void *
+foo (void * const p)
+{
+  const size_t m = sizeof (p) - 1;
+  const unsigned char * const o = (unsigned char*) &p;
+  void *n;
+  unsigned char * const q = (unsigned char *) &n;
+  unsigned char i;
+  for (i = 0; i <= m; ++i)
+    q[m - i] = o[i];
+  return n;
+}