Patchwork [tree-ssa-math-opts] : Improve bswap detection

login
register
mail settings
Submitter Kai Tietz
Date June 27, 2011, 8:06 a.m.
Message ID <400461904.779176.1309161973367.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com>
Download mbox | patch
Permalink /patch/102120/
State New
Headers show

Comments

Kai Tietz - June 27, 2011, 8:06 a.m.
Hello,

this patch improves the bswap (32/64) detection and fixes an issue about shifted bit values out of type-size precision.

ChangeLog

2011-06-27  Kai Tietz  <ktietz@redhat.com>

        * tree-ssa-math-opts.c (do_shift_rotate): Zero bits
        out of type precision after operation.
        (find_bswap): Take for limit value the integer auto-
        promotion into account.

ChangeLog

2011-06-27  Kai Tietz  <ktietz@redhat.com>

        * gcc.dg/optimize-bswapdi-2.c: New test.

Bootstrapped and regression tested for x86_64-pc-linux-gnu. Ok for apply?

Regards,
Kai
Richard Guenther - June 27, 2011, 9:42 a.m.
On Mon, Jun 27, 2011 at 10:06 AM, Kai Tietz <ktietz@redhat.com> wrote:
> Hello,
>
> this patch improves the bswap (32/64) detection and fixes an issue about shifted bit values out of type-size precision.
>
> ChangeLog
>
> 2011-06-27  Kai Tietz  <ktietz@redhat.com>
>
>        * tree-ssa-math-opts.c (do_shift_rotate): Zero bits
>        out of type precision after operation.
>        (find_bswap): Take for limit value the integer auto-
>        promotion into account.
>
> ChangeLog
>
> 2011-06-27  Kai Tietz  <ktietz@redhat.com>
>
>        * gcc.dg/optimize-bswapdi-2.c: New test.
>
> Bootstrapped and regression tested for x86_64-pc-linux-gnu. Ok for apply?

The do_shift_rotate hunk is ok.  Where did you get the testcase from?
I can easily construct testcases that go via an intermediate SImode step
and the still won't be recongized.  Thus, is the testcase practically
relevant?  If so we should add 1 + log2 (byte-size) instead of 3 to also
cover the SImode -> HImode intermediate case.

Thanks,
Richard.

> Regards,
> Kai
>
Kai Tietz - June 27, 2011, 9:50 a.m.
2011/6/27 Richard Guenther <richard.guenther@gmail.com>:
> On Mon, Jun 27, 2011 at 10:06 AM, Kai Tietz <ktietz@redhat.com> wrote:
>> Hello,
>>
>> this patch improves the bswap (32/64) detection and fixes an issue about shifted bit values out of type-size precision.
>>
>> ChangeLog
>>
>> 2011-06-27  Kai Tietz  <ktietz@redhat.com>
>>
>>        * tree-ssa-math-opts.c (do_shift_rotate): Zero bits
>>        out of type precision after operation.
>>        (find_bswap): Take for limit value the integer auto-
>>        promotion into account.
>>
>> ChangeLog
>>
>> 2011-06-27  Kai Tietz  <ktietz@redhat.com>
>>
>>        * gcc.dg/optimize-bswapdi-2.c: New test.
>>
>> Bootstrapped and regression tested for x86_64-pc-linux-gnu. Ok for apply?
>
> The do_shift_rotate hunk is ok.  Where did you get the testcase from?
> I can easily construct testcases that go via an intermediate SImode step
> and the still won't be recongized.  Thus, is the testcase practically
> relevant?  If so we should add 1 + log2 (byte-size) instead of 3 to also
> cover the SImode -> HImode intermediate case.
>
> Thanks,
> Richard.

Ok, I can adjust it to use here 1 + 3 (for byte-size covering the
SImode -> HImode intermediate case).  I was seen a regression for
bswap detection by doing in forward-propagate a type-sinking on (type)
X op (type) Y (for X and Y with compatible type), and (type) X op CST
(for ((type) (type-x) CST) == CST case.  I will sent soon a patch for
this optimization related thing. By this I detected that bswap
algorithm didn't zero'ed upper bits out of size-range.

Ok with that change?

Regards,
Kai
Richard Guenther - June 27, 2011, 10:07 a.m.
On Mon, Jun 27, 2011 at 11:50 AM, Kai Tietz <ktietz70@googlemail.com> wrote:
> 2011/6/27 Richard Guenther <richard.guenther@gmail.com>:
>> On Mon, Jun 27, 2011 at 10:06 AM, Kai Tietz <ktietz@redhat.com> wrote:
>>> Hello,
>>>
>>> this patch improves the bswap (32/64) detection and fixes an issue about shifted bit values out of type-size precision.
>>>
>>> ChangeLog
>>>
>>> 2011-06-27  Kai Tietz  <ktietz@redhat.com>
>>>
>>>        * tree-ssa-math-opts.c (do_shift_rotate): Zero bits
>>>        out of type precision after operation.
>>>        (find_bswap): Take for limit value the integer auto-
>>>        promotion into account.
>>>
>>> ChangeLog
>>>
>>> 2011-06-27  Kai Tietz  <ktietz@redhat.com>
>>>
>>>        * gcc.dg/optimize-bswapdi-2.c: New test.
>>>
>>> Bootstrapped and regression tested for x86_64-pc-linux-gnu. Ok for apply?
>>
>> The do_shift_rotate hunk is ok.  Where did you get the testcase from?
>> I can easily construct testcases that go via an intermediate SImode step
>> and the still won't be recongized.  Thus, is the testcase practically
>> relevant?  If so we should add 1 + log2 (byte-size) instead of 3 to also
>> cover the SImode -> HImode intermediate case.
>>
>> Thanks,
>> Richard.
>
> Ok, I can adjust it to use here 1 + 3 (for byte-size covering the
> SImode -> HImode intermediate case).  I was seen a regression for
> bswap detection by doing in forward-propagate a type-sinking on (type)
> X op (type) Y (for X and Y with compatible type), and (type) X op CST
> (for ((type) (type-x) CST) == CST case.  I will sent soon a patch for
> this optimization related thing. By this I detected that bswap
> algorithm didn't zero'ed upper bits out of size-range.
>
> Ok with that change?

Ok with using 1 + ceil_log2 (TREE_INT_CST_LOW (TYPE_SIZE_UNIT (...))),
CSE that TREE_INT_CST_LOW.

Thanks,
Richard.

> Regards,
> Kai
>

Patch

Index: gcc-head/gcc/testsuite/gcc.dg/optimize-bswapdi-2.c
===================================================================
--- /dev/null
+++ gcc-head/gcc/testsuite/gcc.dg/optimize-bswapdi-2.c
@@ -0,0 +1,26 @@ 
+/* { dg-do compile { target arm*-*-* alpha*-*-* ia64*-*-* x86_64-*-* s390x-*-* powerpc*-*-* rs6000-*-* } } */
+/* { dg-require-effective-target stdint_types } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-options "-O2 -fdump-tree-bswap" } */
+
+#include <stdint.h>
+
+/* A variant via unsigned short.  */
+
+uint64_t
+swap64_c (uint64_t x)
+{
+  uint16_t a0 = x >> 48;
+  uint16_t a1 = x >> 32;
+  uint16_t a2 = x >> 16;
+  uint16_t a3 = x;
+
+  return ((uint64_t) (((a0 >> 8) & 0xff) | ((a0 << 8) & 0xff00)))
+	| ((uint64_t) (((a1 >> 8) & 0xff) | ((a1 << 8) & 0xff00)) << 16)
+	| ((uint64_t) (((a2 >> 8) & 0xff) | ((a2 << 8) & 0xff00)) << 32)
+	| ((uint64_t) (((a3 >> 8) & 0xff) | ((a3 << 8) & 0xff00)) << 48);
+}
+
+
+/* { dg-final { scan-tree-dump-times "64 bit bswap implementation found at" 1 "bswap" } } */
+/* { dg-final { cleanup-tree-dump "bswap" } } */
Index: gcc-head/gcc/tree-ssa-math-opts.c
===================================================================
--- gcc-head.orig/gcc/tree-ssa-math-opts.c
+++ gcc-head/gcc/tree-ssa-math-opts.c
@@ -1543,6 +1543,9 @@  do_shift_rotate (enum tree_code code,
     default:
       return false;
     }
+  /* Zero unused bits for size.  */
+  if (n->size < (int)sizeof (HOST_WIDEST_INT))
+    n->n &= ((unsigned HOST_WIDEST_INT)1 << (n->size * BITS_PER_UNIT)) - 1;
   return true;
 }
 
@@ -1743,12 +1746,12 @@  find_bswap (gimple stmt)
 
   /* The last parameter determines the depth search limit.  It usually
      correlates directly to the number of bytes to be touched.  We
-     increase that number by one here in order to also cover signed ->
-     unsigned conversions of the src operand as can be seen in
-     libgcc.  */
+     increase that number by three  here in order to also
+     cover signed -> unsigned converions of the src operand as can be seen
+     in libgcc, and for initial shift/and operation of the src operand.  */
   source_expr =  find_bswap_1 (stmt, &n,
 			       TREE_INT_CST_LOW (
-				 TYPE_SIZE_UNIT (gimple_expr_type (stmt))) + 1);
+				 TYPE_SIZE_UNIT (gimple_expr_type (stmt))) + 3);
 
   if (!source_expr)
     return NULL_TREE;