From patchwork Thu Oct 14 15:58:18 2010
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Dmitry Melnik <dm@ispras.ru>
X-Patchwork-Id: 67838
Return-Path: 
 <gcc-patches-return-275438-incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Received: from sourceware.org (server1.sourceware.org [209.132.180.131])
	by ozlabs.org (Postfix) with SMTP id 91DAAB70E2
	for <incoming@patchwork.ozlabs.org>;
	Fri, 15 Oct 2010 02:58:36 +1100 (EST)
Received: (qmail 29323 invoked by alias); 14 Oct 2010 15:58:30 -0000
Received: (qmail 29186 invoked by uid 22791); 14 Oct 2010 15:58:27 -0000
X-SWARE-Spam-Status: No, hits=-0.9 required=5.0	tests=AWL, BAYES_00,
	T_RP_MATCHES_RCVD
X-Spam-Check-By: sourceware.org
Received: from smtp.ispras.ru (HELO smtp.ispras.ru) (83.149.198.201) by
	sourceware.org (qpsmtpd/0.43rc1) with ESMTP;
	Thu, 14 Oct 2010 15:58:21 +0000
Received: from ispserv.ispras.ru (ispserv.ispras.ru [83.149.198.72])	by
	smtp.ispras.ru (Postfix) with ESMTP id A906C5D4122	for
	<gcc-patches@gcc.gnu.org>; Thu, 14 Oct 2010 19:56:38 +0400 (MSD)
Received: from [10.10.3.58] (winnie.ispras.ru [83.149.198.236])	by
	ispserv.ispras.ru (Postfix) with ESMTP id 68DD23FC48	for
	<gcc-patches@gcc.gnu.org>; Thu, 14 Oct 2010 19:58:19 +0400 (MSD)
Message-ID: <4CB7289A.7080608@ispras.ru>
Date: Thu, 14 Oct 2010 19:58:18 +0400
From: Dmitry Melnik <dm@ispras.ru>
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US;
	rv:1.9.1.10) Gecko/20100528 Thunderbird/3.0.5
MIME-Version: 1.0
To: gcc-patches@gcc.gnu.org
Subject: [PATCH] Fold integer divisions to arithmetic shifts
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Unsubscribe: 
 <mailto:gcc-patches-unsubscribe-incoming=patchwork.ozlabs.org@gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Delivered-To: mailing list gcc-patches@gcc.gnu.org

We've found that in Webkit code the following pattern is frequently used to
implement arithmetic shift right by 4 bits:

        r = (v&  (~15)) / 16;

This is done to comply with C99 standard, which defines arithmetic
shifts in C as implementation-dependent.
Unfortunately, GCC can't transform this code back directly into an
arithmetic shift, but it rather expresses division through arithmetic
shift as follows:

        unsigned r, v = ...;
        t = (v&&  (~15));
        if( t>  0 )
          r = t>>  4;
        else
          r = (t + 15)>>  4;

To fix this, we wrote a new transformation to fold-const.c.
It recognizes such pattern as an expression for arithmetic shift.

Original source code:

      int foo(int a)
      {
        int x = (a&  (~15)) / 16;
        return x;
      }

Original output:

        bic     r0, r0, #15
        add     r3, r0, #15
        cmp     r0, #0
        movlt   r0, r3
        mov     r0, r0, asr #4

Output with new transformation:

        mov     r0, r0, asr #4


We have tested the patch on Webkit. It have shown very small
(about 0.5%) speedup on SunSpider JavaScript test, and 12Kb
(0.16%) code size reduction of the stripped 7.5Mb shared library
binary.
Bootstrapped and regtested on x86-64-linux.

OK for trunk?
---
Best regards,
   Dmitry

2010-10-13  Dmitry Melnik  <dm@ispras.ru>

gcc/
	* fold-const.c (fold_binary_loc): New transformation.

gcc/testsuite/
	* gcc.target/arm/20101013-1.c: New test.

diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 8146920..de0f350 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -11599,6 +11599,22 @@ fold_binary_loc (location_t loc,
       return NULL_TREE;

     case TRUNC_DIV_EXPR:
+      /* Optimize (X & (-A)) / A where A is a power of 2,
+        to X >> log2(A) */
+      if (TREE_CODE (arg0) == BIT_AND_EXPR
+         && !TYPE_UNSIGNED (type)
+         && integer_pow2p (arg1) && tree_int_cst_sgn (arg1) > 0)
+       {
+         tree sum = fold_binary_loc (loc, PLUS_EXPR, TREE_TYPE (arg1),
+                                     arg1, TREE_OPERAND (arg0, 1));
+         if (sum && integer_zerop (sum))
+           return fold_build2_loc (loc, RSHIFT_EXPR, type,
+                         TREE_OPERAND (arg0, 0),
+                         build_int_cst (NULL_TREE,
+                               exact_log2 (TREE_INT_CST_LOW (arg1))));
+       }
+
+      /* Fall thru */
     case FLOOR_DIV_EXPR:
       /* Simplify A / (B << N) where A and B are positive and B is
         a power of 2, to A >> (N + log2(B)).  */
diff --git a/gcc/testsuite/gcc.target/arm/20101013-1.c b/gcc/testsuite/gcc.target/arm/20101013-1.c
new file mode 100644
index 0000000..34b19d7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/20101013-1.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int foo(int a)
+{
+  int x = (a & (~15)) / 16;
+  return x;
+}
+
+/* { dg-final { scan-assembler-not "cmp" } } */