From patchwork Mon Mar 25 07:15:35 2013
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Bin Cheng <bin.cheng@arm.com>
X-Patchwork-Id: 230576
Return-Path: 
 <gcc-patches-return-338594-incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Received: from sourceware.org (server1.sourceware.org [209.132.180.131])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(Client CN "localhost", Issuer "www.qmailtoaster.com" (not verified))
	by ozlabs.org (Postfix) with ESMTPS id 436EE2C008A
	for <incoming@patchwork.ozlabs.org>;
	Mon, 25 Mar 2013 18:16:45 +1100 (EST)
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender:from
	:to:subject:date:message-id:mime-version:content-type; q=dns; s=
	default; b=KeUbEV+Q7rLa07jtTL+AqgHcGR/vD2RGUGmxa1tFsR3EhX70s2xhD
	6AofcLOXoiB25TGHjymmwGqNQnUeYC6O+ULqKKE5CzSuv6xAcK/YfJhye6v4xeIK
	Tp9GKa6ZucA62zJWeVMI7AJojH4PWyEuqY8YGhkZH9xjiZU7geXf+4=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender:from
	:to:subject:date:message-id:mime-version:content-type; s=
	default; bh=oBVnLwpGwQdnUAJ5L0uWxXiwVRU=; b=E8jzLrlmFPB4cj/Hrv/r
	QCtP/HX/TwZ+lntvTarZsxh2ZJ25yTP/uO/WlO46lmM2+VGwye9+sUcKTd80kdDZ
	Mnj4ddUFQImm1McR1lj4OUhIvYf0pk1ssSKcUCZIS3zpn9dmE3mRTvssz5ksIrbe
	D029vr90+mbvD5DMW/4WFA8=
Received: (qmail 14654 invoked by alias); 25 Mar 2013 07:16:32 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <mailto:gcc-patches-unsubscribe-##L=##H@gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Delivered-To: mailing list gcc-patches@gcc.gnu.org
Received: (qmail 14385 invoked by uid 89); 25 Mar 2013 07:16:25 -0000
X-Spam-SWARE-Status: No, score=-0.4 required=5.0 tests=AWL, BAYES_50,
	MSGID_MULTIPLE_AT, RCVD_IN_DNSWL_LOW,
	TW_CP autolearn=no version=3.3.1
Received: from service87.mimecast.com (HELO service87.mimecast.com)
	(91.220.42.44) by sourceware.org
	(qpsmtpd/0.84/v0.84-167-ge50287c) with ESMTP;
	Mon, 25 Mar 2013 07:16:23 +0000
Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com
	[217.140.96.21]) by service87.mimecast.com;
	Mon, 25 Mar 2013 07:16:20 +0000
Received: from Binsh02 ([10.1.255.212]) by cam-owa1.Emea.Arm.com with
	Microsoft SMTPSVC(6.0.3790.0); Mon, 25 Mar 2013 07:16:19 +0000
From: "Bin Cheng" <bin.cheng@arm.com>
To: <gcc-patches@gcc.gnu.org>
Subject: FW: [PATCH GCC]Relax the probability condition in CE pass when
	optimizing for code size
Date: Mon, 25 Mar 2013 15:15:35 +0800
Message-ID: <000d01ce2928$8ca50a20$a5ef1e60$@cheng@arm.com>
MIME-Version: 1.0
X-MC-Unique: 113032507162002101
X-Virus-Found: No

Wrong list.

-----Original Message-----
From: Bin Cheng [mailto:bin.cheng@arm.com] 
Sent: Monday, March 25, 2013 3:01 PM
To: gcc@gcc.gnu.org
Subject: [PATCH GCC]Relax the probability condition in CE pass when
optimizing for code size

Hi,
The CE pass has been adapted to work with the probability of then/else
branches. Now the transformation is done only when it's profitable.
Problem is the change affects both performance and size, causing size
regression in many cases (especially in C library like Newlib). 
So this patch relaxes the probability condition when we are optimizing for
size.

Below is an example from Newlib:

unsigned int strlen (const char *);
void * realloc (void * __r, unsigned int __size) ; void * memcpy (void *,
const void *, unsigned int); int argz_add(char **argz , unsigned int
*argz_len , const char *str) {
  int len_to_add = 0;
  unsigned int last = *argz_len;

  if (str == ((void *)0))
    return 0;

  len_to_add = strlen(str) + 1;
  *argz_len += len_to_add;

  if(!(*argz = (char *)realloc(*argz, *argz_len)))
    return 12;

  memcpy(*argz + last, str, len_to_add);
  return 0;
}

The generated assembly for Os/cortex-m0 is like:

argz_add:
	push	{r0, r1, r2, r4, r5, r6, r7, lr}
	mov	r6, r0
	mov	r7, r1
	mov	r4, r2
	ldr	r5, [r1]
	beq	.L3
	mov	r0, r2
	bl	strlen
	add	r0, r0, #1
	add	r1, r0, r5
	str	r0, [sp, #4]
	str	r1, [r7]
	ldr	r0, [r6]
	bl	realloc
	mov	r3, #12
	str	r0, [r6]
	cmp	r0, #0
	beq	.L2
	add	r0, r0, r5
	mov	r1, r4
	ldr	r2, [sp, #4]
	bl	memcpy
	mov	r3, #0
	b	.L2
.L3:
	mov	r3, r2
.L2:
	mov	r0, r3

In which branch/mov instructions around .L3 can be CEed with this patch.

During the work I observed passes before combine might interfere with CE
pass, so this patch is enabled for ce2/ce3 after combination pass.

It is tested on x86/thumb2 for both normal and Os. Is it ok for trunk?


2013-03-25  Bin Cheng  <bin.cheng@arm.com>

	* ifcvt.c (ifcvt_after_combine): New static variable.
	(cheap_bb_rtx_cost_p): Set scale to REG_BR_PROB_BASE when optimizing
	for size.
	(rest_of_handle_if_conversion, rest_of_handle_if_after_combine):
	Clear/set the variable ifcvt_after_combine.

Index: gcc/ifcvt.c
===================================================================
--- gcc/ifcvt.c	(revision 197029)
+++ gcc/ifcvt.c	(working copy)
@@ -67,6 +67,9 @@
 
 #define NULL_BLOCK	((basic_block) NULL)
 
+/* TRUE if after combine pass.  */
+static bool ifcvt_after_combine;
+
 /* # of IF-THEN or IF-THEN-ELSE blocks we looked at  */
 static int num_possible_if_blocks;
 
@@ -144,8 +147,14 @@ cheap_bb_rtx_cost_p (const_basic_block bb, int sca
   /* Our branch probability/scaling factors are just estimates and don't
      account for cases where we can get speculation for free and other
      secondary benefits.  So we fudge the scale factor to make speculating
-     appear a little more profitable.  */
+     appear a little more profitable when optimizing for performance.  */
   scale += REG_BR_PROB_BASE / 8;
+
+  /* Set the scale to REG_BR_PROB_BASE to be more agressive when
+     optimizing for size and after combine pass.  */
+  if (!optimize_function_for_speed_p (cfun) && ifcvt_after_combine)
+    scale = REG_BR_PROB_BASE;
+
   max_cost *= scale;
 
   while (1)
@@ -4445,6 +4454,7 @@ gate_handle_if_conversion (void)
 static unsigned int
 rest_of_handle_if_conversion (void)
 {
+  ifcvt_after_combine = false;
   if (flag_if_conversion)
     {
       if (dump_file)
@@ -4494,6 +4504,7 @@ gate_handle_if_after_combine (void)
 static unsigned int
 rest_of_handle_if_after_combine (void)
 {
+  ifcvt_after_combine = true;
   if_convert ();
   return 0;
 }