Patchwork [ira] Miss checks in split_live_ranges_for_shrink_wrap

login
register
mail settings
Submitter Zhenqiang Chen
Date Sept. 1, 2014, 8:13 a.m.
Message ID <000001cfc5bc$9876f4d0$c964de70$@arm.com>
Download mbox | patch
Permalink /patch/384702/
State New
Headers show

Comments

Zhenqiang Chen - Sept. 1, 2014, 8:13 a.m.
> -----Original Message-----
> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-
> owner@gcc.gnu.org] On Behalf Of Jeff Law
> Sent: Saturday, August 30, 2014 4:54 AM
> To: Zhenqiang Chen; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH, ira] Miss checks in split_live_ranges_for_shrink_wrap
> 
> On 08/13/14 20:55, Zhenqiang Chen wrote:
> > Hi,
> >
> > Function split_live_ranges_for_shrink_wrap has code
> >
> >    if (!flag_shrink_wrap)
> >      return false;
> >
> > But flag_shrink_wrap is TRUE by default when optimize > 0 even if the
> > port does not support shrink-wrap. To make sure shrink-wrap is
> > enabled, "HAVE_simple_return" must be defined and
> "HAVE_simple_return"
> > must be TRUE.
> >
> > Please refer function.c and shrink-wrap.c on how shrink-wrap is
> > enabled in thread_prologue_and_epilogue_insns.
> >
> > To make the check easy, the patch defines a MICRO:
> > SUPPORT_SHRINK_WRAP_P and replace the uses in ira.c and ifcvt.c
> >
> > Bootstrap and no make check regression on X86-64.
> >
> > OK for trunk?
> >
> > Thanks!
> > -Zhenqiang
> >
> > ChangeLog:
> > 2014-08-14  Zhenqiang Chen  <zhenqiang.chen@arm.com>
> >
> >          * shrink-wrap.h: #define SUPPORT_SHRINK_WRAP_P.
> >          * ira.c: #include "shrink-wrap.h"
> >          (split_live_ranges_for_shrink_wrap): Use SUPPORT_SHRINK_WRAP_P.
> >          * ifcvt.c: #include "shrink-wrap.h"
> >          (dead_or_predicable): Use SUPPORT_SHRINK_WRAP_P.
> So what's the motivation behind this patch?   I can probably guess the
> motivation, but I might guess wrong.  Since you know the motivation, it's
> best if you just tell everyone what it is.

To split live-range of register, split_live_ranges_for_shrink_wrap will
introduce additional register copies. If such copies can not be optimized by
later optimizations, it will lead to code size and performance regression.
My tests on ARM THUMB1 code size show lots of regressions due to additional
register copies. Shrink-wrap is not enabled for ARM THUMB1, so I think
split_live_ranges_for_shrink_wrap should not be called.

> >
> > testsuite/ChangeLog:
> > 2014-08-14  Zhenqiang Chen  <zhenqiang.chen@arm.com>
> >
> >          * gcc.target/arm/split-live-ranges-for-shrink-wrap.c: New test.
> Testcase wasn't included in the patchkit.
> 
>  From a pure bikeshedding standpoint "SUPPORT_SHRINK_WRAP_P" seems
> poorly named.  SHRINK_WRAPPING_ENABLED seems like a better name to
> me.
> 
> Can you repost with the testcase included, name change and basic
> rationale behind why you want to make this change.  I'm pretty sure
> it'll be OK at that point.

Thanks. Patch is updated according to your comments.

-Zhenqiang

ChangeLog:
2014-09-01  Zhenqiang Chen  <zhenqiang.chen@arm.com>

        * shrink-wrap.h: #define SHRINK_WRAPPING_ENABLED.
        * ira.c: #include "shrink-wrap.h"
        (split_live_ranges_for_shrink_wrap): Use SHRINK_WRAPPING_ENABLED.
        * ifcvt.c: #include "shrink-wrap.h"
        (dead_or_predicable): Use SHRINK_WRAPPING_ENABLED.

testsuite/ChangeLog:
2014-09-01  Zhenqiang Chen  <zhenqiang.chen@arm.com>

        * gcc.target/arm/split-live-ranges-for-shrink-wrap.c: New test.

+/* { dg-final { cleanup-rtl-dump "ira" } } */
Jeff Law - Sept. 5, 2014, 4:45 a.m.
On 09/01/14 02:13, Zhenqiang Chen wrote:
>
> To split live-range of register, split_live_ranges_for_shrink_wrap will
> introduce additional register copies. If such copies can not be optimized by
> later optimizations, it will lead to code size and performance regression.
> My tests on ARM THUMB1 code size show lots of regressions due to additional
> register copies. Shrink-wrap is not enabled for ARM THUMB1, so I think
> split_live_ranges_for_shrink_wrap should not be called.
So has anyone looked at why IRA ends up selecting different registers 
for the source/dest of these copies?   Odds are it's just an artifact of 
the heuristics in use, but I'd like to make sure there isn't something 
inherently wrong happening in IRA that's causing it to not tie the 
source/dest of those copies.



> ChangeLog:
> 2014-09-01  Zhenqiang Chen  <zhenqiang.chen@arm.com>
>
>          * shrink-wrap.h: #define SHRINK_WRAPPING_ENABLED.
>          * ira.c: #include "shrink-wrap.h"
>          (split_live_ranges_for_shrink_wrap): Use SHRINK_WRAPPING_ENABLED.
>          * ifcvt.c: #include "shrink-wrap.h"
>          (dead_or_predicable): Use SHRINK_WRAPPING_ENABLED.
>
> testsuite/ChangeLog:
> 2014-09-01  Zhenqiang Chen  <zhenqiang.chen@arm.com>
>
>          * gcc.target/arm/split-live-ranges-for-shrink-wrap.c: New test.
Thanks.  OK for the trunk.

As noted above, it'd may be worth spending a little time looking at the 
regressions without this patch installed to see why IRA isn't doing a 
good job of tying the source/dest of these copies together -- perhaps 
there's something that's been overlooked and fixing it may be beneficial.

jeff
Zhenqiang Chen - Sept. 9, 2014, 6:23 a.m.
> -----Original Message-----
> From: Jeff Law [mailto:law@redhat.com]
> Sent: Friday, September 05, 2014 12:45 PM
> To: Zhenqiang Chen
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH, ira] Miss checks in split_live_ranges_for_shrink_wrap
> 
> On 09/01/14 02:13, Zhenqiang Chen wrote:
> >
> > To split live-range of register, split_live_ranges_for_shrink_wrap
> > will introduce additional register copies. If such copies can not be
> > optimized by later optimizations, it will lead to code size and
performance
> regression.
> > My tests on ARM THUMB1 code size show lots of regressions due to
> > additional register copies. Shrink-wrap is not enabled for ARM THUMB1,
> > so I think split_live_ranges_for_shrink_wrap should not be called.
> So has anyone looked at why IRA ends up selecting different registers
> for the source/dest of these copies?   Odds are it's just an artifact of
> the heuristics in use, but I'd like to make sure there isn't something
> inherently wrong happening in IRA that's causing it to not tie the
source/dest
> of those copies.
> 
> 
> 
> > ChangeLog:
> > 2014-09-01  Zhenqiang Chen  <zhenqiang.chen@arm.com>
> >
> >          * shrink-wrap.h: #define SHRINK_WRAPPING_ENABLED.
> >          * ira.c: #include "shrink-wrap.h"
> >          (split_live_ranges_for_shrink_wrap): Use
> SHRINK_WRAPPING_ENABLED.
> >          * ifcvt.c: #include "shrink-wrap.h"
> >          (dead_or_predicable): Use SHRINK_WRAPPING_ENABLED.
> >
> > testsuite/ChangeLog:
> > 2014-09-01  Zhenqiang Chen  <zhenqiang.chen@arm.com>
> >
> >          * gcc.target/arm/split-live-ranges-for-shrink-wrap.c: New test.
> Thanks.  OK for the trunk.

Thanks. The patch is installed @r215041.
 
> As noted above, it'd may be worth spending a little time looking at the
> regressions without this patch installed to see why IRA isn't doing a good
job
> of tying the source/dest of these copies together -- perhaps there's
> something that's been overlooked and fixing it may be beneficial.

I had investigated it. Compared with 4.8, the allocation order and conflict
cost might be the root cause. A bug is submitted: PR63210.

Thanks!
-Zhenqiang 
 
> jeff
>

Patch

diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index 94b96f3..d2af0f9 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -42,6 +42,7 @@ 
 #include "df.h"
 #include "vec.h"
 #include "dbgcnt.h"
+#include "shrink-wrap.h"
 
 #ifndef HAVE_conditional_move
 #define HAVE_conditional_move 0
@@ -4287,14 +4288,13 @@  dead_or_predicable (basic_block test_bb, basic_block
merge_bb,
 	if (NONDEBUG_INSN_P (insn))
 	  df_simulate_find_defs (insn, merge_set);
 
-#ifdef HAVE_simple_return
       /* If shrink-wrapping, disable this optimization when test_bb is
 	 the first basic block and merge_bb exits.  The idea is to not
 	 move code setting up a return register as that may clobber a
 	 register used to pass function parameters, which then must be
 	 saved in caller-saved regs.  A caller-saved reg requires the
 	 prologue, killing a shrink-wrap opportunity.  */
-      if ((flag_shrink_wrap && HAVE_simple_return && !epilogue_completed)
+      if ((SHRINK_WRAPPING_ENABLED && !epilogue_completed)
 	  && ENTRY_BLOCK_PTR_FOR_FN (cfun)->next_bb == test_bb
 	  && single_succ_p (new_dest)
 	  && single_succ (new_dest) == EXIT_BLOCK_PTR_FOR_FN (cfun)
@@ -4341,7 +4341,6 @@  dead_or_predicable (basic_block test_bb, basic_block
merge_bb,
 	    }
 	  BITMAP_FREE (return_regs);
 	}
-#endif
     }
 
  no_body:
diff --git a/gcc/ira.c b/gcc/ira.c
index 7c18496..f4140e4 100644
--- a/gcc/ira.c
+++ b/gcc/ira.c
@@ -392,6 +392,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "lra.h"
 #include "dce.h"
 #include "dbgcnt.h"
+#include "shrink-wrap.h"
 
 struct target_ira default_target_ira;
 struct target_ira_int default_target_ira_int;
@@ -4781,7 +4782,7 @@  split_live_ranges_for_shrink_wrap (void)
   bitmap_head need_new, reachable;
   vec<basic_block> queue;
 
-  if (!flag_shrink_wrap)
+  if (!SHRINK_WRAPPING_ENABLED)
     return false;
 
   bitmap_initialize (&need_new, 0);
diff --git a/gcc/shrink-wrap.h b/gcc/shrink-wrap.h
index 66bd26d..afcfec3 100644
--- a/gcc/shrink-wrap.h
+++ b/gcc/shrink-wrap.h
@@ -46,6 +46,9 @@  extern edge get_unconverted_simple_return (edge,
bitmap_head,
 extern void convert_to_simple_return (edge entry_edge, edge
orig_entry_edge,
 				      bitmap_head bb_flags, rtx returnjump,
 				      vec<edge> unconverted_simple_returns);
+#define SHRINK_WRAPPING_ENABLED (flag_shrink_wrap && HAVE_simple_return)
+#else
+#define SHRINK_WRAPPING_ENABLED false
 #endif
 
 #endif  /* GCC_SHRINK_WRAP_H  */
diff --git
a/gcc/testsuite/gcc.target/arm/split-live-ranges-for-shrink-wrap.c
b/gcc/testsuite/gcc.target/arm/split-live-ranges-for-shrink-wrap.c
new file mode 100644
index 0000000..e36000b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/split-live-ranges-for-shrink-wrap.c
@@ -0,0 +1,14 @@ 
+/* { dg-do assemble } */
+/* { dg-options "-mthumb -Os -fdump-rtl-ira " }  */
+/* { dg-require-effective-target arm_thumb1_ok } */
+
+int foo (char *, char *, int);
+int test (int d, char * out, char *in, int len)
+{
+  if (out != in)
+    foo (out, in, len);
+  return 0;
+}
+/* { dg-final { object-size text <= 20 } } */
+/* { dg-final { scan-rtl-dump-not "Split live-range of register" "ira" } }
*/