Patchwork [1/4] Add the -floop-if-convert flag.

login
register
mail settings
Submitter Sebastian Pop
Date July 7, 2010, 8:22 p.m.
Message ID <1278534137-22733-2-git-send-email-sebpop@gmail.com>
Download mbox | patch
Permalink /patch/58162/
State New
Headers show

Comments

Sebastian Pop - July 7, 2010, 8:22 p.m.
* common.opt (floop-if-convert): New flag.
	* doc/invoke.texi (floop-if-convert): Documented.
	* tree-if-conv.c (gate_tree_if_conversion): Enable if-conversion
	when flag_loop_if_convert is set.
---
 gcc/common.opt      |    4 ++++
 gcc/doc/invoke.texi |   10 ++++++++--
 gcc/tree-if-conv.c  |    3 ++-
 3 files changed, 14 insertions(+), 3 deletions(-)
Richard Guenther - July 7, 2010, 9:39 p.m.
On Wed, Jul 7, 2010 at 10:22 PM, Sebastian Pop <sebpop@gmail.com> wrote:
>        * common.opt (floop-if-convert): New flag.
>        * doc/invoke.texi (floop-if-convert): Documented.
>        * tree-if-conv.c (gate_tree_if_conversion): Enable if-conversion
>        when flag_loop_if_convert is set.
> ---
>  gcc/common.opt      |    4 ++++
>  gcc/doc/invoke.texi |   10 ++++++++--
>  gcc/tree-if-conv.c  |    3 ++-
>  3 files changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 6ca787a..2a5d391 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -653,6 +653,10 @@ fif-conversion2
>  Common Report Var(flag_if_conversion2) Optimization
>  Perform conversion of conditional jumps to conditional execution
>
> +floop-if-convert
> +Common Report Var(flag_loop_if_convert) Optimization
> +Convert conditional jumps in innermost loops to branchless equivalents
> +

Init(0) is missing.  Also can you name it -ftree-loop-if-conversion instead
consistent with -ftree-loop-* and -fif-conversion.

There is still no way to disable if-conversion if the vectorizer is enabled,
so I guess a tri-state -1, disabled and enabled would be more useful
(as you wanted it for debugging in the first place).

Richard.

>  ; -finhibit-size-directive inhibits output of .size for ELF.
>  ; This is used only for compiling crtstuff.c,
>  ; and it may be extended to other effects
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index d70f130..da14bf9 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -342,7 +342,7 @@ Objective-C and Objective-C++ Dialects}.
>  -fearly-inlining -fipa-sra -fexpensive-optimizations -ffast-math @gol
>  -ffinite-math-only -ffloat-store -fexcess-precision=@var{style} @gol
>  -fforward-propagate -ffunction-sections @gol
> --fgcse -fgcse-after-reload -fgcse-las -fgcse-lm @gol
> +-fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol
>  -fgcse-sm -fif-conversion -fif-conversion2 -findirect-inlining @gol
>  -finline-functions -finline-functions-called-once -finline-limit=@var{n} @gol
>  -finline-small-functions -fipa-cp -fipa-cp-clone -fipa-matrix-reorg -fipa-pta @gol
> @@ -352,7 +352,7 @@ Objective-C and Objective-C++ Dialects}.
>  -fira-loop-pressure -fno-ira-share-save-slots @gol
>  -fno-ira-share-spill-slots -fira-verbose=@var{n} @gol
>  -fivopts -fkeep-inline-functions -fkeep-static-consts @gol
> --floop-block -floop-interchange -floop-strip-mine -fgraphite-identity @gol
> +-floop-block -floop-if-convert -floop-interchange -floop-strip-mine @gol
>  -floop-parallelize-all -flto -flto-compression-level -flto-report -fltrans @gol
>  -fltrans-output-list -fmerge-all-constants -fmerge-constants -fmodulo-sched @gol
>  -fmodulo-sched-allow-regmoves -fmove-loop-invariants -fmudflap @gol
> @@ -6883,6 +6883,12 @@ profitable to parallelize the loops.
>  Compare the results of several data dependence analyzers.  This option
>  is used for debugging the data dependence analyzers.
>
> +@item -floop-if-convert
> +Attempt to transform conditional jumps in the innermost loops to
> +branch-less equivalents.  The intent is to remove control-flow from
> +the innermost loops in order to improve the ability of the
> +auto-vectorization pass to handle these loops.
> +
>  @item -ftree-loop-distribution
>  Perform loop distribution.  This flag can improve cache performance on
>  big loop bodies and allow further loop optimizations, like
> diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
> index 8d5d226..ad106d7 100644
> --- a/gcc/tree-if-conv.c
> +++ b/gcc/tree-if-conv.c
> @@ -1242,7 +1242,8 @@ main_tree_if_conversion (void)
>  static bool
>  gate_tree_if_conversion (void)
>  {
> -  return flag_tree_vectorize != 0;
> +  return flag_tree_vectorize
> +    || flag_loop_if_convert;
>  }
>
>  struct gimple_opt_pass pass_if_conversion =
> --
> 1.7.0.4
>
>
Sebastian Pop - July 7, 2010, 9:48 p.m.
On Wed, Jul 7, 2010 at 16:39, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Wed, Jul 7, 2010 at 10:22 PM, Sebastian Pop <sebpop@gmail.com> wrote:
>>        * common.opt (floop-if-convert): New flag.
>>        * doc/invoke.texi (floop-if-convert): Documented.
>>        * tree-if-conv.c (gate_tree_if_conversion): Enable if-conversion
>>        when flag_loop_if_convert is set.
>> ---
>>  gcc/common.opt      |    4 ++++
>>  gcc/doc/invoke.texi |   10 ++++++++--
>>  gcc/tree-if-conv.c  |    3 ++-
>>  3 files changed, 14 insertions(+), 3 deletions(-)
>>
>> diff --git a/gcc/common.opt b/gcc/common.opt
>> index 6ca787a..2a5d391 100644
>> --- a/gcc/common.opt
>> +++ b/gcc/common.opt
>> @@ -653,6 +653,10 @@ fif-conversion2
>>  Common Report Var(flag_if_conversion2) Optimization
>>  Perform conversion of conditional jumps to conditional execution
>>
>> +floop-if-convert
>> +Common Report Var(flag_loop_if_convert) Optimization
>> +Convert conditional jumps in innermost loops to branchless equivalents
>> +
>
> Init(0) is missing.  Also can you name it -ftree-loop-if-conversion instead
> consistent with -ftree-loop-* and -fif-conversion.
>

Ok, I will do this.
Although, I do not like the "tree" in the flags, as I find that there is no
reason to expose the internals of GCC to GCC users.

> There is still no way to disable if-conversion if the vectorizer is enabled,
> so I guess a tri-state -1, disabled and enabled would be more useful
> (as you wanted it for debugging in the first place).
>

Good idea.
I will post an updated patch with these modifications.

Sebastian

Patch

diff --git a/gcc/common.opt b/gcc/common.opt
index 6ca787a..2a5d391 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -653,6 +653,10 @@  fif-conversion2
 Common Report Var(flag_if_conversion2) Optimization
 Perform conversion of conditional jumps to conditional execution
 
+floop-if-convert
+Common Report Var(flag_loop_if_convert) Optimization
+Convert conditional jumps in innermost loops to branchless equivalents
+
 ; -finhibit-size-directive inhibits output of .size for ELF.
 ; This is used only for compiling crtstuff.c,
 ; and it may be extended to other effects
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index d70f130..da14bf9 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -342,7 +342,7 @@  Objective-C and Objective-C++ Dialects}.
 -fearly-inlining -fipa-sra -fexpensive-optimizations -ffast-math @gol
 -ffinite-math-only -ffloat-store -fexcess-precision=@var{style} @gol
 -fforward-propagate -ffunction-sections @gol
--fgcse -fgcse-after-reload -fgcse-las -fgcse-lm @gol
+-fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol
 -fgcse-sm -fif-conversion -fif-conversion2 -findirect-inlining @gol
 -finline-functions -finline-functions-called-once -finline-limit=@var{n} @gol
 -finline-small-functions -fipa-cp -fipa-cp-clone -fipa-matrix-reorg -fipa-pta @gol
@@ -352,7 +352,7 @@  Objective-C and Objective-C++ Dialects}.
 -fira-loop-pressure -fno-ira-share-save-slots @gol
 -fno-ira-share-spill-slots -fira-verbose=@var{n} @gol
 -fivopts -fkeep-inline-functions -fkeep-static-consts @gol
--floop-block -floop-interchange -floop-strip-mine -fgraphite-identity @gol
+-floop-block -floop-if-convert -floop-interchange -floop-strip-mine @gol
 -floop-parallelize-all -flto -flto-compression-level -flto-report -fltrans @gol
 -fltrans-output-list -fmerge-all-constants -fmerge-constants -fmodulo-sched @gol
 -fmodulo-sched-allow-regmoves -fmove-loop-invariants -fmudflap @gol
@@ -6883,6 +6883,12 @@  profitable to parallelize the loops.
 Compare the results of several data dependence analyzers.  This option
 is used for debugging the data dependence analyzers.
 
+@item -floop-if-convert
+Attempt to transform conditional jumps in the innermost loops to
+branch-less equivalents.  The intent is to remove control-flow from
+the innermost loops in order to improve the ability of the
+auto-vectorization pass to handle these loops.
+
 @item -ftree-loop-distribution
 Perform loop distribution.  This flag can improve cache performance on
 big loop bodies and allow further loop optimizations, like
diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
index 8d5d226..ad106d7 100644
--- a/gcc/tree-if-conv.c
+++ b/gcc/tree-if-conv.c
@@ -1242,7 +1242,8 @@  main_tree_if_conversion (void)
 static bool
 gate_tree_if_conversion (void)
 {
-  return flag_tree_vectorize != 0;
+  return flag_tree_vectorize
+    || flag_loop_if_convert;
 }
 
 struct gimple_opt_pass pass_if_conversion =