diff mbox

[sched-deps] Generalise usage of macro fusion to work on any two insns

Message ID 53BE4804.4010709@arm.com
State New
Headers show

Commit Message

Kyrylo Tkachov July 10, 2014, 8 a.m. UTC
On 30/06/14 21:39, Jeff Law wrote:
> On 06/27/14 02:29, Kyrill Tkachov wrote:
>> Hi all,
>>
>> This patch generalises the TARGET_MACRO_FUSION_PAIR_P hook usage to work
>> on more than just
>> compares and conditional branches for which it was initially designed
>> for (for x86).
>>
>> There are some instructions in arm and aarch64 that can be fused
>> together when they're back to back in the instruction stream and I'd
>> like to use this hook to keep them together.
>>
>> I'll post an implementation of TARGET_MACRO_FUSION_PAIR_P for arm and
>> aarch64 shortly...
>>
>> Bootstrapped and tested on x86, aarch64-none-linux-gnu and
>> arm-none-linux-gnueabihf.
>>
>> Ok for trunk?
>>
>> 2014-06-27  Ramana Radhakrishnan <ramana.radhakrishnan@arm.com>
>>                       Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>
>>       * sched-deps.c (try_group_insn): Generalise macro fusion hook usage
>>       to any two insns.  Update comment.
> Isn't this going to end up calling the x86 specific macro_fusion_pair_p
> with a lot more insns than that function was previously prepared to handle?
>
> In particular I'm concerned that the 2nd argument is going to be a
> non-jumping insn a lot more often.  Of particular concern is this code:
>
>
>     test_if = SET_SRC (pc_set (condjmp));
>     cond = XEXP (test_if, 0);
>     ccode = GET_CODE (cond);
>
> if CONDJMP is not a JUMP_INSN, pc_set is going to return NULL and XEXP
> (test_if, 0) will then fault.
>
> I realize you bootstrapped on x86, but I suspect that whatever tuning
> you need to enable to really exercise this code wasn't on.
>
> I think you can deal with this by putting
>
> if (!any_condjump_p (condjmp)) at the start of the x86 specific
> macro_fusion_pair_p is sufficient to address this issue.  It also
> ensures that we don't do a lot of unnecessary work in that function.
>
>   From a general code structure standpoint, can you avoid this kind of
> structure:
>
> if (any_condjmp_p (insn))
>     {
>       ...
>       goto succ;
>     }
> else
>     {
>       ...
>       goto succ
>     }
> return
>
> succ:
>
> Can you structure so that you return for all the cases where you don't
> want to set SCHED_GROUP_P from each arm?  Or go ahead and duplicate the
> SCHED_GROUP_P setting in each arm of the conditional.

Hi Jeff,

Thanks for the pointers, I've reworked the patch and it does look 
cleaner. I've made sure to run the x86 bootstrap with Haswell tuning and 
instrumented the code to make sure that the x86 macro fusion code was 
being exercised and it passed that fine.

How's that?

2014-07-10  Ramana Radhakrishnan <ramana.radhakrishnan@arm.com>
             Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     * sched-deps.c (try_group_insn): Generalise macro fusion hook usage
     to any two insns.  Update comment.
     * config/i386/i386.c (ix86_macro_fusion_pair_p): Reject 2nd arguments
     that are not conditional jumps.


>
> jeff
>
>

Comments

Maxim Kuvyrkov July 10, 2014, 9:53 p.m. UTC | #1
On Jul 10, 2014, at 8:00 PM, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote:

> 
> On 30/06/14 21:39, Jeff Law wrote:
>> On 06/27/14 02:29, Kyrill Tkachov wrote:
>>> Hi all,
>>> 
>>> This patch generalises the TARGET_MACRO_FUSION_PAIR_P hook usage to work
>>> on more than just
>>> compares and conditional branches for which it was initially designed
>>> for (for x86).
>>> 
>>> There are some instructions in arm and aarch64 that can be fused
>>> together when they're back to back in the instruction stream and I'd
>>> like to use this hook to keep them together.
>>> 
>>> I'll post an implementation of TARGET_MACRO_FUSION_PAIR_P for arm and
>>> aarch64 shortly...
>>> 
>>> Bootstrapped and tested on x86, aarch64-none-linux-gnu and
>>> arm-none-linux-gnueabihf.
>>> 
>>> Ok for trunk?

The patch looks good to me, but some cleanup suggestions below.

> commit e36b8977738dbe3f63445199710ca627ab37e243
> Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
> Date:   Fri Jun 13 11:41:41 2014 +0100
> 
>     [sched-deps] Generalise macro fusion hook usage
> 
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 8046c67..7dd2ce5 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -25817,6 +25817,9 @@ ix86_macro_fusion_pair_p (rtx condgen, rtx condjmp)
>    rtx compare_set = NULL_RTX, test_if, cond;
>    rtx alu_set = NULL_RTX, addr = NULL_RTX;
>  
> +  if (!any_condjump_p (condjmp))
> +    return false;
> +
>    if (get_attr_type (condgen) != TYPE_TEST
>        && get_attr_type (condgen) != TYPE_ICMP
>        && get_attr_type (condgen) != TYPE_INCDEC
> diff --git a/gcc/sched-deps.c b/gcc/sched-deps.c
> index 7cafc8b..c01a8a6 100644
> --- a/gcc/sched-deps.c
> +++ b/gcc/sched-deps.c
> @@ -2820,35 +2820,48 @@ sched_analyze_2 (struct deps_desc *deps, rtx x, rtx insn)
>      sched_deps_info->finish_rhs ();
>  }
>  
> -/* Try to group comparison and the following conditional jump INSN if
> -   they're already adjacent. This is to prevent scheduler from scheduling
> -   them apart.  */
> +/* Try to group two fuseable insns together to prevent scheduler
> +   from scheduling them apart.  */
>  
>  static void
>  try_group_insn (rtx insn)

Please rename try_group_insn to sched_macro_fuse_insns.  The call is predicated to try_group_insn is predicated on targetm.sched.macro_fusion_p, so this code will not be used for any other kinds of fusion -- might as well just state that in the name,.

>  {
> -  unsigned int condreg1, condreg2;
> -  rtx cc_reg_1;
>    rtx prev;
>  
> -  if (!any_condjump_p (insn))
> +  if (!targetm.sched.macro_fusion_p ())
>      return;

This is a no-op since there is a check on the upper level.  Please remove.

>  
> -  targetm.fixed_condition_code_regs (&condreg1, &condreg2);
> -  cc_reg_1 = gen_rtx_REG (CCmode, condreg1);
> -  prev = prev_nonnote_nondebug_insn (insn);
> -  if (!reg_referenced_p (cc_reg_1, PATTERN (insn))
> -      || !prev
> -      || !modified_in_p (cc_reg_1, prev))
> -    return;
> +  if (any_condjump_p (insn))
> +    {
> +      unsigned int condreg1, condreg2;
> +      rtx cc_reg_1;
> +      targetm.fixed_condition_code_regs (&condreg1, &condreg2);
> +      cc_reg_1 = gen_rtx_REG (CCmode, condreg1);
> +      prev = prev_nonnote_nondebug_insn (insn);
> +      if (!reg_referenced_p (cc_reg_1, PATTERN (insn))
> +	  || !prev
> +	  || !modified_in_p (cc_reg_1, prev))
> +	return;
>  
> -  /* Different microarchitectures support macro fusions for different
> -     combinations of insn pairs.  */
> -  if (!targetm.sched.macro_fusion_pair_p
> -      || !targetm.sched.macro_fusion_pair_p (prev, insn))
> -    return;
> +      if (targetm.sched.macro_fusion_pair_p (prev, insn))
> +        SCHED_GROUP_P (insn) = 1;
> +    }
> +  else
> +    {
> +      rtx insn_set = single_set (insn);
> +
> +      prev = prev_nonnote_nondebug_insn (insn);
> +      if (prev
> +          && insn_set
> +          && single_set (prev)
> +          && modified_in_p (SET_DEST (insn_set), prev)

Invert the check (as done in the upper if-clause) and cut it here.  Then you can use a single unified

if (targetm.sched.macro_fusion_pair_p (prev, insn))
  SCHED_GROUP_P (insn) = 1;

as the final statement of the function.

Thank you,

--
Maxim Kuvyrkov
www.linaro.org
diff mbox

Patch

commit e36b8977738dbe3f63445199710ca627ab37e243
Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Date:   Fri Jun 13 11:41:41 2014 +0100

    [sched-deps] Generalise macro fusion hook usage

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 8046c67..7dd2ce5 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -25817,6 +25817,9 @@  ix86_macro_fusion_pair_p (rtx condgen, rtx condjmp)
   rtx compare_set = NULL_RTX, test_if, cond;
   rtx alu_set = NULL_RTX, addr = NULL_RTX;
 
+  if (!any_condjump_p (condjmp))
+    return false;
+
   if (get_attr_type (condgen) != TYPE_TEST
       && get_attr_type (condgen) != TYPE_ICMP
       && get_attr_type (condgen) != TYPE_INCDEC
diff --git a/gcc/sched-deps.c b/gcc/sched-deps.c
index 7cafc8b..c01a8a6 100644
--- a/gcc/sched-deps.c
+++ b/gcc/sched-deps.c
@@ -2820,35 +2820,48 @@  sched_analyze_2 (struct deps_desc *deps, rtx x, rtx insn)
     sched_deps_info->finish_rhs ();
 }
 
-/* Try to group comparison and the following conditional jump INSN if
-   they're already adjacent. This is to prevent scheduler from scheduling
-   them apart.  */
+/* Try to group two fuseable insns together to prevent scheduler
+   from scheduling them apart.  */
 
 static void
 try_group_insn (rtx insn)
 {
-  unsigned int condreg1, condreg2;
-  rtx cc_reg_1;
   rtx prev;
 
-  if (!any_condjump_p (insn))
+  if (!targetm.sched.macro_fusion_p ())
     return;
 
-  targetm.fixed_condition_code_regs (&condreg1, &condreg2);
-  cc_reg_1 = gen_rtx_REG (CCmode, condreg1);
-  prev = prev_nonnote_nondebug_insn (insn);
-  if (!reg_referenced_p (cc_reg_1, PATTERN (insn))
-      || !prev
-      || !modified_in_p (cc_reg_1, prev))
-    return;
+  if (any_condjump_p (insn))
+    {
+      unsigned int condreg1, condreg2;
+      rtx cc_reg_1;
+      targetm.fixed_condition_code_regs (&condreg1, &condreg2);
+      cc_reg_1 = gen_rtx_REG (CCmode, condreg1);
+      prev = prev_nonnote_nondebug_insn (insn);
+      if (!reg_referenced_p (cc_reg_1, PATTERN (insn))
+	  || !prev
+	  || !modified_in_p (cc_reg_1, prev))
+	return;
 
-  /* Different microarchitectures support macro fusions for different
-     combinations of insn pairs.  */
-  if (!targetm.sched.macro_fusion_pair_p
-      || !targetm.sched.macro_fusion_pair_p (prev, insn))
-    return;
+      if (targetm.sched.macro_fusion_pair_p (prev, insn))
+        SCHED_GROUP_P (insn) = 1;
+    }
+  else
+    {
+      rtx insn_set = single_set (insn);
+
+      prev = prev_nonnote_nondebug_insn (insn);
+      if (prev
+          && insn_set
+          && single_set (prev)
+          && modified_in_p (SET_DEST (insn_set), prev)
+          && targetm.sched.macro_fusion_pair_p (prev, insn))
+        {
+            SCHED_GROUP_P (insn) = 1;
+        }
+
+    }
 
-  SCHED_GROUP_P (insn) = 1;
 }
 
 /* Analyze an INSN with pattern X to find all dependencies.  */