Message ID | 5B97C539.1020009@arm.com |
---|---|
State | New |
Headers | show |
Series | Schedule by INSN_COST in case of tie | expand |
On Tue, 11 Sep 2018, 14:38 Vlad Lazar, <vlad.lazar@arm.com> wrote: > Hi. > > This patch makes the scheduler prefer instructions with higher cost if two > given instructions are equally good. > Issuing more restricted instructions first is particularly useful on > in-order cores because it increases the > number of dual issue opportunities. > > For example, on AArch64, instead of: > > add x11, x2, 96 > mov x4, x2 > mov w10, 1 > ldrh w5, [x0] > ldrh w13, [x0, 2] > ldrh w9, [x0, 4] > ldrh w12, [x0, 6] > b .L759 > > Generate: > > ldrh w5, [x0] > add x11, x2, 96 > ldrh w13, [x0, 2] > mov x4, x2 > ldrh w9, [x0, 4] > mov w10, 1 > ldrh w12, [x0, 6] > b .L759 > > Bootstrapped and regtested on aarch64-none-linux-gnu and there are no > regressions. > Ok for trunk? > This to me feels like the wrong approach as it feels like you are assuming INSN_COST is latency in some way ? Surely, we shouldn't be introducing INSN_COST based stuff into the scheduler. Have you investigated using TARGET_SCHED_ADJUST_COST (IIRC, look for the right name in the internals documents) and such hooks that come from the scheduler rather than trying to massage INSN_COST into the target independent parts of the scheduler ? Ramana > > Thanks, > Vlad > > gcc/ > Changelog for gcc/Changelog > 2018-09-11 Vlad Lazar <vlad.lazar@arm.com> > > * haifa-sched.c (rank_for_schedule): Schedule by INSN_COST. > (rfs_decision): New scheduling decision. >
Hi Ramana, On 11/09/18 15:55, Ramana Radhakrishnan wrote: > On Tue, 11 Sep 2018, 14:38 Vlad Lazar, <vlad.lazar@arm.com> wrote: > > > Hi. > > > > This patch makes the scheduler prefer instructions with higher cost if two > > given instructions are equally good. > > Issuing more restricted instructions first is particularly useful on > > in-order cores because it increases the > > number of dual issue opportunities. > > > > For example, on AArch64, instead of: > > > > add x11, x2, 96 > > mov x4, x2 > > mov w10, 1 > > ldrh w5, [x0] > > ldrh w13, [x0, 2] > > ldrh w9, [x0, 4] > > ldrh w12, [x0, 6] > > b .L759 > > > > Generate: > > > > ldrh w5, [x0] > > add x11, x2, 96 > > ldrh w13, [x0, 2] > > mov x4, x2 > > ldrh w9, [x0, 4] > > mov w10, 1 > > ldrh w12, [x0, 6] > > b .L759 > > > > Bootstrapped and regtested on aarch64-none-linux-gnu and there are no > > regressions. > > Ok for trunk? > > > > This to me feels like the wrong approach as it feels like you are assuming > INSN_COST is latency in some way ? Surely, we shouldn't be introducing > INSN_COST based stuff into the scheduler. > > Have you investigated using TARGET_SCHED_ADJUST_COST (IIRC, look for the > right name in the internals documents) and such hooks that come from the > scheduler rather than trying to massage INSN_COST into the target > independent parts of the scheduler ? > In the context of haifa-sched.c, INSN_COST is the latency cost. It is not the rtx_cost of the insn, as used by combine and others. So this approach looks reasonable to me (though I haven't done a deep review). Thanks, Kyrill > Ramana > > > > > > Thanks, > > Vlad > > > > gcc/ > > Changelog for gcc/Changelog > > 2018-09-11 Vlad Lazar <vlad.lazar@arm.com> > > > > * haifa-sched.c (rank_for_schedule): Schedule by INSN_COST. > > (rfs_decision): New scheduling decision. > >
> > > This to me feels like the wrong approach as it feels like you are assuming > > INSN_COST is latency in some way ? Surely, we shouldn't be introducing > > INSN_COST based stuff into the scheduler. > > > > Have you investigated using TARGET_SCHED_ADJUST_COST (IIRC, look for the > > right name in the internals documents) and such hooks that come from the > > scheduler rather than trying to massage INSN_COST into the target > > independent parts of the scheduler ? > > > > In the context of haifa-sched.c, INSN_COST is the latency cost. > It is not the rtx_cost of the insn, as used by combine and others. Ah, I was conflating rtx_cost with INSN_COST, sorry about the noise. Ramana > So this approach looks reasonable to me (though I haven't done a deep review). > > Thanks, > Kyrill > > > Ramana > > > > > > > > > > Thanks, > > > Vlad > > > > > > gcc/ > > > Changelog for gcc/Changelog > > > 2018-09-11 Vlad Lazar <vlad.lazar@arm.com> > > > > > > * haifa-sched.c (rank_for_schedule): Schedule by INSN_COST. > > > (rfs_decision): New scheduling decision. > > > >
On 9/11/18 9:00 AM, Kyrill Tkachov wrote: > Hi Ramana, > > On 11/09/18 15:55, Ramana Radhakrishnan wrote: >> On Tue, 11 Sep 2018, 14:38 Vlad Lazar, <vlad.lazar@arm.com> wrote: >> >> > Hi. >> > >> > This patch makes the scheduler prefer instructions with higher cost >> if two >> > given instructions are equally good. >> > Issuing more restricted instructions first is particularly useful on >> > in-order cores because it increases the >> > number of dual issue opportunities. >> > >> > For example, on AArch64, instead of: >> > >> > add x11, x2, 96 >> > mov x4, x2 >> > mov w10, 1 >> > ldrh w5, [x0] >> > ldrh w13, [x0, 2] >> > ldrh w9, [x0, 4] >> > ldrh w12, [x0, 6] >> > b .L759 >> > >> > Generate: >> > >> > ldrh w5, [x0] >> > add x11, x2, 96 >> > ldrh w13, [x0, 2] >> > mov x4, x2 >> > ldrh w9, [x0, 4] >> > mov w10, 1 >> > ldrh w12, [x0, 6] >> > b .L759 >> > >> > Bootstrapped and regtested on aarch64-none-linux-gnu and there are no >> > regressions. >> > Ok for trunk? >> > >> >> This to me feels like the wrong approach as it feels like you are >> assuming >> INSN_COST is latency in some way ? Surely, we shouldn't be introducing >> INSN_COST based stuff into the scheduler. >> >> Have you investigated using TARGET_SCHED_ADJUST_COST (IIRC, look for the >> right name in the internals documents) and such hooks that come from the >> scheduler rather than trying to massage INSN_COST into the target >> independent parts of the scheduler ? >> > > In the context of haifa-sched.c, INSN_COST is the latency cost. > It is not the rtx_cost of the insn, as used by combine and others. > So this approach looks reasonable to me (though I haven't done a deep > review). It looks reasonable to me as well. Essentially it's a new tie-breaker if everything else is equal. jeff
On 9/11/18 7:38 AM, Vlad Lazar wrote: > Hi. > > This patch makes the scheduler prefer instructions with higher cost if > two given instructions are equally good. > Issuing more restricted instructions first is particularly useful on > in-order cores because it increases the > number of dual issue opportunities. > > For example, on AArch64, instead of: > > add x11, x2, 96 > mov x4, x2 > mov w10, 1 > ldrh w5, [x0] > ldrh w13, [x0, 2] > ldrh w9, [x0, 4] > ldrh w12, [x0, 6] > b .L759 > > Generate: > > ldrh w5, [x0] > add x11, x2, 96 > ldrh w13, [x0, 2] > mov x4, x2 > ldrh w9, [x0, 4] > mov w10, 1 > ldrh w12, [x0, 6] > b .L759 > > Bootstrapped and regtested on aarch64-none-linux-gnu and there are no > regressions. > Ok for trunk? > > Thanks, > Vlad > > gcc/ > Changelog for gcc/Changelog > 2018-09-11 Vlad Lazar <vlad.lazar@arm.com> > > * haifa-sched.c (rank_for_schedule): Schedule by INSN_COST. > (rfs_decision): New scheduling decision. OK. jeff
diff --git a/gcc/haifa-sched.c b/gcc/haifa-sched.c index 4f0221f6f43..3095e0375b5 100644 --- a/gcc/haifa-sched.c +++ b/gcc/haifa-sched.c @@ -2542,7 +2542,7 @@ enum rfs_decision { RFS_SCHED_GROUP, RFS_PRESSURE_DELAY, RFS_PRESSURE_TICK, RFS_FEEDS_BACKTRACK_INSN, RFS_PRIORITY, RFS_SPECULATION, RFS_SCHED_RANK, RFS_LAST_INSN, RFS_PRESSURE_INDEX, - RFS_DEP_COUNT, RFS_TIE, RFS_FUSION, RFS_N }; + RFS_DEP_COUNT, RFS_TIE, RFS_FUSION, RFS_COST, RFS_N }; /* Corresponding strings for print outs. */ static const char *rfs_str[RFS_N] = { @@ -2550,7 +2550,7 @@ static const char *rfs_str[RFS_N] = { "RFS_SCHED_GROUP", "RFS_PRESSURE_DELAY", "RFS_PRESSURE_TICK", "RFS_FEEDS_BACKTRACK_INSN", "RFS_PRIORITY", "RFS_SPECULATION", "RFS_SCHED_RANK", "RFS_LAST_INSN", "RFS_PRESSURE_INDEX", - "RFS_DEP_COUNT", "RFS_TIE", "RFS_FUSION" }; + "RFS_DEP_COUNT", "RFS_TIE", "RFS_FUSION", "RFS_COST" }; /* Statistical breakdown of rank_for_schedule decisions. */ struct rank_for_schedule_stats_t { unsigned stats[RFS_N]; }; @@ -2803,6 +2803,14 @@ rank_for_schedule (const void *x, const void *y) if (flag_sched_dep_count_heuristic && val != 0) return rfs_result (RFS_DEP_COUNT, val, tmp, tmp2); + /* Sort by INSN_COST rather than INSN_LUID. This means that instructions + which take longer to execute are prioritised and it leads to more + dual-issue opportunities on in-order cores which have this feature. */ + + if (INSN_COST (tmp) != INSN_COST (tmp2)) + return rfs_result (RFS_COST, INSN_COST (tmp2) - INSN_COST (tmp), + tmp, tmp2); + /* If insns are equally good, sort by INSN_LUID (original insn order), so that we make the sort stable. This minimizes instruction movement, thus minimizing sched's effect on debugging and cross-jumping. */