Message ID | 076a3744-f608-6f31-7244-2bf7ab06cdb1@yahoo.co.jp |
---|---|
State | New |
Headers | show |
Series | ifcvt.cc: Prevent excessive if-conversion for conditional moves | expand |
Hi, > On optimizing for speed, default_noce_conversion_profitable_p() allows > plenty of headroom, so this patch has little impact. > > Also, if the target-specific cost estimate is accurate or allows for > margins, the impact should be similarly small. I believe this part of ifcvt does/did not use the costing on purpose. It will generally convert more sequences than other paths that compare before and after costs since we just count the number of converted insns comparing them against the "branch costs". Similar to rtx costs they are kind of relative to a single insn but AFAIK it's not used consistently everywhere. All the major platforms have low branch costs nowadays (0 or 1?) thus we won't emit too many conditional moves here. In general I agree that we should compare costs everywhere and not just count (the costing should include the branch costs as well) but this would be a major overhaul. For your case (assuming xtensa), could you not tune xtensa_branch_cost? It is currently 3 allowing up to 4 conditional moves to be generated. optimize_function_for_speed_p is already being passed to the hook so you could make use of that and decrease branch costs when optimizing for size only. Regards Robin
On 2023/01/11 17:02, Robin Dapp wrote: > Hi, Hi! > >> On optimizing for speed, default_noce_conversion_profitable_p() allows >> plenty of headroom, so this patch has little impact. >> >> Also, if the target-specific cost estimate is accurate or allows for >> margins, the impact should be similarly small. > I believe this part of ifcvt does/did not use the costing on purpose. > It will generally convert more sequences than other paths that compare > before and after costs since we just count the number of converted > insns comparing them against the "branch costs". Similar to rtx costs > they are kind of relative to a single insn but AFAIK it's not used > consistently everywhere. All the major platforms have low branch costs > nowadays (0 or 1?) thus we won't emit too many conditional moves here. > > In general I agree that we should compare costs everywhere and not just > count (the costing should include the branch costs as well) but this would > be a major overhaul. For your case (assuming xtensa), could you not > tune xtensa_branch_cost? It is currently 3 allowing up to 4 conditional > moves to be generated. optimize_function_for_speed_p is already being > passed to the hook so you could make use of that and decrease branch > costs when optimizing for size only. > > Regards > Robin Thank you for your detailed explanation. In my case (for Xtensa), the cost of branching isn't really an issue. The actual problem (that I think) is the costs of the sequence itself before and after conversion. It is due to the fact that ifcvt's internal estimation is based on PATTERN(insn), so the instruction lengths ("length" attribute) associated with insns are not well reflected. This is especially noticeable when optimizing for size (overestimating the original cost). Currently, in addition to the patch, I have implemented the following code, and I'm confirming that it works roughly well (fine adjustments are still required). /* Return true if the instruction sequence seq is a good candidate as a replacement for the if-convertible sequence described in if_info. */ static bool xtensa_noce_conversion_profitable_p (rtx_insn *seq, struct noce_if_info *if_info) { unsigned int cost, original_cost; bool speed_p; rtx_insn *insn; speed_p = if_info->speed_p; /* of TEST_BB */ /* Estimate the cost for the replacing sequence. */ cost = 0; for (insn = seq; insn; insn = NEXT_INSN (insn)) if (active_insn_p (insn)) cost += xtensa_insn_cost (insn, speed_p); /* Short circuit and margins if optimiziing for speed. */ if (speed_p) return cost <= if_info->max_seq_cost; /* Estimate the cost for the original sequence if optimizing for size. */ original_cost = xtensa_insn_cost (if_info->jump, speed_p); speed_p = optimize_bb_for_speed_p (if_info->then_bb); FOR_BB_INSNS (if_info->then_bb, insn) if (active_insn_p (insn)) original_cost += xtensa_insn_cost (insn, speed_p); if (if_info->else_bb) { speed_p = optimize_bb_for_speed_p (if_info->else_bb); FOR_BB_INSNS (if_info->else_bb, insn) if (active_insn_p (insn)) original_cost += xtensa_insn_cost (insn, speed_p); } return cost <= original_cost; }
On 1/10/23 21:20, Takayuki 'January June' Suwa via Gcc-patches wrote: > Currently, cond_move_process_if_block() does the conversion without > balancing the cost of the converted sequence with the original one, but > this should be checked by calling targetm.noce_conversion_profitable_p(). > > Doing so allows us to provide a way based on the target-specific cost > estimate, to prevent unwanted size growth due to excessive conditional > moves on optimizing for size. > > On optimizing for speed, default_noce_conversion_profitable_p() allows > plenty of headroom, so this patch has little impact. > > Also, if the target-specific cost estimate is accurate or allows for > margins, the impact should be similarly small. > > gcc/ChangeLog: > > * ifcvt.cc (cond_move_process_if_block): > Consider the result of targetm.noce_conversion_profitable_p() > when replacing the original sequence with the converted one. This is OK for gcc-14 when stage1 opens. The only way I see including it in gcc-13 would be if it fixes a regression. jeff
On 1/10/23 21:20, Takayuki 'January June' Suwa via Gcc-patches wrote: > Currently, cond_move_process_if_block() does the conversion without > balancing the cost of the converted sequence with the original one, but > this should be checked by calling targetm.noce_conversion_profitable_p(). > > Doing so allows us to provide a way based on the target-specific cost > estimate, to prevent unwanted size growth due to excessive conditional > moves on optimizing for size. > > On optimizing for speed, default_noce_conversion_profitable_p() allows > plenty of headroom, so this patch has little impact. > > Also, if the target-specific cost estimate is accurate or allows for > margins, the impact should be similarly small. > > gcc/ChangeLog: > > * ifcvt.cc (cond_move_process_if_block): > Consider the result of targetm.noce_conversion_profitable_p() > when replacing the original sequence with the converted one. THanks. I pushed this to the trunk. Jeff
diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc index 008796838f7..a896e14bb3c 100644 --- a/gcc/ifcvt.cc +++ b/gcc/ifcvt.cc @@ -4350,7 +4350,7 @@ cond_move_process_if_block (struct noce_if_info *if_info) goto done; } seq = end_ifcvt_sequence (if_info); - if (!seq) + if (!seq || !targetm.noce_conversion_profitable_p (seq, if_info)) goto done; loc_insn = first_active_insn (then_bb);