diff mbox series

[pushed,RA] : Improve cost calculation of pseudos with equivalences

Message ID fe599ae1-2e91-1273-4876-5051bf2e42b1@redhat.com
State New
Headers show
Series [pushed,RA] : Improve cost calculation of pseudos with equivalences | expand

Commit Message

Vladimir Makarov Sept. 14, 2023, 3:28 p.m. UTC
I've committed the following patch.  The reason for this patch is 
explained in its commit message.

The patch was successfully bootstrapped and tested on x86-64, aarch64, 
and ppc64le.

Comments

Jeff Law Sept. 17, 2023, 1:49 p.m. UTC | #1
On 9/14/23 09:28, Vladimir Makarov via Gcc-patches wrote:
> I've committed the following patch.  The reason for this patch is 
> explained in its commit message.
> 
> The patch was successfully bootstrapped and tested on x86-64, aarch64, 
> and ppc64le.
> 
> 
> ra-equiv-cost.patch_ZN7cObject4dropEP12cOwnedObject-stores
> 
> commit 3c834d85f2ec42c60995c2b678196a06cb744959
> Author: Vladimir N. Makarov<vmakarov@redhat.com>
> Date:   Thu Sep 14 10:26:48 2023 -0400
> 
>      [RA]: Improve cost calculation of pseudos with equivalences
>      
>      RISCV target developers reported that RA can spill pseudo used in a
>      loop although there are enough registers to assign.  It happens when
>      the pseudo has an equivalence outside the loop and the equivalence is
>      not merged into insns using the pseudo.  IRA sets up that memory cost
>      to zero when the pseudo has an equivalence and it means that the
>      pseudo will be probably spilled.  This approach worked well for i686
>      (different approaches were benchmarked long time ago on spec2k).
>      Although common sense says that the code is wrong and this was
>      confirmed by RISCV developers.
>      
>      I've tried the following patch on I7-9700k and it improved spec17 fp
>      by 1.5% (21.1 vs 20.8) although spec17 int is a bit worse by 0.45%
>      (8.54 vs 8.58).  The average generated code size is practically the
>      same (0.001% difference).
>      
>      In the future we probably need to try more sophisticated cost
>      calculation which should take into account that the equiv can not be
>      combined in usage insns and the costs of reloads because of this.
>      
>      gcc/ChangeLog:
>      
>              * ira-costs.cc (find_costs_and_classes): Decrease memory cost
>              by equiv savings.
Thanks for diving into this!

What's rather strange is when I do an A/B test with this patch on RISC-V 
it appears to be a pretty consistent loss for integer code.  This would 
seem to match your findings on x86 as well.

I still need to dig into it more deeply, but I see higher ALU as well as 
higher load/store traffic.  The load/store traffic in the one case I've 
looked at so far (omnetpp) appears to be prologue/epilogue related. 
Essentially we're using an additional callee saved register on paths 
that don't trigger at runtime.

Jeff
diff mbox series

Patch

commit 3c834d85f2ec42c60995c2b678196a06cb744959
Author: Vladimir N. Makarov <vmakarov@redhat.com>
Date:   Thu Sep 14 10:26:48 2023 -0400

    [RA]: Improve cost calculation of pseudos with equivalences
    
    RISCV target developers reported that RA can spill pseudo used in a
    loop although there are enough registers to assign.  It happens when
    the pseudo has an equivalence outside the loop and the equivalence is
    not merged into insns using the pseudo.  IRA sets up that memory cost
    to zero when the pseudo has an equivalence and it means that the
    pseudo will be probably spilled.  This approach worked well for i686
    (different approaches were benchmarked long time ago on spec2k).
    Although common sense says that the code is wrong and this was
    confirmed by RISCV developers.
    
    I've tried the following patch on I7-9700k and it improved spec17 fp
    by 1.5% (21.1 vs 20.8) although spec17 int is a bit worse by 0.45%
    (8.54 vs 8.58).  The average generated code size is practically the
    same (0.001% difference).
    
    In the future we probably need to try more sophisticated cost
    calculation which should take into account that the equiv can not be
    combined in usage insns and the costs of reloads because of this.
    
    gcc/ChangeLog:
    
            * ira-costs.cc (find_costs_and_classes): Decrease memory cost
            by equiv savings.

diff --git a/gcc/ira-costs.cc b/gcc/ira-costs.cc
index d9e700e8947..8c93ace5094 100644
--- a/gcc/ira-costs.cc
+++ b/gcc/ira-costs.cc
@@ -1947,15 +1947,8 @@  find_costs_and_classes (FILE *dump_file)
 	    }
 	  if (i >= first_moveable_pseudo && i < last_moveable_pseudo)
 	    i_mem_cost = 0;
-	  else if (equiv_savings < 0)
-	    i_mem_cost = -equiv_savings;
-	  else if (equiv_savings > 0)
-	    {
-	      i_mem_cost = 0;
-	      for (k = cost_classes_ptr->num - 1; k >= 0; k--)
-		i_costs[k] += equiv_savings;
-	    }
-
+	  else
+	    i_mem_cost -= equiv_savings;
 	  best_cost = (1 << (HOST_BITS_PER_INT - 2)) - 1;
 	  best = ALL_REGS;
 	  alt_class = NO_REGS;