diff mbox series

Make the default rtx_costs of MULT/DIV variants consistent.

Message ID 003901d8742b$a7715db0$f6541910$@nextmovesoftware.com
State New
Headers show
Series Make the default rtx_costs of MULT/DIV variants consistent. | expand

Commit Message

Roger Sayle May 30, 2022, 1:46 p.m. UTC
GCC's middle-end provides a default cost model for RTL expressions, for
backends that don't specify their own instruction timings, that can be
summarized as multiplications are COSTS_N_INSNS(4), divisions are
COSTS_N_INSNS(7) and all other operations are COSTS_N_INSNS(1).
This patch tweaks the above definition so that fused-multiply-add
(FMA) and high-part multiplications cost the same as regular
multiplications,
or more importantly aren't (by default) considered less expensive.  Likewise
the saturating forms of multiplication and division cost the same as the
regular variants.  These values can always be changed by the target, but
the goal is to avoid RTL expansion substituting a suitable operation with
its saturating equivalent because it (accidentally) looks much cheaper.
For example, PR 89845 is about implementing division/modulus via highpart
multiply, which may accidentally look extremely cheap.

I believe there should be no code generation changes for this patch,
but of course I'm happy to address any adverse changes on rare targets.
This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32},
with no new failures.  Ok for mainline?


2022-05-30  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
        * rtlanal.cc (rtx_cost) <MULT>: Treat FMA, SS_MULT, US_MULT,
        SMUL_HIGHPART and UMUL_HIGHPART as having the same cost as MULT.
        <DIV>: Likewise, SS_DIV and US_DIV have the same default as DIV.


Thanks in advance,
Roger
--

Comments

Jeff Law May 30, 2022, 2:49 p.m. UTC | #1
On 5/30/2022 7:46 AM, Roger Sayle wrote:
> GCC's middle-end provides a default cost model for RTL expressions, for
> backends that don't specify their own instruction timings, that can be
> summarized as multiplications are COSTS_N_INSNS(4), divisions are
> COSTS_N_INSNS(7) and all other operations are COSTS_N_INSNS(1).
> This patch tweaks the above definition so that fused-multiply-add
> (FMA) and high-part multiplications cost the same as regular
> multiplications,
> or more importantly aren't (by default) considered less expensive.  Likewise
> the saturating forms of multiplication and division cost the same as the
> regular variants.  These values can always be changed by the target, but
> the goal is to avoid RTL expansion substituting a suitable operation with
> its saturating equivalent because it (accidentally) looks much cheaper.
> For example, PR 89845 is about implementing division/modulus via highpart
> multiply, which may accidentally look extremely cheap.
>
> I believe there should be no code generation changes for this patch,
> but of course I'm happy to address any adverse changes on rare targets.
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32},
> with no new failures.  Ok for mainline?
>
>
> 2022-05-30  Roger Sayle  <roger@nextmovesoftware.com>
>
> gcc/ChangeLog
>          * rtlanal.cc (rtx_cost) <MULT>: Treat FMA, SS_MULT, US_MULT,
>          SMUL_HIGHPART and UMUL_HIGHPART as having the same cost as MULT.
>          <DIV>: Likewise, SS_DIV and US_DIV have the same default as DIV.
OK.

Jeff
diff mbox series

Patch

diff --git a/gcc/rtlanal.cc b/gcc/rtlanal.cc
index 7c29682..d78cc60 100644
--- a/gcc/rtlanal.cc
+++ b/gcc/rtlanal.cc
@@ -4578,6 +4578,11 @@  rtx_cost (rtx x, machine_mode mode, enum rtx_code outer_code,
   switch (code)
     {
     case MULT:
+    case FMA:
+    case SS_MULT:
+    case US_MULT:
+    case SMUL_HIGHPART:
+    case UMUL_HIGHPART:
       /* Multiplication has time-complexity O(N*N), where N is the
 	 number of units (translated from digits) when using
 	 schoolbook long multiplication.  */
@@ -4587,6 +4592,8 @@  rtx_cost (rtx x, machine_mode mode, enum rtx_code outer_code,
     case UDIV:
     case MOD:
     case UMOD:
+    case SS_DIV:
+    case US_DIV:
       /* Similarly, complexity for schoolbook long division.  */
       total = factor * factor * COSTS_N_INSNS (7);
       break;