diff mbox series

[x86_64] Correct insn_cost of movabsq.

Message ID 007701daac5a$d94d0af0$8be720d0$@nextmovesoftware.com
State New
Headers show
Series [x86_64] Correct insn_cost of movabsq. | expand

Commit Message

Roger Sayle May 22, 2024, 3:15 p.m. UTC
This single line patch fixes a strange quirk/glitch in i386's rtx_costs,
which considers an instruction loading a 64-bit constant to be significantly
cheaper than loading a 32-bit (or smaller) constant.

Consider the two functions:
unsigned long long foo() { return 0x0123456789abcdefULL; }
unsigned int bar() { return 10; }

and the corresponding lines from combine's dump file:
  insn_cost 1 for     #: r98:DI=0x123456789abcdef
  insn_cost 4 for     #: ax:SI=0xa

The same issue can be seen in -dP assembler output.
  movabsq $81985529216486895, %rax    # 5  [c=1 l=10]  *movdi_internal/4

The problem is that pattern_costs interpretation of rtx_costs contains
"return cost > 0 ? cost : COSTS_N_INSNS (1)" where a zero value (for
example a register or small immediate constant) is considered special,
and equivalent to a single instruction, but all other values are treated
as verbatim.  Hence to make x86_64's 10-byte long movabsq instruction
slightly more expensive than a simple constant, rtx_costs needs to
return COSTS_N_INSNS(1)+1 and not 1.  With this change, the insn_cost
of movabsq is the intended value 5:
  insn_cost 5 for     #: r98:DI=0x123456789abcdef
and
  movabsq $81985529216486895, %rax    # 5  [c=5 l=10]  *movdi_internal/4


[I'd originally tried fixing this by adding a ix86_insn_cost target
hook, but the testsuite is very sensitive to the costing of insns].


This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32}
with no new failures.  Ok for mainline?


2024-05-22  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
        * config/i386/i386.cc (ix86_rtx_costs) <case CONST_INT>:
        A CONST_INT that isn't x86_64_immediate_operand requires an extra
        (expensive) movabsq insn to load, so return COSTS_N_INSNS (1) + 1.


Thanks in advance,
Roger
--

Comments

Uros Bizjak May 22, 2024, 3:29 p.m. UTC | #1
On Wed, May 22, 2024 at 5:15 PM Roger Sayle <roger@nextmovesoftware.com> wrote:
>
> This single line patch fixes a strange quirk/glitch in i386's rtx_costs,
> which considers an instruction loading a 64-bit constant to be significantly
> cheaper than loading a 32-bit (or smaller) constant.
>
> Consider the two functions:
> unsigned long long foo() { return 0x0123456789abcdefULL; }
> unsigned int bar() { return 10; }
>
> and the corresponding lines from combine's dump file:
>   insn_cost 1 for     #: r98:DI=0x123456789abcdef
>   insn_cost 4 for     #: ax:SI=0xa
>
> The same issue can be seen in -dP assembler output.
>   movabsq $81985529216486895, %rax    # 5  [c=1 l=10]  *movdi_internal/4
>
> The problem is that pattern_costs interpretation of rtx_costs contains
> "return cost > 0 ? cost : COSTS_N_INSNS (1)" where a zero value (for
> example a register or small immediate constant) is considered special,
> and equivalent to a single instruction, but all other values are treated
> as verbatim.  Hence to make x86_64's 10-byte long movabsq instruction
> slightly more expensive than a simple constant, rtx_costs needs to
> return COSTS_N_INSNS(1)+1 and not 1.  With this change, the insn_cost
> of movabsq is the intended value 5:
>   insn_cost 5 for     #: r98:DI=0x123456789abcdef
> and
>   movabsq $81985529216486895, %rax    # 5  [c=5 l=10]  *movdi_internal/4
>
>
> [I'd originally tried fixing this by adding a ix86_insn_cost target
> hook, but the testsuite is very sensitive to the costing of insns].
>
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32}
> with no new failures.  Ok for mainline?
>
>
> 2024-05-22  Roger Sayle  <roger@nextmovesoftware.com>
>
> gcc/ChangeLog
>         * config/i386/i386.cc (ix86_rtx_costs) <case CONST_INT>:
>         A CONST_INT that isn't x86_64_immediate_operand requires an extra
>         (expensive) movabsq insn to load, so return COSTS_N_INSNS (1) + 1.

1 of 20,796

[x86_64 PATCH] Correct insn_cost of movabsq.

Inbox

Roger Sayle

5:15 PM (12 minutes ago)


to gcc-patches, me
This single line patch fixes a strange quirk/glitch in i386's rtx_costs,
which considers an instruction loading a 64-bit constant to be significantly
cheaper than loading a 32-bit (or smaller) constant.

Consider the two functions:
unsigned long long foo() { return 0x0123456789abcdefULL; }
unsigned int bar() { return 10; }

and the corresponding lines from combine's dump file:
  insn_cost 1 for     #: r98:DI=0x123456789abcdef
  insn_cost 4 for     #: ax:SI=0xa

The same issue can be seen in -dP assembler output.
  movabsq $81985529216486895, %rax    # 5  [c=1 l=10]  *movdi_internal/4

The problem is that pattern_costs interpretation of rtx_costs contains
"return cost > 0 ? cost : COSTS_N_INSNS (1)" where a zero value (for
example a register or small immediate constant) is considered special,
and equivalent to a single instruction, but all other values are treated
as verbatim.  Hence to make x86_64's 10-byte long movabsq instruction
slightly more expensive than a simple constant, rtx_costs needs to
return COSTS_N_INSNS(1)+1 and not 1.  With this change, the insn_cost
of movabsq is the intended value 5:
  insn_cost 5 for     #: r98:DI=0x123456789abcdef
and
  movabsq $81985529216486895, %rax    # 5  [c=5 l=10]  *movdi_internal/4


[I'd originally tried fixing this by adding a ix86_insn_cost target
hook, but the testsuite is very sensitive to the costing of insns].


This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32}
with no new failures.  Ok for mainline?


2024-05-22  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
        * config/i386/i386.cc (ix86_rtx_costs) <case CONST_INT>:
        A CONST_INT that isn't x86_64_immediate_operand requires an extra
        (expensive) movabsq insn to load, so return COSTS_N_INSNS (1) + 1.


Thanks in advance,
Roger
--


One attachment • Scanned by Gmail


Roger Sayle (nextmovesoftware.com), gcc-patches@gcc.gnu.org


On Wed, May 22, 2024 at 5:15 PM Roger Sayle <roger@nextmovesoftware.com> wrote:
>
> This single line patch fixes a strange quirk/glitch in i386's rtx_costs,
> which considers an instruction loading a 64-bit constant to be significantly
> cheaper than loading a 32-bit (or smaller) constant.
>
> Consider the two functions:
> unsigned long long foo() { return 0x0123456789abcdefULL; }
> unsigned int bar() { return 10; }
>
> and the corresponding lines from combine's dump file:
>   insn_cost 1 for     #: r98:DI=0x123456789abcdef
>   insn_cost 4 for     #: ax:SI=0xa
>
> The same issue can be seen in -dP assembler output.
>   movabsq $81985529216486895, %rax    # 5  [c=1 l=10]  *movdi_internal/4
>
> The problem is that pattern_costs interpretation of rtx_costs contains
> "return cost > 0 ? cost : COSTS_N_INSNS (1)" where a zero value (for
> example a register or small immediate constant) is considered special,
> and equivalent to a single instruction, but all other values are treated
> as verbatim.  Hence to make x86_64's 10-byte long movabsq instruction
> slightly more expensive than a simple constant, rtx_costs needs to
> return COSTS_N_INSNS(1)+1 and not 1.  With this change, the insn_cost
> of movabsq is the intended value 5:
>   insn_cost 5 for     #: r98:DI=0x123456789abcdef
> and
>   movabsq $81985529216486895, %rax    # 5  [c=5 l=10]  *movdi_internal/4
>
>
> [I'd originally tried fixing this by adding a ix86_insn_cost target
> hook, but the testsuite is very sensitive to the costing of insns].
>
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32}
> with no new failures.  Ok for mainline?
>
>
> 2024-05-22  Roger Sayle  <roger@nextmovesoftware.com>
>
> gcc/ChangeLog
>         * config/i386/i386.cc (ix86_rtx_costs) <case CONST_INT>:
>         A CONST_INT that isn't x86_64_immediate_operand requires an extra
>         (expensive) movabsq insn to load, so return COSTS_N_INSNS (1) + 1.

OK, with a small comment added.

Thanks,
Uros.

> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> index b4838b7..b4a9519 100644
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/config/i386/i386.cc
> @@ -21569,7 +21569,7 @@ ix86_rtx_costs (rtx x, machine_mode mode, int outer_code_i, int opno,
>        if (x86_64_immediate_operand (x, VOIDmode))
>   *total = 0;
>       else
> - *total = 1;
> + *total = COSTS_N_INSNS (1) + 1;
>        return true;

Please add a small comment that this cost belongs to movabs.
Richard Biener May 22, 2024, 5:54 p.m. UTC | #2
> Am 22.05.2024 um 17:30 schrieb Uros Bizjak <ubizjak@gmail.com>:
> 
> On Wed, May 22, 2024 at 5:15 PM Roger Sayle <roger@nextmovesoftware.com> wrote:
>> 
>> This single line patch fixes a strange quirk/glitch in i386's rtx_costs,
>> which considers an instruction loading a 64-bit constant to be significantly
>> cheaper than loading a 32-bit (or smaller) constant.
>> 
>> Consider the two functions:
>> unsigned long long foo() { return 0x0123456789abcdefULL; }
>> unsigned int bar() { return 10; }
>> 
>> and the corresponding lines from combine's dump file:
>>  insn_cost 1 for     #: r98:DI=0x123456789abcdef
>>  insn_cost 4 for     #: ax:SI=0xa
>> 
>> The same issue can be seen in -dP assembler output.
>>  movabsq $81985529216486895, %rax    # 5  [c=1 l=10]  *movdi_internal/4
>> 
>> The problem is that pattern_costs interpretation of rtx_costs contains
>> "return cost > 0 ? cost : COSTS_N_INSNS (1)" where a zero value (for
>> example a register or small immediate constant) is considered special,
>> and equivalent to a single instruction, but all other values are treated
>> as verbatim.

A zero cost is interpreted as „not implemented“ and assigned a cost of 1, assuming a COSTS_N_INSNS basing.
IMO a bit bogus but I didn’t dare to argue further with Segher.

Richard 


>>  Hence to make x86_64's 10-byte long movabsq instruction
>> slightly more expensive than a simple constant, rtx_costs needs to
>> return COSTS_N_INSNS(1)+1 and not 1.  With this change, the insn_cost
>> of movabsq is the intended value 5:
>>  insn_cost 5 for     #: r98:DI=0x123456789abcdef
>> and
>>  movabsq $81985529216486895, %rax    # 5  [c=5 l=10]  *movdi_internal/4
>> 
>> 
>> [I'd originally tried fixing this by adding a ix86_insn_cost target
>> hook, but the testsuite is very sensitive to the costing of insns].
>> 
>> 
>> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
>> and make -k check, both with and without --target_board=unix{-m32}
>> with no new failures.  Ok for mainline?
>> 
>> 
>> 2024-05-22  Roger Sayle  <roger@nextmovesoftware.com>
>> 
>> gcc/ChangeLog
>>        * config/i386/i386.cc (ix86_rtx_costs) <case CONST_INT>:
>>        A CONST_INT that isn't x86_64_immediate_operand requires an extra
>>        (expensive) movabsq insn to load, so return COSTS_N_INSNS (1) + 1.
> 
> 1 of 20,796
> 
> [x86_64 PATCH] Correct insn_cost of movabsq.
> 
> Inbox
> 
> Roger Sayle
> 
> 5:15 PM (12 minutes ago)
> 
> 
> to gcc-patches, me
> This single line patch fixes a strange quirk/glitch in i386's rtx_costs,
> which considers an instruction loading a 64-bit constant to be significantly
> cheaper than loading a 32-bit (or smaller) constant.
> 
> Consider the two functions:
> unsigned long long foo() { return 0x0123456789abcdefULL; }
> unsigned int bar() { return 10; }
> 
> and the corresponding lines from combine's dump file:
>  insn_cost 1 for     #: r98:DI=0x123456789abcdef
>  insn_cost 4 for     #: ax:SI=0xa
> 
> The same issue can be seen in -dP assembler output.
>  movabsq $81985529216486895, %rax    # 5  [c=1 l=10]  *movdi_internal/4
> 
> The problem is that pattern_costs interpretation of rtx_costs contains
> "return cost > 0 ? cost : COSTS_N_INSNS (1)" where a zero value (for
> example a register or small immediate constant) is considered special,
> and equivalent to a single instruction, but all other values are treated
> as verbatim.  Hence to make x86_64's 10-byte long movabsq instruction
> slightly more expensive than a simple constant, rtx_costs needs to
> return COSTS_N_INSNS(1)+1 and not 1.  With this change, the insn_cost
> of movabsq is the intended value 5:
>  insn_cost 5 for     #: r98:DI=0x123456789abcdef
> and
>  movabsq $81985529216486895, %rax    # 5  [c=5 l=10]  *movdi_internal/4
> 
> 
> [I'd originally tried fixing this by adding a ix86_insn_cost target
> hook, but the testsuite is very sensitive to the costing of insns].
> 
> 
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32}
> with no new failures.  Ok for mainline?
> 
> 
> 2024-05-22  Roger Sayle  <roger@nextmovesoftware.com>
> 
> gcc/ChangeLog
>        * config/i386/i386.cc (ix86_rtx_costs) <case CONST_INT>:
>        A CONST_INT that isn't x86_64_immediate_operand requires an extra
>        (expensive) movabsq insn to load, so return COSTS_N_INSNS (1) + 1.
> 
> 
> Thanks in advance,
> Roger
> --
> 
> 
> One attachment • Scanned by Gmail
> 
> 
> Roger Sayle (nextmovesoftware.com), gcc-patches@gcc.gnu.org
> 
> 
>> On Wed, May 22, 2024 at 5:15 PM Roger Sayle <roger@nextmovesoftware.com> wrote:
>> 
>> This single line patch fixes a strange quirk/glitch in i386's rtx_costs,
>> which considers an instruction loading a 64-bit constant to be significantly
>> cheaper than loading a 32-bit (or smaller) constant.
>> 
>> Consider the two functions:
>> unsigned long long foo() { return 0x0123456789abcdefULL; }
>> unsigned int bar() { return 10; }
>> 
>> and the corresponding lines from combine's dump file:
>>  insn_cost 1 for     #: r98:DI=0x123456789abcdef
>>  insn_cost 4 for     #: ax:SI=0xa
>> 
>> The same issue can be seen in -dP assembler output.
>>  movabsq $81985529216486895, %rax    # 5  [c=1 l=10]  *movdi_internal/4
>> 
>> The problem is that pattern_costs interpretation of rtx_costs contains
>> "return cost > 0 ? cost : COSTS_N_INSNS (1)" where a zero value (for
>> example a register or small immediate constant) is considered special,
>> and equivalent to a single instruction, but all other values are treated
>> as verbatim.  Hence to make x86_64's 10-byte long movabsq instruction
>> slightly more expensive than a simple constant, rtx_costs needs to
>> return COSTS_N_INSNS(1)+1 and not 1.  With this change, the insn_cost
>> of movabsq is the intended value 5:
>>  insn_cost 5 for     #: r98:DI=0x123456789abcdef
>> and
>>  movabsq $81985529216486895, %rax    # 5  [c=5 l=10]  *movdi_internal/4
>> 
>> 
>> [I'd originally tried fixing this by adding a ix86_insn_cost target
>> hook, but the testsuite is very sensitive to the costing of insns].
>> 
>> 
>> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
>> and make -k check, both with and without --target_board=unix{-m32}
>> with no new failures.  Ok for mainline?
>> 
>> 
>> 2024-05-22  Roger Sayle  <roger@nextmovesoftware.com>
>> 
>> gcc/ChangeLog
>>        * config/i386/i386.cc (ix86_rtx_costs) <case CONST_INT>:
>>        A CONST_INT that isn't x86_64_immediate_operand requires an extra
>>        (expensive) movabsq insn to load, so return COSTS_N_INSNS (1) + 1.
> 
> OK, with a small comment added.
> 
> Thanks,
> Uros.
> 
>> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
>> index b4838b7..b4a9519 100644
>> --- a/gcc/config/i386/i386.cc
>> +++ b/gcc/config/i386/i386.cc
>> @@ -21569,7 +21569,7 @@ ix86_rtx_costs (rtx x, machine_mode mode, int outer_code_i, int opno,
>>       if (x86_64_immediate_operand (x, VOIDmode))
>>  *total = 0;
>>      else
>> - *total = 1;
>> + *total = COSTS_N_INSNS (1) + 1;
>>       return true;
> 
> Please add a small comment that this cost belongs to movabs.
diff mbox series

Patch

diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index b4838b7..b4a9519 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -21569,7 +21569,7 @@  ix86_rtx_costs (rtx x, machine_mode mode, int outer_code_i, int opno,
       if (x86_64_immediate_operand (x, VOIDmode))
 	*total = 0;
      else
-	*total = 1;
+	*total = COSTS_N_INSNS (1) + 1;
       return true;
 
     case CONST_DOUBLE: