Patchwork [i386] Allow cltd/cqto etc on modern CPUs

login
register
mail settings
Submitter Xinliang David Li
Date Dec. 1, 2012, 5:50 a.m.
Message ID <CAAkRFZJv-CUKDju7QT2ngDFT50BpU6w7fwPQXuiZfAHycG74dw@mail.gmail.com>
Download mbox | patch
Permalink /patch/203112/
State New
Headers show

Comments

Xinliang David Li - Dec. 1, 2012, 5:50 a.m.
Compiling the following code with O2

typedef unsigned long ulong;
typedef __SIZE_TYPE__ size_t;
long woo_i(long a, long b) { return a/b; }

GCC generates:

.LFB0:
        .cfi_startproc
        movq    %rdi, %rdx
        movq    %rdi, %rax
        sarq    $63, %rdx
        idivq   %rsi
        ret

but both ICC and LLVM generate smaller and faster version:

        movq      %rdi, %rax
        cqto
        idivq     %rsi
        ret

for reference see
http://www.agner.org/optimize/instruction_tables.pdf.  On Pentium, the
latency of the instruction is 3 cycles while on modern CPUs, the
instruction has only one uOp with 1 cycle latency.

The following proposed patch fixed the problem. Note that for Atom,
only the CWD instruction is slow with 5 cycle latency, the rest sign
extension instructions are fast -- the fix for Atom needs finer grain
control and can be done separately.

Ok to install after testing?


2010-11-30  Xinliang David Li  <davidxl@google.com>

        * config/i386/i386.c: Allow sign extend instructions (cltd etc)
        on modern CPUs.


thanks,

David
Steven Bosscher - Dec. 2, 2012, 12:08 a.m.
On Sat, Dec 1, 2012 at 6:50 AM, Xinliang David Li wrote:
> 2010-11-30  Xinliang David Li  <>
>
>         * config/i386/i386.c: Allow sign extend instructions (cltd etc)
>         on modern CPUs.

You installed the patch without the ChangeLog entry...
(http://gcc.gnu.org/ml/gcc-cvs/2012-12/msg00027.html)

Ciao!
Steven
Xinliang David Li - Dec. 2, 2012, 1:12 a.m.
Fixed.

thanks,

David

On Sat, Dec 1, 2012 at 4:08 PM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
> On Sat, Dec 1, 2012 at 6:50 AM, Xinliang David Li wrote:
>> 2010-11-30  Xinliang David Li  <>
>>
>>         * config/i386/i386.c: Allow sign extend instructions (cltd etc)
>>         on modern CPUs.
>
> You installed the patch without the ChangeLog entry...
> (http://gcc.gnu.org/ml/gcc-cvs/2012-12/msg00027.html)
>
> Ciao!
> Steven

Patch

Index: config/i386/i386.c
===================================================================
--- config/i386/i386.c (revision 193861)
+++ config/i386/i386.c (working copy)
@@ -1822,7 +1822,7 @@  static unsigned int initial_ix86_tune_fe
   m_K6,

   /* X86_TUNE_USE_CLTD */
-  ~(m_PENT | m_CORE2I7 | m_ATOM | m_K6 | m_GENERIC),
+  ~(m_PENT | m_ATOM | m_K6),

   /* X86_TUNE_USE_XCHGB: Use xchgb %rh,%rl instead of rolw/rorw $8,rx.  */
   m_PENT4,