diff mbox

[i386] : Correctly handle maximum size of stringop algorithm in decide_alg

Message ID CAFULd4YNwj+fT_=SKc06NLR0RxN-UGPpB2hT25N0niFvVkTCFg@mail.gmail.com
State New
Headers show

Commit Message

Uros Bizjak June 2, 2014, 9:12 p.m. UTC
Hello!

A problem was uncovered by -march=corei7 -mtune=intel -m32 with
i386/memcpy-[23] testcase in decide_alg subroutine [1]. Although the
max size of the transfer was known, the memcpy was not inlined, as
expected by the testcase.

The core of the problem can be seen in the definition of 32bit
intel_memcpy stringop alg:

  {libcall, {{11, loop, false}, {-1, rep_prefix_4_byte, false}}},

Please note that the last algorithm sets its maximum size to -1,
"unlimited". However, in decide_alg, the same number also signals that
no algorithm sets its size, so expected_size is never calculated. In
the loop that sets maximal size for user defined algorithm, it is
assumed that "-1" belongs exclusively to libcall, which is not the
case in the above intel_memcpy definition:

      if (candidate != libcall && candidate && usable)
      max = algs->size[i].max;

When the last non-libcall algorithm sets its maximum to "-1" (aka
"unlimited"), this value fails following test:

  if (max > 1 && (unsigned HOST_WIDE_INT) max >= max_size

and expected_size is never calculated.

Attached patch fixes this oversight, so "-1" means unlimited size and
"0" means that size was never set. The patch also considers these two
special values when choosing a maximum size for dynamic check.

2014-06-02  Uros Bizjak  <ubizjak@gmail.com>

    * config/i386/i386.c (decide_alg): Correctly handle maximum size of
    stringop algorithm.

Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu
{,-m32}, also with
RUNTESTFLAGS="--target_board=unix/-march=corei7/-mtune=intel\{,-m32\}",
where it fixes both memcpy failures from [1].

[1] https://gcc.gnu.org/ml/gcc-testresults/2014-06/msg00127.html

Jan, can you please review the patch, to check if the logic is OK?

Uros.
diff mbox

Patch

Index: fuse-caller-save.c
===================================================================
--- fuse-caller-save.c	(revision 211112)
+++ fuse-caller-save.c	(working copy)
@@ -1,5 +1,7 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -fuse-caller-save" } */
+/* { dg-additional-options "-mregparm=1" { target ia32 } } */
+
 /* Testing -fuse-caller-save optimization option.  */
 
 static int __attribute__((noinline))
Index: sibcall-1.c
===================================================================
--- sibcall-1.c	(revision 211112)
+++ sibcall-1.c	(working copy)
@@ -1,5 +1,4 @@ 
-/* { dg-do compile } */
-/* { dg-require-effective-target ia32 } */
+/* { dg-do compile { target ia32 } } */
 /* { dg-options "-O2" } */
 
 extern int (*foo)(int);
Index: sibcall-2.c
===================================================================
--- sibcall-2.c	(revision 211118)
+++ sibcall-2.c	(working copy)
@@ -1,5 +1,4 @@ 
-/* { dg-do compile { xfail { *-*-* } } } */
-/* { dg-require-effective-target ia32 } */
+/* { dg-do compile { target ia32 } } */
 /* { dg-options "-O2" } */
 
 extern int doo1 (int);
@@ -13,4 +12,4 @@ 
   return (a < 0 ? doo1 : doo2) (a);
 }
 
-/* { dg-final { scan-assembler-not "call\[ \t\]*.%eax" } } */
+/* { dg-final { scan-assembler-not "call\[ \t\]*.%eax" { xfail *-*-* } } } */
Index: sibcall-3.c
===================================================================
--- sibcall-3.c	(revision 211118)
+++ sibcall-3.c	(working copy)
@@ -1,5 +1,4 @@ 
-/* { dg-do compile } */
-/* { dg-require-effective-target ia32 } */
+/* { dg-do compile { target ia32 } } */
 /* { dg-options "-O2" } */
 
 extern 
Index: sibcall-4.c
===================================================================
--- sibcall-4.c	(revision 211118)
+++ sibcall-4.c	(working copy)
@@ -1,6 +1,5 @@ 
 /* Testcase for PR target/46219.  */
-/* { dg-do compile { xfail { *-*-* } } } */
-/* { dg-require-effective-target ia32 } */
+/* { dg-do compile  { target ia32 } } */
 /* { dg-options "-O2" } */
 
 typedef void (*dispatch_t)(long offset);
@@ -12,4 +11,4 @@ 
   dispatch[offset](offset);
 }
 
-/* { dg-final { scan-assembler-not "jmp\[ \t\]*.%eax" } } */
+/* { dg-final { scan-assembler-not "jmp\[ \t\]*.%eax" { xfail *-*-* } } } */
Index: sibcall-5.c
===================================================================
--- sibcall-5.c	(revision 211112)
+++ sibcall-5.c	(working copy)
@@ -1,6 +1,5 @@ 
 /* Check that indirect sibcalls understand regparm.  */
-/* { dg-do run } */
-/* { dg-require-effective-target ia32 } */
+/* { dg-do run { target ia32 } } */
 /* { dg-options "-O2" } */
 
 extern void abort (void);
Index: sibcall-6.c
===================================================================
--- sibcall-6.c	(revision 211118)
+++ sibcall-6.c	(working copy)
@@ -1,5 +1,4 @@ 
-/* { dg-do compile } */
-/* { dg-require-effective-target ia32 } */
+/* { dg-do compile { target ia32 } } */
 /* { dg-options "-O2" } */
 
 typedef void *ira_loop_tree_node_t;