diff mbox

[i386] : Correctly handle maximum size of stringop algorithm in decide_alg

Message ID CAFULd4bFSvUd9h1iVLo7a6t9j3O98Md72dB6LO0N_4GPGkRBTA@mail.gmail.com
State New
Headers show

Commit Message

Uros Bizjak June 9, 2014, 6:52 p.m. UTC
Ping.

On Mon, Jun 2, 2014 at 11:12 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
> Hello!
>
> A problem was uncovered by -march=corei7 -mtune=intel -m32 with
> i386/memcpy-[23] testcase in decide_alg subroutine [1]. Although the
> max size of the transfer was known, the memcpy was not inlined, as
> expected by the testcase.
>
> The core of the problem can be seen in the definition of 32bit
> intel_memcpy stringop alg:
>
>   {libcall, {{11, loop, false}, {-1, rep_prefix_4_byte, false}}},
>
> Please note that the last algorithm sets its maximum size to -1,
> "unlimited". However, in decide_alg, the same number also signals that
> no algorithm sets its size, so expected_size is never calculated. In
> the loop that sets maximal size for user defined algorithm, it is
> assumed that "-1" belongs exclusively to libcall, which is not the
> case in the above intel_memcpy definition:
>
>       if (candidate != libcall && candidate && usable)
>       max = algs->size[i].max;
>
> When the last non-libcall algorithm sets its maximum to "-1" (aka
> "unlimited"), this value fails following test:
>
>   if (max > 1 && (unsigned HOST_WIDE_INT) max >= max_size
>
> and expected_size is never calculated.
>
> Attached patch fixes this oversight, so "-1" means unlimited size and
> "0" means that size was never set. The patch also considers these two
> special values when choosing a maximum size for dynamic check.
>
> 2014-06-02  Uros Bizjak  <ubizjak@gmail.com>
>
>     * config/i386/i386.c (decide_alg): Correctly handle maximum size of
>     stringop algorithm.
>
> Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu
> {,-m32}, also with
> RUNTESTFLAGS="--target_board=unix/-march=corei7/-mtune=intel\{,-m32\}",
> where it fixes both memcpy failures from [1].
>
> [1] https://gcc.gnu.org/ml/gcc-testresults/2014-06/msg00127.html
>
> Jan, can you please review the patch, to check if the logic is OK?
>
> Uros.
diff mbox

Patch

Index: ChangeLog
===================================================================
--- ChangeLog	(revision 211140)
+++ ChangeLog	(working copy)
@@ -1,3 +1,8 @@ 
+2014-06-02  Uros Bizjak  <ubizjak@gmail.com>
+
+	* config/i386/i386.c (decide_alg): Correctly handle maximum size of
+	stringop algorithm.
+
 2014-06-02  Marcus Shawcroft  <marcus.shawcroft@arm.com>
 
 	* config/aarch64/aarch64.md (set_fpcr): Drop ISB after FPCR write.
Index: config/i386/i386.c
===================================================================
--- config/i386/i386.c	(revision 211140)
+++ config/i386/i386.c	(working copy)
@@ -23828,7 +23828,7 @@  decide_alg (HOST_WIDE_INT count, HOST_WIDE_INT exp
 {
   const struct stringop_algs * algs;
   bool optimize_for_speed;
-  int max = -1;
+  int max = 0;
   const struct processor_costs *cost;
   int i;
   bool any_alg_usable_p = false;
@@ -23866,7 +23866,7 @@  decide_alg (HOST_WIDE_INT count, HOST_WIDE_INT exp
   /* If expected size is not known but max size is small enough
      so inline version is a win, set expected size into
      the range.  */
-  if (max > 1 && (unsigned HOST_WIDE_INT) max >= max_size
+  if (((max > 1 && (unsigned HOST_WIDE_INT) max >= max_size) || max == -1)
       && expected_size == -1)
     expected_size = min_size / 2 + max_size / 2;
 
@@ -23955,7 +23955,7 @@  decide_alg (HOST_WIDE_INT count, HOST_WIDE_INT exp
             *dynamic_check = 128;
           return loop_1_byte;
         }
-      if (max == -1)
+      if (max <= 0)
 	max = 4096;
       alg = decide_alg (count, max / 2, min_size, max_size, memset,
 			zero_memset, dynamic_check, noalign);