diff mbox

[v3] Optimize strchr to strlen

Message ID AM5PR0802MB2610611B8711E331DEA3A4D983C80@AM5PR0802MB2610.eurprd08.prod.outlook.com
State New
Headers show

Commit Message

Wilco Dijkstra Sept. 23, 2016, 2:07 p.m. UTC
After discussion (https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00718.html)
here is the latest version of the strchr patch.  This uses a gimple statement for
the pointer addition so the gsi_prev always points at the strlen call.

Optimize strchr (s, 0) to s + strlen (s).  strchr (s, 0) appears a common
idiom for finding the end of a string, however it is not a very efficient
way of doing so.  Strlen is a much simpler operation which is significantly
faster (eg. on x86 strlen is 50% faster for strings of 8 bytes and about
twice as fast as strchr on strings of 1KB).

OK for commit?

ChangeLog:
2016-09-23  Wilco Dijkstra  <wdijkstr@arm.com>

gcc/
	* gcc/gimple-fold.c (gimple_fold_builtin_strchr):
	New function to optimize strchr (s, 0) to strlen.
	(gimple_fold_builtin): Add BUILT_IN_STRCHR case.

testsuite/
	* gcc/testsuite/gcc.dg/strlenopt-20.c: Update test.
	* gcc/testsuite/gcc.dg/strlenopt-21.c: Likewise.
	* gcc/testsuite/gcc.dg/strlenopt-22.c: Likewise.
	* gcc/testsuite/gcc.dg/strlenopt-22g.c: Likewise.
	* gcc/testsuite/gcc.dg/strlenopt-26.c: Likewise.
	* gcc/testsuite/gcc.dg/strlenopt-5.c: Likewise.
	* gcc/testsuite/gcc.dg/strlenopt-7.c: Likewise.
	* gcc/testsuite/gcc.dg/strlenopt-9.c: Likewise.

--

Comments

Oleg Endo Sept. 23, 2016, 2:10 p.m. UTC | #1
On Fri, 2016-09-23 at 14:07 +0000, Wilco Dijkstra wrote:
> After discussion (https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00718
> .html)
> here is the latest version of the strchr patch.  This uses a gimple
> statement for
> the pointer addition so the gsi_prev always points at the strlen
> call.
> 
> Optimize strchr (s, 0) to s + strlen (s).  strchr (s, 0) appears a
> common
> idiom for finding the end of a string, however it is not a very
> efficient
> way of doing so.  Strlen is a much simpler operation which is
> significantly
> faster (eg. on x86 strlen is 50% faster for strings of 8 bytes and
> about
> twice as fast as strchr on strings of 1KB).
> 
> OK for commit?

Please add PR tree-optimization/61056 to the changelog so that it gets
linked in bugzilla.

Cheers,
Oleg
Richard Biener Sept. 27, 2016, 8:43 a.m. UTC | #2
On Fri, Sep 23, 2016 at 4:10 PM, Oleg Endo <oleg.endo@t-online.de> wrote:
> On Fri, 2016-09-23 at 14:07 +0000, Wilco Dijkstra wrote:
>> After discussion (https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00718
>> .html)
>> here is the latest version of the strchr patch.  This uses a gimple
>> statement for
>> the pointer addition so the gsi_prev always points at the strlen
>> call.
>>
>> Optimize strchr (s, 0) to s + strlen (s).  strchr (s, 0) appears a
>> common
>> idiom for finding the end of a string, however it is not a very
>> efficient
>> way of doing so.  Strlen is a much simpler operation which is
>> significantly
>> faster (eg. on x86 strlen is 50% faster for strings of 8 bytes and
>> about
>> twice as fast as strchr on strings of 1KB).
>>
>> OK for commit?
>
> Please add PR tree-optimization/61056 to the changelog so that it gets
> linked in bugzilla.

Yes.  Also you should strip gcc/  and gcc/testsuite from the filenames
in the changelog entry
as this entries are for gcc/ChangeLog and gcc/testsuite/ChangeLog.

Ok with that changelog changes.

Thanks,
Richard.

> Cheers,
> Oleg
Jason Merrill Sept. 28, 2016, 2:45 p.m. UTC | #3
I think this broke g++.dg/ext/builtin10.C.

Jason
Christophe Lyon Sept. 28, 2016, 5:49 p.m. UTC | #4
On 28 September 2016 at 16:45, Jason Merrill <jason@redhat.com> wrote:
> I think this broke g++.dg/ext/builtin10.C.
>
> Jason

It also broke:
  gc  gcc.c-torture/execute/builtins/strchr.c execution,  -O1
c.c-torture/execute/builtins/strchr.c (execution) on arm* and aarch64*

Christophe
Oleg Endo Sept. 29, 2016, 12:16 p.m. UTC | #5
On Fri, 2016-09-23 at 23:10 +0900, Oleg Endo wrote:
> On Fri, 2016-09-23 at 14:07 +0000, Wilco Dijkstra wrote:
> > 
> > After discussion (https://gcc.gnu.org/ml/gcc-patches/2016-09/msg007
> > 18
> > .html)
> > here is the latest version of the strchr patch.  This uses a gimple
> > statement for
> > the pointer addition so the gsi_prev always points at the strlen
> > call.
> > 
> > Optimize strchr (s, 0) to s + strlen (s).  strchr (s, 0) appears a
> > common
> > idiom for finding the end of a string, however it is not a very
> > efficient
> > way of doing so.  Strlen is a much simpler operation which is
> > significantly
> > faster (eg. on x86 strlen is 50% faster for strings of 8 bytes and
> > about
> > twice as fast as strchr on strings of 1KB).
> > 
> > OK for commit?

> Please add PR tree-optimization/61056 to the changelog so that it
> gets linked in bugzilla.

Notice that the "PR AAA/BBB" markers from the changelog should also be
included in the SVN commit message.  Otherwise the bugzilla commit hook
doesn't notice it, because it looks at the commit messages and not at
the contents of the ChangeLog files.

Cheers,
Oleg
diff mbox

Patch

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index addabb745efc73465f313d1d591aea383f7f4a13..ddf4cf0ae68ef6708377fdb1a2b45575d90da799 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -1457,6 +1457,55 @@  gimple_fold_builtin_strncpy (gimple_stmt_iterator *gsi,
   return true;
 }
 
+/* Simplify strchr (str, 0) into str + strlen (str).
+   In general strlen is significantly faster than strchr
+   due to being a simpler operation.  */
+static bool
+gimple_fold_builtin_strchr (gimple_stmt_iterator *gsi)
+{
+  gimple *stmt = gsi_stmt (*gsi);
+  tree str = gimple_call_arg (stmt, 0);
+  tree c = gimple_call_arg (stmt, 1);
+  location_t loc = gimple_location (stmt);
+
+  if (optimize_function_for_size_p (cfun))
+    return false;
+
+  if (!integer_zerop (c) || !gimple_call_lhs (stmt))
+    return false;
+
+  tree len;
+  tree strlen_fn = builtin_decl_implicit (BUILT_IN_STRLEN);
+
+  if (!strlen_fn)
+    return false;
+
+  /* Create newstr = strlen (str).  */
+  gimple_seq stmts = NULL;
+  gimple *new_stmt = gimple_build_call (strlen_fn, 1, str);
+  gimple_set_location (new_stmt, loc);
+  if (gimple_in_ssa_p (cfun))
+    len = make_ssa_name (size_type_node);
+  else
+    len = create_tmp_reg (size_type_node);
+  gimple_call_set_lhs (new_stmt, len);
+  gimple_seq_add_stmt_without_update (&stmts, new_stmt);
+
+  /* Create (str p+ strlen (str)).  */
+  new_stmt = gimple_build_assign (gimple_call_lhs (stmt),
+				  POINTER_PLUS_EXPR, str, len);
+  gimple_seq_add_stmt_without_update (&stmts, new_stmt);
+  gsi_replace_with_seq_vops (gsi, stmts);
+  /* gsi now points at the assignment to the lhs, get a
+     stmt iterator to the strlen.
+     ???  We can't use gsi_for_stmt as that doesn't work when the
+     CFG isn't built yet.  */
+  gimple_stmt_iterator gsi2 = *gsi;
+  gsi_prev (&gsi2);
+  fold_stmt (&gsi2);
+  return true;
+}
+
 /* Simplify a call to the strcat builtin.  DST and SRC are the arguments
    to the call.
 
@@ -2898,6 +2947,8 @@  gimple_fold_builtin (gimple_stmt_iterator *gsi)
 					 gimple_call_arg (stmt, 1));
     case BUILT_IN_STRNCAT:
       return gimple_fold_builtin_strncat (gsi);
+    case BUILT_IN_STRCHR:
+      return gimple_fold_builtin_strchr (gsi);
     case BUILT_IN_FPUTS:
       return gimple_fold_builtin_fputs (gsi, gimple_call_arg (stmt, 0),
 					gimple_call_arg (stmt, 1), false);
diff --git a/gcc/testsuite/gcc.dg/strlenopt-20.c b/gcc/testsuite/gcc.dg/strlenopt-20.c
index a83e845c26d88e5acdcabf142f7b319136663488..7b483eaeac1aa47278111a92148a16f00b2aaa2d 100644
--- a/gcc/testsuite/gcc.dg/strlenopt-20.c
+++ b/gcc/testsuite/gcc.dg/strlenopt-20.c
@@ -86,9 +86,9 @@  main ()
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "strlen \\(" 1 "strlen" } } */
+/* { dg-final { scan-tree-dump-times "strlen \\(" 2 "strlen" } } */
 /* { dg-final { scan-tree-dump-times "memcpy \\(" 4 "strlen" } } */
 /* { dg-final { scan-tree-dump-times "strcpy \\(" 0 "strlen" } } */
 /* { dg-final { scan-tree-dump-times "strcat \\(" 0 "strlen" } } */
-/* { dg-final { scan-tree-dump-times "strchr \\(" 1 "strlen" } } */
+/* { dg-final { scan-tree-dump-times "strchr \\(" 0 "strlen" } } */
 /* { dg-final { scan-tree-dump-times "stpcpy \\(" 0 "strlen" } } */
diff --git a/gcc/testsuite/gcc.dg/strlenopt-21.c b/gcc/testsuite/gcc.dg/strlenopt-21.c
index e22fa9fca9ba14354db2cd5f602283b64bd8dcac..05b85a49dde0a7f5d269174fd4269e40be910dbd 100644
--- a/gcc/testsuite/gcc.dg/strlenopt-21.c
+++ b/gcc/testsuite/gcc.dg/strlenopt-21.c
@@ -57,9 +57,9 @@  main ()
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "strlen \\(" 1 "strlen" } } */
+/* { dg-final { scan-tree-dump-times "strlen \\(" 2 "strlen" } } */
 /* { dg-final { scan-tree-dump-times "memcpy \\(" 3 "strlen" } } */
 /* { dg-final { scan-tree-dump-times "strcpy \\(" 0 "strlen" } } */
 /* { dg-final { scan-tree-dump-times "strcat \\(" 0 "strlen" } } */
-/* { dg-final { scan-tree-dump-times "strchr \\(" 1 "strlen" } } */
+/* { dg-final { scan-tree-dump-times "strchr \\(" 0 "strlen" } } */
 /* { dg-final { scan-tree-dump-times "stpcpy \\(" 0 "strlen" } } */
diff --git a/gcc/testsuite/gcc.dg/strlenopt-22.c b/gcc/testsuite/gcc.dg/strlenopt-22.c
index aa55f5ebd6a2d4803ee9a7fd60fc538d86f47124..b4ef772f0e59252f10a5419ede6837b3c8ca8265 100644
--- a/gcc/testsuite/gcc.dg/strlenopt-22.c
+++ b/gcc/testsuite/gcc.dg/strlenopt-22.c
@@ -31,9 +31,9 @@  main ()
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "strlen \\(" 3 "strlen" } } */
+/* { dg-final { scan-tree-dump-times "strlen \\(" 4 "strlen" } } */
 /* { dg-final { scan-tree-dump-times "memcpy \\(" 1 "strlen" } } */
 /* { dg-final { scan-tree-dump-times "strcpy \\(" 1 "strlen" } } */
 /* { dg-final { scan-tree-dump-times "strcat \\(" 0 "strlen" } } */
-/* { dg-final { scan-tree-dump-times "strchr \\(" 1 "strlen" } } */
+/* { dg-final { scan-tree-dump-times "strchr \\(" 0 "strlen" } } */
 /* { dg-final { scan-tree-dump-times "stpcpy \\(" 0 "strlen" } } */
diff --git a/gcc/testsuite/gcc.dg/strlenopt-22g.c b/gcc/testsuite/gcc.dg/strlenopt-22g.c
index 3e2f8c4633939a0b5ff667fb3671e11cde8fe6ed..9c5d020588f3f338a9acf51d36facfea55810760 100644
--- a/gcc/testsuite/gcc.dg/strlenopt-22g.c
+++ b/gcc/testsuite/gcc.dg/strlenopt-22g.c
@@ -5,9 +5,9 @@ 
 #define USE_GNU
 #include "strlenopt-22.c"
 
-/* { dg-final { scan-tree-dump-times "strlen \\(" 0 "strlen" } } */
+/* { dg-final { scan-tree-dump-times "strlen \\(" 1 "strlen" } } */
 /* { dg-final { scan-tree-dump-times "memcpy \\(" 1 "strlen" } } */
 /* { dg-final { scan-tree-dump-times "strcpy \\(" 0 "strlen" } } */
 /* { dg-final { scan-tree-dump-times "strcat \\(" 0 "strlen" } } */
-/* { dg-final { scan-tree-dump-times "strchr \\(" 1 "strlen" } } */
+/* { dg-final { scan-tree-dump-times "strchr \\(" 0 "strlen" } } */
 /* { dg-final { scan-tree-dump-times "stpcpy \\(" 1 "strlen" } } */
diff --git a/gcc/testsuite/gcc.dg/strlenopt-26.c b/gcc/testsuite/gcc.dg/strlenopt-26.c
index 4bd54bef540806ca90d2b9bcfc33eb531991c967..da2f465a5b5003fa5dca05f3a6ee00e97b98b5dd 100644
--- a/gcc/testsuite/gcc.dg/strlenopt-26.c
+++ b/gcc/testsuite/gcc.dg/strlenopt-26.c
@@ -21,4 +21,5 @@  main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "strlen \\(" 1 "strlen" } } */
+/* { dg-final { scan-tree-dump-times "strlen \\(" 2 "strlen" } } */
+/* { dg-final { scan-tree-dump-times "strchr \\(" 0 "strlen" } } */
diff --git a/gcc/testsuite/gcc.dg/strlenopt-5.c b/gcc/testsuite/gcc.dg/strlenopt-5.c
index 1b006a93045599a75bdb10a39e86ffa59b475c83..a24aea44e8b00ff7b35a907aaa941b4c509642c4 100644
--- a/gcc/testsuite/gcc.dg/strlenopt-5.c
+++ b/gcc/testsuite/gcc.dg/strlenopt-5.c
@@ -48,9 +48,9 @@  main ()
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "strlen \\(" 0 "strlen" } } */
+/* { dg-final { scan-tree-dump-times "strlen \\(" 2 "strlen" } } */
 /* { dg-final { scan-tree-dump-times "memcpy \\(" 2 "strlen" } } */
 /* { dg-final { scan-tree-dump-times "strcpy \\(" 1 "strlen" } } */
 /* { dg-final { scan-tree-dump-times "strcat \\(" 0 "strlen" } } */
-/* { dg-final { scan-tree-dump-times "strchr \\(" 2 "strlen" } } */
+/* { dg-final { scan-tree-dump-times "strchr \\(" 0 "strlen" } } */
 /* { dg-final { scan-tree-dump-times "stpcpy \\(" 0 "strlen" } } */
diff --git a/gcc/testsuite/gcc.dg/strlenopt-7.c b/gcc/testsuite/gcc.dg/strlenopt-7.c
index 3ae1e2cb3f0d08c8b05107c4e65b67bdb39cf7ab..aa53d7e75254dfe56c93172afc49f95e5b7901e6 100644
--- a/gcc/testsuite/gcc.dg/strlenopt-7.c
+++ b/gcc/testsuite/gcc.dg/strlenopt-7.c
@@ -40,11 +40,11 @@  main ()
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "strlen \\(" 0 "strlen" } } */
+/* { dg-final { scan-tree-dump-times "strlen \\(" 1 "strlen" } } */
 /* { dg-final { scan-tree-dump-times "memcpy \\(" 2 "strlen" } } */
 /* { dg-final { scan-tree-dump-times "strcpy \\(" 0 "strlen" } } */
 /* { dg-final { scan-tree-dump-times "strcat \\(" 0 "strlen" } } */
-/* { dg-final { scan-tree-dump-times "strchr \\(" 1 "strlen" } } */
+/* { dg-final { scan-tree-dump-times "strchr \\(" 0 "strlen" } } */
 /* { dg-final { scan-tree-dump-times "stpcpy \\(" 0 "strlen" } } */
 /* { dg-final { scan-tree-dump-times "\\*r_\[0-9\]* = 0;" 1 "strlen" } } */
 /* { dg-final { scan-tree-dump-times "return 3;" 1 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/strlenopt-9.c b/gcc/testsuite/gcc.dg/strlenopt-9.c
index f78defe6287cee9e9ed4779c70675e91c6541d40..e8ff1023d71268e2067993189d5f62eab37a16e5 100644
--- a/gcc/testsuite/gcc.dg/strlenopt-9.c
+++ b/gcc/testsuite/gcc.dg/strlenopt-9.c
@@ -98,10 +98,10 @@  main ()
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "strlen \\(" 3 "strlen" } } */
+/* { dg-final { scan-tree-dump-times "strlen \\(" 5 "strlen" } } */
 /* { dg-final { scan-tree-dump-times "memcpy \\(" 6 "strlen" } } */
 /* { dg-final { scan-tree-dump-times "strcpy \\(" 1 "strlen" } } */
 /* { dg-final { scan-tree-dump-times "strcat \\(" 0 "strlen" } } */
-/* { dg-final { scan-tree-dump-times "strchr \\(" 3 "strlen" } } */
+/* { dg-final { scan-tree-dump-times "strchr \\(" 0 "strlen" } } */
 /* { dg-final { scan-tree-dump-times "stpcpy \\(" 0 "strlen" } } */
 /* { dg-final { scan-tree-dump-times "return 4;" 1 "optimized" } } */