Patchwork Don't invalidate string length cache when not needed

login
register
mail settings
Submitter Marek Polacek
Date May 16, 2013, 2:07 p.m.
Message ID <20130516140744.GH14240@redhat.com>
Download mbox | patch
Permalink /patch/244331/
State New
Headers show

Comments

Marek Polacek - May 16, 2013, 2:07 p.m.
On Thu, May 16, 2013 at 03:38:47PM +0200, Jakub Jelinek wrote:
> Please add here a comment what it does and why, that if si->length
> is non-zero constant, we know that the character at that spot is
> not '\0' and when storing non-'\0' to that location, we can't affect
> size of any strings at all.  Therefore we do the gsi_next + return false
> to signal caller that it shouldn't invalidate anything.
> 
> Ok with that change, thanks.

Thanks, this what I'll commit later today if no objections.

2013-05-16  Marek Polacek  <polacek@redhat.com>

	* tree-ssa-strlen.c (handle_char_store): Don't invalidate
	cached length when doing non-zero store.

	* gcc.dg/strlenopt-25.c: New test.


	Marek
Jakub Jelinek - May 16, 2013, 2:18 p.m.
On Thu, May 16, 2013 at 04:07:44PM +0200, Marek Polacek wrote:
> --- gcc/tree-ssa-strlen.c.mp	2013-05-15 14:11:20.079707492 +0200
> +++ gcc/tree-ssa-strlen.c	2013-05-16 16:03:50.373504796 +0200
> @@ -1717,6 +1717,27 @@ handle_char_store (gimple_stmt_iterator
>  	    si->endptr = ssaname;
>  	  si->dont_invalidate = true;
>  	}
> +      /* If si->length is non-zero constant, we aren't overwriting '\0',
> +         and if we aren't storing '\0', we know that the length of the

The above line has 8 spaces instead of tab.  Also, please write
"that the length of the string and any other zero terminated string in
memory remains the same."

> +	 string remains the same.  In that case we move to the next
> +	 gimple statement and return to signal the caller that it shouldn't
> +	 invalidate anything.  
> +	 
> +	 This is benefical for cases like:
> +	 
> +	 char p[] = "foobar";
> +	 size_t len = strlen (p);
> +	 p[0] = 'X'

Missing ; at the end of the above line.

> +	 size_t len2 = strlen (p);

Also, you could make it clear that it affects any other strings.  Perhaps
	char p[20];
	void foo (char *q)
	{
	  strcpy (p, "foobar");
	  size_t len = strlen (p);	// This can be optimized into 6
	  size_t len2 = strlen (q);	// This has to be computed
	  p[0] = 'X';
	  size_t len3 = strlen (p);	// This can be optimized into 6
	  size_t len4 = strlen (q);	// This can be optimized into len2
	  bar (len, len2, len3, len4);
	}
As q could point to p, if we didn't do what your patch does on the p[0] = 'X';
store, then we'd need to invalidate the recorded length of the q string.
Similarly if there is p[0] = '\0' or p[0] = var.

	Jakub
Jakub Jelinek - May 16, 2013, 2:28 p.m.
On Thu, May 16, 2013 at 04:18:27PM +0200, Jakub Jelinek wrote:
> As q could point to p, if we didn't do what your patch does on the p[0] = 'X';
> store, then we'd need to invalidate the recorded length of the q string.
> Similarly if there is p[0] = '\0' or p[0] = var.

Ah, another thing while we are at it.
For p[0] = '\0'; case when p[0] is known to be '\0' already, we
remove it if:
              /* When storing '\0', the store can be removed
                 if we know it has been stored in the current function.  */
              if (!stmt_could_throw_p (stmt) && si->writable)
(and in that case don't invalidate anything either).  But the above
condition is false, we set si->writable (correct) and si->dont_invalidate,
while we could do instead of that the same what you do for non-zero store to
non-zero location, i.e. gsi_next (gsi); return false;.
Perhaps a testcase for that is:
size_t
bar (char *p, char *r)
{
  size_t len1 = strlen (r);
  char *q = strchr (p, '\0');
  *q = '\0';
  return len1 - strlen (r); // This strlen should be optimized into len1.
}

strlen (q) should be known to be zero at that point, but si->writable should
be false, we don't know if p doesn't point say into .rodata, and
stmt_could_throw_p probably should return true too.

	Jakub

Patch

--- gcc/tree-ssa-strlen.c.mp	2013-05-15 14:11:20.079707492 +0200
+++ gcc/tree-ssa-strlen.c	2013-05-16 16:03:50.373504796 +0200
@@ -1717,6 +1717,27 @@  handle_char_store (gimple_stmt_iterator
 	    si->endptr = ssaname;
 	  si->dont_invalidate = true;
 	}
+      /* If si->length is non-zero constant, we aren't overwriting '\0',
+         and if we aren't storing '\0', we know that the length of the
+	 string remains the same.  In that case we move to the next
+	 gimple statement and return to signal the caller that it shouldn't
+	 invalidate anything.  
+	 
+	 This is benefical for cases like:
+	 
+	 char p[] = "foobar";
+	 size_t len = strlen (p);
+	 p[0] = 'X'
+	 size_t len2 = strlen (p);
+	 
+	 where we should be able to optimize away the second strlen call.  */
+      else if (si != NULL && si->length != NULL_TREE
+	       && TREE_CODE (si->length) == INTEGER_CST
+	       && integer_nonzerop (gimple_assign_rhs1 (stmt)))
+	{
+	  gsi_next (gsi);
+	  return false;
+	}
     }
   else if (idx == 0 && initializer_zerop (gimple_assign_rhs1 (stmt)))
     {
--- gcc/testsuite/gcc.dg/strlenopt-25.c.mp	2013-05-15 17:15:18.702118637 +0200
+++ gcc/testsuite/gcc.dg/strlenopt-25.c	2013-05-15 18:26:27.881030317 +0200
@@ -0,0 +1,18 @@ 
+/* { dg-do run } */
+/* { dg-options "-O2 -fdump-tree-strlen" } */
+
+#include "strlenopt.h"
+
+int
+main ()
+{
+  char p[] = "foobar";
+  int len, len2;
+  len = strlen (p);
+  p[0] = 'O';
+  len2 = strlen (p);
+  return len - len2;
+}
+
+/* { dg-final { scan-tree-dump-times "strlen \\(" 0 "strlen" } } */
+/* { dg-final { cleanup-tree-dump "strlen" } } */