diff mbox series

[Middle-end,version,2] 2nd patch of PR78809 and PR83026

Message ID A4FB5DB3-1280-4155-A20F-B3D03E09D252@oracle.com
State New
Headers show
Series [Middle-end,version,2] 2nd patch of PR78809 and PR83026 | expand

Commit Message

Qing Zhao Dec. 21, 2017, 9:25 p.m. UTC
Hi, 

I updated my patch based on all your comments. 

the major changes are the following:

	1. replace the candidate calls with __builtin_str(n)cmp_eq instead of __builtin_memcmp_eq;
            in builtins.c,  when expanding the new __builtin_str(n)cmp_eq call, expand them first as
            __builtin_memcmp_eq, if Not succeed,  change the call back to __builtin_str(n)cmp.
	2. change the call to “get_range_strlen” with “compute_objsize”.
	3. add the missing case for equality checking with zero;
	4. adjust the new testing case for PR83026; add a new testing case for the missing case added in 3.
	5. update “uhwi” to “shwi” for where it needs;
	6. some other minor format changes.

the changes are retested on x86 and aarch64, bootstrapped and regression tested. no issue.

Okay for trunk?

thanks.

Qing

Please see the updated patch:

gcc/ChangeLog:

+2017-12-21  Qing Zhao  <qing.zhao@oracle.com>
+
+	PR middle-end/78809
+	PR middle-end/83026
+	* builtins.c (expand_builtin): Add the handling of BUILT_IN_STRCMP_EQ
+	and BUILT_IN_STRNCMP_EQ.
+	* builtins.def: Add new builtins BUILT_IN_STRCMP_EQ and
+	BUILT_IN_STRNCMP_EQ.
+	* tree-ssa-strlen.c (compute_string_length): New function.
+	(handle_builtin_string_cmp): New function to handle calls to
+	string compare functions.
+	(strlen_optimize_stmt): Add handling to builtin string compare
+	calls. 
+	* tree.c (build_common_builtin_nodes): Add new defines of
+	BUILT_IN_STRNCMP_EQ and BUILT_IN_STRCMP_EQ.
+
gcc/testsuite/ChangeLog

+2017-12-21  Qing Zhao  <qing.zhao@oracle.com> 
+
+	PR middle-end/78809
+	* gcc.dg/strcmpopt_2.c: New testcase.
+	* gcc.dg/strcmpopt_3.c: New testcase.
+
+	PR middle-end/83026
+	* gcc.dg/strcmpopt_3.c: New testcase.
+

---
 gcc/builtins.c                     |  33 ++++++
 gcc/builtins.def                   |   5 +
 gcc/testsuite/gcc.dg/strcmpopt_2.c |  67 ++++++++++++
 gcc/testsuite/gcc.dg/strcmpopt_3.c |  31 ++++++
 gcc/testsuite/gcc.dg/strcmpopt_4.c |  16 +++
 gcc/tree-ssa-strlen.c              | 215 +++++++++++++++++++++++++++++++++++++
 gcc/tree.c                         |   8 ++
 7 files changed, 375 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_2.c
 create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_3.c
 create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_4.c

Comments

Jeff Law Jan. 12, 2018, 10:47 p.m. UTC | #1
On 12/21/2017 02:25 PM, Qing Zhao wrote:
> Hi, 
> 
> I updated my patch based on all your comments. 
> 
> the major changes are the following:
> 
> 	1. replace the candidate calls with __builtin_str(n)cmp_eq instead of __builtin_memcmp_eq;
>             in builtins.c,  when expanding the new __builtin_str(n)cmp_eq call, expand them first as
>             __builtin_memcmp_eq, if Not succeed,  change the call back to __builtin_str(n)cmp.
> 	2. change the call to “get_range_strlen” with “compute_objsize”.
Please read the big comment before compute_objsize.  If you are going to
use it to influence code generation or optimization, then you're most
likely doing something wrong.

compute_objsize can return an estimate in some circumstances.


> 	3. add the missing case for equality checking with zero;
> 	4. adjust the new testing case for PR83026; add a new testing case for the missing case added in 3.
> 	5. update “uhwi” to “shwi” for where it needs;
> 	6. some other minor format changes.
> 
> the changes are retested on x86 and aarch64, bootstrapped and regression tested. no issue.
> 
> Okay for trunk?
> 
> thanks.
> 
> Qing
> 
> Please see the updated patch:
> 
> gcc/ChangeLog:
> 
> +2017-12-21  Qing Zhao  <qing.zhao@oracle.com>
> +
> +	PR middle-end/78809
> +	PR middle-end/83026
> +	* builtins.c (expand_builtin): Add the handling of BUILT_IN_STRCMP_EQ
> +	and BUILT_IN_STRNCMP_EQ.
> +	* builtins.def: Add new builtins BUILT_IN_STRCMP_EQ and
> +	BUILT_IN_STRNCMP_EQ.
> +	* tree-ssa-strlen.c (compute_string_length): New function.
> +	(handle_builtin_string_cmp): New function to handle calls to
> +	string compare functions.
> +	(strlen_optimize_stmt): Add handling to builtin string compare
> +	calls. 
> +	* tree.c (build_common_builtin_nodes): Add new defines of
> +	BUILT_IN_STRNCMP_EQ and BUILT_IN_STRCMP_EQ.
> +
> gcc/testsuite/ChangeLog
> 
> +2017-12-21  Qing Zhao  <qing.zhao@oracle.com> 
> +
> +	PR middle-end/78809
> +	* gcc.dg/strcmpopt_2.c: New testcase.
> +	* gcc.dg/strcmpopt_3.c: New testcase.
> +
> +	PR middle-end/83026
> +	* gcc.dg/strcmpopt_3.c: New testcase.
What I don't like here is the introduction of STRCMP_EQ and STRNCMP_EQ.
ISTM that if you're going to introduce those new builtins, then you have
to audit all the code that runs between their introduction into the IL
and when you expand them to ensure they're handled properly.

All you're really doing is carrying along a status bit about what
tree-ssa-strlen did.  So you could possibly store that status bit somewhere.

The concern with both is that something later invalidates the analysis
you've done.  I'm having a hard time coming up with a case where this
could happen, but I'm definitely concerned about this possibility.
Though I feel it's more likely to happen if we store a status bit vs
using STRCMP_EQ STRNCMP_EQ.

[ For example, we have two calls with the same arguments, but one feeds
an equality test, the other does not.  If we store a status bit that one
could be transformed, but then we end up CSE-ing the two calls, the
status bit would be incorrect because one of the calls did not feed an
equality test.  Hmmm. ]




> diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
> index 94f20ef..57563ef 100644
> --- a/gcc/tree-ssa-strlen.c
> +++ b/gcc/tree-ssa-strlen.c
> @@ -2540,6 +2540,216 @@ handle_builtin_memcmp (gimple_stmt_iterator *gsi)
>    return false;
>  }
>  
> +/* Given an index to the strinfo vector, compute the string length for the
> +   corresponding string. Return -1 when unknown.  */
> + 
> +static HOST_WIDE_INT 
> +compute_string_length (int idx)
> +{
> +  HOST_WIDE_INT string_leni = -1; 
> +  gcc_assert (idx != 0);
> +
> +  if (idx < 0)
> +    string_leni = ~idx;
So it seems to me you should just
  return ~idx;

Then appropriately simplify the rest of the code.

> +
> +/* Handle a call to strcmp or strncmp. When the result is ONLY used to do 
> +   equality test against zero:
> +
> +   A. When both arguments are constant strings and it's a strcmp:
> +      * if the length of the strings are NOT equal, we can safely fold the call
> +        to a non-zero value.
> +      * otherwise, do nothing now.
I'm guessing your comment needs a bit of work.  If both arguments are
constant strings, then we can just use the host str[n]cmp to fold the
str[n]cmp to a constant.  Presumably this is handled earlier :-)

So what I'm guessing is you're really referring to the case where the
lengths are known constants, even if the contents of the strings
themselves are not.  In that case if its an equality comparison, then
you can fold to a constant.  Otherwise we do nothing.  So I think the
comment needs updating here.






> +  
> +   B. When one of the arguments is constant string, try to replace the call with
> +   a __builtin_str(n)cmp_eq call where possible, i.e:
> +
> +   strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a 
> +   constant string, C is a constant.
> +     if (C <= strlen(STR) && sizeof_array(s) > C)
> +       {
> +         replace this call with
> +         strncmp_eq (s, STR, C) (!)= 0
> +       }
> +     if (C > strlen(STR)
> +       {
> +         it can be safely treated as a call to strcmp (s, STR) (!)= 0
> +         can handled by the following strcmp.
> +       }
> +
> +   strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a 
> +   constant string.
> +     if  (sizeof_array(s) > strlen(STR))
> +       {
> +         replace this call with
> +         strcmp_eq (s, STR, strlen(STR)+1) (!)= 0
So I'd hazard a guess that this comment has the same problem.  It's
mixing up the concept of a constant string and the concept of a
nonconstant string, but which has a known constant length.

First, if one of the arguments is a string constant, then it should be
easy to get the constant at expansion time with minimal effort.  It's
also the case that knowing if we're comparing the result against zero is
trivial to figure out at expansion time.  This would probably be a
reasonble thing to add to the expanders.



Your function comment for handle_builtin_string_cmp does not indicate
what the return value means.  From reading the code is appears to return
TRUE when it does nothing and FALSE when it optimizes the call in some
way.  Is that correct?  That seems inverted from the way we'd normally
write this stuff.



> +
> +  /* When one of args is constant string.  */
> +  tree var_string = NULL_TREE;
> +  HOST_WIDE_INT const_string_leni = -1;
> +  
> +  if (idx1)
> +    {
> +      const_string_leni = compute_string_length (idx1);
> +      var_string = arg2;
> +    } 
> +  else 
> +    {
> +      gcc_checking_assert (idx2);
> +      const_string_leni = compute_string_length (idx2);
> +      var_string = arg1;
> +    } 
> +
> +  if (const_string_leni < 0) 
> +    return true;
> + 
> +  unsigned HOST_WIDE_INT var_sizei = 0;
> +  /* try to determine the minimum size of the object pointed by var_string.  */
> +  tree size = compute_objsize (var_string, 2);
So this worries me as well.  compute_objsize is not supposed to be used
to influence code generation or optimization.  It is not guaranteed to
return an exact size, but instead may return an estimate!




I'd really like to hear Jakub or Richi chime in with their thoughts on
how this transforms one builtin into another and whether or not they
think that is wise.

jeff
Richard Biener Jan. 15, 2018, 8:48 a.m. UTC | #2
On Fri, 12 Jan 2018, Jeff Law wrote:

> On 12/21/2017 02:25 PM, Qing Zhao wrote:
> > Hi, 
> > 
> > I updated my patch based on all your comments. 
> > 
> > the major changes are the following:
> > 
> > 	1. replace the candidate calls with __builtin_str(n)cmp_eq instead of __builtin_memcmp_eq;
> >             in builtins.c,  when expanding the new __builtin_str(n)cmp_eq call, expand them first as
> >             __builtin_memcmp_eq, if Not succeed,  change the call back to __builtin_str(n)cmp.
> > 	2. change the call to “get_range_strlen” with “compute_objsize”.
> Please read the big comment before compute_objsize.  If you are going to
> use it to influence code generation or optimization, then you're most
> likely doing something wrong.
> 
> compute_objsize can return an estimate in some circumstances.
> 
> 
> > 	3. add the missing case for equality checking with zero;
> > 	4. adjust the new testing case for PR83026; add a new testing case for the missing case added in 3.
> > 	5. update “uhwi” to “shwi” for where it needs;
> > 	6. some other minor format changes.
> > 
> > the changes are retested on x86 and aarch64, bootstrapped and regression tested. no issue.
> > 
> > Okay for trunk?
> > 
> > thanks.
> > 
> > Qing
> > 
> > Please see the updated patch:
> > 
> > gcc/ChangeLog:
> > 
> > +2017-12-21  Qing Zhao  <qing.zhao@oracle.com>
> > +
> > +	PR middle-end/78809
> > +	PR middle-end/83026
> > +	* builtins.c (expand_builtin): Add the handling of BUILT_IN_STRCMP_EQ
> > +	and BUILT_IN_STRNCMP_EQ.
> > +	* builtins.def: Add new builtins BUILT_IN_STRCMP_EQ and
> > +	BUILT_IN_STRNCMP_EQ.
> > +	* tree-ssa-strlen.c (compute_string_length): New function.
> > +	(handle_builtin_string_cmp): New function to handle calls to
> > +	string compare functions.
> > +	(strlen_optimize_stmt): Add handling to builtin string compare
> > +	calls. 
> > +	* tree.c (build_common_builtin_nodes): Add new defines of
> > +	BUILT_IN_STRNCMP_EQ and BUILT_IN_STRCMP_EQ.
> > +
> > gcc/testsuite/ChangeLog
> > 
> > +2017-12-21  Qing Zhao  <qing.zhao@oracle.com> 
> > +
> > +	PR middle-end/78809
> > +	* gcc.dg/strcmpopt_2.c: New testcase.
> > +	* gcc.dg/strcmpopt_3.c: New testcase.
> > +
> > +	PR middle-end/83026
> > +	* gcc.dg/strcmpopt_3.c: New testcase.
> What I don't like here is the introduction of STRCMP_EQ and STRNCMP_EQ.
> ISTM that if you're going to introduce those new builtins, then you have
> to audit all the code that runs between their introduction into the IL
> and when you expand them to ensure they're handled properly.
> 
> All you're really doing is carrying along a status bit about what
> tree-ssa-strlen did.  So you could possibly store that status bit somewhere.
> 
> The concern with both is that something later invalidates the analysis
> you've done.  I'm having a hard time coming up with a case where this
> could happen, but I'm definitely concerned about this possibility.
> Though I feel it's more likely to happen if we store a status bit vs
> using STRCMP_EQ STRNCMP_EQ.
> 
> [ For example, we have two calls with the same arguments, but one feeds
> an equality test, the other does not.  If we store a status bit that one
> could be transformed, but then we end up CSE-ing the two calls, the
> status bit would be incorrect because one of the calls did not feed an
> equality test.  Hmmm. ]
> 
> 
> 
> 
> > diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
> > index 94f20ef..57563ef 100644
> > --- a/gcc/tree-ssa-strlen.c
> > +++ b/gcc/tree-ssa-strlen.c
> > @@ -2540,6 +2540,216 @@ handle_builtin_memcmp (gimple_stmt_iterator *gsi)
> >    return false;
> >  }
> >  
> > +/* Given an index to the strinfo vector, compute the string length for the
> > +   corresponding string. Return -1 when unknown.  */
> > + 
> > +static HOST_WIDE_INT 
> > +compute_string_length (int idx)
> > +{
> > +  HOST_WIDE_INT string_leni = -1; 
> > +  gcc_assert (idx != 0);
> > +
> > +  if (idx < 0)
> > +    string_leni = ~idx;
> So it seems to me you should just
>   return ~idx;
> 
> Then appropriately simplify the rest of the code.
> 
> > +
> > +/* Handle a call to strcmp or strncmp. When the result is ONLY used to do 
> > +   equality test against zero:
> > +
> > +   A. When both arguments are constant strings and it's a strcmp:
> > +      * if the length of the strings are NOT equal, we can safely fold the call
> > +        to a non-zero value.
> > +      * otherwise, do nothing now.
> I'm guessing your comment needs a bit of work.  If both arguments are
> constant strings, then we can just use the host str[n]cmp to fold the
> str[n]cmp to a constant.  Presumably this is handled earlier :-)
> 
> So what I'm guessing is you're really referring to the case where the
> lengths are known constants, even if the contents of the strings
> themselves are not.  In that case if its an equality comparison, then
> you can fold to a constant.  Otherwise we do nothing.  So I think the
> comment needs updating here.
> 
> 
> 
> 
> 
> 
> > +  
> > +   B. When one of the arguments is constant string, try to replace the call with
> > +   a __builtin_str(n)cmp_eq call where possible, i.e:
> > +
> > +   strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a 
> > +   constant string, C is a constant.
> > +     if (C <= strlen(STR) && sizeof_array(s) > C)
> > +       {
> > +         replace this call with
> > +         strncmp_eq (s, STR, C) (!)= 0
> > +       }
> > +     if (C > strlen(STR)
> > +       {
> > +         it can be safely treated as a call to strcmp (s, STR) (!)= 0
> > +         can handled by the following strcmp.
> > +       }
> > +
> > +   strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a 
> > +   constant string.
> > +     if  (sizeof_array(s) > strlen(STR))
> > +       {
> > +         replace this call with
> > +         strcmp_eq (s, STR, strlen(STR)+1) (!)= 0
> So I'd hazard a guess that this comment has the same problem.  It's
> mixing up the concept of a constant string and the concept of a
> nonconstant string, but which has a known constant length.
> 
> First, if one of the arguments is a string constant, then it should be
> easy to get the constant at expansion time with minimal effort.  It's
> also the case that knowing if we're comparing the result against zero is
> trivial to figure out at expansion time.  This would probably be a
> reasonble thing to add to the expanders.
> 
> 
> 
> Your function comment for handle_builtin_string_cmp does not indicate
> what the return value means.  From reading the code is appears to return
> TRUE when it does nothing and FALSE when it optimizes the call in some
> way.  Is that correct?  That seems inverted from the way we'd normally
> write this stuff.
> 
> 
> 
> > +
> > +  /* When one of args is constant string.  */
> > +  tree var_string = NULL_TREE;
> > +  HOST_WIDE_INT const_string_leni = -1;
> > +  
> > +  if (idx1)
> > +    {
> > +      const_string_leni = compute_string_length (idx1);
> > +      var_string = arg2;
> > +    } 
> > +  else 
> > +    {
> > +      gcc_checking_assert (idx2);
> > +      const_string_leni = compute_string_length (idx2);
> > +      var_string = arg1;
> > +    } 
> > +
> > +  if (const_string_leni < 0) 
> > +    return true;
> > + 
> > +  unsigned HOST_WIDE_INT var_sizei = 0;
> > +  /* try to determine the minimum size of the object pointed by var_string.  */
> > +  tree size = compute_objsize (var_string, 2);
> So this worries me as well.  compute_objsize is not supposed to be used
> to influence code generation or optimization.  It is not guaranteed to
> return an exact size, but instead may return an estimate!

Yes, compute_objsize cannot be used for optimizations.

> I'd really like to hear Jakub or Richi chime in with their thoughts on
> how this transforms one builtin into another and whether or not they
> think that is wise.

Given it follows how we handle memcmp () != 0 it's ok to trigger
inline expansion.  Didn't look into the patch whether it does this.
And yes, places that look at STR[N]CMP now also have to look at
STR[N]CMP_EQ.

We're now in stage4 so please queue this patch and ping it during
next stage1.

Thanks,
Richard.
Qing Zhao Jan. 25, 2018, 8:32 p.m. UTC | #3
Hi, Jeff,

Really sorry for my  delay. 

Your email on 1/12/2018 and Richard’s email on 1/15/2018,   were completely lost in my mailboxes until yesterday my colleague, Paolo Carlini, forwarded it to me. 

I really apologize for the late reply.

Please see my reply below:

> On Jan 12, 2018, at 4:47 PM, Jeff Law <law@redhat.com <mailto:law@redhat.com>> wrote:
>> 
>> 
>> 	1. replace the candidate calls with __builtin_str(n)cmp_eq instead of __builtin_memcmp_eq;
>>            in builtins.c,  when expanding the new __builtin_str(n)cmp_eq call, expand them first as
>>            __builtin_memcmp_eq, if Not succeed,  change the call back to __builtin_str(n)cmp.
>> 	2. change the call to “get_range_strlen” with “compute_objsize”.
> Please read the big comment before compute_objsize.  If you are going to
> use it to influence code generation or optimization, then you're most
> likely doing something wrong.
> 
> compute_objsize can return an estimate in some circumstances.

> On Jan 15, 2018, at 2:48 AM, Richard Biener <rguenther@suse.de <mailto:rguenther@suse.de>> wrote:
> 
> Yes, compute_objsize cannot be used for optimizations.
Yes, I just checked the compute_objsize again, you are right, it might return an estimated code size.


> On Jan 12, 2018, at 4:47 PM, Jeff Law <law@redhat.com <mailto:law@redhat.com>> wrote:

> What I don't like here is the introduction of STRCMP_EQ and STRNCMP_EQ.
> ISTM that if you're going to introduce those new builtins, then you have
> to audit all the code that runs between their introduction into the IL
> and when you expand them to ensure they're handled properly.
> 
> All you're really doing is carrying along a status bit about what
> tree-ssa-strlen did.  So you could possibly store that status bit somewhere.
> 
> The concern with both is that something later invalidates the analysis
> you've done.  I'm having a hard time coming up with a case where this
> could happen, but I'm definitely concerned about this possibility.
> Though I feel it's more likely to happen if we store a status bit vs
> using STRCMP_EQ STRNCMP_EQ.
> 
> [ For example, we have two calls with the same arguments, but one feeds
> an equality test, the other does not.  If we store a status bit that one
> could be transformed, but then we end up CSE-ing the two calls, the
> status bit would be incorrect because one of the calls did not feed an
> equality test.  Hmmm. ]

See below.

>> +static HOST_WIDE_INT 
>> +compute_string_length (int idx)
>> +{
>> +  HOST_WIDE_INT string_leni = -1; 
>> +  gcc_assert (idx != 0);
>> +
>> +  if (idx < 0)
>> +    string_leni = ~idx;
> So it seems to me you should just
>  return ~idx;
> 
> Then appropriately simplify the rest of the code.

Okay, I will update my code per your suggestion. 
> 
>> +
>> +/* Handle a call to strcmp or strncmp. When the result is ONLY used to do 
>> +   equality test against zero:
>> +
>> +   A. When both arguments are constant strings and it's a strcmp:
>> +      * if the length of the strings are NOT equal, we can safely fold the call
>> +        to a non-zero value.
>> +      * otherwise, do nothing now.
> I'm guessing your comment needs a bit of work.  If both arguments are
> constant strings, then we can just use the host str[n]cmp to fold the
> str[n]cmp to a constant.  Presumably this is handled earlier :-)
> 
> So what I'm guessing is you're really referring to the case where the
> lengths are known constants, even if the contents of the strings
> themselves are not.  In that case if its an equality comparison, then
> you can fold to a constant.  Otherwise we do nothing.  So I think the
> comment needs updating here.

your understanding is correct, I will update the comments to make it more accurate.

>> +  
>> +   B. When one of the arguments is constant string, try to replace the call with
>> +   a __builtin_str(n)cmp_eq call where possible, i.e:
>> +
>> +   strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a 
>> +   constant string, C is a constant.
>> +     if (C <= strlen(STR) && sizeof_array(s) > C)
>> +       {
>> +         replace this call with
>> +         strncmp_eq (s, STR, C) (!)= 0
>> +       }
>> +     if (C > strlen(STR)
>> +       {
>> +         it can be safely treated as a call to strcmp (s, STR) (!)= 0
>> +         can handled by the following strcmp.
>> +       }
>> +
>> +   strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a 
>> +   constant string.
>> +     if  (sizeof_array(s) > strlen(STR))
>> +       {
>> +         replace this call with
>> +         strcmp_eq (s, STR, strlen(STR)+1) (!)= 0
> So I'd hazard a guess that this comment has the same problem.  It's
> mixing up the concept of a constant string and the concept of a
> nonconstant string, but which has a known constant length.
> 
> First, if one of the arguments is a string constant, then it should be
> easy to get the constant at expansion time with minimal effort.  It's
> also the case that knowing if we're comparing the result against zero is
> trivial to figure out at expansion time.  This would probably be a
> reasonble thing to add to the expanders.

Yes, should be a string with constant length.
I will update my comments to reflect this.

> Your function comment for handle_builtin_string_cmp does not indicate
> what the return value means.  From reading the code is appears to return
> TRUE when it does nothing and FALSE when it optimizes the call in some
> way.  Is that correct?

Yes.
>  That seems inverted from the way we'd normally
> write this stuff.

This is just following the handling of memcmp (please see the routine handle_builtin_memcmp),
try to be consistent with them. 

>> +
>> +  /* When one of args is constant string.  */
>> +  tree var_string = NULL_TREE;
>> +  HOST_WIDE_INT const_string_leni = -1;
>> +  
>> +  if (idx1)
>> +    {
>> +      const_string_leni = compute_string_length (idx1);
>> +      var_string = arg2;
>> +    } 
>> +  else 
>> +    {
>> +      gcc_checking_assert (idx2);
>> +      const_string_leni = compute_string_length (idx2);
>> +      var_string = arg1;
>> +    } 
>> +
>> +  if (const_string_leni < 0) 
>> +    return true;
>> + 
>> +  unsigned HOST_WIDE_INT var_sizei = 0;
>> +  /* try to determine the minimum size of the object pointed by var_string.  */
>> +  tree size = compute_objsize (var_string, 2);
> So this worries me as well.  compute_objsize is not supposed to be used
> to influence code generation or optimization.  It is not guaranteed to
> return an exact size, but instead may return an estimate!

As I mentioned in above:

Yes, I just checked the compute_objsize again, you are right, it might return an estimated code size.

there are 3 parts in  compute_objsize:
	A. call “compute_builtin_object_size” to get the code size;
	B.  If A failed, use value_RANGE info when the referenced object involves
   a non-constant offset in some range. 
	C. if both A and B failed, use the TYPE info to determine the code size.

So, From my understanding:
both A and C should provide accurate code size info, but B will NOT.
for my purpose, if I do NOT use B, then it will be return accurate code size, right?

> 
> 
> I'd really like to hear Jakub or Richi chime in with their thoughts on
> how this transforms one builtin into another and whether or not they think that is wise

On Jan 15, 2018, at 2:48 AM, Richard Biener <rguenther@suse.de <mailto:rguenther@suse.de>> wrote:
> Given it follows how we handle memcmp () != 0 it's ok to trigger inline expansion.
>  Didn't look into the patch whether it does this.
> And yes, places that look at STR[N]CMP now also have to look atSTR[N]CMP_EQ.

Based on both you and Richard’s above comments, my understanding is:

introducing STRCMP_EQ/STRNCMP_EQ follows the current handling of memcmp ()!=0 in 
GCC, should be fine. 
but I should check all the places that look at STRCMP/STRNCMP to include STRCMP_EQ/STRNCMP_EQ now.

Thanks.

Qing
Qing Zhao Jan. 25, 2018, 8:36 p.m. UTC | #4
> 
> We're now in stage4 so please queue this patch and ping it during
> next stage1.
> 

I will update my patch based on Jeff and your comments, and send it during next stage 1.

thanks.

Qing
Qing Zhao Feb. 7, 2018, 3:36 p.m. UTC | #5
Hi, this is the 3rd version for this patch.

the main change compared with 2nd version are:
	1. do not use “compute_objsize” to get the minimum object size per Jeff and Richard’s
comment. Instead, add a new function “determine_min_objsize” for this purpose. This new
function calls “compute_builtin_object_size” to get the minimum objsize, however, due to 
the fact that “compute_builtin_object_size” does nothing for SSA_NAME when optimize > 0 (don’t
know the exactly reason for this), inside “determine_min_objsize”, I have to add  more code
to catch up some simple SSA_NAME cases.

	2. in gimple-fold.c and tree-ssa-structalias.c, add the handling of the new 
BUILT_IN_STRCMP_EQ and BUILT_IN_STRNCMP_EQ in the same places where 
BUILT_IN_STRCMP and BUILT_IN_STRNCMP is checked.

	3. some  format change and comments change per Jeff’s comment. 

let me know if you have any comments.

thanks a lot.

Qing

*********************************

2nd Patch for PR78009 
Patch for PR83026

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809
Inline strcmp with small constant strings

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83026
missing strlen optimization for strcmp of unequal strings

The design doc for PR78809 is at:
https://www.mail-archive.com/gcc@gcc.gnu.org/msg83822.html

this patch is for the second part of change of PR78809 and PR83026:

B. for strncmp (s1, s2, n) (!)= 0 or strcmp (s1, s2) (!)= 0

 B.1. (PR83026) When the lengths of both arguments are constant and
      it's a strcmp:
    * if the lengths are NOT equal, we can safely fold the call
      to a non-zero value.
    * otherwise, do nothing now.

 B.2. (PR78809) When the length of one argument is constant, try to replace
 the call with a __builtin_str(n)cmp_eq call where possible, i.e:

 strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a
 string with constant length, C is a constant.
   if (C <= strlen(STR) && sizeof_array(s) > C)
     {
       replace this call with
       __builtin_strncmp_eq (s, STR, C) (!)= 0
     }
   if (C > strlen(STR)
     {
       it can be safely treated as a call to strcmp (s, STR) (!)= 0
       can handled by the following strcmp.
     }

 strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a
 string with constant length.
   if  (sizeof_array(s) > strlen(STR))
     {
       replace this call with
       __builtin_strcmp_eq (s, STR, strlen(STR)+1) (!)= 0
     }

 later when expanding the new __builtin_str(n)cmp_eq calls, first expand them
 as __builtin_memcmp_eq, if the expansion does not succeed, change them back
 to call to __builtin_str(n)cmp.

adding test case strcmpopt_2.c and strcmpopt_4.c into gcc.dg for part B of
PR78809 adding test case strcmpopt_3.c into gcc.dg for PR83026

bootstraped and tested on both X86 and Aarch64. no regression.

************************************************
gcc/ChangeLog

+2018-02-02  <qing.zhao@oracle.com>
+	
+	PR middle-end/78809
+	PR middle-end/83026
+	* builtins.c (expand_builtin): Add the handling of BUILT_IN_STRCMP_EQ
+	and BUILT_IN_STRNCMP_EQ.
+	* builtins.def: Add new builtins BUILT_IN_STRCMP_EQ and
+	BUILT_IN_STRNCMP_EQ.
+	* gimple-fold.c (gimple_fold_builtin_string_compare): Add the 
+	handling of BUILTIN_IN_STRCMP_EQ and BUILT_IN_STRNCMP_EQ.
+	(gimple_fold_builtin): Likewise.
+	* tree-ssa-strlen.c (compute_string_length): New function.
+	(determine_min_obsize): New function.
+	(handle_builtin_string_cmp): New function to handle calls to
+	string compare functions.
+	(strlen_optimize_stmt): Add handling to builtin string compare
+	calls. 
+	* tree-ssa-structalias.c (find_func_aliases_for_builtin_call):
+	Add the handling of BUILT_IN_STRCMP_EQ and BUILT_IN_STRNCMP_EQ.
+	* tree.c (build_common_builtin_nodes): Add new defines of
+	BUILT_IN_STRNCMP_EQ and BUILT_IN_STRCMP_EQ.
+

gcc/testsuite/ChangeLog

+2018-02-02  <qing.zhao@oracle.com>
+
+	PR middle-end/78809
+	* gcc.dg/strcmpopt_2.c: New testcase.
+	* gcc.dg/strcmpopt_3.c: New testcase.
+
+	PR middle-end/83026
+	* gcc.dg/strcmpopt_3.c: New testcase.
> 
>> 
>> We're now in stage4 so please queue this patch and ping it during
>> next stage1.
>> 
> 
> I will update my patch based on Jeff and your comments, and send it during next stage 1.
> 
> thanks.
> 
> Qing
>
Qing Zhao May 22, 2018, 5:17 p.m. UTC | #6
Ping for the following patch sent in 3 months ago in the end of GCC8:

https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg184075.html

I have rebased the patch on the latest GCC9 thunk. 

bootstraped and tested on both X86 and Aarch64. no regression.

the following are more details:

========

Hi, this is the 3rd version for this patch.

the main change compared with 2nd version are:
	1. do not use “compute_objsize” to get the minimum object size per Jeff and Richard’s
comment. Instead, add a new function “determine_min_objsize” for this purpose. This new
function calls “compute_builtin_object_size” to get the minimum objsize, however, due to 
the fact that “compute_builtin_object_size” does nothing for SSA_NAME when optimize > 0 (don’t
know the exactly reason for this), inside “determine_min_objsize”, I have to add  more code
to catch up some simple SSA_NAME cases.

	2. in gimple-fold.c and tree-ssa-structalias.c, add the handling of the new 
BUILT_IN_STRCMP_EQ and BUILT_IN_STRNCMP_EQ in the same places where 
BUILT_IN_STRCMP and BUILT_IN_STRNCMP is checked.

	3. some  format change and comments change per Jeff’s comment. 

let me know if you have any comments.

thanks a lot.

Qing

*********************************

2nd Patch for PR78009 
Patch for PR83026

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809
Inline strcmp with small constant strings

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83026
missing strlen optimization for strcmp of unequal strings

The design doc for PR78809 is at:
https://www.mail-archive.com/gcc@gcc.gnu.org/msg83822.html

this patch is for the second part of change of PR78809 and PR83026:

B. for strncmp (s1, s2, n) (!)= 0 or strcmp (s1, s2) (!)= 0

B.1. (PR83026) When the lengths of both arguments are constant and
     it's a strcmp:
   * if the lengths are NOT equal, we can safely fold the call
     to a non-zero value.
   * otherwise, do nothing now.

B.2. (PR78809) When the length of one argument is constant, try to replace
the call with a __builtin_str(n)cmp_eq call where possible, i.e:

strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a
string with constant length, C is a constant.
  if (C <= strlen(STR) && sizeof_array(s) > C)
    {
      replace this call with
      __builtin_strncmp_eq (s, STR, C) (!)= 0
    }
  if (C > strlen(STR)
    {
      it can be safely treated as a call to strcmp (s, STR) (!)= 0
      can handled by the following strcmp.
    }

strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a
string with constant length.
  if  (sizeof_array(s) > strlen(STR))
    {
      replace this call with
      __builtin_strcmp_eq (s, STR, strlen(STR)+1) (!)= 0
    }

later when expanding the new __builtin_str(n)cmp_eq calls, first expand them
as __builtin_memcmp_eq, if the expansion does not succeed, change them back
to call to __builtin_str(n)cmp.

adding test case strcmpopt_2.c and strcmpopt_4.c into gcc.dg for part B of
PR78809 adding test case strcmpopt_3.c into gcc.dg for PR83026

bootstraped and tested on both X86 and Aarch64. no regression.

************************************************
gcc/ChangeLog

+2018-05-21  <qing.zhao@oracle.com>
+	
+	PR middle-end/78809
+	PR middle-end/83026
+	* builtins.c (expand_builtin): Add the handling of BUILT_IN_STRCMP_EQ
+	and BUILT_IN_STRNCMP_EQ.
+	* builtins.def: Add new builtins BUILT_IN_STRCMP_EQ and
+	BUILT_IN_STRNCMP_EQ.
+	* gimple-fold.c (gimple_fold_builtin_string_compare): Add the 
+	handling of BUILTIN_IN_STRCMP_EQ and BUILT_IN_STRNCMP_EQ.
+	(gimple_fold_builtin): Likewise.
+	* tree-ssa-strlen.c (compute_string_length): New function.
+	(determine_min_obsize): New function.
+	(handle_builtin_string_cmp): New function to handle calls to
+	string compare functions.
+	(strlen_optimize_stmt): Add handling to builtin string compare
+	calls. 
+	* tree-ssa-structalias.c (find_func_aliases_for_builtin_call):
+	Add the handling of BUILT_IN_STRCMP_EQ and BUILT_IN_STRNCMP_EQ.
+	* tree.c (build_common_builtin_nodes): Add new defines of
+	BUILT_IN_STRNCMP_EQ and BUILT_IN_STRCMP_EQ.
+

gcc/testsuite/ChangeLog

+2018-05-21  <qing.zhao@oracle.com>
+
+	PR middle-end/78809
+	* gcc.dg/strcmpopt_2.c: New testcase.
+	* gcc.dg/strcmpopt_3.c: New testcase.
+
+	PR middle-end/83026
+	* gcc.dg/strcmpopt_3.c: New testcase.
Jeff Law May 25, 2018, 8:38 p.m. UTC | #7
On 02/07/2018 08:36 AM, Qing Zhao wrote:
> Hi, this is the 3rd version for this patch.
> 
> the main change compared with 2nd version are:
> 	1. do not use “compute_objsize” to get the minimum object size per Jeff and Richard’s
> comment. Instead, add a new function “determine_min_objsize” for this purpose. This new
> function calls “compute_builtin_object_size” to get the minimum objsize, however, due to 
> the fact that “compute_builtin_object_size” does nothing for SSA_NAME when optimize > 0 (don’t
> know the exactly reason for this), inside “determine_min_objsize”, I have to add  more code
> to catch up some simple SSA_NAME cases.
> 
> 	2. in gimple-fold.c and tree-ssa-structalias.c, add the handling of the new 
> BUILT_IN_STRCMP_EQ and BUILT_IN_STRNCMP_EQ in the same places where 
> BUILT_IN_STRCMP and BUILT_IN_STRNCMP is checked.
> 
> 	3. some  format change and comments change per Jeff’s comment. 
> 
> let me know if you have any comments.
> 
> thanks a lot.
> 
> Qing
> 
> *********************************
> 
> 2nd Patch for PR78009 
> Patch for PR83026
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809
> Inline strcmp with small constant strings
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83026
> missing strlen optimization for strcmp of unequal strings
> 
> The design doc for PR78809 is at:
> https://www.mail-archive.com/gcc@gcc.gnu.org/msg83822.html
> 
> this patch is for the second part of change of PR78809 and PR83026:
> 
> B. for strncmp (s1, s2, n) (!)= 0 or strcmp (s1, s2) (!)= 0
> 
>  B.1. (PR83026) When the lengths of both arguments are constant and
>       it's a strcmp:
>     * if the lengths are NOT equal, we can safely fold the call
>       to a non-zero value.
>     * otherwise, do nothing now.
> 
>  B.2. (PR78809) When the length of one argument is constant, try to replace
>  the call with a __builtin_str(n)cmp_eq call where possible, i.e:
> 
>  strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a
>  string with constant length, C is a constant.
>    if (C <= strlen(STR) && sizeof_array(s) > C)
>      {
>        replace this call with
>        __builtin_strncmp_eq (s, STR, C) (!)= 0
>      }
>    if (C > strlen(STR)
>      {
>        it can be safely treated as a call to strcmp (s, STR) (!)= 0
>        can handled by the following strcmp.
>      }
> 
>  strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a
>  string with constant length.
>    if  (sizeof_array(s) > strlen(STR))
>      {
>        replace this call with
>        __builtin_strcmp_eq (s, STR, strlen(STR)+1) (!)= 0
>      }
> 
>  later when expanding the new __builtin_str(n)cmp_eq calls, first expand them
>  as __builtin_memcmp_eq, if the expansion does not succeed, change them back
>  to call to __builtin_str(n)cmp.
> 
> adding test case strcmpopt_2.c and strcmpopt_4.c into gcc.dg for part B of
> PR78809 adding test case strcmpopt_3.c into gcc.dg for PR83026
> 
> bootstraped and tested on both X86 and Aarch64. no regression.
> 
> ************************************************
> gcc/ChangeLog
> 
> +2018-02-02  <qing.zhao@oracle.com>
> +	
> +	PR middle-end/78809
> +	PR middle-end/83026
> +	* builtins.c (expand_builtin): Add the handling of BUILT_IN_STRCMP_EQ
> +	and BUILT_IN_STRNCMP_EQ.
> +	* builtins.def: Add new builtins BUILT_IN_STRCMP_EQ and
> +	BUILT_IN_STRNCMP_EQ.
> +	* gimple-fold.c (gimple_fold_builtin_string_compare): Add the 
> +	handling of BUILTIN_IN_STRCMP_EQ and BUILT_IN_STRNCMP_EQ.
> +	(gimple_fold_builtin): Likewise.
> +	* tree-ssa-strlen.c (compute_string_length): New function.
> +	(determine_min_obsize): New function.
> +	(handle_builtin_string_cmp): New function to handle calls to
> +	string compare functions.
> +	(strlen_optimize_stmt): Add handling to builtin string compare
> +	calls. 
> +	* tree-ssa-structalias.c (find_func_aliases_for_builtin_call):
> +	Add the handling of BUILT_IN_STRCMP_EQ and BUILT_IN_STRNCMP_EQ.
> +	* tree.c (build_common_builtin_nodes): Add new defines of
> +	BUILT_IN_STRNCMP_EQ and BUILT_IN_STRCMP_EQ.
> +
> 
> gcc/testsuite/ChangeLog
> 
> +2018-02-02  <qing.zhao@oracle.com>
> +
> +	PR middle-end/78809
> +	* gcc.dg/strcmpopt_2.c: New testcase.
> +	* gcc.dg/strcmpopt_3.c: New testcase.
> +
> +	PR middle-end/83026
> +	* gcc.dg/strcmpopt_3.c: New testcase.
> 
> 
>
> +
> +/* Handle a call to strcmp or strncmp. When the result is ONLY used to do 
> +   equality test against zero:
> +
> +   A. When the lengths of both arguments are constant and it's a strcmp:
> +      * if the lengths are NOT equal, we can safely fold the call
> +        to a non-zero value.
> +      * otherwise, do nothing now.
> +  
> +   B. When the length of one argument is constant, try to replace the call with
> +   a __builtin_str(n)cmp_eq call where possible, i.e:
> +
> +   strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a 
> +   string with constant length , C is a constant.
> +     if (C <= strlen(STR) && sizeof_array(s) > C)
> +       {
> +         replace this call with
> +         strncmp_eq (s, STR, C) (!)= 0
> +       }
> +     if (C > strlen(STR)
> +       {
> +         it can be safely treated as a call to strcmp (s, STR) (!)= 0
> +         can handled by the following strcmp.
> +       }
> +
> +   strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a 
> +   string with constant length.
> +     if  (sizeof_array(s) > strlen(STR))
> +       {
> +         replace this call with
> +         strcmp_eq (s, STR, strlen(STR)+1) (!)= 0
> +       }
> +
> +   Return false when the call is transformed, return true otherwise. 
> + */ 
> +
> +static bool
> +handle_builtin_string_cmp (gimple_stmt_iterator *gsi)
So I originally thought you had the core logic wrong in the immediate
uses loop.  But it's actually the case that the return value is the
exact opposite of what I expected.

ie, I expected "TRUE" to mean the call was transformed, "FALSE" if it
was not transformed.

Can you fix that so it's not so confusing?

I think with that change we'll be good to go, but please repost for a
final looksie.

THanks,
Jeff
Qing Zhao May 30, 2018, 12:08 a.m. UTC | #8
Hi, Jeff,

Thanks a lot for your review and comments.

I have updated my patch based on your suggestion, and retested this whole patch on both X86 and aarch64.

please take a look at the patch again.

thanks.

Qing

> On May 25, 2018, at 3:38 PM, Jeff Law <law@redhat.com> wrote:

> So I originally thought you had the core logic wrong in the immediate
> uses loop.  But it's actually the case that the return value is the
> exact opposite of what I expected.
> 
> ie, I expected "TRUE" to mean the call was transformed, "FALSE" if it
> was not transformed.
> 
> Can you fix that so it's not so confusing?
> 
> I think with that change we'll be good to go, but please repost for a
> final looksie.
> 
> THanks,
> Jeff
Qing Zhao May 31, 2018, 8:40 p.m. UTC | #9
Hi, 

I have committed the patch as revision 261039.

thanks.

Qing

> On May 29, 2018, at 7:08 PM, Qing Zhao <qing.zhao@oracle.com> wrote:
> 
> Hi, Jeff,
> 
> Thanks a lot for your review and comments.
> 
> I have updated my patch based on your suggestion, and retested this whole patch on both X86 and aarch64.
> 
> please take a look at the patch again.
> 
> thanks.
> 
> Qing
> 
>> On May 25, 2018, at 3:38 PM, Jeff Law <law@redhat.com> wrote:
> 
>> So I originally thought you had the core logic wrong in the immediate
>> uses loop.  But it's actually the case that the return value is the
>> exact opposite of what I expected.
>> 
>> ie, I expected "TRUE" to mean the call was transformed, "FALSE" if it
>> was not transformed.
>> 
>> Can you fix that so it's not so confusing?
>> 
>> I think with that change we'll be good to go, but please repost for a
>> final looksie.
>> 
>> THanks,
>> Jeff
Jeff Law June 23, 2018, 4:49 a.m. UTC | #10
On 05/29/2018 06:08 PM, Qing Zhao wrote:
> Hi, Jeff,
> 
> Thanks a lot for your review and comments.
> 
> I have updated my patch based on your suggestion, and retested this whole patch on both X86 and aarch64.
> 
> please take a look at the patch again.
> 
> thanks.
> 
> Qing
> 
>> On May 25, 2018, at 3:38 PM, Jeff Law <law@redhat.com> wrote:
>> So I originally thought you had the core logic wrong in the immediate
>> uses loop.  But it's actually the case that the return value is the
>> exact opposite of what I expected.
>>
>> ie, I expected "TRUE" to mean the call was transformed, "FALSE" if it
>> was not transformed.
>>
>> Can you fix that so it's not so confusing?
>>
>> I think with that change we'll be good to go, but please repost for a
>> final looksie.
>>
>> THanks,
>> Jeff
> 
> 
> 0001-2nd-Patch-for-PR78009.patch
> 
> 
> From 750f44ee0777d55b568f07e263babdedd532d315 Mon Sep 17 00:00:00 2001
> From: qing zhao <qing.zhao@oracle.com>
> Date: Tue, 29 May 2018 16:15:21 -0400
> Subject: [PATCH] 2nd Patch for PR78009 Patch for PR83026
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809
> Inline strcmp with small constant strings
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83026
> missing strlen optimization for strcmp of unequal strings
> 
> The design doc for PR78809 is at:
> https://www.mail-archive.com/gcc@gcc.gnu.org/msg83822.html
> 
> this patch is for the second part of change of PR78809 and PR83026:
> 
> B. for strncmp (s1, s2, n) (!)= 0 or strcmp (s1, s2) (!)= 0
> 
>    B.1. (PR83026) When the lengths of both arguments are constant and
>         it's a strcmp:
>       * if the lengths are NOT equal, we can safely fold the call
>         to a non-zero value.
>       * otherwise, do nothing now.
> 
>    B.2. (PR78809) When the length of one argument is constant, try to replace
>    the call with a __builtin_str(n)cmp_eq call where possible, i.e:
> 
>    strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a
>    string with constant length, C is a constant.
>      if (C <= strlen(STR) && sizeof_array(s) > C)
>        {
>          replace this call with
>          __builtin_strncmp_eq (s, STR, C) (!)= 0
>        }
>      if (C > strlen(STR)
>        {
>          it can be safely treated as a call to strcmp (s, STR) (!)= 0
>          can handled by the following strcmp.
>        }
> 
>    strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a
>    string with constant length.
>      if  (sizeof_array(s) > strlen(STR))
>        {
>          replace this call with
>          __builtin_strcmp_eq (s, STR, strlen(STR)+1) (!)= 0
>        }
> 
>    later when expanding the new __builtin_str(n)cmp_eq calls, first expand them
>    as __builtin_memcmp_eq, if the expansion does not succeed, change them back
>    to call to __builtin_str(n)cmp.
> 
> adding test case strcmpopt_2.c and strcmpopt_4.c into gcc.dg for part B of
> PR78809 adding test case strcmpopt_3.c into gcc.dg for PR83026
> 
> bootstraped and tested on both X86 and Aarch64. no regression.
> ---
>  gcc/builtins.c                     |  33 ++++
>  gcc/builtins.def                   |   5 +
>  gcc/gimple-fold.c                  |   5 +
>  gcc/testsuite/gcc.dg/strcmpopt_2.c |  67 ++++++++
>  gcc/testsuite/gcc.dg/strcmpopt_3.c |  31 ++++
>  gcc/testsuite/gcc.dg/strcmpopt_4.c |  16 ++
>  gcc/tree-ssa-strlen.c              | 304 ++++++++++++++++++++++++++++++++++---
>  gcc/tree-ssa-structalias.c         |   2 +
>  gcc/tree.c                         |   8 +
>  9 files changed, 454 insertions(+), 17 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_2.c
>  create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_3.c
>  create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_4.c
Sorry for the long delay.  This needs a ChangeLog.  With the ChangeLog
it is OK for the trunk.

Jeff
Qing Zhao June 25, 2018, 3:28 p.m. UTC | #11
> On Jun 22, 2018, at 11:49 PM, Jeff Law <law@redhat.com> wrote:
> 
> On 05/29/2018 06:08 PM, Qing Zhao wrote:
>> Hi, Jeff,
>> 
>> Thanks a lot for your review and comments.
>> 
>> I have updated my patch based on your suggestion, and retested this whole patch on both X86 and aarch64.
>> 
>> please take a look at the patch again.
>> 
>> thanks.
>> 
>> Qing
>> 
>>> On May 25, 2018, at 3:38 PM, Jeff Law <law@redhat.com> wrote:
>>> So I originally thought you had the core logic wrong in the immediate
>>> uses loop.  But it's actually the case that the return value is the
>>> exact opposite of what I expected.
>>> 
>>> ie, I expected "TRUE" to mean the call was transformed, "FALSE" if it
>>> was not transformed.
>>> 
>>> Can you fix that so it's not so confusing?
>>> 
>>> I think with that change we'll be good to go, but please repost for a
>>> final looksie.
>>> 
>>> THanks,
>>> Jeff
>> 
>> 
>> 0001-2nd-Patch-for-PR78009.patch
>> 
>> 
>> From 750f44ee0777d55b568f07e263babdedd532d315 Mon Sep 17 00:00:00 2001
>> From: qing zhao <qing.zhao@oracle.com <mailto:qing.zhao@oracle.com>>
>> Date: Tue, 29 May 2018 16:15:21 -0400
>> Subject: [PATCH] 2nd Patch for PR78009 Patch for PR83026
>> 
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809 <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809>
>> Inline strcmp with small constant strings
>> 
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83026 <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83026>
>> missing strlen optimization for strcmp of unequal strings
>> 
>> The design doc for PR78809 is at:
>> https://www.mail-archive.com/gcc@gcc.gnu.org/msg83822.html <https://www.mail-archive.com/gcc@gcc.gnu.org/msg83822.html>
>> 
>> this patch is for the second part of change of PR78809 and PR83026:
>> 
>> B. for strncmp (s1, s2, n) (!)= 0 or strcmp (s1, s2) (!)= 0
>> 
>>   B.1. (PR83026) When the lengths of both arguments are constant and
>>        it's a strcmp:
>>      * if the lengths are NOT equal, we can safely fold the call
>>        to a non-zero value.
>>      * otherwise, do nothing now.
>> 
>>   B.2. (PR78809) When the length of one argument is constant, try to replace
>>   the call with a __builtin_str(n)cmp_eq call where possible, i.e:
>> 
>>   strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a
>>   string with constant length, C is a constant.
>>     if (C <= strlen(STR) && sizeof_array(s) > C)
>>       {
>>         replace this call with
>>         __builtin_strncmp_eq (s, STR, C) (!)= 0
>>       }
>>     if (C > strlen(STR)
>>       {
>>         it can be safely treated as a call to strcmp (s, STR) (!)= 0
>>         can handled by the following strcmp.
>>       }
>> 
>>   strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a
>>   string with constant length.
>>     if  (sizeof_array(s) > strlen(STR))
>>       {
>>         replace this call with
>>         __builtin_strcmp_eq (s, STR, strlen(STR)+1) (!)= 0
>>       }
>> 
>>   later when expanding the new __builtin_str(n)cmp_eq calls, first expand them
>>   as __builtin_memcmp_eq, if the expansion does not succeed, change them back
>>   to call to __builtin_str(n)cmp.
>> 
>> adding test case strcmpopt_2.c and strcmpopt_4.c into gcc.dg for part B of
>> PR78809 adding test case strcmpopt_3.c into gcc.dg for PR83026
>> 
>> bootstraped and tested on both X86 and Aarch64. no regression.
>> ---
>> gcc/builtins.c                     |  33 ++++
>> gcc/builtins.def                   |   5 +
>> gcc/gimple-fold.c                  |   5 +
>> gcc/testsuite/gcc.dg/strcmpopt_2.c |  67 ++++++++
>> gcc/testsuite/gcc.dg/strcmpopt_3.c |  31 ++++
>> gcc/testsuite/gcc.dg/strcmpopt_4.c |  16 ++
>> gcc/tree-ssa-strlen.c              | 304 ++++++++++++++++++++++++++++++++++---
>> gcc/tree-ssa-structalias.c         |   2 +
>> gcc/tree.c                         |   8 +
>> 9 files changed, 454 insertions(+), 17 deletions(-)
>> create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_2.c
>> create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_3.c
>> create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_4.c
> Sorry for the long delay.  This needs a ChangeLog.  With the ChangeLog
> it is OK for the trunk.

Hi, Jeff,

the patch had been committed with ChangeLogs as following:

https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=261039 <https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=261039>

thanks.

Qing

> 
> Jeff
diff mbox series

Patch

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 6b25253..a5f6885 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -6953,12 +6953,45 @@  expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode,
 	return target;
       break;
 
+    /* expand it as BUILT_IN_MEMCMP_EQ first. If not successful, change it 
+       back to a BUILT_IN_STRCMP. Remember to delete the 3rd paramater
+       when changing it to a strcmp call.  */
+    case BUILT_IN_STRCMP_EQ:
+      target = expand_builtin_memcmp (exp, target, true);
+      if (target)
+	return target;
+
+      /* change this call back to a BUILT_IN_STRCMP.  */
+      TREE_OPERAND (exp, 1) 
+	= build_fold_addr_expr (builtin_decl_explicit (BUILT_IN_STRCMP));
+
+      /* delete the last parameter.  */
+      unsigned int i;
+      vec<tree, va_gc> *arg_vec;
+      vec_alloc (arg_vec, 2);
+      for (i = 0; i < 2; i++)
+	arg_vec->quick_push (CALL_EXPR_ARG (exp, i));
+      exp = build_call_vec (TREE_TYPE (exp), CALL_EXPR_FN (exp), arg_vec);
+      /* FALLTHROUGH */
+
     case BUILT_IN_STRCMP:
       target = expand_builtin_strcmp (exp, target);
       if (target)
 	return target;
       break;
 
+    /* expand it as BUILT_IN_MEMCMP_EQ first. If not successful, change it
+       back to a BUILT_IN_STRNCMP.  */
+    case BUILT_IN_STRNCMP_EQ:
+      target = expand_builtin_memcmp (exp, target, true);
+      if (target)
+	return target;
+
+      /* change it back to a BUILT_IN_STRNCMP.  */
+      TREE_OPERAND (exp, 1) 
+	= build_fold_addr_expr (builtin_decl_explicit (BUILT_IN_STRNCMP));
+      /* FALLTHROUGH */
+
     case BUILT_IN_STRNCMP:
       target = expand_builtin_strncmp (exp, target, mode);
       if (target)
diff --git a/gcc/builtins.def b/gcc/builtins.def
index 2281021..9eb79fd 100644
--- a/gcc/builtins.def
+++ b/gcc/builtins.def
@@ -949,6 +949,11 @@  DEF_BUILTIN_STUB (BUILT_IN_ALLOCA_WITH_ALIGN_AND_MAX, "__builtin_alloca_with_ali
    equality with zero.  */
 DEF_BUILTIN_STUB (BUILT_IN_MEMCMP_EQ, "__builtin_memcmp_eq")
 
+/* An internal version of strcmp/strncmp, used when the result is only 
+   tested for equality with zero.  */
+DEF_BUILTIN_STUB (BUILT_IN_STRCMP_EQ, "__builtin_strcmp_eq")
+DEF_BUILTIN_STUB (BUILT_IN_STRNCMP_EQ, "__builtin_strncmp_eq")
+
 /* Object size checking builtins.  */
 DEF_GCC_BUILTIN	       (BUILT_IN_OBJECT_SIZE, "object_size", BT_FN_SIZE_CONST_PTR_INT, ATTR_PURE_NOTHROW_LEAF_LIST)
 DEF_EXT_LIB_BUILTIN_CHKP (BUILT_IN_MEMCPY_CHK, "__memcpy_chk", BT_FN_PTR_PTR_CONST_PTR_SIZE_SIZE, ATTR_RET1_NOTHROW_NONNULL_LEAF)
diff --git a/gcc/testsuite/gcc.dg/strcmpopt_2.c b/gcc/testsuite/gcc.dg/strcmpopt_2.c
new file mode 100644
index 0000000..0131b8f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strcmpopt_2.c
@@ -0,0 +1,67 @@ 
+/* { dg-do run } */
+/* { dg-options "-O2 -fdump-tree-strlen" } */
+
+char s[100] = {'a','b','c','d'};
+typedef struct { char s[8]; int x; } S;
+
+__attribute__ ((noinline)) int 
+f1 (S *s) 
+{ 
+  return __builtin_strcmp (s->s, "abc") != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f2 (void) 
+{ 
+  return __builtin_strcmp (s, "abc") != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f3 (S *s) 
+{ 
+  return __builtin_strcmp ("abc", s->s) != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f4 (void) 
+{ 
+  return __builtin_strcmp ("abc", s) != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f5 (S *s) 
+{ 
+  return __builtin_strncmp (s->s, "abc", 3) != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f6 (void) 
+{ 
+  return __builtin_strncmp (s, "abc", 2) != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f7 (S *s) 
+{ 
+  return __builtin_strncmp ("abc", s->s, 3) != 0; 
+}
+
+__attribute__ ((noinline)) int 
+f8 (void) 
+{ 
+  return __builtin_strncmp ("abc", s, 2) != 0; 
+}
+
+int main (void)
+{
+  S ss = {{'a','b','c'}, 2};
+
+  if (f1 (&ss) != 0 || f2 () != 1 || f3 (&ss) != 0 ||
+      f4 () != 1 || f5 (&ss) != 0 || f6 () != 0 ||
+      f7 (&ss) != 0 || f8 () != 0)
+    __builtin_abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "cmp_eq \\(" 8 "strlen" } } */
diff --git a/gcc/testsuite/gcc.dg/strcmpopt_3.c b/gcc/testsuite/gcc.dg/strcmpopt_3.c
new file mode 100644
index 0000000..86a0d7a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strcmpopt_3.c
@@ -0,0 +1,31 @@ 
+/* { dg-do run } */
+/* { dg-options "-O2 -fdump-tree-strlen" } */
+
+__attribute__ ((noinline)) int 
+f1 (void) 
+{ 
+  char *s0= "abcd";
+  char s[8];
+  __builtin_strcpy (s, s0);
+  return __builtin_strcmp(s, "abc") != 0; 
+}
+
+__attribute__ ((noinline)) int
+f2 (void) 
+{ 
+  char *s0 = "ab";
+  char s[8];
+  __builtin_strcpy (s, s0);
+  return __builtin_strcmp("abc", s) != 0; 
+}
+
+int main (void)
+{
+  if (f1 () != 1 
+      || f2 () != 1)
+    __builtin_abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "strcmp" 0 "strlen" } } */
diff --git a/gcc/testsuite/gcc.dg/strcmpopt_4.c b/gcc/testsuite/gcc.dg/strcmpopt_4.c
new file mode 100644
index 0000000..d727bc3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strcmpopt_4.c
@@ -0,0 +1,16 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-strlen" } */
+
+typedef struct { char s[8]; int x; } S;
+extern int max_i;
+
+int
+f1 (S * s)
+{ 
+  int result, i;
+  for (i = 0; i < max_i; i++)
+    result += __builtin_strcmp (s->s, "abc") != 0 ? 2 : 1;
+  return result;
+}
+
+/* { dg-final { scan-tree-dump-times "cmp_eq \\(" 1 "strlen" } } */
diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
index 94f20ef..57563ef 100644
--- a/gcc/tree-ssa-strlen.c
+++ b/gcc/tree-ssa-strlen.c
@@ -2540,6 +2540,216 @@  handle_builtin_memcmp (gimple_stmt_iterator *gsi)
   return false;
 }
 
+/* Given an index to the strinfo vector, compute the string length for the
+   corresponding string. Return -1 when unknown.  */
+ 
+static HOST_WIDE_INT 
+compute_string_length (int idx)
+{
+  HOST_WIDE_INT string_leni = -1; 
+  gcc_assert (idx != 0);
+
+  if (idx < 0)
+    string_leni = ~idx;
+  else
+    {
+      strinfo *si = get_strinfo (idx);
+      if (si)
+	{
+	  tree const_string_len = get_string_length (si);
+	  if (const_string_len && tree_fits_shwi_p (const_string_len))
+	    string_leni = tree_to_shwi (const_string_len);
+	}
+    }
+
+  if (string_leni < 0)
+    return -1;
+
+  return string_leni;
+}
+
+/* Handle a call to strcmp or strncmp. When the result is ONLY used to do 
+   equality test against zero:
+
+   A. When both arguments are constant strings and it's a strcmp:
+      * if the length of the strings are NOT equal, we can safely fold the call
+        to a non-zero value.
+      * otherwise, do nothing now.
+  
+   B. When one of the arguments is constant string, try to replace the call with
+   a __builtin_str(n)cmp_eq call where possible, i.e:
+
+   strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a 
+   constant string, C is a constant.
+     if (C <= strlen(STR) && sizeof_array(s) > C)
+       {
+         replace this call with
+         strncmp_eq (s, STR, C) (!)= 0
+       }
+     if (C > strlen(STR)
+       {
+         it can be safely treated as a call to strcmp (s, STR) (!)= 0
+         can handled by the following strcmp.
+       }
+
+   strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a 
+   constant string.
+     if  (sizeof_array(s) > strlen(STR))
+       {
+         replace this call with
+         strcmp_eq (s, STR, strlen(STR)+1) (!)= 0
+       }
+ */ 
+
+static bool
+handle_builtin_string_cmp (gimple_stmt_iterator *gsi)
+{
+  gcall *stmt = as_a <gcall *> (gsi_stmt (*gsi));
+  tree res = gimple_call_lhs (stmt);
+  use_operand_p use_p;
+  imm_use_iterator iter;
+  tree arg1 = gimple_call_arg (stmt, 0);
+  tree arg2 = gimple_call_arg (stmt, 1);
+  int idx1 = get_stridx (arg1);
+  int idx2 = get_stridx (arg2);
+  HOST_WIDE_INT length = -1;
+  bool is_ncmp = false;
+
+  if (!res)
+    return true;
+
+  /* When both arguments are unknown, do nothing.  */
+  if (idx1 == 0 && idx2 == 0)
+    return true;
+
+  /* Handle strncmp function.  */
+  if (gimple_call_num_args (stmt) == 3)
+    {
+      tree len = gimple_call_arg (stmt, 2);
+      if (tree_fits_shwi_p (len))
+        length = tree_to_shwi (len);
+
+      is_ncmp = true;
+    }
+
+  /* For strncmp, if the length argument is NOT known, do nothing.  */
+  if (is_ncmp && length < 0)
+    return true;
+
+  /* When the result is ONLY used to do equality test against zero.  */
+  FOR_EACH_IMM_USE_FAST (use_p, iter, res) 
+    {    
+      gimple *use_stmt = USE_STMT (use_p);
+
+      if (is_gimple_debug (use_stmt))
+        continue;
+      if (gimple_code (use_stmt) == GIMPLE_ASSIGN)
+	{    
+	  tree_code code = gimple_assign_rhs_code (use_stmt);
+	  if (code == COND_EXPR) 
+	    {
+	      tree cond_expr = gimple_assign_rhs1 (use_stmt);
+	      if ((TREE_CODE (cond_expr) != EQ_EXPR 
+		   && (TREE_CODE (cond_expr) != NE_EXPR))
+		  || !integer_zerop (TREE_OPERAND (cond_expr, 1)))
+		return true;
+	    }
+	  else if (code == EQ_EXPR || code == NE_EXPR)
+	    {
+	      if (!integer_zerop (gimple_assign_rhs2 (use_stmt)))
+		return true;
+            }
+	  else 
+	    return true;
+	}
+      else if (gimple_code (use_stmt) == GIMPLE_COND)
+	{
+	  tree_code code = gimple_cond_code (use_stmt);
+	  if ((code != EQ_EXPR && code != NE_EXPR)
+	      || !integer_zerop (gimple_cond_rhs (use_stmt)))
+	    return true;
+	}
+      else
+        return true;
+    }
+  
+  /* When both arguments are known, and their strlens are unequal, we can 
+     safely fold the call to a non-zero value for strcmp;
+     othewise, do nothing now.  */
+  if (idx1 != 0 && idx2 != 0)
+    {
+      HOST_WIDE_INT const_string_leni1 = compute_string_length (idx1);
+      HOST_WIDE_INT const_string_leni2 = compute_string_length (idx2);
+
+      if (!is_ncmp 
+	  && const_string_leni1 != -1
+	  && const_string_leni2 != -1
+	  && const_string_leni1 != const_string_leni2) 
+	{
+	  replace_call_with_value (gsi, integer_one_node); 
+	  return false;
+	}
+      return true;
+    }
+
+  /* When one of args is constant string.  */
+  tree var_string = NULL_TREE;
+  HOST_WIDE_INT const_string_leni = -1;
+  
+  if (idx1)
+    {
+      const_string_leni = compute_string_length (idx1);
+      var_string = arg2;
+    } 
+  else 
+    {
+      gcc_checking_assert (idx2);
+      const_string_leni = compute_string_length (idx2);
+      var_string = arg1;
+    } 
+
+  if (const_string_leni < 0) 
+    return true;
+ 
+  unsigned HOST_WIDE_INT var_sizei = 0;
+  /* try to determine the minimum size of the object pointed by var_string.  */
+  tree size = compute_objsize (var_string, 2);
+
+  if (!size)
+    return true;
+ 
+  if (tree_fits_uhwi_p (size))
+    var_sizei = tree_to_uhwi (size);
+
+  if (var_sizei == 0)
+    return true;
+
+  /* For strncmp, if length > const_string_leni , this call can be safely 
+     transformed to a strcmp.  */
+  if (is_ncmp && length > const_string_leni)
+    is_ncmp = false;
+
+  unsigned HOST_WIDE_INT final_length 
+    = is_ncmp ? length : const_string_leni + 1;
+
+  /* Replace strcmp or strncmp with the corresponding str(n)cmp_eq.  */
+  if (var_sizei > final_length) 
+    {
+      tree fn 
+	= (is_ncmp 
+	   ? builtin_decl_implicit (BUILT_IN_STRNCMP_EQ) 
+	   : builtin_decl_implicit (BUILT_IN_STRCMP_EQ));
+      if (!fn)
+	return true;
+      tree const_string_len = build_int_cst (size_type_node, final_length); 
+      update_gimple_call (gsi, fn, 3, arg1, arg2, const_string_len);
+    }
+  else 
+    return true;
+
+  return false;
+}
+
 /* Handle a POINTER_PLUS_EXPR statement.
    For p = "abcd" + 2; compute associated length, or if
    p = q + off is pointing to a '\0' character of a string, call
@@ -2939,6 +3149,11 @@  strlen_optimize_stmt (gimple_stmt_iterator *gsi)
 	    if (!handle_builtin_memcmp (gsi))
 	      return false;
 	    break;
+	  case BUILT_IN_STRCMP:
+	  case BUILT_IN_STRNCMP:
+	    if (!handle_builtin_string_cmp (gsi))
+	      return false;
+	    break;
 	  default:
 	    break;
 	  }
diff --git a/gcc/tree.c b/gcc/tree.c
index ed1852b..e23b3cd 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -10047,6 +10047,14 @@  build_common_builtin_nodes (void)
 			"__builtin_memcmp_eq",
 			ECF_PURE | ECF_NOTHROW | ECF_LEAF);
 
+  local_define_builtin ("__builtin_strncmp_eq", ftype, BUILT_IN_STRNCMP_EQ,
+			"__builtin_strncmp_eq",
+			ECF_PURE | ECF_NOTHROW | ECF_LEAF);
+
+  local_define_builtin ("__builtin_strcmp_eq", ftype, BUILT_IN_STRCMP_EQ,
+			"__builtin_strcmp_eq",
+			ECF_PURE | ECF_NOTHROW | ECF_LEAF);
+
   /* If there's a possibility that we might use the ARM EABI, build the
     alternate __cxa_end_cleanup node used to resume from C++.  */
   if (targetm.arm_eabi_unwinder)