Message ID | A4FB5DB3-1280-4155-A20F-B3D03E09D252@oracle.com |
---|---|
State | New |
Headers | show |
Series | [Middle-end,version,2] 2nd patch of PR78809 and PR83026 | expand |
On 12/21/2017 02:25 PM, Qing Zhao wrote: > Hi, > > I updated my patch based on all your comments. > > the major changes are the following: > > 1. replace the candidate calls with __builtin_str(n)cmp_eq instead of __builtin_memcmp_eq; > in builtins.c, when expanding the new __builtin_str(n)cmp_eq call, expand them first as > __builtin_memcmp_eq, if Not succeed, change the call back to __builtin_str(n)cmp. > 2. change the call to “get_range_strlen” with “compute_objsize”. Please read the big comment before compute_objsize. If you are going to use it to influence code generation or optimization, then you're most likely doing something wrong. compute_objsize can return an estimate in some circumstances. > 3. add the missing case for equality checking with zero; > 4. adjust the new testing case for PR83026; add a new testing case for the missing case added in 3. > 5. update “uhwi” to “shwi” for where it needs; > 6. some other minor format changes. > > the changes are retested on x86 and aarch64, bootstrapped and regression tested. no issue. > > Okay for trunk? > > thanks. > > Qing > > Please see the updated patch: > > gcc/ChangeLog: > > +2017-12-21 Qing Zhao <qing.zhao@oracle.com> > + > + PR middle-end/78809 > + PR middle-end/83026 > + * builtins.c (expand_builtin): Add the handling of BUILT_IN_STRCMP_EQ > + and BUILT_IN_STRNCMP_EQ. > + * builtins.def: Add new builtins BUILT_IN_STRCMP_EQ and > + BUILT_IN_STRNCMP_EQ. > + * tree-ssa-strlen.c (compute_string_length): New function. > + (handle_builtin_string_cmp): New function to handle calls to > + string compare functions. > + (strlen_optimize_stmt): Add handling to builtin string compare > + calls. > + * tree.c (build_common_builtin_nodes): Add new defines of > + BUILT_IN_STRNCMP_EQ and BUILT_IN_STRCMP_EQ. > + > gcc/testsuite/ChangeLog > > +2017-12-21 Qing Zhao <qing.zhao@oracle.com> > + > + PR middle-end/78809 > + * gcc.dg/strcmpopt_2.c: New testcase. > + * gcc.dg/strcmpopt_3.c: New testcase. > + > + PR middle-end/83026 > + * gcc.dg/strcmpopt_3.c: New testcase. What I don't like here is the introduction of STRCMP_EQ and STRNCMP_EQ. ISTM that if you're going to introduce those new builtins, then you have to audit all the code that runs between their introduction into the IL and when you expand them to ensure they're handled properly. All you're really doing is carrying along a status bit about what tree-ssa-strlen did. So you could possibly store that status bit somewhere. The concern with both is that something later invalidates the analysis you've done. I'm having a hard time coming up with a case where this could happen, but I'm definitely concerned about this possibility. Though I feel it's more likely to happen if we store a status bit vs using STRCMP_EQ STRNCMP_EQ. [ For example, we have two calls with the same arguments, but one feeds an equality test, the other does not. If we store a status bit that one could be transformed, but then we end up CSE-ing the two calls, the status bit would be incorrect because one of the calls did not feed an equality test. Hmmm. ] > diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c > index 94f20ef..57563ef 100644 > --- a/gcc/tree-ssa-strlen.c > +++ b/gcc/tree-ssa-strlen.c > @@ -2540,6 +2540,216 @@ handle_builtin_memcmp (gimple_stmt_iterator *gsi) > return false; > } > > +/* Given an index to the strinfo vector, compute the string length for the > + corresponding string. Return -1 when unknown. */ > + > +static HOST_WIDE_INT > +compute_string_length (int idx) > +{ > + HOST_WIDE_INT string_leni = -1; > + gcc_assert (idx != 0); > + > + if (idx < 0) > + string_leni = ~idx; So it seems to me you should just return ~idx; Then appropriately simplify the rest of the code. > + > +/* Handle a call to strcmp or strncmp. When the result is ONLY used to do > + equality test against zero: > + > + A. When both arguments are constant strings and it's a strcmp: > + * if the length of the strings are NOT equal, we can safely fold the call > + to a non-zero value. > + * otherwise, do nothing now. I'm guessing your comment needs a bit of work. If both arguments are constant strings, then we can just use the host str[n]cmp to fold the str[n]cmp to a constant. Presumably this is handled earlier :-) So what I'm guessing is you're really referring to the case where the lengths are known constants, even if the contents of the strings themselves are not. In that case if its an equality comparison, then you can fold to a constant. Otherwise we do nothing. So I think the comment needs updating here. > + > + B. When one of the arguments is constant string, try to replace the call with > + a __builtin_str(n)cmp_eq call where possible, i.e: > + > + strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a > + constant string, C is a constant. > + if (C <= strlen(STR) && sizeof_array(s) > C) > + { > + replace this call with > + strncmp_eq (s, STR, C) (!)= 0 > + } > + if (C > strlen(STR) > + { > + it can be safely treated as a call to strcmp (s, STR) (!)= 0 > + can handled by the following strcmp. > + } > + > + strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a > + constant string. > + if (sizeof_array(s) > strlen(STR)) > + { > + replace this call with > + strcmp_eq (s, STR, strlen(STR)+1) (!)= 0 So I'd hazard a guess that this comment has the same problem. It's mixing up the concept of a constant string and the concept of a nonconstant string, but which has a known constant length. First, if one of the arguments is a string constant, then it should be easy to get the constant at expansion time with minimal effort. It's also the case that knowing if we're comparing the result against zero is trivial to figure out at expansion time. This would probably be a reasonble thing to add to the expanders. Your function comment for handle_builtin_string_cmp does not indicate what the return value means. From reading the code is appears to return TRUE when it does nothing and FALSE when it optimizes the call in some way. Is that correct? That seems inverted from the way we'd normally write this stuff. > + > + /* When one of args is constant string. */ > + tree var_string = NULL_TREE; > + HOST_WIDE_INT const_string_leni = -1; > + > + if (idx1) > + { > + const_string_leni = compute_string_length (idx1); > + var_string = arg2; > + } > + else > + { > + gcc_checking_assert (idx2); > + const_string_leni = compute_string_length (idx2); > + var_string = arg1; > + } > + > + if (const_string_leni < 0) > + return true; > + > + unsigned HOST_WIDE_INT var_sizei = 0; > + /* try to determine the minimum size of the object pointed by var_string. */ > + tree size = compute_objsize (var_string, 2); So this worries me as well. compute_objsize is not supposed to be used to influence code generation or optimization. It is not guaranteed to return an exact size, but instead may return an estimate! I'd really like to hear Jakub or Richi chime in with their thoughts on how this transforms one builtin into another and whether or not they think that is wise. jeff
On Fri, 12 Jan 2018, Jeff Law wrote: > On 12/21/2017 02:25 PM, Qing Zhao wrote: > > Hi, > > > > I updated my patch based on all your comments. > > > > the major changes are the following: > > > > 1. replace the candidate calls with __builtin_str(n)cmp_eq instead of __builtin_memcmp_eq; > > in builtins.c, when expanding the new __builtin_str(n)cmp_eq call, expand them first as > > __builtin_memcmp_eq, if Not succeed, change the call back to __builtin_str(n)cmp. > > 2. change the call to “get_range_strlen” with “compute_objsize”. > Please read the big comment before compute_objsize. If you are going to > use it to influence code generation or optimization, then you're most > likely doing something wrong. > > compute_objsize can return an estimate in some circumstances. > > > > 3. add the missing case for equality checking with zero; > > 4. adjust the new testing case for PR83026; add a new testing case for the missing case added in 3. > > 5. update “uhwi” to “shwi” for where it needs; > > 6. some other minor format changes. > > > > the changes are retested on x86 and aarch64, bootstrapped and regression tested. no issue. > > > > Okay for trunk? > > > > thanks. > > > > Qing > > > > Please see the updated patch: > > > > gcc/ChangeLog: > > > > +2017-12-21 Qing Zhao <qing.zhao@oracle.com> > > + > > + PR middle-end/78809 > > + PR middle-end/83026 > > + * builtins.c (expand_builtin): Add the handling of BUILT_IN_STRCMP_EQ > > + and BUILT_IN_STRNCMP_EQ. > > + * builtins.def: Add new builtins BUILT_IN_STRCMP_EQ and > > + BUILT_IN_STRNCMP_EQ. > > + * tree-ssa-strlen.c (compute_string_length): New function. > > + (handle_builtin_string_cmp): New function to handle calls to > > + string compare functions. > > + (strlen_optimize_stmt): Add handling to builtin string compare > > + calls. > > + * tree.c (build_common_builtin_nodes): Add new defines of > > + BUILT_IN_STRNCMP_EQ and BUILT_IN_STRCMP_EQ. > > + > > gcc/testsuite/ChangeLog > > > > +2017-12-21 Qing Zhao <qing.zhao@oracle.com> > > + > > + PR middle-end/78809 > > + * gcc.dg/strcmpopt_2.c: New testcase. > > + * gcc.dg/strcmpopt_3.c: New testcase. > > + > > + PR middle-end/83026 > > + * gcc.dg/strcmpopt_3.c: New testcase. > What I don't like here is the introduction of STRCMP_EQ and STRNCMP_EQ. > ISTM that if you're going to introduce those new builtins, then you have > to audit all the code that runs between their introduction into the IL > and when you expand them to ensure they're handled properly. > > All you're really doing is carrying along a status bit about what > tree-ssa-strlen did. So you could possibly store that status bit somewhere. > > The concern with both is that something later invalidates the analysis > you've done. I'm having a hard time coming up with a case where this > could happen, but I'm definitely concerned about this possibility. > Though I feel it's more likely to happen if we store a status bit vs > using STRCMP_EQ STRNCMP_EQ. > > [ For example, we have two calls with the same arguments, but one feeds > an equality test, the other does not. If we store a status bit that one > could be transformed, but then we end up CSE-ing the two calls, the > status bit would be incorrect because one of the calls did not feed an > equality test. Hmmm. ] > > > > > > diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c > > index 94f20ef..57563ef 100644 > > --- a/gcc/tree-ssa-strlen.c > > +++ b/gcc/tree-ssa-strlen.c > > @@ -2540,6 +2540,216 @@ handle_builtin_memcmp (gimple_stmt_iterator *gsi) > > return false; > > } > > > > +/* Given an index to the strinfo vector, compute the string length for the > > + corresponding string. Return -1 when unknown. */ > > + > > +static HOST_WIDE_INT > > +compute_string_length (int idx) > > +{ > > + HOST_WIDE_INT string_leni = -1; > > + gcc_assert (idx != 0); > > + > > + if (idx < 0) > > + string_leni = ~idx; > So it seems to me you should just > return ~idx; > > Then appropriately simplify the rest of the code. > > > + > > +/* Handle a call to strcmp or strncmp. When the result is ONLY used to do > > + equality test against zero: > > + > > + A. When both arguments are constant strings and it's a strcmp: > > + * if the length of the strings are NOT equal, we can safely fold the call > > + to a non-zero value. > > + * otherwise, do nothing now. > I'm guessing your comment needs a bit of work. If both arguments are > constant strings, then we can just use the host str[n]cmp to fold the > str[n]cmp to a constant. Presumably this is handled earlier :-) > > So what I'm guessing is you're really referring to the case where the > lengths are known constants, even if the contents of the strings > themselves are not. In that case if its an equality comparison, then > you can fold to a constant. Otherwise we do nothing. So I think the > comment needs updating here. > > > > > > > > + > > + B. When one of the arguments is constant string, try to replace the call with > > + a __builtin_str(n)cmp_eq call where possible, i.e: > > + > > + strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a > > + constant string, C is a constant. > > + if (C <= strlen(STR) && sizeof_array(s) > C) > > + { > > + replace this call with > > + strncmp_eq (s, STR, C) (!)= 0 > > + } > > + if (C > strlen(STR) > > + { > > + it can be safely treated as a call to strcmp (s, STR) (!)= 0 > > + can handled by the following strcmp. > > + } > > + > > + strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a > > + constant string. > > + if (sizeof_array(s) > strlen(STR)) > > + { > > + replace this call with > > + strcmp_eq (s, STR, strlen(STR)+1) (!)= 0 > So I'd hazard a guess that this comment has the same problem. It's > mixing up the concept of a constant string and the concept of a > nonconstant string, but which has a known constant length. > > First, if one of the arguments is a string constant, then it should be > easy to get the constant at expansion time with minimal effort. It's > also the case that knowing if we're comparing the result against zero is > trivial to figure out at expansion time. This would probably be a > reasonble thing to add to the expanders. > > > > Your function comment for handle_builtin_string_cmp does not indicate > what the return value means. From reading the code is appears to return > TRUE when it does nothing and FALSE when it optimizes the call in some > way. Is that correct? That seems inverted from the way we'd normally > write this stuff. > > > > > + > > + /* When one of args is constant string. */ > > + tree var_string = NULL_TREE; > > + HOST_WIDE_INT const_string_leni = -1; > > + > > + if (idx1) > > + { > > + const_string_leni = compute_string_length (idx1); > > + var_string = arg2; > > + } > > + else > > + { > > + gcc_checking_assert (idx2); > > + const_string_leni = compute_string_length (idx2); > > + var_string = arg1; > > + } > > + > > + if (const_string_leni < 0) > > + return true; > > + > > + unsigned HOST_WIDE_INT var_sizei = 0; > > + /* try to determine the minimum size of the object pointed by var_string. */ > > + tree size = compute_objsize (var_string, 2); > So this worries me as well. compute_objsize is not supposed to be used > to influence code generation or optimization. It is not guaranteed to > return an exact size, but instead may return an estimate! Yes, compute_objsize cannot be used for optimizations. > I'd really like to hear Jakub or Richi chime in with their thoughts on > how this transforms one builtin into another and whether or not they > think that is wise. Given it follows how we handle memcmp () != 0 it's ok to trigger inline expansion. Didn't look into the patch whether it does this. And yes, places that look at STR[N]CMP now also have to look at STR[N]CMP_EQ. We're now in stage4 so please queue this patch and ping it during next stage1. Thanks, Richard.
Hi, Jeff, Really sorry for my delay. Your email on 1/12/2018 and Richard’s email on 1/15/2018, were completely lost in my mailboxes until yesterday my colleague, Paolo Carlini, forwarded it to me. I really apologize for the late reply. Please see my reply below: > On Jan 12, 2018, at 4:47 PM, Jeff Law <law@redhat.com <mailto:law@redhat.com>> wrote: >> >> >> 1. replace the candidate calls with __builtin_str(n)cmp_eq instead of __builtin_memcmp_eq; >> in builtins.c, when expanding the new __builtin_str(n)cmp_eq call, expand them first as >> __builtin_memcmp_eq, if Not succeed, change the call back to __builtin_str(n)cmp. >> 2. change the call to “get_range_strlen” with “compute_objsize”. > Please read the big comment before compute_objsize. If you are going to > use it to influence code generation or optimization, then you're most > likely doing something wrong. > > compute_objsize can return an estimate in some circumstances. > On Jan 15, 2018, at 2:48 AM, Richard Biener <rguenther@suse.de <mailto:rguenther@suse.de>> wrote: > > Yes, compute_objsize cannot be used for optimizations. Yes, I just checked the compute_objsize again, you are right, it might return an estimated code size. > On Jan 12, 2018, at 4:47 PM, Jeff Law <law@redhat.com <mailto:law@redhat.com>> wrote: > What I don't like here is the introduction of STRCMP_EQ and STRNCMP_EQ. > ISTM that if you're going to introduce those new builtins, then you have > to audit all the code that runs between their introduction into the IL > and when you expand them to ensure they're handled properly. > > All you're really doing is carrying along a status bit about what > tree-ssa-strlen did. So you could possibly store that status bit somewhere. > > The concern with both is that something later invalidates the analysis > you've done. I'm having a hard time coming up with a case where this > could happen, but I'm definitely concerned about this possibility. > Though I feel it's more likely to happen if we store a status bit vs > using STRCMP_EQ STRNCMP_EQ. > > [ For example, we have two calls with the same arguments, but one feeds > an equality test, the other does not. If we store a status bit that one > could be transformed, but then we end up CSE-ing the two calls, the > status bit would be incorrect because one of the calls did not feed an > equality test. Hmmm. ] See below. >> +static HOST_WIDE_INT >> +compute_string_length (int idx) >> +{ >> + HOST_WIDE_INT string_leni = -1; >> + gcc_assert (idx != 0); >> + >> + if (idx < 0) >> + string_leni = ~idx; > So it seems to me you should just > return ~idx; > > Then appropriately simplify the rest of the code. Okay, I will update my code per your suggestion. > >> + >> +/* Handle a call to strcmp or strncmp. When the result is ONLY used to do >> + equality test against zero: >> + >> + A. When both arguments are constant strings and it's a strcmp: >> + * if the length of the strings are NOT equal, we can safely fold the call >> + to a non-zero value. >> + * otherwise, do nothing now. > I'm guessing your comment needs a bit of work. If both arguments are > constant strings, then we can just use the host str[n]cmp to fold the > str[n]cmp to a constant. Presumably this is handled earlier :-) > > So what I'm guessing is you're really referring to the case where the > lengths are known constants, even if the contents of the strings > themselves are not. In that case if its an equality comparison, then > you can fold to a constant. Otherwise we do nothing. So I think the > comment needs updating here. your understanding is correct, I will update the comments to make it more accurate. >> + >> + B. When one of the arguments is constant string, try to replace the call with >> + a __builtin_str(n)cmp_eq call where possible, i.e: >> + >> + strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a >> + constant string, C is a constant. >> + if (C <= strlen(STR) && sizeof_array(s) > C) >> + { >> + replace this call with >> + strncmp_eq (s, STR, C) (!)= 0 >> + } >> + if (C > strlen(STR) >> + { >> + it can be safely treated as a call to strcmp (s, STR) (!)= 0 >> + can handled by the following strcmp. >> + } >> + >> + strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a >> + constant string. >> + if (sizeof_array(s) > strlen(STR)) >> + { >> + replace this call with >> + strcmp_eq (s, STR, strlen(STR)+1) (!)= 0 > So I'd hazard a guess that this comment has the same problem. It's > mixing up the concept of a constant string and the concept of a > nonconstant string, but which has a known constant length. > > First, if one of the arguments is a string constant, then it should be > easy to get the constant at expansion time with minimal effort. It's > also the case that knowing if we're comparing the result against zero is > trivial to figure out at expansion time. This would probably be a > reasonble thing to add to the expanders. Yes, should be a string with constant length. I will update my comments to reflect this. > Your function comment for handle_builtin_string_cmp does not indicate > what the return value means. From reading the code is appears to return > TRUE when it does nothing and FALSE when it optimizes the call in some > way. Is that correct? Yes. > That seems inverted from the way we'd normally > write this stuff. This is just following the handling of memcmp (please see the routine handle_builtin_memcmp), try to be consistent with them. >> + >> + /* When one of args is constant string. */ >> + tree var_string = NULL_TREE; >> + HOST_WIDE_INT const_string_leni = -1; >> + >> + if (idx1) >> + { >> + const_string_leni = compute_string_length (idx1); >> + var_string = arg2; >> + } >> + else >> + { >> + gcc_checking_assert (idx2); >> + const_string_leni = compute_string_length (idx2); >> + var_string = arg1; >> + } >> + >> + if (const_string_leni < 0) >> + return true; >> + >> + unsigned HOST_WIDE_INT var_sizei = 0; >> + /* try to determine the minimum size of the object pointed by var_string. */ >> + tree size = compute_objsize (var_string, 2); > So this worries me as well. compute_objsize is not supposed to be used > to influence code generation or optimization. It is not guaranteed to > return an exact size, but instead may return an estimate! As I mentioned in above: Yes, I just checked the compute_objsize again, you are right, it might return an estimated code size. there are 3 parts in compute_objsize: A. call “compute_builtin_object_size” to get the code size; B. If A failed, use value_RANGE info when the referenced object involves a non-constant offset in some range. C. if both A and B failed, use the TYPE info to determine the code size. So, From my understanding: both A and C should provide accurate code size info, but B will NOT. for my purpose, if I do NOT use B, then it will be return accurate code size, right? > > > I'd really like to hear Jakub or Richi chime in with their thoughts on > how this transforms one builtin into another and whether or not they think that is wise On Jan 15, 2018, at 2:48 AM, Richard Biener <rguenther@suse.de <mailto:rguenther@suse.de>> wrote: > Given it follows how we handle memcmp () != 0 it's ok to trigger inline expansion. > Didn't look into the patch whether it does this. > And yes, places that look at STR[N]CMP now also have to look atSTR[N]CMP_EQ. Based on both you and Richard’s above comments, my understanding is: introducing STRCMP_EQ/STRNCMP_EQ follows the current handling of memcmp ()!=0 in GCC, should be fine. but I should check all the places that look at STRCMP/STRNCMP to include STRCMP_EQ/STRNCMP_EQ now. Thanks. Qing
> > We're now in stage4 so please queue this patch and ping it during > next stage1. > I will update my patch based on Jeff and your comments, and send it during next stage 1. thanks. Qing
Hi, this is the 3rd version for this patch. the main change compared with 2nd version are: 1. do not use “compute_objsize” to get the minimum object size per Jeff and Richard’s comment. Instead, add a new function “determine_min_objsize” for this purpose. This new function calls “compute_builtin_object_size” to get the minimum objsize, however, due to the fact that “compute_builtin_object_size” does nothing for SSA_NAME when optimize > 0 (don’t know the exactly reason for this), inside “determine_min_objsize”, I have to add more code to catch up some simple SSA_NAME cases. 2. in gimple-fold.c and tree-ssa-structalias.c, add the handling of the new BUILT_IN_STRCMP_EQ and BUILT_IN_STRNCMP_EQ in the same places where BUILT_IN_STRCMP and BUILT_IN_STRNCMP is checked. 3. some format change and comments change per Jeff’s comment. let me know if you have any comments. thanks a lot. Qing ********************************* 2nd Patch for PR78009 Patch for PR83026 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809 Inline strcmp with small constant strings https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83026 missing strlen optimization for strcmp of unequal strings The design doc for PR78809 is at: https://www.mail-archive.com/gcc@gcc.gnu.org/msg83822.html this patch is for the second part of change of PR78809 and PR83026: B. for strncmp (s1, s2, n) (!)= 0 or strcmp (s1, s2) (!)= 0 B.1. (PR83026) When the lengths of both arguments are constant and it's a strcmp: * if the lengths are NOT equal, we can safely fold the call to a non-zero value. * otherwise, do nothing now. B.2. (PR78809) When the length of one argument is constant, try to replace the call with a __builtin_str(n)cmp_eq call where possible, i.e: strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a string with constant length, C is a constant. if (C <= strlen(STR) && sizeof_array(s) > C) { replace this call with __builtin_strncmp_eq (s, STR, C) (!)= 0 } if (C > strlen(STR) { it can be safely treated as a call to strcmp (s, STR) (!)= 0 can handled by the following strcmp. } strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a string with constant length. if (sizeof_array(s) > strlen(STR)) { replace this call with __builtin_strcmp_eq (s, STR, strlen(STR)+1) (!)= 0 } later when expanding the new __builtin_str(n)cmp_eq calls, first expand them as __builtin_memcmp_eq, if the expansion does not succeed, change them back to call to __builtin_str(n)cmp. adding test case strcmpopt_2.c and strcmpopt_4.c into gcc.dg for part B of PR78809 adding test case strcmpopt_3.c into gcc.dg for PR83026 bootstraped and tested on both X86 and Aarch64. no regression. ************************************************ gcc/ChangeLog +2018-02-02 <qing.zhao@oracle.com> + + PR middle-end/78809 + PR middle-end/83026 + * builtins.c (expand_builtin): Add the handling of BUILT_IN_STRCMP_EQ + and BUILT_IN_STRNCMP_EQ. + * builtins.def: Add new builtins BUILT_IN_STRCMP_EQ and + BUILT_IN_STRNCMP_EQ. + * gimple-fold.c (gimple_fold_builtin_string_compare): Add the + handling of BUILTIN_IN_STRCMP_EQ and BUILT_IN_STRNCMP_EQ. + (gimple_fold_builtin): Likewise. + * tree-ssa-strlen.c (compute_string_length): New function. + (determine_min_obsize): New function. + (handle_builtin_string_cmp): New function to handle calls to + string compare functions. + (strlen_optimize_stmt): Add handling to builtin string compare + calls. + * tree-ssa-structalias.c (find_func_aliases_for_builtin_call): + Add the handling of BUILT_IN_STRCMP_EQ and BUILT_IN_STRNCMP_EQ. + * tree.c (build_common_builtin_nodes): Add new defines of + BUILT_IN_STRNCMP_EQ and BUILT_IN_STRCMP_EQ. + gcc/testsuite/ChangeLog +2018-02-02 <qing.zhao@oracle.com> + + PR middle-end/78809 + * gcc.dg/strcmpopt_2.c: New testcase. + * gcc.dg/strcmpopt_3.c: New testcase. + + PR middle-end/83026 + * gcc.dg/strcmpopt_3.c: New testcase. > >> >> We're now in stage4 so please queue this patch and ping it during >> next stage1. >> > > I will update my patch based on Jeff and your comments, and send it during next stage 1. > > thanks. > > Qing >
Ping for the following patch sent in 3 months ago in the end of GCC8: https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg184075.html I have rebased the patch on the latest GCC9 thunk. bootstraped and tested on both X86 and Aarch64. no regression. the following are more details: ======== Hi, this is the 3rd version for this patch. the main change compared with 2nd version are: 1. do not use “compute_objsize” to get the minimum object size per Jeff and Richard’s comment. Instead, add a new function “determine_min_objsize” for this purpose. This new function calls “compute_builtin_object_size” to get the minimum objsize, however, due to the fact that “compute_builtin_object_size” does nothing for SSA_NAME when optimize > 0 (don’t know the exactly reason for this), inside “determine_min_objsize”, I have to add more code to catch up some simple SSA_NAME cases. 2. in gimple-fold.c and tree-ssa-structalias.c, add the handling of the new BUILT_IN_STRCMP_EQ and BUILT_IN_STRNCMP_EQ in the same places where BUILT_IN_STRCMP and BUILT_IN_STRNCMP is checked. 3. some format change and comments change per Jeff’s comment. let me know if you have any comments. thanks a lot. Qing ********************************* 2nd Patch for PR78009 Patch for PR83026 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809 Inline strcmp with small constant strings https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83026 missing strlen optimization for strcmp of unequal strings The design doc for PR78809 is at: https://www.mail-archive.com/gcc@gcc.gnu.org/msg83822.html this patch is for the second part of change of PR78809 and PR83026: B. for strncmp (s1, s2, n) (!)= 0 or strcmp (s1, s2) (!)= 0 B.1. (PR83026) When the lengths of both arguments are constant and it's a strcmp: * if the lengths are NOT equal, we can safely fold the call to a non-zero value. * otherwise, do nothing now. B.2. (PR78809) When the length of one argument is constant, try to replace the call with a __builtin_str(n)cmp_eq call where possible, i.e: strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a string with constant length, C is a constant. if (C <= strlen(STR) && sizeof_array(s) > C) { replace this call with __builtin_strncmp_eq (s, STR, C) (!)= 0 } if (C > strlen(STR) { it can be safely treated as a call to strcmp (s, STR) (!)= 0 can handled by the following strcmp. } strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a string with constant length. if (sizeof_array(s) > strlen(STR)) { replace this call with __builtin_strcmp_eq (s, STR, strlen(STR)+1) (!)= 0 } later when expanding the new __builtin_str(n)cmp_eq calls, first expand them as __builtin_memcmp_eq, if the expansion does not succeed, change them back to call to __builtin_str(n)cmp. adding test case strcmpopt_2.c and strcmpopt_4.c into gcc.dg for part B of PR78809 adding test case strcmpopt_3.c into gcc.dg for PR83026 bootstraped and tested on both X86 and Aarch64. no regression. ************************************************ gcc/ChangeLog +2018-05-21 <qing.zhao@oracle.com> + + PR middle-end/78809 + PR middle-end/83026 + * builtins.c (expand_builtin): Add the handling of BUILT_IN_STRCMP_EQ + and BUILT_IN_STRNCMP_EQ. + * builtins.def: Add new builtins BUILT_IN_STRCMP_EQ and + BUILT_IN_STRNCMP_EQ. + * gimple-fold.c (gimple_fold_builtin_string_compare): Add the + handling of BUILTIN_IN_STRCMP_EQ and BUILT_IN_STRNCMP_EQ. + (gimple_fold_builtin): Likewise. + * tree-ssa-strlen.c (compute_string_length): New function. + (determine_min_obsize): New function. + (handle_builtin_string_cmp): New function to handle calls to + string compare functions. + (strlen_optimize_stmt): Add handling to builtin string compare + calls. + * tree-ssa-structalias.c (find_func_aliases_for_builtin_call): + Add the handling of BUILT_IN_STRCMP_EQ and BUILT_IN_STRNCMP_EQ. + * tree.c (build_common_builtin_nodes): Add new defines of + BUILT_IN_STRNCMP_EQ and BUILT_IN_STRCMP_EQ. + gcc/testsuite/ChangeLog +2018-05-21 <qing.zhao@oracle.com> + + PR middle-end/78809 + * gcc.dg/strcmpopt_2.c: New testcase. + * gcc.dg/strcmpopt_3.c: New testcase. + + PR middle-end/83026 + * gcc.dg/strcmpopt_3.c: New testcase.
On 02/07/2018 08:36 AM, Qing Zhao wrote: > Hi, this is the 3rd version for this patch. > > the main change compared with 2nd version are: > 1. do not use “compute_objsize” to get the minimum object size per Jeff and Richard’s > comment. Instead, add a new function “determine_min_objsize” for this purpose. This new > function calls “compute_builtin_object_size” to get the minimum objsize, however, due to > the fact that “compute_builtin_object_size” does nothing for SSA_NAME when optimize > 0 (don’t > know the exactly reason for this), inside “determine_min_objsize”, I have to add more code > to catch up some simple SSA_NAME cases. > > 2. in gimple-fold.c and tree-ssa-structalias.c, add the handling of the new > BUILT_IN_STRCMP_EQ and BUILT_IN_STRNCMP_EQ in the same places where > BUILT_IN_STRCMP and BUILT_IN_STRNCMP is checked. > > 3. some format change and comments change per Jeff’s comment. > > let me know if you have any comments. > > thanks a lot. > > Qing > > ********************************* > > 2nd Patch for PR78009 > Patch for PR83026 > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809 > Inline strcmp with small constant strings > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83026 > missing strlen optimization for strcmp of unequal strings > > The design doc for PR78809 is at: > https://www.mail-archive.com/gcc@gcc.gnu.org/msg83822.html > > this patch is for the second part of change of PR78809 and PR83026: > > B. for strncmp (s1, s2, n) (!)= 0 or strcmp (s1, s2) (!)= 0 > > B.1. (PR83026) When the lengths of both arguments are constant and > it's a strcmp: > * if the lengths are NOT equal, we can safely fold the call > to a non-zero value. > * otherwise, do nothing now. > > B.2. (PR78809) When the length of one argument is constant, try to replace > the call with a __builtin_str(n)cmp_eq call where possible, i.e: > > strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a > string with constant length, C is a constant. > if (C <= strlen(STR) && sizeof_array(s) > C) > { > replace this call with > __builtin_strncmp_eq (s, STR, C) (!)= 0 > } > if (C > strlen(STR) > { > it can be safely treated as a call to strcmp (s, STR) (!)= 0 > can handled by the following strcmp. > } > > strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a > string with constant length. > if (sizeof_array(s) > strlen(STR)) > { > replace this call with > __builtin_strcmp_eq (s, STR, strlen(STR)+1) (!)= 0 > } > > later when expanding the new __builtin_str(n)cmp_eq calls, first expand them > as __builtin_memcmp_eq, if the expansion does not succeed, change them back > to call to __builtin_str(n)cmp. > > adding test case strcmpopt_2.c and strcmpopt_4.c into gcc.dg for part B of > PR78809 adding test case strcmpopt_3.c into gcc.dg for PR83026 > > bootstraped and tested on both X86 and Aarch64. no regression. > > ************************************************ > gcc/ChangeLog > > +2018-02-02 <qing.zhao@oracle.com> > + > + PR middle-end/78809 > + PR middle-end/83026 > + * builtins.c (expand_builtin): Add the handling of BUILT_IN_STRCMP_EQ > + and BUILT_IN_STRNCMP_EQ. > + * builtins.def: Add new builtins BUILT_IN_STRCMP_EQ and > + BUILT_IN_STRNCMP_EQ. > + * gimple-fold.c (gimple_fold_builtin_string_compare): Add the > + handling of BUILTIN_IN_STRCMP_EQ and BUILT_IN_STRNCMP_EQ. > + (gimple_fold_builtin): Likewise. > + * tree-ssa-strlen.c (compute_string_length): New function. > + (determine_min_obsize): New function. > + (handle_builtin_string_cmp): New function to handle calls to > + string compare functions. > + (strlen_optimize_stmt): Add handling to builtin string compare > + calls. > + * tree-ssa-structalias.c (find_func_aliases_for_builtin_call): > + Add the handling of BUILT_IN_STRCMP_EQ and BUILT_IN_STRNCMP_EQ. > + * tree.c (build_common_builtin_nodes): Add new defines of > + BUILT_IN_STRNCMP_EQ and BUILT_IN_STRCMP_EQ. > + > > gcc/testsuite/ChangeLog > > +2018-02-02 <qing.zhao@oracle.com> > + > + PR middle-end/78809 > + * gcc.dg/strcmpopt_2.c: New testcase. > + * gcc.dg/strcmpopt_3.c: New testcase. > + > + PR middle-end/83026 > + * gcc.dg/strcmpopt_3.c: New testcase. > > > > + > +/* Handle a call to strcmp or strncmp. When the result is ONLY used to do > + equality test against zero: > + > + A. When the lengths of both arguments are constant and it's a strcmp: > + * if the lengths are NOT equal, we can safely fold the call > + to a non-zero value. > + * otherwise, do nothing now. > + > + B. When the length of one argument is constant, try to replace the call with > + a __builtin_str(n)cmp_eq call where possible, i.e: > + > + strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a > + string with constant length , C is a constant. > + if (C <= strlen(STR) && sizeof_array(s) > C) > + { > + replace this call with > + strncmp_eq (s, STR, C) (!)= 0 > + } > + if (C > strlen(STR) > + { > + it can be safely treated as a call to strcmp (s, STR) (!)= 0 > + can handled by the following strcmp. > + } > + > + strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a > + string with constant length. > + if (sizeof_array(s) > strlen(STR)) > + { > + replace this call with > + strcmp_eq (s, STR, strlen(STR)+1) (!)= 0 > + } > + > + Return false when the call is transformed, return true otherwise. > + */ > + > +static bool > +handle_builtin_string_cmp (gimple_stmt_iterator *gsi) So I originally thought you had the core logic wrong in the immediate uses loop. But it's actually the case that the return value is the exact opposite of what I expected. ie, I expected "TRUE" to mean the call was transformed, "FALSE" if it was not transformed. Can you fix that so it's not so confusing? I think with that change we'll be good to go, but please repost for a final looksie. THanks, Jeff
Hi, Jeff, Thanks a lot for your review and comments. I have updated my patch based on your suggestion, and retested this whole patch on both X86 and aarch64. please take a look at the patch again. thanks. Qing > On May 25, 2018, at 3:38 PM, Jeff Law <law@redhat.com> wrote: > So I originally thought you had the core logic wrong in the immediate > uses loop. But it's actually the case that the return value is the > exact opposite of what I expected. > > ie, I expected "TRUE" to mean the call was transformed, "FALSE" if it > was not transformed. > > Can you fix that so it's not so confusing? > > I think with that change we'll be good to go, but please repost for a > final looksie. > > THanks, > Jeff
Hi, I have committed the patch as revision 261039. thanks. Qing > On May 29, 2018, at 7:08 PM, Qing Zhao <qing.zhao@oracle.com> wrote: > > Hi, Jeff, > > Thanks a lot for your review and comments. > > I have updated my patch based on your suggestion, and retested this whole patch on both X86 and aarch64. > > please take a look at the patch again. > > thanks. > > Qing > >> On May 25, 2018, at 3:38 PM, Jeff Law <law@redhat.com> wrote: > >> So I originally thought you had the core logic wrong in the immediate >> uses loop. But it's actually the case that the return value is the >> exact opposite of what I expected. >> >> ie, I expected "TRUE" to mean the call was transformed, "FALSE" if it >> was not transformed. >> >> Can you fix that so it's not so confusing? >> >> I think with that change we'll be good to go, but please repost for a >> final looksie. >> >> THanks, >> Jeff
On 05/29/2018 06:08 PM, Qing Zhao wrote: > Hi, Jeff, > > Thanks a lot for your review and comments. > > I have updated my patch based on your suggestion, and retested this whole patch on both X86 and aarch64. > > please take a look at the patch again. > > thanks. > > Qing > >> On May 25, 2018, at 3:38 PM, Jeff Law <law@redhat.com> wrote: >> So I originally thought you had the core logic wrong in the immediate >> uses loop. But it's actually the case that the return value is the >> exact opposite of what I expected. >> >> ie, I expected "TRUE" to mean the call was transformed, "FALSE" if it >> was not transformed. >> >> Can you fix that so it's not so confusing? >> >> I think with that change we'll be good to go, but please repost for a >> final looksie. >> >> THanks, >> Jeff > > > 0001-2nd-Patch-for-PR78009.patch > > > From 750f44ee0777d55b568f07e263babdedd532d315 Mon Sep 17 00:00:00 2001 > From: qing zhao <qing.zhao@oracle.com> > Date: Tue, 29 May 2018 16:15:21 -0400 > Subject: [PATCH] 2nd Patch for PR78009 Patch for PR83026 > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809 > Inline strcmp with small constant strings > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83026 > missing strlen optimization for strcmp of unequal strings > > The design doc for PR78809 is at: > https://www.mail-archive.com/gcc@gcc.gnu.org/msg83822.html > > this patch is for the second part of change of PR78809 and PR83026: > > B. for strncmp (s1, s2, n) (!)= 0 or strcmp (s1, s2) (!)= 0 > > B.1. (PR83026) When the lengths of both arguments are constant and > it's a strcmp: > * if the lengths are NOT equal, we can safely fold the call > to a non-zero value. > * otherwise, do nothing now. > > B.2. (PR78809) When the length of one argument is constant, try to replace > the call with a __builtin_str(n)cmp_eq call where possible, i.e: > > strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a > string with constant length, C is a constant. > if (C <= strlen(STR) && sizeof_array(s) > C) > { > replace this call with > __builtin_strncmp_eq (s, STR, C) (!)= 0 > } > if (C > strlen(STR) > { > it can be safely treated as a call to strcmp (s, STR) (!)= 0 > can handled by the following strcmp. > } > > strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a > string with constant length. > if (sizeof_array(s) > strlen(STR)) > { > replace this call with > __builtin_strcmp_eq (s, STR, strlen(STR)+1) (!)= 0 > } > > later when expanding the new __builtin_str(n)cmp_eq calls, first expand them > as __builtin_memcmp_eq, if the expansion does not succeed, change them back > to call to __builtin_str(n)cmp. > > adding test case strcmpopt_2.c and strcmpopt_4.c into gcc.dg for part B of > PR78809 adding test case strcmpopt_3.c into gcc.dg for PR83026 > > bootstraped and tested on both X86 and Aarch64. no regression. > --- > gcc/builtins.c | 33 ++++ > gcc/builtins.def | 5 + > gcc/gimple-fold.c | 5 + > gcc/testsuite/gcc.dg/strcmpopt_2.c | 67 ++++++++ > gcc/testsuite/gcc.dg/strcmpopt_3.c | 31 ++++ > gcc/testsuite/gcc.dg/strcmpopt_4.c | 16 ++ > gcc/tree-ssa-strlen.c | 304 ++++++++++++++++++++++++++++++++++--- > gcc/tree-ssa-structalias.c | 2 + > gcc/tree.c | 8 + > 9 files changed, 454 insertions(+), 17 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_2.c > create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_3.c > create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_4.c Sorry for the long delay. This needs a ChangeLog. With the ChangeLog it is OK for the trunk. Jeff
> On Jun 22, 2018, at 11:49 PM, Jeff Law <law@redhat.com> wrote: > > On 05/29/2018 06:08 PM, Qing Zhao wrote: >> Hi, Jeff, >> >> Thanks a lot for your review and comments. >> >> I have updated my patch based on your suggestion, and retested this whole patch on both X86 and aarch64. >> >> please take a look at the patch again. >> >> thanks. >> >> Qing >> >>> On May 25, 2018, at 3:38 PM, Jeff Law <law@redhat.com> wrote: >>> So I originally thought you had the core logic wrong in the immediate >>> uses loop. But it's actually the case that the return value is the >>> exact opposite of what I expected. >>> >>> ie, I expected "TRUE" to mean the call was transformed, "FALSE" if it >>> was not transformed. >>> >>> Can you fix that so it's not so confusing? >>> >>> I think with that change we'll be good to go, but please repost for a >>> final looksie. >>> >>> THanks, >>> Jeff >> >> >> 0001-2nd-Patch-for-PR78009.patch >> >> >> From 750f44ee0777d55b568f07e263babdedd532d315 Mon Sep 17 00:00:00 2001 >> From: qing zhao <qing.zhao@oracle.com <mailto:qing.zhao@oracle.com>> >> Date: Tue, 29 May 2018 16:15:21 -0400 >> Subject: [PATCH] 2nd Patch for PR78009 Patch for PR83026 >> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809 <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809> >> Inline strcmp with small constant strings >> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83026 <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83026> >> missing strlen optimization for strcmp of unequal strings >> >> The design doc for PR78809 is at: >> https://www.mail-archive.com/gcc@gcc.gnu.org/msg83822.html <https://www.mail-archive.com/gcc@gcc.gnu.org/msg83822.html> >> >> this patch is for the second part of change of PR78809 and PR83026: >> >> B. for strncmp (s1, s2, n) (!)= 0 or strcmp (s1, s2) (!)= 0 >> >> B.1. (PR83026) When the lengths of both arguments are constant and >> it's a strcmp: >> * if the lengths are NOT equal, we can safely fold the call >> to a non-zero value. >> * otherwise, do nothing now. >> >> B.2. (PR78809) When the length of one argument is constant, try to replace >> the call with a __builtin_str(n)cmp_eq call where possible, i.e: >> >> strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a >> string with constant length, C is a constant. >> if (C <= strlen(STR) && sizeof_array(s) > C) >> { >> replace this call with >> __builtin_strncmp_eq (s, STR, C) (!)= 0 >> } >> if (C > strlen(STR) >> { >> it can be safely treated as a call to strcmp (s, STR) (!)= 0 >> can handled by the following strcmp. >> } >> >> strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a >> string with constant length. >> if (sizeof_array(s) > strlen(STR)) >> { >> replace this call with >> __builtin_strcmp_eq (s, STR, strlen(STR)+1) (!)= 0 >> } >> >> later when expanding the new __builtin_str(n)cmp_eq calls, first expand them >> as __builtin_memcmp_eq, if the expansion does not succeed, change them back >> to call to __builtin_str(n)cmp. >> >> adding test case strcmpopt_2.c and strcmpopt_4.c into gcc.dg for part B of >> PR78809 adding test case strcmpopt_3.c into gcc.dg for PR83026 >> >> bootstraped and tested on both X86 and Aarch64. no regression. >> --- >> gcc/builtins.c | 33 ++++ >> gcc/builtins.def | 5 + >> gcc/gimple-fold.c | 5 + >> gcc/testsuite/gcc.dg/strcmpopt_2.c | 67 ++++++++ >> gcc/testsuite/gcc.dg/strcmpopt_3.c | 31 ++++ >> gcc/testsuite/gcc.dg/strcmpopt_4.c | 16 ++ >> gcc/tree-ssa-strlen.c | 304 ++++++++++++++++++++++++++++++++++--- >> gcc/tree-ssa-structalias.c | 2 + >> gcc/tree.c | 8 + >> 9 files changed, 454 insertions(+), 17 deletions(-) >> create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_2.c >> create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_3.c >> create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_4.c > Sorry for the long delay. This needs a ChangeLog. With the ChangeLog > it is OK for the trunk. Hi, Jeff, the patch had been committed with ChangeLogs as following: https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=261039 <https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=261039> thanks. Qing > > Jeff
diff --git a/gcc/builtins.c b/gcc/builtins.c index 6b25253..a5f6885 100644 --- a/gcc/builtins.c +++ b/gcc/builtins.c @@ -6953,12 +6953,45 @@ expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode, return target; break; + /* expand it as BUILT_IN_MEMCMP_EQ first. If not successful, change it + back to a BUILT_IN_STRCMP. Remember to delete the 3rd paramater + when changing it to a strcmp call. */ + case BUILT_IN_STRCMP_EQ: + target = expand_builtin_memcmp (exp, target, true); + if (target) + return target; + + /* change this call back to a BUILT_IN_STRCMP. */ + TREE_OPERAND (exp, 1) + = build_fold_addr_expr (builtin_decl_explicit (BUILT_IN_STRCMP)); + + /* delete the last parameter. */ + unsigned int i; + vec<tree, va_gc> *arg_vec; + vec_alloc (arg_vec, 2); + for (i = 0; i < 2; i++) + arg_vec->quick_push (CALL_EXPR_ARG (exp, i)); + exp = build_call_vec (TREE_TYPE (exp), CALL_EXPR_FN (exp), arg_vec); + /* FALLTHROUGH */ + case BUILT_IN_STRCMP: target = expand_builtin_strcmp (exp, target); if (target) return target; break; + /* expand it as BUILT_IN_MEMCMP_EQ first. If not successful, change it + back to a BUILT_IN_STRNCMP. */ + case BUILT_IN_STRNCMP_EQ: + target = expand_builtin_memcmp (exp, target, true); + if (target) + return target; + + /* change it back to a BUILT_IN_STRNCMP. */ + TREE_OPERAND (exp, 1) + = build_fold_addr_expr (builtin_decl_explicit (BUILT_IN_STRNCMP)); + /* FALLTHROUGH */ + case BUILT_IN_STRNCMP: target = expand_builtin_strncmp (exp, target, mode); if (target) diff --git a/gcc/builtins.def b/gcc/builtins.def index 2281021..9eb79fd 100644 --- a/gcc/builtins.def +++ b/gcc/builtins.def @@ -949,6 +949,11 @@ DEF_BUILTIN_STUB (BUILT_IN_ALLOCA_WITH_ALIGN_AND_MAX, "__builtin_alloca_with_ali equality with zero. */ DEF_BUILTIN_STUB (BUILT_IN_MEMCMP_EQ, "__builtin_memcmp_eq") +/* An internal version of strcmp/strncmp, used when the result is only + tested for equality with zero. */ +DEF_BUILTIN_STUB (BUILT_IN_STRCMP_EQ, "__builtin_strcmp_eq") +DEF_BUILTIN_STUB (BUILT_IN_STRNCMP_EQ, "__builtin_strncmp_eq") + /* Object size checking builtins. */ DEF_GCC_BUILTIN (BUILT_IN_OBJECT_SIZE, "object_size", BT_FN_SIZE_CONST_PTR_INT, ATTR_PURE_NOTHROW_LEAF_LIST) DEF_EXT_LIB_BUILTIN_CHKP (BUILT_IN_MEMCPY_CHK, "__memcpy_chk", BT_FN_PTR_PTR_CONST_PTR_SIZE_SIZE, ATTR_RET1_NOTHROW_NONNULL_LEAF) diff --git a/gcc/testsuite/gcc.dg/strcmpopt_2.c b/gcc/testsuite/gcc.dg/strcmpopt_2.c new file mode 100644 index 0000000..0131b8f --- /dev/null +++ b/gcc/testsuite/gcc.dg/strcmpopt_2.c @@ -0,0 +1,67 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -fdump-tree-strlen" } */ + +char s[100] = {'a','b','c','d'}; +typedef struct { char s[8]; int x; } S; + +__attribute__ ((noinline)) int +f1 (S *s) +{ + return __builtin_strcmp (s->s, "abc") != 0; +} + +__attribute__ ((noinline)) int +f2 (void) +{ + return __builtin_strcmp (s, "abc") != 0; +} + +__attribute__ ((noinline)) int +f3 (S *s) +{ + return __builtin_strcmp ("abc", s->s) != 0; +} + +__attribute__ ((noinline)) int +f4 (void) +{ + return __builtin_strcmp ("abc", s) != 0; +} + +__attribute__ ((noinline)) int +f5 (S *s) +{ + return __builtin_strncmp (s->s, "abc", 3) != 0; +} + +__attribute__ ((noinline)) int +f6 (void) +{ + return __builtin_strncmp (s, "abc", 2) != 0; +} + +__attribute__ ((noinline)) int +f7 (S *s) +{ + return __builtin_strncmp ("abc", s->s, 3) != 0; +} + +__attribute__ ((noinline)) int +f8 (void) +{ + return __builtin_strncmp ("abc", s, 2) != 0; +} + +int main (void) +{ + S ss = {{'a','b','c'}, 2}; + + if (f1 (&ss) != 0 || f2 () != 1 || f3 (&ss) != 0 || + f4 () != 1 || f5 (&ss) != 0 || f6 () != 0 || + f7 (&ss) != 0 || f8 () != 0) + __builtin_abort (); + + return 0; +} + +/* { dg-final { scan-tree-dump-times "cmp_eq \\(" 8 "strlen" } } */ diff --git a/gcc/testsuite/gcc.dg/strcmpopt_3.c b/gcc/testsuite/gcc.dg/strcmpopt_3.c new file mode 100644 index 0000000..86a0d7a --- /dev/null +++ b/gcc/testsuite/gcc.dg/strcmpopt_3.c @@ -0,0 +1,31 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -fdump-tree-strlen" } */ + +__attribute__ ((noinline)) int +f1 (void) +{ + char *s0= "abcd"; + char s[8]; + __builtin_strcpy (s, s0); + return __builtin_strcmp(s, "abc") != 0; +} + +__attribute__ ((noinline)) int +f2 (void) +{ + char *s0 = "ab"; + char s[8]; + __builtin_strcpy (s, s0); + return __builtin_strcmp("abc", s) != 0; +} + +int main (void) +{ + if (f1 () != 1 + || f2 () != 1) + __builtin_abort (); + + return 0; +} + +/* { dg-final { scan-tree-dump-times "strcmp" 0 "strlen" } } */ diff --git a/gcc/testsuite/gcc.dg/strcmpopt_4.c b/gcc/testsuite/gcc.dg/strcmpopt_4.c new file mode 100644 index 0000000..d727bc3 --- /dev/null +++ b/gcc/testsuite/gcc.dg/strcmpopt_4.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-strlen" } */ + +typedef struct { char s[8]; int x; } S; +extern int max_i; + +int +f1 (S * s) +{ + int result, i; + for (i = 0; i < max_i; i++) + result += __builtin_strcmp (s->s, "abc") != 0 ? 2 : 1; + return result; +} + +/* { dg-final { scan-tree-dump-times "cmp_eq \\(" 1 "strlen" } } */ diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c index 94f20ef..57563ef 100644 --- a/gcc/tree-ssa-strlen.c +++ b/gcc/tree-ssa-strlen.c @@ -2540,6 +2540,216 @@ handle_builtin_memcmp (gimple_stmt_iterator *gsi) return false; } +/* Given an index to the strinfo vector, compute the string length for the + corresponding string. Return -1 when unknown. */ + +static HOST_WIDE_INT +compute_string_length (int idx) +{ + HOST_WIDE_INT string_leni = -1; + gcc_assert (idx != 0); + + if (idx < 0) + string_leni = ~idx; + else + { + strinfo *si = get_strinfo (idx); + if (si) + { + tree const_string_len = get_string_length (si); + if (const_string_len && tree_fits_shwi_p (const_string_len)) + string_leni = tree_to_shwi (const_string_len); + } + } + + if (string_leni < 0) + return -1; + + return string_leni; +} + +/* Handle a call to strcmp or strncmp. When the result is ONLY used to do + equality test against zero: + + A. When both arguments are constant strings and it's a strcmp: + * if the length of the strings are NOT equal, we can safely fold the call + to a non-zero value. + * otherwise, do nothing now. + + B. When one of the arguments is constant string, try to replace the call with + a __builtin_str(n)cmp_eq call where possible, i.e: + + strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a + constant string, C is a constant. + if (C <= strlen(STR) && sizeof_array(s) > C) + { + replace this call with + strncmp_eq (s, STR, C) (!)= 0 + } + if (C > strlen(STR) + { + it can be safely treated as a call to strcmp (s, STR) (!)= 0 + can handled by the following strcmp. + } + + strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a + constant string. + if (sizeof_array(s) > strlen(STR)) + { + replace this call with + strcmp_eq (s, STR, strlen(STR)+1) (!)= 0 + } + */ + +static bool +handle_builtin_string_cmp (gimple_stmt_iterator *gsi) +{ + gcall *stmt = as_a <gcall *> (gsi_stmt (*gsi)); + tree res = gimple_call_lhs (stmt); + use_operand_p use_p; + imm_use_iterator iter; + tree arg1 = gimple_call_arg (stmt, 0); + tree arg2 = gimple_call_arg (stmt, 1); + int idx1 = get_stridx (arg1); + int idx2 = get_stridx (arg2); + HOST_WIDE_INT length = -1; + bool is_ncmp = false; + + if (!res) + return true; + + /* When both arguments are unknown, do nothing. */ + if (idx1 == 0 && idx2 == 0) + return true; + + /* Handle strncmp function. */ + if (gimple_call_num_args (stmt) == 3) + { + tree len = gimple_call_arg (stmt, 2); + if (tree_fits_shwi_p (len)) + length = tree_to_shwi (len); + + is_ncmp = true; + } + + /* For strncmp, if the length argument is NOT known, do nothing. */ + if (is_ncmp && length < 0) + return true; + + /* When the result is ONLY used to do equality test against zero. */ + FOR_EACH_IMM_USE_FAST (use_p, iter, res) + { + gimple *use_stmt = USE_STMT (use_p); + + if (is_gimple_debug (use_stmt)) + continue; + if (gimple_code (use_stmt) == GIMPLE_ASSIGN) + { + tree_code code = gimple_assign_rhs_code (use_stmt); + if (code == COND_EXPR) + { + tree cond_expr = gimple_assign_rhs1 (use_stmt); + if ((TREE_CODE (cond_expr) != EQ_EXPR + && (TREE_CODE (cond_expr) != NE_EXPR)) + || !integer_zerop (TREE_OPERAND (cond_expr, 1))) + return true; + } + else if (code == EQ_EXPR || code == NE_EXPR) + { + if (!integer_zerop (gimple_assign_rhs2 (use_stmt))) + return true; + } + else + return true; + } + else if (gimple_code (use_stmt) == GIMPLE_COND) + { + tree_code code = gimple_cond_code (use_stmt); + if ((code != EQ_EXPR && code != NE_EXPR) + || !integer_zerop (gimple_cond_rhs (use_stmt))) + return true; + } + else + return true; + } + + /* When both arguments are known, and their strlens are unequal, we can + safely fold the call to a non-zero value for strcmp; + othewise, do nothing now. */ + if (idx1 != 0 && idx2 != 0) + { + HOST_WIDE_INT const_string_leni1 = compute_string_length (idx1); + HOST_WIDE_INT const_string_leni2 = compute_string_length (idx2); + + if (!is_ncmp + && const_string_leni1 != -1 + && const_string_leni2 != -1 + && const_string_leni1 != const_string_leni2) + { + replace_call_with_value (gsi, integer_one_node); + return false; + } + return true; + } + + /* When one of args is constant string. */ + tree var_string = NULL_TREE; + HOST_WIDE_INT const_string_leni = -1; + + if (idx1) + { + const_string_leni = compute_string_length (idx1); + var_string = arg2; + } + else + { + gcc_checking_assert (idx2); + const_string_leni = compute_string_length (idx2); + var_string = arg1; + } + + if (const_string_leni < 0) + return true; + + unsigned HOST_WIDE_INT var_sizei = 0; + /* try to determine the minimum size of the object pointed by var_string. */ + tree size = compute_objsize (var_string, 2); + + if (!size) + return true; + + if (tree_fits_uhwi_p (size)) + var_sizei = tree_to_uhwi (size); + + if (var_sizei == 0) + return true; + + /* For strncmp, if length > const_string_leni , this call can be safely + transformed to a strcmp. */ + if (is_ncmp && length > const_string_leni) + is_ncmp = false; + + unsigned HOST_WIDE_INT final_length + = is_ncmp ? length : const_string_leni + 1; + + /* Replace strcmp or strncmp with the corresponding str(n)cmp_eq. */ + if (var_sizei > final_length) + { + tree fn + = (is_ncmp + ? builtin_decl_implicit (BUILT_IN_STRNCMP_EQ) + : builtin_decl_implicit (BUILT_IN_STRCMP_EQ)); + if (!fn) + return true; + tree const_string_len = build_int_cst (size_type_node, final_length); + update_gimple_call (gsi, fn, 3, arg1, arg2, const_string_len); + } + else + return true; + + return false; +} + /* Handle a POINTER_PLUS_EXPR statement. For p = "abcd" + 2; compute associated length, or if p = q + off is pointing to a '\0' character of a string, call @@ -2939,6 +3149,11 @@ strlen_optimize_stmt (gimple_stmt_iterator *gsi) if (!handle_builtin_memcmp (gsi)) return false; break; + case BUILT_IN_STRCMP: + case BUILT_IN_STRNCMP: + if (!handle_builtin_string_cmp (gsi)) + return false; + break; default: break; } diff --git a/gcc/tree.c b/gcc/tree.c index ed1852b..e23b3cd 100644 --- a/gcc/tree.c +++ b/gcc/tree.c @@ -10047,6 +10047,14 @@ build_common_builtin_nodes (void) "__builtin_memcmp_eq", ECF_PURE | ECF_NOTHROW | ECF_LEAF); + local_define_builtin ("__builtin_strncmp_eq", ftype, BUILT_IN_STRNCMP_EQ, + "__builtin_strncmp_eq", + ECF_PURE | ECF_NOTHROW | ECF_LEAF); + + local_define_builtin ("__builtin_strcmp_eq", ftype, BUILT_IN_STRCMP_EQ, + "__builtin_strcmp_eq", + ECF_PURE | ECF_NOTHROW | ECF_LEAF); + /* If there's a possibility that we might use the ARM EABI, build the alternate __cxa_end_cleanup node used to resume from C++. */ if (targetm.arm_eabi_unwinder)