Message ID | CAO9iq9Eq6nSUYB61CZN7_u5aPSa_UD7Hcx1R5N7FTyC7LT7rNw@mail.gmail.com |
---|---|
State | New |
Headers | show |
Hi Janne, sorry for being late in voicing my opinion, but I personally would prefer to have this patch in a separately. That way bisecting for performance regressions points only to the offending code and not to the change of the character length (the latter might add a tiny bit of cost, too). Regards, Andre On Fri, 23 Dec 2016 11:27:10 +0200 Janne Blomqvist <blomqvist.janne@gmail.com> wrote: > On Wed, Dec 21, 2016 at 3:14 PM, Andre Vehreschild <vehre@gmx.de> wrote: > >> Now when I think about this some more, I have a vague recollection > >> that a long time ago it used to be something like that. The problem > >> is that MIN_EXPR<CONSTANT, NON-CONSTANT> will of course be > >> NON-CONSTANT, so the memcpy call can't be inlined. Hence it was > >> changed to two separate __builtin_memmove() calls to have better > >> opportunity to inline. So probably a no-go to change it back. :( > > > > I don't get that. From the former only one of the memmove's could have been > > inlined assuming that only CONSTANT sizes are inlined. > > Yes. Which is better than not being able to inline, assuming the > inlined branch is more likely to be taken, no? > > > The other one had a > > NON-CONSTANT as long as pointer following and constant propagation was not > > effective together. In our case the "NON-CONSTANT branch" would have been > > used, which is resolved by constant propagation (of the size of constant > > memory p points to). I assume the warning is triggered, because dead-code > > elimination has not removed the else part. > > Probably yes, in this case. My performance worries were more about the > general case. But perhaps they are unfounded, IDK really. Does anybody > have a battery of string operation benchmarks that mirrors how real > Fortran code does string handling? > > Anyway, the attached patch accomplishes that. It turns the tree dump I > showed two days ago into something like > > { > integer(kind=8) D.3484; > unsigned long D.3485; > > D.3484 = *_p; > D.3485 = (unsigned long) D.3484 + 18446744073709551608; > if (D.3484 != 0) > { > __builtin_memmove ((void *) *p, (void *) &"12345679"[1]{lb: 1 > sz: 1}, MIN_EXPR <(unsigned long) D.3484, 8>); > if ((unsigned long) D.3484 > 8) > { > __builtin_memset ((void *) *p + 8, 32, D.3485); > } > } > } > > and gets rid of the -Wstringop-overflow warning. (It causes a two new > testsuite failures (gfortran.dg/dependency_49.f90 and > gfortran.dg/transfer_intrinsic_1.f90), but those are just tests that > grep for patterns in the .original dump and need to be adjusted). > > If that is Ok, do you prefer it as part of the charlen-size_t patch, > or would you (GFortran maintainers in general) prefer that it's a > separate patch? >
On Tue, Dec 27, 2016 at 12:47 PM, Andre Vehreschild <vehre@gmx.de> wrote: > Hi Janne, > > sorry for being late in voicing my opinion, but I personally would prefer to > have this patch in a separately. That way bisecting for performance > regressions points only to the offending code and not to the change of the > character length (the latter might add a tiny bit of cost, too). No worries. I included this in the updated charlen-size_t patch I posted and which FX reviewed, but I can certainly commit this part separately. > > Regards, > Andre > > > On Fri, 23 Dec 2016 11:27:10 +0200 > Janne Blomqvist <blomqvist.janne@gmail.com> wrote: > >> On Wed, Dec 21, 2016 at 3:14 PM, Andre Vehreschild <vehre@gmx.de> wrote: >> >> Now when I think about this some more, I have a vague recollection >> >> that a long time ago it used to be something like that. The problem >> >> is that MIN_EXPR<CONSTANT, NON-CONSTANT> will of course be >> >> NON-CONSTANT, so the memcpy call can't be inlined. Hence it was >> >> changed to two separate __builtin_memmove() calls to have better >> >> opportunity to inline. So probably a no-go to change it back. :( >> > >> > I don't get that. From the former only one of the memmove's could have been >> > inlined assuming that only CONSTANT sizes are inlined. >> >> Yes. Which is better than not being able to inline, assuming the >> inlined branch is more likely to be taken, no? >> >> > The other one had a >> > NON-CONSTANT as long as pointer following and constant propagation was not >> > effective together. In our case the "NON-CONSTANT branch" would have been >> > used, which is resolved by constant propagation (of the size of constant >> > memory p points to). I assume the warning is triggered, because dead-code >> > elimination has not removed the else part. >> >> Probably yes, in this case. My performance worries were more about the >> general case. But perhaps they are unfounded, IDK really. Does anybody >> have a battery of string operation benchmarks that mirrors how real >> Fortran code does string handling? >> >> Anyway, the attached patch accomplishes that. It turns the tree dump I >> showed two days ago into something like >> >> { >> integer(kind=8) D.3484; >> unsigned long D.3485; >> >> D.3484 = *_p; >> D.3485 = (unsigned long) D.3484 + 18446744073709551608; >> if (D.3484 != 0) >> { >> __builtin_memmove ((void *) *p, (void *) &"12345679"[1]{lb: 1 >> sz: 1}, MIN_EXPR <(unsigned long) D.3484, 8>); >> if ((unsigned long) D.3484 > 8) >> { >> __builtin_memset ((void *) *p + 8, 32, D.3485); >> } >> } >> } >> >> and gets rid of the -Wstringop-overflow warning. (It causes a two new >> testsuite failures (gfortran.dg/dependency_49.f90 and >> gfortran.dg/transfer_intrinsic_1.f90), but those are just tests that >> grep for patterns in the .original dump and need to be adjusted). >> >> If that is Ok, do you prefer it as part of the charlen-size_t patch, >> or would you (GFortran maintainers in general) prefer that it's a >> separate patch? >> > > > -- > Andre Vehreschild * Email: vehre ad gmx dot de
diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c index 3767a28..6708ee03 100644 --- a/gcc/fortran/trans-expr.c +++ b/gcc/fortran/trans-expr.c @@ -6455,33 +6455,19 @@ gfc_trans_string_copy (stmtblock_t * block, tree dlength, tree dest, return; } + /* The string copy algorithm below generates code like + + if (dlen > 0) { + memmove (dest, src, min(dlen, slen)); + if (slen < dlen) + memset(&dest[slen], ' ', dlen - slen); + } + */ + /* Do nothing if the destination length is zero. */ cond = fold_build2_loc (input_location, GT_EXPR, boolean_type_node, dlen, build_int_cst (size_type_node, 0)); - /* The following code was previously in _gfortran_copy_string: - - // The two strings may overlap so we use memmove. - void - copy_string (GFC_INTEGER_4 destlen, char * dest, - GFC_INTEGER_4 srclen, const char * src) - { - if (srclen >= destlen) - { - // This will truncate if too long. - memmove (dest, src, destlen); - } - else - { - memmove (dest, src, srclen); - // Pad with spaces. - memset (&dest[srclen], ' ', destlen - srclen); - } - } - - We're now doing it here for better optimization, but the logic - is the same. */ - /* For non-default character kinds, we have to multiply the string length by the base type size. */ chartype = gfc_get_char_type (dkind); @@ -6504,17 +6490,19 @@ gfc_trans_string_copy (stmtblock_t * block, tree dlength, tree dest, else src = gfc_build_addr_expr (pvoid_type_node, src); - /* Truncate string if source is too long. */ - cond2 = fold_build2_loc (input_location, GE_EXPR, boolean_type_node, slen, - dlen); + /* First do the memmove. */ + tmp2 = fold_build2_loc (input_location, MIN_EXPR, TREE_TYPE (dlen), dlen, + slen); tmp2 = build_call_expr_loc (input_location, builtin_decl_explicit (BUILT_IN_MEMMOVE), - 3, dest, src, dlen); + 3, dest, src, tmp2); + stmtblock_t tmpblock2; + gfc_init_block (&tmpblock2); + gfc_add_expr_to_block (&tmpblock2, tmp2); - /* Else copy and pad with spaces. */ - tmp3 = build_call_expr_loc (input_location, - builtin_decl_explicit (BUILT_IN_MEMMOVE), - 3, dest, src, slen); + /* If the destination is longer, fill the end with spaces. */ + cond2 = fold_build2_loc (input_location, LT_EXPR, boolean_type_node, slen, + dlen); /* Wstringop-overflow appears at -O3 even though this warning is not explicitly available in fortran nor can it be switched off. If the @@ -6530,13 +6518,14 @@ gfc_trans_string_copy (stmtblock_t * block, tree dlength, tree dest, tmp4 = fill_with_spaces (tmp4, chartype, tmp); gfc_init_block (&tempblock); - gfc_add_expr_to_block (&tempblock, tmp3); gfc_add_expr_to_block (&tempblock, tmp4); tmp3 = gfc_finish_block (&tempblock); /* The whole copy_string function is there. */ tmp = fold_build3_loc (input_location, COND_EXPR, void_type_node, cond2, - tmp2, tmp3); + tmp3, build_empty_stmt (input_location)); + gfc_add_expr_to_block (&tmpblock2, tmp); + tmp = gfc_finish_block (&tmpblock2); tmp = fold_build3_loc (input_location, COND_EXPR, void_type_node, cond, tmp, build_empty_stmt (input_location)); gfc_add_expr_to_block (block, tmp);