Message ID | 20171219172049.GT2353@tucnak |
---|---|
State | New |
Headers | show |
Series | Fix up one mishandled plural in diagnostics in gimple-ssa-sprintf.c | expand |
On 12/19/2017 10:20 AM, Jakub Jelinek wrote: > Hi! > > This inform is correct in english, but wrong in many other languages. I think someone already mentioned this in another review but the computation below is far from intuitive: + ? (fmtres.range.likely % 1000000) + 1000000 + : fmtres.range.likely, It may not be a big deal in this one isolated instance but there are many more examples of this problem in this code and elsewhere that should be fixed (the sprintf warnings aren't as easy to deal with due to bug 77810). Other examples of the same problem are going to be in some of the other warnings I added (-Walloc-size-larger-than, -Wstringop-overflow/truncation, -Wrestrict). Since all these should be adjusted and ideally handled consistently, duplicating the same computation as above would be tedious and error-prone. I'd much rather have a general solution that obviated having to deal with it at every call site. Can the math be moved into inform_n (and warning_n) itself? If that doesn't make sense, then providing a helper for warning_n and inform_n callers to call might be an alternative. I CC David for his thoughts on how best to deal with it. As a separate enhancement, it would also be nice to put in place a checker (such a post-processing script) to help detect this problem before it's introduced into the code base. Martin > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > 2017-12-19 Jakub Jelinek <jakub@redhat.com> > > * gimple-ssa-sprintf.c (format_directive): Use inform_n instead of > inform with hardcoded english plural handling. > > --- gcc/gimple-ssa-sprintf.c.jj 2017-12-14 12:00:32.000000000 +0100 > +++ gcc/gimple-ssa-sprintf.c 2017-12-19 12:19:10.975042433 +0100 > @@ -2933,13 +2933,15 @@ format_directive (const sprintf_dom_walk > > if (warned && fmtres.range.min < fmtres.range.likely > && fmtres.range.likely < fmtres.range.max) > - { > - inform (info.fmtloc, > - (1 == fmtres.range.likely > - ? G_("assuming directive output of %wu byte") > - : G_("assuming directive output of %wu bytes")), > + /* Some languages have special plural rules even for large values, > + but it is periodic with period of 10, 100, 1000 etc. */ > + inform_n (info.fmtloc, > + fmtres.range.likely > INT_MAX > + ? (fmtres.range.likely % 1000000) + 1000000 > + : fmtres.range.likely, > + "assuming directive output of %wu byte", > + "assuming directive output of %wu bytes", > fmtres.range.likely); > - } > > if (warned && fmtres.argmin) > { > > Jakub >
On Tue, Dec 19, 2017 at 10:49:07AM -0700, Martin Sebor wrote:
> Can the math be moved into inform_n (and warning_n) itself?
No. I'm against having dozens of inform_n and warning_n etc.
overloads, it is already bad enough we have so many diagnostics
entry points now that putting breakpoints on them is so hard.
Those functions just have int, and we can have
perhaps some other function (plural_count or some better name)
that is inline and overloaded on the various types and can be used
if the count isn't int.
As for gimple-ssa-sprintf.c, there are several other cases that
want a plural form, but I haven't changed them because
format_warning_at_substring doesn't have _n suffixed variant.
Jakub
On December 19, 2017 6:58:15 PM GMT+01:00, Jakub Jelinek <jakub@redhat.com> wrote: >On Tue, Dec 19, 2017 at 10:49:07AM -0700, Martin Sebor wrote: >> Can the math be moved into inform_n (and warning_n) itself? > >No. I'm against having dozens of inform_n and warning_n etc. >overloads, it is already bad enough we have so many diagnostics >entry points now that putting breakpoints on them is so hard. >Those functions just have int, and we can have >perhaps some other function (plural_count or some better name) >that is inline and overloaded on the various types and can be used >if the count isn't int. > >As for gimple-ssa-sprintf.c, there are several other cases that >want a plural form, but I haven't changed them because >format_warning_at_substring doesn't have _n suffixed variant. Can't we simply use bytes (plural) and not care for the detail of special values like one? Other languages (Czech?) even have special cases for two and three... Richard. > Jakub
On Tue, Dec 19, 2017 at 07:02:38PM +0100, Richard Biener wrote: > On December 19, 2017 6:58:15 PM GMT+01:00, Jakub Jelinek <jakub@redhat.com> wrote: > >On Tue, Dec 19, 2017 at 10:49:07AM -0700, Martin Sebor wrote: > >> Can the math be moved into inform_n (and warning_n) itself? > > > >No. I'm against having dozens of inform_n and warning_n etc. > >overloads, it is already bad enough we have so many diagnostics > >entry points now that putting breakpoints on them is so hard. > >Those functions just have int, and we can have > >perhaps some other function (plural_count or some better name) > >that is inline and overloaded on the various types and can be used > >if the count isn't int. > > > >As for gimple-ssa-sprintf.c, there are several other cases that > >want a plural form, but I haven't changed them because > >format_warning_at_substring doesn't have _n suffixed variant. > > Can't we simply use bytes (plural) and not care for the detail of special > values like one? Other languages (Czech?) even have special cases for two > and three... If it is supposed to be gramattically correct, it should use the _n suffixed APIs. There are languages with just 1 or 2, 3, 4 and 6 forms, see https://www.gnu.org/savannah-checkouts/gnu/gettext/manual/html_node/Plural-forms.html for details. Jakub
On 12/19/2017 10:58 AM, Jakub Jelinek wrote: > On Tue, Dec 19, 2017 at 10:49:07AM -0700, Martin Sebor wrote: >> Can the math be moved into inform_n (and warning_n) itself? > > No. I'm against having dozens of inform_n and warning_n etc. The question/suggestion is to change the existing inform_n and warning_n functions to do the math instead of having each caller do it. (I.e., not add any more overloads.) > overloads, it is already bad enough we have so many diagnostics > entry points now that putting breakpoints on them is so hard. > Those functions just have int, and we can have > perhaps some other function (plural_count or some better name) > that is inline and overloaded on the various types and can be used > if the count isn't int. If the change I suggested above doesn't work for some reason then let's consider introducing such a helper function. > As for gimple-ssa-sprintf.c, there are several other cases that > want a plural form, but I haven't changed them because > format_warning_at_substring doesn't have _n suffixed variant. Yes, I think I mentioned that. It involves adding even more "overloads." I'm not a fan of a proliferation of APIs with only subtle differences either. They are too easy to misuse. I'd prefer fewer "smart" APIs that git it right on their own. It's probably too late to add something like that for GCC 8 but I'm hoping we can come up with an idea and implement it in GCC 9 to avoid these pitfalls while (hopefully) also making it easier to write messages involving more than one plural form. Martin
On Tue, Dec 19, 2017 at 11:19:44AM -0700, Martin Sebor wrote: > On 12/19/2017 10:58 AM, Jakub Jelinek wrote: > > On Tue, Dec 19, 2017 at 10:49:07AM -0700, Martin Sebor wrote: > > > Can the math be moved into inform_n (and warning_n) itself? > > > > No. I'm against having dozens of inform_n and warning_n etc. > > The question/suggestion is to change the existing inform_n and > warning_n functions to do the math instead of having each caller > do it. (I.e., not add any more overloads.) The "math" is a mapping from some type (in this case UHWI) to some other type (int, used by our *_n APIs), so that it gives the same plural forms for known translations. Are you suggesting we change the type of the *_n APIs to UHWI instead? What if the caller needs something different? E.g. wide_int, widest_int, offset_int, __int128, whatever else? Jakub
On 12/19/2017 11:28 AM, Jakub Jelinek wrote: > On Tue, Dec 19, 2017 at 11:19:44AM -0700, Martin Sebor wrote: >> On 12/19/2017 10:58 AM, Jakub Jelinek wrote: >>> On Tue, Dec 19, 2017 at 10:49:07AM -0700, Martin Sebor wrote: >>>> Can the math be moved into inform_n (and warning_n) itself? >>> >>> No. I'm against having dozens of inform_n and warning_n etc. >> >> The question/suggestion is to change the existing inform_n and >> warning_n functions to do the math instead of having each caller >> do it. (I.e., not add any more overloads.) > > The "math" is a mapping from some type (in this case UHWI) to > some other type (int, used by our *_n APIs), so that it gives the > same plural forms for known translations. I understand that. Presumably this mapping should be the same for all messages (whether we're dealing with "bytes" or "elements" or whatever other quantity). > Are you suggesting we change the type of the *_n APIs to UHWI > instead? What if the caller needs something different? E.g. > wide_int, widest_int, offset_int, __int128, whatever else? I'd expect HWI (either signed or unsigned) to work for existing callers. There is (unfortunately) no directive to format wide ints so callers that also format the number (as in "%wu bytes") in addition to using it to choose a plural vs singular form have to convert it to an integer as it is(*). If/when in the future a directive to format an offset_int and wide_int (and the various flavors of poly_int) is added we can revise the API to take whatever type is the most appropriate, or (more likely) add true overloads (i.e., APIs with the same name but different type) to handle each form. My "vision" is to have a single warning_n() name (if not a single function) handle all there nuances so we don't have to remember to use waning_uhwi_n() and warning_wide_int_n() and warning_poly_int_n(), etc., or even call get_plural_form (n), but still get values of all types formatted optimally and consistently. Martin [*] Extending the dollar flag (or coming up with some other intuitive notation) to refer to the N argument might be a useful enhancement to make the association between it and the form of the message explicit: inform_n (loc, nbytes, "%0$wu byte", "%0$wu bytes"); and avoid the mistake of passing one value as N and another as the argument to the numeric directive.
On 12/19/2017 10:20 AM, Jakub Jelinek wrote: > Hi! > > This inform is correct in english, but wrong in many other languages. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > 2017-12-19 Jakub Jelinek <jakub@redhat.com> > > * gimple-ssa-sprintf.c (format_directive): Use inform_n instead of > inform with hardcoded english plural handling. OK. jeff
On Tue, 2017-12-19 at 18:58 +0100, Jakub Jelinek wrote: > On Tue, Dec 19, 2017 at 10:49:07AM -0700, Martin Sebor wrote: > > Can the math be moved into inform_n (and warning_n) itself? > > No. I'm against having dozens of inform_n and warning_n etc. > overloads, it is already bad enough we have so many diagnostics > entry points now that putting breakpoints on them is so hard. FWIW, I added a "break-on-diagnostic" command to our .gdbinit script, in case that helps: https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01382.html > Those functions just have int, and we can have > perhaps some other function (plural_count or some better name) > that is inline and overloaded on the various types and can be used > if the count isn't int. > > As for gimple-ssa-sprintf.c, there are several other cases that > want a plural form, but I haven't changed them because > format_warning_at_substring doesn't have _n suffixed variant. > > Jakub
--- gcc/gimple-ssa-sprintf.c.jj 2017-12-14 12:00:32.000000000 +0100 +++ gcc/gimple-ssa-sprintf.c 2017-12-19 12:19:10.975042433 +0100 @@ -2933,13 +2933,15 @@ format_directive (const sprintf_dom_walk if (warned && fmtres.range.min < fmtres.range.likely && fmtres.range.likely < fmtres.range.max) - { - inform (info.fmtloc, - (1 == fmtres.range.likely - ? G_("assuming directive output of %wu byte") - : G_("assuming directive output of %wu bytes")), + /* Some languages have special plural rules even for large values, + but it is periodic with period of 10, 100, 1000 etc. */ + inform_n (info.fmtloc, + fmtres.range.likely > INT_MAX + ? (fmtres.range.likely % 1000000) + 1000000 + : fmtres.range.likely, + "assuming directive output of %wu byte", + "assuming directive output of %wu bytes", fmtres.range.likely); - } if (warned && fmtres.argmin) {