adjust warning_n() to take uhwi (PR 84207)

Message ID 7c200219-2d6e-02ca-e720-9d3a8616a1e6@gmail.com
State New
Headers show
Series
  • adjust warning_n() to take uhwi (PR 84207)
Related show

Commit Message

Martin Sebor Feb. 13, 2018, 3:10 a.m.
Bug 84207 - Hard coded plural in gimple-fold.c points out one
of a number of warning_at() calls where warning_n() should have
been used.  The attached patch both replaces the calls and also
changes the signatures of the warning_n(), error_n(), and
inform_n() functions to take an unsigned HOST_WIDE_INT argument
instead of int.  I also changed the implementation of
diagnostic_n_impl() to deal with unsigned HOST_WIDE_INT values
in excess of ULONG_MAX (the maximum value ngettext handles) so
callers don't need to.

Bootstrapped/regtested on x86_64-linux.

Martin

PS Is there any reason why diagnostic-core.h and diagnostic.c
does not/should not include tree.h and other GCC headers?

Comments

Manuel López-Ibáñez Feb. 13, 2018, 5:07 p.m. | #1
On 13/02/18 03:10, Martin Sebor wrote:
> PS Is there any reason why diagnostic-core.h and diagnostic.c
> does not/should not include tree.h and other GCC headers?

The short reason is that we want to keep the core diagnostics as independent of 
the rest of the compiler as possible so that using it doesn't bring undesired 
dependencies. It took some effort a few years ago to clean-up headers, in 
particular, the mess of toplev.h, inclusion of rtl in FE files, inclusion of 
tree.h in back-end files, etc. I'm not sure what is the status now but I'm 
afraid that without any definition of actual internal/external interfaces and 
without actual physical separation of files and compilation steps (read: 
libraries), the headers may have crept back in.

Long time ago there was a plan of making GCC more modular, perhaps moving to a 
library framework like LLVM/Clang. I even had a patch that moved all core 
diagnostics and line-map stuff to its own libdiagnostic that was used by libcpp 
and the rest of the compiler (instead of overriding call-backs in libcpp like 
we do now and forcing libcpp into every executable that wishes to use 
diagnostics or line-maps). Unfortunately, it never worked properly because of 
the build machinery.

More background (probably very much outdated and forgotten):

https://docs.google.com/document/pub?id=1ZfyfkB62EFaR4_g4JKm4--guz3vxm9pciOBziMHTnK4

https://gcc.gnu.org/wiki/rearch

https://gcc.gnu.org/wiki/ModularGCC

Cheers,

Manuel.
Martin Sebor Feb. 13, 2018, 5:58 p.m. | #2
On 02/13/2018 10:07 AM, Manuel López-Ibáñez wrote:
> On 13/02/18 03:10, Martin Sebor wrote:
>> PS Is there any reason why diagnostic-core.h and diagnostic.c
>> does not/should not include tree.h and other GCC headers?
>
> The short reason is that we want to keep the core diagnostics as
> independent of the rest of the compiler as possible so that using it
> doesn't bring undesired dependencies. It took some effort a few years
> ago to clean-up headers, in particular, the mess of toplev.h, inclusion
> of rtl in FE files, inclusion of tree.h in back-end files, etc. I'm not
> sure what is the status now but I'm afraid that without any definition
> of actual internal/external interfaces and without actual physical
> separation of files and compilation steps (read: libraries), the headers
> may have crept back in.

Thanks for the background.  It makes sense to keep header
dependencies to a minimum.  Here it seems like more than
that because adding calls to (for instance) tree_to_uhwi()
to diagnostic.c causes:

/src/gcc/svn/gcc/diagnostic.c:56: undefined reference to 
`tree_to_uhwi(tree_node const*)'
collect2: error: ld returned 1 exit status
Makefile:2929: recipe for target 'gcov' failed
make[2]: *** [gcov] Error 1

I wanted to make the _n() functions like warning_n() more
robust by letting them accept a tree argument (as well as
offset_int and wide_int) in addition to HOST_WIDE_INT but
I can't do it if they can't work with these types.

> Long time ago there was a plan of making GCC more modular, perhaps
> moving to a library framework like LLVM/Clang. I even had a patch that
> moved all core diagnostics and line-map stuff to its own libdiagnostic
> that was used by libcpp and the rest of the compiler (instead of
> overriding call-backs in libcpp like we do now and forcing libcpp into
> every executable that wishes to use diagnostics or line-maps).
> Unfortunately, it never worked properly because of the build machinery.
>
> More background (probably very much outdated and forgotten):
>
> https://docs.google.com/document/pub?id=1ZfyfkB62EFaR4_g4JKm4--guz3vxm9pciOBziMHTnK4
>
>
> https://gcc.gnu.org/wiki/rearch
>
> https://gcc.gnu.org/wiki/ModularGCC
>

Yes, better/greater modularity would be useful.

Thanks again for the background.

Martin
Manuel López-Ibáñez Feb. 13, 2018, 8:59 p.m. | #3
On 13 Feb 2018 5:58 pm, "Martin Sebor" <msebor@gmail.com> wrote:


I wanted to make the _n() functions like warning_n() more
robust by letting them accept a tree argument (as well as
offset_int and wide_int) in addition to HOST_WIDE_INT but
I can't do it if they can't work with these types.


There must be a tree-diagnostics.c where you can add those functions and
then call the general diagnostic functions. Same for RTL.

Note that pretty-printer.c works in the same way.

Cheers,

Manuel.
Joseph Myers Feb. 13, 2018, 9:05 p.m. | #4
On Mon, 12 Feb 2018, Martin Sebor wrote:

> Bug 84207 - Hard coded plural in gimple-fold.c points out one
> of a number of warning_at() calls where warning_n() should have
> been used.  The attached patch both replaces the calls and also
> changes the signatures of the warning_n(), error_n(), and
> inform_n() functions to take an unsigned HOST_WIDE_INT argument
> instead of int.  I also changed the implementation of
> diagnostic_n_impl() to deal with unsigned HOST_WIDE_INT values
> in excess of ULONG_MAX (the maximum value ngettext handles) so
> callers don't need to.

Saturating to ULONG_MAX is not correct for languages where the plural form 
depends on n%10 or n%100 (see the various Plural-Forms entries in the .po 
files).  If n is too large you want something like n % 1000000 + 1000000 
instead to get the correct plural form in all cases.
Martin Sebor Feb. 13, 2018, 10:37 p.m. | #5
On 02/13/2018 01:59 PM, Manuel López-Ibáñez wrote:
>
>
> On 13 Feb 2018 5:58 pm, "Martin Sebor" <msebor@gmail.com
> <mailto:msebor@gmail.com>> wrote:
>
>
>     I wanted to make the _n() functions like warning_n() more
>     robust by letting them accept a tree argument (as well as
>     offset_int and wide_int) in addition to HOST_WIDE_INT but
>     I can't do it if they can't work with these types.
>
>
> There must be a tree-diagnostics.c where you can add those functions and
> then call the general diagnostic functions. Same for RTL.

I don't see how to do that.

Here's a sketch of what I tried to do:

   struct IntegerConverter
   {
     union {
       tree t;
       unsigned HOST_WIDE_INT hwi;
       // buffer for offset_int, wide_int, etc.
     } value;

     IntegerConverter (tree t)
     {
       value.t = t;
     }

     IntegerConverter (unsigned HOST_WIDE_INT x)
     {
       value.x = x;
     }

     ...
   };

   void error_n (int, const IntegerConverter &, const char*, ...);
   ...

With that, the call

   error_n (loc, t, "%E thing", "%E things", t);

works when t is a tree, and the call to the same function

   error_n (loc, i, "%wu thing", "%wu things", i);

also works when i is a HOST_WIDE_INT.  I chose this approach
to avoid introducing additional overloads of the error_n()
functions.

> Note that pretty-printer.c works in the same way.

It works that way for arguments passed through the ellipsis.
I was trying to use tree (indirectly) in the signature of
the _n() functions and for that I need "tree.h" etc. in
diagnostic-core.h and be able to call functions from
the header in diagnostic-core.c.  I suppose I could make
the calls indirectly somehow but I can't avoid including
GCC headers in diagnostic-core.h.

Martin
David Malcolm Feb. 13, 2018, 11:05 p.m. | #6
On Tue, 2018-02-13 at 15:37 -0700, Martin Sebor wrote:
> On 02/13/2018 01:59 PM, Manuel López-Ibáñez wrote:
> > 
> > 
> > On 13 Feb 2018 5:58 pm, "Martin Sebor" <msebor@gmail.com
> > <mailto:msebor@gmail.com>> wrote:
> > 
> > 
> >     I wanted to make the _n() functions like warning_n() more
> >     robust by letting them accept a tree argument (as well as
> >     offset_int and wide_int) in addition to HOST_WIDE_INT but
> >     I can't do it if they can't work with these types.
> > 
> > 
> > There must be a tree-diagnostics.c where you can add those
> > functions and
> > then call the general diagnostic functions. Same for RTL.
> 
> I don't see how to do that.
> 
> Here's a sketch of what I tried to do:
> 
>    struct IntegerConverter
>    {
>      union {
>        tree t;
>        unsigned HOST_WIDE_INT hwi;
>        // buffer for offset_int, wide_int, etc.
>      } value;
> 
>      IntegerConverter (tree t)
>      {
>        value.t = t;
>      }
> 
>      IntegerConverter (unsigned HOST_WIDE_INT x)
>      {
>        value.x = x;
>      }
> 
>      ...
>    };
> 
>    void error_n (int, const IntegerConverter &, const char*, ...);
>    ...
> 
> With that, the call
> 
>    error_n (loc, t, "%E thing", "%E things", t);
> 
> works when t is a tree, and the call to the same function
> 
>    error_n (loc, i, "%wu thing", "%wu things", i);
> 
> also works when i is a HOST_WIDE_INT.  I chose this approach
> to avoid introducing additional overloads of the error_n()
> functions.

Why can't you simply introduce a tree-diagnostic.h and tree-
diagnostic.c and put overloads there?

In diagnostic.c the various _n all use diagnostic_n_impl, which
ultimately calls out to 
  ngettext (singular_gmsgid, plural_gmsgid, n)
so there would presumably be an analogous "diagnostic_n_hwi_impl" (or
somesuch) function there to do the same thing.

That would keep the tree stuff separated from the rest of the
diagnostics subsystem.

Or am I missing something?

Hope this is constructive
Dave

> > Note that pretty-printer.c works in the same way.
> 
> It works that way for arguments passed through the ellipsis.
> I was trying to use tree (indirectly) in the signature of
> the _n() functions and for that I need "tree.h" etc. in
> diagnostic-core.h and be able to call functions from
> the header in diagnostic-core.c.  I suppose I could make
> the calls indirectly somehow but I can't avoid including
> GCC headers in diagnostic-core.h.
> 
> Martin
>
Martin Sebor Feb. 13, 2018, 11:33 p.m. | #7
On 02/13/2018 04:05 PM, David Malcolm wrote:
> On Tue, 2018-02-13 at 15:37 -0700, Martin Sebor wrote:
>> On 02/13/2018 01:59 PM, Manuel López-Ibáñez wrote:
>>>
>>>
>>> On 13 Feb 2018 5:58 pm, "Martin Sebor" <msebor@gmail.com
>>> <mailto:msebor@gmail.com>> wrote:
>>>
>>>
>>>     I wanted to make the _n() functions like warning_n() more
>>>     robust by letting them accept a tree argument (as well as
>>>     offset_int and wide_int) in addition to HOST_WIDE_INT but
>>>     I can't do it if they can't work with these types.
>>>
>>>
>>> There must be a tree-diagnostics.c where you can add those
>>> functions and
>>> then call the general diagnostic functions. Same for RTL.
>>
>> I don't see how to do that.
>>
>> Here's a sketch of what I tried to do:
>>
>>    struct IntegerConverter
>>    {
>>      union {
>>        tree t;
>>        unsigned HOST_WIDE_INT hwi;
>>        // buffer for offset_int, wide_int, etc.
>>      } value;
>>
>>      IntegerConverter (tree t)
>>      {
>>        value.t = t;
>>      }
>>
>>      IntegerConverter (unsigned HOST_WIDE_INT x)
>>      {
>>        value.x = x;
>>      }
>>
>>      ...
>>    };
>>
>>    void error_n (int, const IntegerConverter &, const char*, ...);
>>    ...
>>
>> With that, the call
>>
>>    error_n (loc, t, "%E thing", "%E things", t);
>>
>> works when t is a tree, and the call to the same function
>>
>>    error_n (loc, i, "%wu thing", "%wu things", i);
>>
>> also works when i is a HOST_WIDE_INT.  I chose this approach
>> to avoid introducing additional overloads of the error_n()
>> functions.
>
> Why can't you simply introduce a tree-diagnostic.h and tree-
> diagnostic.c and put overloads there?
>
> In diagnostic.c the various _n all use diagnostic_n_impl, which
> ultimately calls out to
>   ngettext (singular_gmsgid, plural_gmsgid, n)
> so there would presumably be an analogous "diagnostic_n_hwi_impl" (or
> somesuch) function there to do the same thing.

The analogous function would, in fact, do the exact same
thing.

>
> That would keep the tree stuff separated from the rest of the
> diagnostics subsystem.
>
> Or am I missing something?

(Sorry if I'm being overly verbose below.  I'm just trying
to make things clear.)

I wasn't introducing overloads.  I was replacing each of
the existing error_n() and related functions with one that
takes an IntegerConverter argument instead of the int n
argument.  The IntegerConverter constructor is overloaded
on all the types I want to be able to call error_n() with
(tree, offset_int, wide_int, and perhaps also the various
flavors of poly_int).  That has the effect if overloading
each of the functions on each of the types
the IntegerConverter class constructor is overloaded on.

Introducing new overloads of error_n() et al. and keeping
the ones in diagnostic-core.h would lead to ambiguities.
E.g., the call to

   error_n (loc, i, "%wu thing", "%wu things", i);

would be ambiguous if i were signed long (converting it
to unsigned long is no better than converting it to int).

I could change the existing error_n() to take HOST_WIDE_INT
instead of int as an argument and also add the overload(s)
as you suggest.  But the last time this issue came up Jakub
had a strong preference for not adding overloads of these
functions.  It also didn't occur to me to duplicate all
these functions.  Does that seem like a good approach to
you? (I mean, to have two parallel sets of functions that
do the same thing and only differ in the type of one of
their arguments.  If duplicating the signatures is
thought to be the right solution to achieve modularity
then I would think that ultimately one should call
the other.)

Martin

>
> Hope this is constructive
> Dave
>
>>> Note that pretty-printer.c works in the same way.
>>
>> It works that way for arguments passed through the ellipsis.
>> I was trying to use tree (indirectly) in the signature of
>> the _n() functions and for that I need "tree.h" etc. in
>> diagnostic-core.h and be able to call functions from
>> the header in diagnostic-core.c.  I suppose I could make
>> the calls indirectly somehow but I can't avoid including
>> GCC headers in diagnostic-core.h.
>>
>> Martin
>>
Martin Sebor Feb. 14, 2018, 6:02 p.m. | #8
On 02/13/2018 02:05 PM, Joseph Myers wrote:
> On Mon, 12 Feb 2018, Martin Sebor wrote:
>
>> Bug 84207 - Hard coded plural in gimple-fold.c points out one
>> of a number of warning_at() calls where warning_n() should have
>> been used.  The attached patch both replaces the calls and also
>> changes the signatures of the warning_n(), error_n(), and
>> inform_n() functions to take an unsigned HOST_WIDE_INT argument
>> instead of int.  I also changed the implementation of
>> diagnostic_n_impl() to deal with unsigned HOST_WIDE_INT values
>> in excess of ULONG_MAX (the maximum value ngettext handles) so
>> callers don't need to.
>
> Saturating to ULONG_MAX is not correct for languages where the plural form
> depends on n%10 or n%100 (see the various Plural-Forms entries in the .po
> files).  If n is too large you want something like n % 1000000 + 1000000
> instead to get the correct plural form in all cases.

Thanks.  I've made that change in the attached patch.

I was also hoping to test it, either now if it's easy, or if
it's complicated, sometime in the future but I couldn't find
a .po file where it would make a difference.  I could have
easily missed one but none of those I've looked seems to do
much with the plural forms where such large numbers could
come up.  The strings are either all empty or all look
the same.  Do you happen to know of one where it matters
and a suggestion for how to test it?  I suppose I could
create a dummy .po file with a non-trivial Plural-Forms but
then how would I plug it into GCC to verify (in an automated
test) that the right form is used?

Martin
PR translation/84207 - Hard coded plural in gimple-fold.c

gcc/ChangeLog:

	PR translation/84207
	* diagnostic-core.h (warning_n, error_n, inform_n): Change
	n argument to unsigned HOST_WIDE_INT.
	* diagnostic.c (warning_n, error_n, inform_n): Ditto.
	(diagnostic_n_impl): Ditto.  Handle arguments in excess of LONG_MAX.
	* gimple-fold.c (gimple_fold_builtin_strncpy): Use warning_n.
	* gimple-ssa-sprintf.c (format_directive): Simplify inform_n call.

Index: gcc/diagnostic-core.h
===================================================================
--- gcc/diagnostic-core.h	(revision 257665)
+++ gcc/diagnostic-core.h	(working copy)
@@ -59,10 +59,11 @@ extern void internal_error_no_backtrace (const cha
      ATTRIBUTE_GCC_DIAG(1,2) ATTRIBUTE_NORETURN;
 /* Pass one of the OPT_W* from options.h as the first parameter.  */
 extern bool warning (int, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
-extern bool warning_n (location_t, int, int, const char *, const char *, ...)
+extern bool warning_n (location_t, int, unsigned HOST_WIDE_INT,
+		       const char *, const char *, ...)
     ATTRIBUTE_GCC_DIAG(4,6) ATTRIBUTE_GCC_DIAG(5,6);
-extern bool warning_n (rich_location *, int, int, const char *,
-		       const char *, ...)
+extern bool warning_n (rich_location *, int, unsigned HOST_WIDE_INT,
+		       const char *, const char *, ...)
     ATTRIBUTE_GCC_DIAG(4, 6) ATTRIBUTE_GCC_DIAG(5, 6);
 extern bool warning_at (location_t, int, const char *, ...)
     ATTRIBUTE_GCC_DIAG(3,4);
@@ -69,7 +70,8 @@ extern bool warning_at (location_t, int, const cha
 extern bool warning_at (rich_location *, int, const char *, ...)
     ATTRIBUTE_GCC_DIAG(3,4);
 extern void error (const char *, ...) ATTRIBUTE_GCC_DIAG(1,2);
-extern void error_n (location_t, int, const char *, const char *, ...)
+extern void error_n (location_t, unsigned HOST_WIDE_INT, const char *,
+		     const char *, ...)
     ATTRIBUTE_GCC_DIAG(3,5) ATTRIBUTE_GCC_DIAG(4,5);
 extern void error_at (location_t, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
 extern void error_at (rich_location *, const char *, ...)
@@ -87,7 +89,8 @@ extern bool permerror (rich_location *, const char
 extern void sorry (const char *, ...) ATTRIBUTE_GCC_DIAG(1,2);
 extern void inform (location_t, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
 extern void inform (rich_location *, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
-extern void inform_n (location_t, int, const char *, const char *, ...)
+extern void inform_n (location_t, unsigned HOST_WIDE_INT, const char *,
+		      const char *, ...)
     ATTRIBUTE_GCC_DIAG(3,5) ATTRIBUTE_GCC_DIAG(4,5);
 extern void verbatim (const char *, ...) ATTRIBUTE_GCC_DIAG(1,2);
 extern bool emit_diagnostic (diagnostic_t, location_t, int,
Index: gcc/diagnostic.c
===================================================================
--- gcc/diagnostic.c	(revision 257665)
+++ gcc/diagnostic.c	(working copy)
@@ -51,8 +51,8 @@ along with GCC; see the file COPYING3.  If not see
 /* Prototypes.  */
 static bool diagnostic_impl (rich_location *, int, const char *,
 			     va_list *, diagnostic_t) ATTRIBUTE_GCC_DIAG(3,0);
-static bool diagnostic_n_impl (rich_location *, int, int, const char *,
-			       const char *, va_list *,
+static bool diagnostic_n_impl (rich_location *, int, unsigned HOST_WIDE_INT,
+			       const char *, const char *, va_list *,
 			       diagnostic_t) ATTRIBUTE_GCC_DIAG(5,0);
 
 static void error_recursion (diagnostic_context *) ATTRIBUTE_NORETURN;
@@ -1111,15 +1111,24 @@ diagnostic_impl (rich_location *richloc, int opt,
 /* Implement inform_n, warning_n, and error_n, as documented and
    defined below.  */
 static bool
-diagnostic_n_impl (rich_location *richloc, int opt, int n,
+diagnostic_n_impl (rich_location *richloc, int opt, unsigned HOST_WIDE_INT n,
 		   const char *singular_gmsgid,
 		   const char *plural_gmsgid,
 		   va_list *ap, diagnostic_t kind)
 {
   diagnostic_info diagnostic;
-  diagnostic_set_info_translated (&diagnostic,
-                                  ngettext (singular_gmsgid, plural_gmsgid, n),
-                                  ap, richloc, kind);
+  unsigned long gtn;
+
+  if (sizeof n <= sizeof gtn)
+    gtn = n;
+  else
+    /* Use the largest number ngettext() can handle, otherwise
+       preserve the six least significant decimal digits for
+       languages where the plural form depends on them.  */
+    gtn = n <= ULONG_MAX ? n : n % 1000000LU + 1000000LU;
+
+  const char *text = ngettext (singular_gmsgid, plural_gmsgid, gtn);
+  diagnostic_set_info_translated (&diagnostic, text, ap, richloc, kind);
   if (kind == DK_WARNING)
     diagnostic.option_index = opt;
   return diagnostic_report_diagnostic (global_dc, &diagnostic);
@@ -1176,8 +1185,8 @@ inform (rich_location *richloc, const char *gmsgid
 /* An informative note at LOCATION.  Use this for additional details on an
    error message.  */
 void
-inform_n (location_t location, int n, const char *singular_gmsgid,
-          const char *plural_gmsgid, ...)
+inform_n (location_t location, unsigned HOST_WIDE_INT n,
+	  const char *singular_gmsgid, const char *plural_gmsgid, ...)
 {
   va_list ap;
   va_start (ap, plural_gmsgid);
@@ -1233,7 +1242,7 @@ warning_at (rich_location *richloc, int opt, const
 /* Same as warning_n plural variant below, but using RICHLOC.  */
 
 bool
-warning_n (rich_location *richloc, int opt, int n,
+warning_n (rich_location *richloc, int opt, unsigned HOST_WIDE_INT n,
 	   const char *singular_gmsgid, const char *plural_gmsgid, ...)
 {
   gcc_assert (richloc);
@@ -1252,8 +1261,8 @@ bool
    Returns true if the warning was printed, false if it was inhibited.  */
 
 bool
-warning_n (location_t location, int opt, int n, const char *singular_gmsgid,
-	   const char *plural_gmsgid, ...)
+warning_n (location_t location, int opt, unsigned HOST_WIDE_INT n,
+	   const char *singular_gmsgid, const char *plural_gmsgid, ...)
 {
   va_list ap;
   va_start (ap, plural_gmsgid);
@@ -1350,8 +1359,8 @@ error (const char *gmsgid, ...)
 /* A hard error: the code is definitely ill-formed, and an object file
    will not be produced.  */
 void
-error_n (location_t location, int n, const char *singular_gmsgid,
-         const char *plural_gmsgid, ...)
+error_n (location_t location, unsigned HOST_WIDE_INT n,
+	 const char *singular_gmsgid, const char *plural_gmsgid, ...)
 {
   va_list ap;
   va_start (ap, plural_gmsgid);
Index: gcc/gimple-fold.c
===================================================================
--- gcc/gimple-fold.c	(revision 257665)
+++ gcc/gimple-fold.c	(working copy)
@@ -1700,13 +1700,13 @@ gimple_fold_builtin_strncpy (gimple_stmt_iterator
 	  tree fndecl = gimple_call_fndecl (stmt);
 	  gcall *call = as_a <gcall *> (stmt);
 
-	  warning_at (loc, OPT_Wstringop_truncation,
-		      (tree_int_cst_equal (size_one_node, len)
-		       ? G_("%G%qD output truncated copying %E byte "
-			    "from a string of length %E")
-		       : G_("%G%qD output truncated copying %E bytes "
-			    "from a string of length %E")),
-		      call, fndecl, len, slen);
+	  warning_n (loc, OPT_Wstringop_truncation,
+		     tree_to_uhwi (len),
+		     "%G%qD output truncated copying %E byte "
+		     "from a string of length %E",
+		     "%G%qD output truncated copying %E bytes "
+		     "from a string of length %E",
+		     call, fndecl, len, slen);
 	}
       else if (tree_int_cst_equal (len, slen))
 	{
@@ -1713,15 +1713,15 @@ gimple_fold_builtin_strncpy (gimple_stmt_iterator
 	  tree fndecl = gimple_call_fndecl (stmt);
 	  gcall *call = as_a <gcall *> (stmt);
 
-	  warning_at (loc, OPT_Wstringop_truncation,
-		      (tree_int_cst_equal (size_one_node, len)
-		       ? G_("%G%qD output truncated before terminating nul "
-			    "copying %E byte from a string of the same "
-			    "length")
-		       : G_("%G%qD output truncated before terminating nul "
-			    "copying %E bytes from a string of the same "
-			    "length")),
-		      call, fndecl, len);
+	  warning_n (loc, OPT_Wstringop_truncation,
+		     tree_to_uhwi (len),
+		     "%G%qD output truncated before terminating nul "
+		     "copying %E byte from a string of the same "
+		     "length",
+		     "%G%qD output truncated before terminating nul "
+		     "copying %E bytes from a string of the same "
+		     "length",
+		     call, fndecl, len);
 	}
     }
 
Index: gcc/gimple-ssa-sprintf.c
===================================================================
--- gcc/gimple-ssa-sprintf.c	(revision 257665)
+++ gcc/gimple-ssa-sprintf.c	(working copy)
@@ -2937,10 +2937,7 @@ format_directive (const sprintf_dom_walker::call_i
       && fmtres.range.likely < fmtres.range.max)
     /* Some languages have special plural rules even for large values,
        but it is periodic with period of 10, 100, 1000 etc.  */
-    inform_n (info.fmtloc,
-	      fmtres.range.likely > INT_MAX
-	      ? (fmtres.range.likely % 1000000) + 1000000
-	      : fmtres.range.likely,
+    inform_n (info.fmtloc, fmtres.range.likely,
 	      "assuming directive output of %wu byte",
 	      "assuming directive output of %wu bytes",
 	      fmtres.range.likely);
Joseph Myers Feb. 14, 2018, 6:12 p.m. | #9
On Wed, 14 Feb 2018, Martin Sebor wrote:

> I was also hoping to test it, either now if it's easy, or if
> it's complicated, sometime in the future but I couldn't find
> a .po file where it would make a difference.  I could have
> easily missed one but none of those I've looked seems to do
> much with the plural forms where such large numbers could
> come up.  The strings are either all empty or all look
> the same.  Do you happen to know of one where it matters

There are cases where there are more than two translations, but I don't 
have an example where such large numbers could come up (which in any case 
requires a 32-bit host, to have an unsigned HOST_WIDE_INT value above 
ULONG_MAX).  (For example, the Russian translation of "  candidate expects 
%d argument, %d provided" appears to have three different translations, 
using "%d аргумент", "%d аргумента" and "%d аргументов".)
Pedro Alves Feb. 14, 2018, 8:16 p.m. | #10
On 02/13/2018 10:37 PM, Martin Sebor wrote:
> On 02/13/2018 01:59 PM, Manuel López-Ibáñez wrote:
>>
> Here's a sketch of what I tried to do:
> 
>   struct IntegerConverter
>   {
>     union {
>       tree t;
>       unsigned HOST_WIDE_INT hwi;
>       // buffer for offset_int, wide_int, etc.
>     } value;
> 
>     IntegerConverter (tree t)
>     {
>       value.t = t;
>     }
> 
>     IntegerConverter (unsigned HOST_WIDE_INT x)
>     {
>       value.x = x;
>     }
> 
>     ...
>   };
> 
>   void error_n (int, const IntegerConverter &, const char*, ...);
>   ...
> 
> With that, the call
> 
>   error_n (loc, t, "%E thing", "%E things", t);
> 
> works when t is a tree, and the call to the same function
> 
>   error_n (loc, i, "%wu thing", "%wu things", i);
> 
> also works when i is a HOST_WIDE_INT.  I chose this approach
> to avoid introducing additional overloads of the error_n()
> functions.

Instead of a class that has to have a constructor for every type
you want to pass as plural selector to the _n functions, which
increases coupling, I'd suggest using a conversion function, and
overload that.  I.e., something like, in the core diagnostics code:

static inline unsigned HOST_WIDE_INT
as_plural_form (unsigned HOST_WIDE_INT val)
{
  return val;
}

/* In some tree-diagnostics header.  */

static inline unsigned HOST_WIDE_INT
as_plural_form (tree t)
{
   // extract & return a HWI
}

/* In some whatever-other-type-diagnostics header.  */

static inline unsigned HOST_WIDE_INT
as_plural_form (whatever_other_type v)
{
   // like above
}

and then you call error_n and other similar functions like this:

    error_n (loc, u, "%u thing", "%u things", u); 
    error_n (loc, as_plural_form (u), "%u thing", "%u things", u); 
    error_n (loc, as_plural_form (t), "%E thing", "%E things", t); 
    error_n (loc, as_plural_form (i), "%wu thing", "%wu things", i);

This is similar in spirit to std::to_string, etc.

BTW, the "plural_form" naming above comes from ngettext's documentation
of the N parameter:

  char * ngettext (const char *msgid1, const char *msgid2, unsigned long int n);

  "The parameter n is used to determine the plural form."

Pedro Alves
Manuel López-Ibáñez Feb. 14, 2018, 9:47 p.m. | #11
On 14 Feb 2018 8:16 pm, "Pedro Alves" <palves@redhat.com> wrote:

Instead of a class that has to have a constructor for every type
you want to pass as plural selector to the _n functions, which
increases coupling, I'd suggest using a conversion function, and
overload that.  I.e., something like, in the core diagnostics code:

static inline unsigned HOST_WIDE_INT
as_plural_form (unsigned HOST_WIDE_INT val)
{
  return val;
}

/* In some tree-diagnostics header.  */

static inline unsigned HOST_WIDE_INT
as_plural_form (tree t)
{
   // extract & return a HWI
}

/* In some whatever-other-type-diagnostics header.  */

static inline unsigned HOST_WIDE_INT
as_plural_form (whatever_other_type v)
{
   // like above
}

and then you call error_n and other similar functions like this:

    error_n (loc, u, "%u thing", "%u things", u);
    error_n (loc, as_plural_form (u), "%u thing", "%u things", u);
    error_n (loc, as_plural_form (t), "%E thing", "%E things", t);
    error_n (loc, as_plural_form (i), "%wu thing", "%wu things", i);

This is similar in spirit to std::to_string, etc.


If that's desired, why not simply have GCC::to_uhwi() ? It would likely be
useful in other contexts.

Cheers,

Manuel.
Pedro Alves Feb. 14, 2018, 10:41 p.m. | #12
On 02/14/2018 09:47 PM, Manuel López-Ibáñez wrote:
> On 14 Feb 2018 8:16 pm, "Pedro Alves" <palves@redhat.com <mailto:palves@redhat.com>> wrote:
> 
>     Instead of a class that has to have a constructor for every type
>     you want to pass as plural selector to the _n functions, which
>     increases coupling, I'd suggest using a conversion function, and
>     overload that.  I.e., something like, in the core diagnostics code:
> 
>     static inline unsigned HOST_WIDE_INT
>     as_plural_form (unsigned HOST_WIDE_INT val)
>     {
>       return val;
>     }
> 
>     /* In some tree-diagnostics header.  */
> 
>     static inline unsigned HOST_WIDE_INT
>     as_plural_form (tree t)
>     {
>        // extract & return a HWI
>     }
> 
>     /* In some whatever-other-type-diagnostics header.  */
> 
>     static inline unsigned HOST_WIDE_INT
>     as_plural_form (whatever_other_type v)
>     {
>        // like above
>     }
> 
>     and then you call error_n and other similar functions like this:
> 
>         error_n (loc, u, "%u thing", "%u things", u);
>         error_n (loc, as_plural_form (u), "%u thing", "%u things", u);
>         error_n (loc, as_plural_form (t), "%E thing", "%E things", t);
>         error_n (loc, as_plural_form (i), "%wu thing", "%wu things", i);
> 
>     This is similar in spirit to std::to_string, etc.
> 
> 
> If that's desired, why not simply have GCC::to_uhwi() ? It would likely be useful in other contexts.

Because of types that (e.g. wide_int specializations) that can store
values larger than what fit in uhwi.  GCC::to_uhwi's semantics for that
aren't clear -- could saturate, could unsigned wrap, could
throw/abort/assert, could be undefined.

as_plural_form has clear semantics -- it'd return a value that does
the right thing for ngettext's N parameter.  I.e., it'd do
the "n % 1000000 + 1000000" operation as a wide_int still, and
before converting/cast the result to uhwi. 

Thanks,
Pedro Alves
Martin Sebor Feb. 22, 2018, 4:18 p.m. | #13
Ping: https://gcc.gnu.org/ml/gcc-patches/2018-02/msg00858.html

This is just a tweak to fix a translation bug introduced by
one of my warnings (calling warning() where warning_n() is
more appropriate), and to enhance warning_n() et al. to do
the n % 1000000 + 1000000 computation so callers don't have
to worry about it.

On 02/14/2018 11:02 AM, Martin Sebor wrote:
> On 02/13/2018 02:05 PM, Joseph Myers wrote:
>> On Mon, 12 Feb 2018, Martin Sebor wrote:
>>
>>> Bug 84207 - Hard coded plural in gimple-fold.c points out one
>>> of a number of warning_at() calls where warning_n() should have
>>> been used.  The attached patch both replaces the calls and also
>>> changes the signatures of the warning_n(), error_n(), and
>>> inform_n() functions to take an unsigned HOST_WIDE_INT argument
>>> instead of int.  I also changed the implementation of
>>> diagnostic_n_impl() to deal with unsigned HOST_WIDE_INT values
>>> in excess of ULONG_MAX (the maximum value ngettext handles) so
>>> callers don't need to.
>>
>> Saturating to ULONG_MAX is not correct for languages where the plural
>> form
>> depends on n%10 or n%100 (see the various Plural-Forms entries in the .po
>> files).  If n is too large you want something like n % 1000000 + 1000000
>> instead to get the correct plural form in all cases.
>
> Thanks.  I've made that change in the attached patch.
>
> I was also hoping to test it, either now if it's easy, or if
> it's complicated, sometime in the future but I couldn't find
> a .po file where it would make a difference.  I could have
> easily missed one but none of those I've looked seems to do
> much with the plural forms where such large numbers could
> come up.  The strings are either all empty or all look
> the same.  Do you happen to know of one where it matters
> and a suggestion for how to test it?  I suppose I could
> create a dummy .po file with a non-trivial Plural-Forms but
> then how would I plug it into GCC to verify (in an automated
> test) that the right form is used?
>
> Martin
Joseph Myers Feb. 22, 2018, 9:15 p.m. | #14
On Thu, 22 Feb 2018, Martin Sebor wrote:

> Ping: https://gcc.gnu.org/ml/gcc-patches/2018-02/msg00858.html
> 
> This is just a tweak to fix a translation bug introduced by
> one of my warnings (calling warning() where warning_n() is
> more appropriate), and to enhance warning_n() et al. to do
> the n % 1000000 + 1000000 computation so callers don't have
> to worry about it.

OK in the absence of diagnostic maintainer objections within 48 hours, 
with the comment saying "ngettext()" changed to remove the "()" (see the 
GNU Coding Standards: "Please do not write @samp{()} after a function name 
just to indicate it is a function.  @code{foo ()} is not a function, it is 
a function call with no arguments.").
Martin Sebor Feb. 27, 2018, 10:06 p.m. | #15
On 02/22/2018 02:15 PM, Joseph Myers wrote:
> On Thu, 22 Feb 2018, Martin Sebor wrote:
>
>> Ping: https://gcc.gnu.org/ml/gcc-patches/2018-02/msg00858.html
>>
>> This is just a tweak to fix a translation bug introduced by
>> one of my warnings (calling warning() where warning_n() is
>> more appropriate), and to enhance warning_n() et al. to do
>> the n % 1000000 + 1000000 computation so callers don't have
>> to worry about it.
>
> OK in the absence of diagnostic maintainer objections within 48 hours,
> with the comment saying "ngettext()" changed to remove the "()" (see the
> GNU Coding Standards: "Please do not write @samp{()} after a function name
> just to indicate it is a function.  @code{foo ()} is not a function, it is
> a function call with no arguments.").

Committed as r258044.

Martin
Jakub Jelinek Feb. 28, 2018, 9:01 a.m. | #16
On Tue, Feb 27, 2018 at 03:06:53PM -0700, Martin Sebor wrote:
> On 02/22/2018 02:15 PM, Joseph Myers wrote:
> > On Thu, 22 Feb 2018, Martin Sebor wrote:
> > 
> > > Ping: https://gcc.gnu.org/ml/gcc-patches/2018-02/msg00858.html
> > > 
> > > This is just a tweak to fix a translation bug introduced by
> > > one of my warnings (calling warning() where warning_n() is
> > > more appropriate), and to enhance warning_n() et al. to do
> > > the n % 1000000 + 1000000 computation so callers don't have
> > > to worry about it.
> > 
> > OK in the absence of diagnostic maintainer objections within 48 hours,
> > with the comment saying "ngettext()" changed to remove the "()" (see the
> > GNU Coding Standards: "Please do not write @samp{()} after a function name
> > just to indicate it is a function.  @code{foo ()} is not a function, it is
> > a function call with no arguments.").
> 
> Committed as r258044.

With the above change I can revert the
eltscnt > INT_MAX ? (eltscnt % 1000000) + 1000000 : eltscnt
stuff from inform_n call, as it can now accept all UHWI values.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed as obvious.

2018-02-28  Jakub Jelinek  <jakub@redhat.com>

	* decl.c (cp_finish_decomp): Don't adjust eltscnt when calling
	inform_n.

--- gcc/cp/decl.c.jj	2018-02-26 20:49:54.938331630 +0100
+++ gcc/cp/decl.c	2018-02-27 23:20:24.315913318 +0100
@@ -7493,10 +7493,7 @@ cp_finish_decomp (tree decl, tree first,
 	    error_n (loc, count,
 		     "only %u name provided for structured binding",
 		     "only %u names provided for structured binding", count);
-	  /* Some languages have special plural rules even for large values,
-	     but it is periodic with period of 10, 100, 1000 etc.  */
-	  inform_n (loc, eltscnt > INT_MAX
-			 ? (eltscnt % 1000000) + 1000000 : eltscnt,
+	  inform_n (loc, eltscnt,
 		    "while %qT decomposes into %wu element",
 		    "while %qT decomposes into %wu elements",
 		    type, eltscnt);

	Jakub

Patch

PR translation/84207 - Hard coded plural in gimple-fold.c

gcc/ChangeLog:

	PR translation/84207
	* diagnostic-core.h (warning_n, error_n, inform_n): Change
	n argument to unsigned HOST_WIDE_INT.
	* diagnostic.c (warning_n, error_n, inform_n): Ditto.
	(diagnostic_n_impl): Ditto.  Handle arguments in excess of LONG_MAX.
	* gimple-fold.c (gimple_fold_builtin_strncpy): Use warning_n.

Index: gcc/diagnostic-core.h
===================================================================
--- gcc/diagnostic-core.h	(revision 257607)
+++ gcc/diagnostic-core.h	(working copy)
@@ -59,10 +59,11 @@  extern void internal_error_no_backtrace (const cha
      ATTRIBUTE_GCC_DIAG(1,2) ATTRIBUTE_NORETURN;
 /* Pass one of the OPT_W* from options.h as the first parameter.  */
 extern bool warning (int, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
-extern bool warning_n (location_t, int, int, const char *, const char *, ...)
+extern bool warning_n (location_t, int, unsigned HOST_WIDE_INT,
+		       const char *, const char *, ...)
     ATTRIBUTE_GCC_DIAG(4,6) ATTRIBUTE_GCC_DIAG(5,6);
-extern bool warning_n (rich_location *, int, int, const char *,
-		       const char *, ...)
+extern bool warning_n (rich_location *, int, unsigned HOST_WIDE_INT,
+		       const char *, const char *, ...)
     ATTRIBUTE_GCC_DIAG(4, 6) ATTRIBUTE_GCC_DIAG(5, 6);
 extern bool warning_at (location_t, int, const char *, ...)
     ATTRIBUTE_GCC_DIAG(3,4);
@@ -69,7 +70,8 @@  extern bool warning_at (location_t, int, const cha
 extern bool warning_at (rich_location *, int, const char *, ...)
     ATTRIBUTE_GCC_DIAG(3,4);
 extern void error (const char *, ...) ATTRIBUTE_GCC_DIAG(1,2);
-extern void error_n (location_t, int, const char *, const char *, ...)
+extern void error_n (location_t, unsigned HOST_WIDE_INT, const char *,
+		     const char *, ...)
     ATTRIBUTE_GCC_DIAG(3,5) ATTRIBUTE_GCC_DIAG(4,5);
 extern void error_at (location_t, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
 extern void error_at (rich_location *, const char *, ...)
@@ -87,7 +89,8 @@  extern bool permerror (rich_location *, const char
 extern void sorry (const char *, ...) ATTRIBUTE_GCC_DIAG(1,2);
 extern void inform (location_t, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
 extern void inform (rich_location *, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
-extern void inform_n (location_t, int, const char *, const char *, ...)
+extern void inform_n (location_t, unsigned HOST_WIDE_INT, const char *,
+		      const char *, ...)
     ATTRIBUTE_GCC_DIAG(3,5) ATTRIBUTE_GCC_DIAG(4,5);
 extern void verbatim (const char *, ...) ATTRIBUTE_GCC_DIAG(1,2);
 extern bool emit_diagnostic (diagnostic_t, location_t, int,
Index: gcc/diagnostic.c
===================================================================
--- gcc/diagnostic.c	(revision 257607)
+++ gcc/diagnostic.c	(working copy)
@@ -51,8 +51,8 @@  along with GCC; see the file COPYING3.  If not see
 /* Prototypes.  */
 static bool diagnostic_impl (rich_location *, int, const char *,
 			     va_list *, diagnostic_t) ATTRIBUTE_GCC_DIAG(3,0);
-static bool diagnostic_n_impl (rich_location *, int, int, const char *,
-			       const char *, va_list *,
+static bool diagnostic_n_impl (rich_location *, int, unsigned HOST_WIDE_INT,
+			       const char *, const char *, va_list *,
 			       diagnostic_t) ATTRIBUTE_GCC_DIAG(5,0);
 
 static void error_recursion (diagnostic_context *) ATTRIBUTE_NORETURN;
@@ -1111,15 +1111,22 @@  diagnostic_impl (rich_location *richloc, int opt,
 /* Implement inform_n, warning_n, and error_n, as documented and
    defined below.  */
 static bool
-diagnostic_n_impl (rich_location *richloc, int opt, int n,
+diagnostic_n_impl (rich_location *richloc, int opt, unsigned HOST_WIDE_INT n,
 		   const char *singular_gmsgid,
 		   const char *plural_gmsgid,
 		   va_list *ap, diagnostic_t kind)
 {
   diagnostic_info diagnostic;
-  diagnostic_set_info_translated (&diagnostic,
-                                  ngettext (singular_gmsgid, plural_gmsgid, n),
-                                  ap, richloc, kind);
+  unsigned long gtn;
+
+  if (sizeof n <= sizeof gtn)
+    gtn = n;
+  else
+    /* Use the largest number ngettext() can handle.  */
+    gtn = n <= ULONG_MAX ? n : ULONG_MAX;
+
+  const char *text = ngettext (singular_gmsgid, plural_gmsgid, gtn);
+  diagnostic_set_info_translated (&diagnostic, text, ap, richloc, kind);
   if (kind == DK_WARNING)
     diagnostic.option_index = opt;
   return diagnostic_report_diagnostic (global_dc, &diagnostic);
@@ -1176,8 +1183,8 @@  inform (rich_location *richloc, const char *gmsgid
 /* An informative note at LOCATION.  Use this for additional details on an
    error message.  */
 void
-inform_n (location_t location, int n, const char *singular_gmsgid,
-          const char *plural_gmsgid, ...)
+inform_n (location_t location, unsigned HOST_WIDE_INT n,
+	  const char *singular_gmsgid, const char *plural_gmsgid, ...)
 {
   va_list ap;
   va_start (ap, plural_gmsgid);
@@ -1233,7 +1240,7 @@  warning_at (rich_location *richloc, int opt, const
 /* Same as warning_n plural variant below, but using RICHLOC.  */
 
 bool
-warning_n (rich_location *richloc, int opt, int n,
+warning_n (rich_location *richloc, int opt, unsigned HOST_WIDE_INT n,
 	   const char *singular_gmsgid, const char *plural_gmsgid, ...)
 {
   gcc_assert (richloc);
@@ -1252,8 +1259,8 @@  bool
    Returns true if the warning was printed, false if it was inhibited.  */
 
 bool
-warning_n (location_t location, int opt, int n, const char *singular_gmsgid,
-	   const char *plural_gmsgid, ...)
+warning_n (location_t location, int opt, unsigned HOST_WIDE_INT n,
+	   const char *singular_gmsgid, const char *plural_gmsgid, ...)
 {
   va_list ap;
   va_start (ap, plural_gmsgid);
@@ -1350,8 +1357,8 @@  error (const char *gmsgid, ...)
 /* A hard error: the code is definitely ill-formed, and an object file
    will not be produced.  */
 void
-error_n (location_t location, int n, const char *singular_gmsgid,
-         const char *plural_gmsgid, ...)
+error_n (location_t location, unsigned HOST_WIDE_INT n,
+	 const char *singular_gmsgid, const char *plural_gmsgid, ...)
 {
   va_list ap;
   va_start (ap, plural_gmsgid);
Index: gcc/gimple-fold.c
===================================================================
--- gcc/gimple-fold.c	(revision 257607)
+++ gcc/gimple-fold.c	(working copy)
@@ -1694,13 +1694,13 @@  gimple_fold_builtin_strncpy (gimple_stmt_iterator
 	  tree fndecl = gimple_call_fndecl (stmt);
 	  gcall *call = as_a <gcall *> (stmt);
 
-	  warning_at (loc, OPT_Wstringop_truncation,
-		      (tree_int_cst_equal (size_one_node, len)
-		       ? G_("%G%qD output truncated copying %E byte "
-			    "from a string of length %E")
-		       : G_("%G%qD output truncated copying %E bytes "
-			    "from a string of length %E")),
-		      call, fndecl, len, slen);
+	  warning_n (loc, OPT_Wstringop_truncation,
+		     tree_to_uhwi (len),
+		     "%G%qD output truncated copying %E byte "
+		     "from a string of length %E",
+		     "%G%qD output truncated copying %E bytes "
+		     "from a string of length %E",
+		     call, fndecl, len, slen);
 	}
       else if (tree_int_cst_equal (len, slen))
 	{
@@ -1707,15 +1707,15 @@  gimple_fold_builtin_strncpy (gimple_stmt_iterator
 	  tree fndecl = gimple_call_fndecl (stmt);
 	  gcall *call = as_a <gcall *> (stmt);
 
-	  warning_at (loc, OPT_Wstringop_truncation,
-		      (tree_int_cst_equal (size_one_node, len)
-		       ? G_("%G%qD output truncated before terminating nul "
-			    "copying %E byte from a string of the same "
-			    "length")
-		       : G_("%G%qD output truncated before terminating nul "
-			    "copying %E bytes from a string of the same "
-			    "length")),
-		      call, fndecl, len);
+	  warning_n (loc, OPT_Wstringop_truncation,
+		     tree_to_uhwi (len),
+		     "%G%qD output truncated before terminating nul "
+		     "copying %E byte from a string of the same "
+		     "length",
+		     "%G%qD output truncated before terminating nul "
+		     "copying %E bytes from a string of the same "
+		     "length",
+		     call, fndecl, len);
 	}
     }