diff mbox

- improve sprintf buffer overflow detection (middle-end/49905)

Message ID yddoa3gb93q.fsf@CeBiTec.Uni-Bielefeld.DE
State New
Headers show

Commit Message

Rainer Orth Sept. 22, 2016, 9:02 a.m. UTC
Hi Martin,

>> Another nit, if I may: FWIW I'm not in love with the wording of the
>> messages.  Sorry to bikeshed, but how about:
>>   warning: buffer overflow will occur when writing terminating NUL
>> and:
>>   note: formatted output of 2 bytes into a destination of size 1
>> or somesuch.
>
> I won't claim the text of the messages is perfect but I did spend
> a lot of time tweaking them.  That's not to say they can't be
> improved but changing them means quite a bit of work adjusting
> the tests.  At this point, I'd like to focus on getting the patch
> committed.  After that, if there's still time, I'm happy to take
> feedback and tweak the diagnostics based on it.
>
> Thanks again for your help and the suggestions above!

your patch broke bootstrap with MPFR 2.4.2, which is still the
recommended (or perhaps minimal) version according to install.texi:

/vol/gcc/src/hg/trunk/local/gcc/gimple-ssa-sprintf.c: In function 'int {anonymous}::format_floating_max(tree, char)':
/vol/gcc/src/hg/trunk/local/gcc/gimple-ssa-sprintf.c:1128:27: error: 'MPFR_RNDN' was not declared in this scope
   mpfr_from_real (x, &rv, MPFR_RNDN);
                           ^
/vol/gcc/src/hg/trunk/local/gcc/gimple-ssa-sprintf.c: In function '{anonymous}::fmtresult {anonymous}::format_floating(const {anonymous}::conversion_spec&, tree)':
/vol/gcc/src/hg/trunk/local/gcc/gimple-ssa-sprintf.c:1328:37: error: 'MPFR_RNDN' was not declared in this scope
       mpfr_from_real (mpfrval, rvp, MPFR_RNDN);
                                     ^

MPFR_RNDN was only introduced in mpfr 3.0.0, and everywhere else in gcc
GMP_RNDN is used instead.  mpfr 3.0.0 <mpfr.h> has 

/* kept for compatibility with MPFR 2.4.x and before */
#define GMP_RNDN MPFR_RNDN

The following patch (together with your other one to fix ILP32 targets)
allows a sparc-sun-solaris2.12 bootstrap to continue.  I'm going to
commit it as obvious.

	Rainer


2016-09-22  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

	gcc:
	* gimple-ssa-sprintf.c (format_floating_max): Use GMP_RNDN instead
	of MPFR_RNDN.
	(format_floating): Likewise.

Comments

Rainer Orth Sept. 22, 2016, 12:14 p.m. UTC | #1
Hi Martin,

> your patch broke bootstrap with MPFR 2.4.2, which is still the
> recommended (or perhaps minimal) version according to install.texi:
[...]
> The following patch (together with your other one to fix ILP32 targets)
> allows a sparc-sun-solaris2.12 bootstrap to continue.  I'm going to
> commit it as obvious.

done now.  Once the bootstrap had finished, I see quite a number of
testsuite failures (i386-pc-solaris2.12 still running), both 32 and
64-bit:

+FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-1.c  (test for warnings, line 1220)
+FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-1.c  (test for warnings, line 1270)
+FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-1.c  (test for warnings, line 1381)
+FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-1.c  (test for warnings, line 356)
+FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-1.c  (test for warnings, line 99)
+FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-1.c (test for excess errors)

Excess errors:
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:209:3: warning: format '%lc' expects argument of type 'wint_t', but argument 5 has type 'int' [-Wformat=]
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:210:3: warning: format '%lc' expects argument of type 'wint_t', but argument 5 has type 'int' [-Wformat=]
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:211:3: warning: format '%lc' expects argument of type 'wint_t', but argument 5 has type 'int' [-Wformat=]
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:211:3: warning: format '%lc' expects argument of type 'wint_t', but argument 6 has type 'int' [-Wformat=]
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:212:3: warning: format '%lc' expects argument of type 'wint_t', but argument 5 has type 'int' [-Wformat=]
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:213:3: warning: format '%lc' expects argument of type 'wint_t', but argument 5 has type 'int' [-Wformat=]
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:213:3: warning: format '%lc' expects argument of type 'wint_t', but argument 6 has type 'int' [-Wformat=]
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:1238:3: warning: format '%lc' expects argument of type 'wint_t', but argument 4 has type 'int' [-Wformat=]
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:1239:3: warning: format '%lc' expects argument of type 'wint_t', but argument 4 has type 'int' [-Wformat=]
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:1240:3: warning: format '%lc' expects argument of type 'wint_t', but argument 4 has type 'int' [-Wformat=]
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:1287:3: warning: format '%lc' expects argument of type 'wint_t', but argument 6 has type 'int' [-Wformat=]
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:1288:3: warning: format '%lc' expects argument of type 'wint_t', but argument 6 has type 'int' [-Wformat=]
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:1289:3: warning: format '%lc' expects argument of type 'wint_t', but argument 6 has type 'int' [-Wformat=]
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:99:3: warning: '%p' directive writing 1 byte into a region of size 0 [-Wformat-length=]
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:1406:3: warning: specified size 4294967295 exceeds the size 2 of the destination object [-Wformat-length=]

+FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-2.c  (test for warnings, line 50)
+FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-2.c sprintf transformed into strcpy 
(test for warnings, line 83)
+FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-2.c sprintf transformed into strcpy 
(test for warnings, line 84)

+FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-4.c (test for excess errors)

Excess errors:
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-4.c:10:21: warning: writing a terminating nul past the end of the destination [-Wformat-length=]/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-4.c:22:22: warning: '%-s' directive writing 4 bytes into a region of size 1 [-Wformat-length=]

+FAIL: gcc.dg/tree-ssa/builtin-sprintf.c execution test

FAIL: test_a_double:364: "%a" expected result for "0x0.0000000000000p+0" doesn't match function call return value: 20 != 6
FAIL: test_a_double:365: "%a" expected result for "0x1.0000000000000p+0" doesn't match function call return value: 20 != 6
FAIL: test_a_double:366: "%a" expected result for "0x1.0000000000000p+1" doesn't match function call return value: 20 != 6

FAIL: test_a_long_double:375: "%La" expected result for "0x0.0000000000000000000000000000p+0" doesn't match function call return value: 35 != 6
FAIL: test_a_long_double:376: "%La" expected result for "0x1.0000000000000000000000000000p+0" doesn't match function call return value: 35 != 6
FAIL: test_a_long_double:377: "%La" expected result for "0x1.0000000000000000000000000000p+1" doesn't match function call return value: 35 != 6

	Rainer
Martin Sebor Sept. 22, 2016, 2:55 p.m. UTC | #2
> your patch broke bootstrap with MPFR 2.4.2, which is still the
> recommended (or perhaps minimal) version according to install.texi:
>
> /vol/gcc/src/hg/trunk/local/gcc/gimple-ssa-sprintf.c: In function 'int {anonymous}::format_floating_max(tree, char)':
> /vol/gcc/src/hg/trunk/local/gcc/gimple-ssa-sprintf.c:1128:27: error: 'MPFR_RNDN' was not declared in this scope
>    mpfr_from_real (x, &rv, MPFR_RNDN);
>                            ^
> /vol/gcc/src/hg/trunk/local/gcc/gimple-ssa-sprintf.c: In function '{anonymous}::fmtresult {anonymous}::format_floating(const {anonymous}::conversion_spec&, tree)':
> /vol/gcc/src/hg/trunk/local/gcc/gimple-ssa-sprintf.c:1328:37: error: 'MPFR_RNDN' was not declared in this scope
>        mpfr_from_real (mpfrval, rvp, MPFR_RNDN);
>                                      ^
>
> MPFR_RNDN was only introduced in mpfr 3.0.0, and everywhere else in gcc
> GMP_RNDN is used instead.  mpfr 3.0.0 <mpfr.h> has
>
> /* kept for compatibility with MPFR 2.4.x and before */
> #define GMP_RNDN MPFR_RNDN
>
> The following patch (together with your other one to fix ILP32 targets)
> allows a sparc-sun-solaris2.12 bootstrap to continue.  I'm going to
> commit it as obvious.

Thanks.  I didn't realize the older MPFR was still recommended.
FWIW, I was using MPFR 2.4 until recently but after running into
some (unexplained) runtime issues I reran the download_prerequisites
script to update them and it installed 3.1.4.

It seems that either the recommended MPFR version should be bumped
up (or the script downgraded to 2.4 to avoid this risk, though I
suspect there are reasons why it was updated to the latest).

Based on the log the script was updated from MPFR 2.4 to 3.0 in
r235763.  The change also updated install.texi but not the MPFR
version mentioned there.  It might be worth checking with the
author, Bernd Edlinger, to see how to resolve this.  Let me ping
him in a separate thread.

Martin
Martin Sebor Sept. 22, 2016, 3:04 p.m. UTC | #3
On 09/22/2016 06:14 AM, Rainer Orth wrote:
> Hi Martin,
>
>> your patch broke bootstrap with MPFR 2.4.2, which is still the
>> recommended (or perhaps minimal) version according to install.texi:
> [...]
>> The following patch (together with your other one to fix ILP32 targets)
>> allows a sparc-sun-solaris2.12 bootstrap to continue.  I'm going to
>> commit it as obvious.
>
> done now.  Once the bootstrap had finished, I see quite a number of
> testsuite failures (i386-pc-solaris2.12 still running), both 32 and
> 64-bit:
>
> +FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-1.c  (test for warnings, line 1220)
> +FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-1.c  (test for warnings, line 1270)
> +FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-1.c  (test for warnings, line 1381)
> +FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-1.c  (test for warnings, line 356)
> +FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-1.c  (test for warnings, line 99)
> +FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-1.c (test for excess errors)

I have a patch for (hopefully) most of these failures that I will
commit along with the one for pr77676 as soon as it's approved.

> Excess errors:
> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:209:3: warning: format '%lc' expects argument of type 'wint_t', but argument 5 has type 'int' [-Wformat=]
> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:210:3: warning: format '%lc' expects argument of type 'wint_t', but argument 5 has type 'int' [-Wformat=]
> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:211:3: warning: format '%lc' expects argument of type 'wint_t', but argument 5 has type 'int' [-Wformat=]
> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:211:3: warning: format '%lc' expects argument of type 'wint_t', but argument 6 has type 'int' [-Wformat=]
> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:212:3: warning: format '%lc' expects argument of type 'wint_t', but argument 5 has type 'int' [-Wformat=]
> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:213:3: warning: format '%lc' expects argument of type 'wint_t', but argument 5 has type 'int' [-Wformat=]
> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:213:3: warning: format '%lc' expects argument of type 'wint_t', but argument 6 has type 'int' [-Wformat=]
> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:1238:3: warning: format '%lc' expects argument of type 'wint_t', but argument 4 has type 'int' [-Wformat=]
> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:1239:3: warning: format '%lc' expects argument of type 'wint_t', but argument 4 has type 'int' [-Wformat=]
> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:1240:3: warning: format '%lc' expects argument of type 'wint_t', but argument 4 has type 'int' [-Wformat=]
> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:1287:3: warning: format '%lc' expects argument of type 'wint_t', but argument 6 has type 'int' [-Wformat=]
> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:1288:3: warning: format '%lc' expects argument of type 'wint_t', but argument 6 has type 'int' [-Wformat=]
> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:1289:3: warning: format '%lc' expects argument of type 'wint_t', but argument 6 has type 'int' [-Wformat=]
> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:99:3: warning: '%p' directive writing 1 byte into a region of size 0 [-Wformat-length=]
> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:1406:3: warning: specified size 4294967295 exceeds the size 2 of the destination object [-Wformat-length=]
>
> +FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-2.c  (test for warnings, line 50)
> +FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-2.c sprintf transformed into strcpy
> (test for warnings, line 83)
> +FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-2.c sprintf transformed into strcpy
> (test for warnings, line 84)
>
> +FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-4.c (test for excess errors)
>
> Excess errors:
> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-4.c:10:21: warning: writing a terminating nul past the end of the destination [-Wformat-length=]/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-4.c:22:22: warning: '%-s' directive writing 4 bytes into a region of size 1 [-Wformat-length=]
>
> +FAIL: gcc.dg/tree-ssa/builtin-sprintf.c execution test
>
> FAIL: test_a_double:364: "%a" expected result for "0x0.0000000000000p+0" doesn't match function call return value: 20 != 6
> FAIL: test_a_double:365: "%a" expected result for "0x1.0000000000000p+0" doesn't match function call return value: 20 != 6
> FAIL: test_a_double:366: "%a" expected result for "0x1.0000000000000p+1" doesn't match function call return value: 20 != 6
>
> FAIL: test_a_long_double:375: "%La" expected result for "0x0.0000000000000000000000000000p+0" doesn't match function call return value: 35 != 6
> FAIL: test_a_long_double:376: "%La" expected result for "0x1.0000000000000000000000000000p+0" doesn't match function call return value: 35 != 6
> FAIL: test_a_long_double:377: "%La" expected result for "0x1.0000000000000000000000000000p+1" doesn't match function call return value: 35 != 6

I don't know about these.  It looks like the Solaris printf doesn't
handle the %a directive correctly and the tests (and the related
checks/optimization) might need to be disabled, which in turn might
involve extending the existing printf hook or adding a new one.
I don't have access to Solaris to fully debug and test this there.
Would you mind helping with it?

Martin
Rainer Orth Sept. 23, 2016, 1:47 p.m. UTC | #4
Hi Martin,

> On 09/22/2016 06:14 AM, Rainer Orth wrote:
>> Hi Martin,
>>
>>> your patch broke bootstrap with MPFR 2.4.2, which is still the
>>> recommended (or perhaps minimal) version according to install.texi:
>> [...]
>>> The following patch (together with your other one to fix ILP32 targets)
>>> allows a sparc-sun-solaris2.12 bootstrap to continue.  I'm going to
>>> commit it as obvious.
>>
>> done now.  Once the bootstrap had finished, I see quite a number of
>> testsuite failures (i386-pc-solaris2.12 still running), both 32 and
>> 64-bit:
>>
>> +FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-1.c (test for warnings, line
>> 1220)
>> +FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-1.c (test for warnings, line
>> 1270)
>> +FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-1.c (test for warnings, line
>> 1381)
>> +FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-1.c  (test for warnings, line 356)
>> +FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-1.c  (test for warnings, line 99)
>> +FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-1.c (test for excess errors)
>
> I have a patch for (hopefully) most of these failures that I will
> commit along with the one for pr77676 as soon as it's approved.

as of r240389, a few of those failures occur also on x86_64-pc-linux-gnu
with -m32:

FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-1.c  (test for warnings, line 1222)
FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-1.c  (test for warnings, line 1272)
FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-1.c  (test for warnings, line 1383)
FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-1.c (test for excess errors)

>> +FAIL: gcc.dg/tree-ssa/builtin-sprintf.c execution test
>>
>> FAIL: test_a_double:364: "%a" expected result for "0x0.0000000000000p+0"
>> doesn't match function call return value: 20 != 6
>> FAIL: test_a_double:365: "%a" expected result for "0x1.0000000000000p+0"
>> doesn't match function call return value: 20 != 6
>> FAIL: test_a_double:366: "%a" expected result for "0x1.0000000000000p+1"
>> doesn't match function call return value: 20 != 6
>>
>> FAIL: test_a_long_double:375: "%La" expected result for
>> "0x0.0000000000000000000000000000p+0" doesn't match function call return
>> value: 35 != 6
>> FAIL: test_a_long_double:376: "%La" expected result for
>> "0x1.0000000000000000000000000000p+0" doesn't match function call return
>> value: 35 != 6
>> FAIL: test_a_long_double:377: "%La" expected result for
>> "0x1.0000000000000000000000000000p+1" doesn't match function call return
>> value: 35 != 6
>
> I don't know about these.  It looks like the Solaris printf doesn't
> handle the %a directive correctly and the tests (and the related
> checks/optimization) might need to be disabled, which in turn might
> involve extending the existing printf hook or adding a new one.

I've found the following in Solaris 10 (and up) printf(3C):

     a, A    A  double  argument  representing  a  floating-point
             number  is  converted in the style "[-]0xh.hhhhp+d",
             where the single  hexadecimal  digit  preceding  the
             radix  point is 0 if the value converted is zero and
             1 otherwise and the  number  of  hexadecimal  digits
             after it is equal to the precision; if the precision
             is missing, the number of digits printed  after  the
             radix  point  is  13  for the conversion of a double
             value, 16 for the conversion of a long double  value
             on  x86,  and 28 for the conversion of a long double
             value on SPARC; if the precision is zero and the '#'
             flag  is  not  specified, no decimal-point character
             will appear. The letters "abcdef"  are  used  for  a
             conversion  and  the  letters "ABCDEF" for A conver-
             sion. The A conversion specifier produces  a  number
             with  'X'  and  'P'  instead  of  'x'  and  'p'. The
             exponent will always contain at least one digit, and
             only  as  many more digits as necessary to represent
             the decimal exponent of 2. If the value is zero, the
             exponent is zero.

             The converted value is rounded to fit the  specified
             output  format  according to the prevailing floating
             point rounding direction mode. If the conversion  is
             not exact, an inexact exception is raised.

             A double argument representing an infinity or NaN is
             converted in the SUSv3 style of an e or E conversion
             specifier.

I tried to check the relevant sections of the latest C99 and C11 drafts
to check if this handling of missing precision is allowed by the
standard, but I couldn't even fully parse the language there.

> I don't have access to Solaris to fully debug and test this there.
> Would you mind helping with it?

Not at all: if it turns out that Solaris has bugs in this area, I can
easily file them, too.

Thanks.
        Rainer
Andreas Schwab Sept. 24, 2016, 4:39 p.m. UTC | #5
I'm still seeing these failures on m68k:

FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-1.c  (test for warnings, line 358)
FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-1.c  (test for warnings, line 1222)
FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-1.c  (test for warnings, line 1272)
FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-1.c  (test for warnings, line 1383)
FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-1.c (test for excess errors)
Excess errors:
/daten/aranym/gcc/gcc-20160924/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:1408:3: warning: specified size 4294967295 exceeds the size 2 of the destination object [-Wformat-length=]

Andreas.
Martin Sebor Sept. 28, 2016, 6:50 p.m. UTC | #6
On 09/24/2016 10:39 AM, Andreas Schwab wrote:
> I'm still seeing these failures on m68k:
>
> FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-1.c  (test for warnings, line 358)
> FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-1.c  (test for warnings, line 1222)
> FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-1.c  (test for warnings, line 1272)
> FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-1.c  (test for warnings, line 1383)
> FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-1.c (test for excess errors)
> Excess errors:
> /daten/aranym/gcc/gcc-20160924/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:1408:3: warning: specified size 4294967295 exceeds the size 2 of the destination object [-Wformat-length=]

I'm pretty sure the first two are due to the same problem as bug 77735.
I plan to look into fixing it today or later this week.  The last two
were caused by bug v77762 and should be fixed by now.

Martin
Martin Sebor Oct. 3, 2016, 6:05 p.m. UTC | #7
>>> +FAIL: gcc.dg/tree-ssa/builtin-sprintf.c execution test
>>>
>>> FAIL: test_a_double:364: "%a" expected result for "0x0.0000000000000p+0"
>>> doesn't match function call return value: 20 != 6
>>> FAIL: test_a_double:365: "%a" expected result for "0x1.0000000000000p+0"
>>> doesn't match function call return value: 20 != 6
>>> FAIL: test_a_double:366: "%a" expected result for "0x1.0000000000000p+1"
>>> doesn't match function call return value: 20 != 6
>>>
>>> FAIL: test_a_long_double:375: "%La" expected result for
>>> "0x0.0000000000000000000000000000p+0" doesn't match function call return
>>> value: 35 != 6
>>> FAIL: test_a_long_double:376: "%La" expected result for
>>> "0x1.0000000000000000000000000000p+0" doesn't match function call return
>>> value: 35 != 6
>>> FAIL: test_a_long_double:377: "%La" expected result for
>>> "0x1.0000000000000000000000000000p+1" doesn't match function call return
>>> value: 35 != 6
>>
>> I don't know about these.  It looks like the Solaris printf doesn't
>> handle the %a directive correctly and the tests (and the related
>> checks/optimization) might need to be disabled, which in turn might
>> involve extending the existing printf hook or adding a new one.
>
> I've found the following in Solaris 10 (and up) printf(3C):
>
>      a, A    A  double  argument  representing  a  floating-point
>              number  is  converted in the style "[-]0xh.hhhhp+d",
>              where the single  hexadecimal  digit  preceding  the
>              radix  point is 0 if the value converted is zero and
>              1 otherwise and the  number  of  hexadecimal  digits
>              after it is equal to the precision; if the precision
>              is missing, the number of digits printed  after  the
>              radix  point  is  13  for the conversion of a double
>              value, 16 for the conversion of a long double  value
>              on  x86,  and 28 for the conversion of a long double
>              value on SPARC; if the precision is zero and the '#'
>              flag  is  not  specified, no decimal-point character
>              will appear. The letters "abcdef"  are  used  for  a
>              conversion  and  the  letters "ABCDEF" for A conver-
>              sion. The A conversion specifier produces  a  number
>              with  'X'  and  'P'  instead  of  'x'  and  'p'. The
>              exponent will always contain at least one digit, and
>              only  as  many more digits as necessary to represent
>              the decimal exponent of 2. If the value is zero, the
>              exponent is zero.
>
>              The converted value is rounded to fit the  specified
>              output  format  according to the prevailing floating
>              point rounding direction mode. If the conversion  is
>              not exact, an inexact exception is raised.
>
>              A double argument representing an infinity or NaN is
>              converted in the SUSv3 style of an e or E conversion
>              specifier.
>
> I tried to check the relevant sections of the latest C99 and C11 drafts
> to check if this handling of missing precision is allowed by the
> standard, but I couldn't even fully parse the language there.
>
>> I don't have access to Solaris to fully debug and test this there.
>> Would you mind helping with it?
>
> Not at all: if it turns out that Solaris has bugs in this area, I can
> easily file them, too.

I think it's actually a defect in the C standard.  It doesn't
specify how many decimal digits an implementation must produce
on output for a plain %a directive (i.e., when precision isn't
explicitly specified).  With Glibc, for instance, printf("%a",
1.0) prints 0x8p-3 while on Solaris it prints  0x8.000000p-3.
Both seem reasonable but neither is actually specified.  In
theory, an implementation is allowed print any number of zeros
after the decimal point, which the standard should (IMO) not
permit.  There should be a cap (e.g., of at most 6 decimal
digits when precision is not specified with %a, just like
there us with %e).  I'll propose to change the standard and
forward it to the C committee.  Until then, I've worked
around it in the patch for pr77735 (under review).  If you
have a moment and could try it out on Solaris and let me
know how it goes I'd be grateful.

Thanks
Martin
Rainer Orth Oct. 4, 2016, 9:28 a.m. UTC | #8
Hi Martin,

>>>> +FAIL: gcc.dg/tree-ssa/builtin-sprintf.c execution test
>>>>
>>>> FAIL: test_a_double:364: "%a" expected result for "0x0.0000000000000p+0"
>>>> doesn't match function call return value: 20 != 6
>>>> FAIL: test_a_double:365: "%a" expected result for "0x1.0000000000000p+0"
>>>> doesn't match function call return value: 20 != 6
>>>> FAIL: test_a_double:366: "%a" expected result for "0x1.0000000000000p+1"
>>>> doesn't match function call return value: 20 != 6
>>>>
>>>> FAIL: test_a_long_double:375: "%La" expected result for
>>>> "0x0.0000000000000000000000000000p+0" doesn't match function call return
>>>> value: 35 != 6
>>>> FAIL: test_a_long_double:376: "%La" expected result for
>>>> "0x1.0000000000000000000000000000p+0" doesn't match function call return
>>>> value: 35 != 6
>>>> FAIL: test_a_long_double:377: "%La" expected result for
>>>> "0x1.0000000000000000000000000000p+1" doesn't match function call return
>>>> value: 35 != 6
>>>
>>> I don't know about these.  It looks like the Solaris printf doesn't
>>> handle the %a directive correctly and the tests (and the related
>>> checks/optimization) might need to be disabled, which in turn might
>>> involve extending the existing printf hook or adding a new one.
>>
>> I've found the following in Solaris 10 (and up) printf(3C):
>>
>>      a, A    A  double  argument  representing  a  floating-point
>>              number  is  converted in the style "[-]0xh.hhhhp+d",
>>              where the single  hexadecimal  digit  preceding  the
>>              radix  point is 0 if the value converted is zero and
>>              1 otherwise and the  number  of  hexadecimal  digits
>>              after it is equal to the precision; if the precision
>>              is missing, the number of digits printed  after  the
>>              radix  point  is  13  for the conversion of a double
>>              value, 16 for the conversion of a long double  value
>>              on  x86,  and 28 for the conversion of a long double
>>              value on SPARC; if the precision is zero and the '#'
>>              flag  is  not  specified, no decimal-point character
>>              will appear. The letters "abcdef"  are  used  for  a
>>              conversion  and  the  letters "ABCDEF" for A conver-
>>              sion. The A conversion specifier produces  a  number
>>              with  'X'  and  'P'  instead  of  'x'  and  'p'. The
>>              exponent will always contain at least one digit, and
>>              only  as  many more digits as necessary to represent
>>              the decimal exponent of 2. If the value is zero, the
>>              exponent is zero.
>>
>>              The converted value is rounded to fit the  specified
>>              output  format  according to the prevailing floating
>>              point rounding direction mode. If the conversion  is
>>              not exact, an inexact exception is raised.
>>
>>              A double argument representing an infinity or NaN is
>>              converted in the SUSv3 style of an e or E conversion
>>              specifier.
>>
>> I tried to check the relevant sections of the latest C99 and C11 drafts
>> to check if this handling of missing precision is allowed by the
>> standard, but I couldn't even fully parse the language there.
>>
>>> I don't have access to Solaris to fully debug and test this there.
>>> Would you mind helping with it?
>>
>> Not at all: if it turns out that Solaris has bugs in this area, I can
>> easily file them, too.
>
> I think it's actually a defect in the C standard.  It doesn't
> specify how many decimal digits an implementation must produce
> on output for a plain %a directive (i.e., when precision isn't
> explicitly specified).  With Glibc, for instance, printf("%a",
> 1.0) prints 0x8p-3 while on Solaris it prints  0x8.000000p-3.
> Both seem reasonable but neither is actually specified.  In
> theory, an implementation is allowed print any number of zeros
> after the decimal point, which the standard should (IMO) not
> permit.  There should be a cap (e.g., of at most 6 decimal
> digits when precision is not specified with %a, just like
> there us with %e).  I'll propose to change the standard and
> forward it to the C committee.  Until then, I've worked
> around it in the patch for pr77735 (under review).  If you
> have a moment and could try it out on Solaris and let me
> know how it goes I'd be grateful.

as it happens, I'd already started bootstraps with your patch before
your mail arrived :-)

We're left with

FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-1.c (test for excess errors)
FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-4.c (test for excess errors)

for 32 bit and

FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-4.c (test for excess errors)

for 64 bit on both i386-pc-solaris2.12 and sparc-sun-solaris2.12.

In the 32-bit builtin-sprintf-warn-1.c case, there are many instances of

Excess errors:
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:224:3: warning: format '%lc' expects argument of type 'wint_t', but argument 5 has type 'int' [-Wformat=]

while the second is

Excess errors:
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-4.c:15:23: warning: writing a terminating nul past the end of the destination [-Wformat-length=]/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-4.c:30:21: warning: writing format character '4' at offset 3 past the end of the destination [-Wformat-length=]
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-4.c:46:21: warning: writing format character '4' at offset 3 past the end of the destination [-Wformat-length=]
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-4.c:61:25: warning: writing a terminating nul past the end of the destination [-Wformat-length=]
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-4.c:74:22: warning: '%-s' directive writing 4 bytes into a region of size 1 [-Wformat-length=]

I've no idea yet why in the first error message two different messages
are joined into one line.  Probably something with DejaGnu mangling the
output...

	Rainer
Martin Sebor Oct. 4, 2016, 11:30 p.m. UTC | #9
> as it happens, I'd already started bootstraps with your patch before
> your mail arrived :-)

Thanks for your help getting to the bottom of this!

>
> We're left with
>
> FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-1.c (test for excess errors)
> FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-4.c (test for excess errors)
>
> for 32 bit and
>
> FAIL: gcc.dg/tree-ssa/builtin-sprintf-warn-4.c (test for excess errors)
>
> for 64 bit on both i386-pc-solaris2.12 and sparc-sun-solaris2.12.
>
> In the 32-bit builtin-sprintf-warn-1.c case, there are many instances of
>
> Excess errors:
> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-1.c:224:3: warning: format '%lc' expects argument of type 'wint_t', but argument 5 has type 'int' [-Wformat=]

I've built the sparc-sun-solaris2.12 toolchain and reproduced these
warnings.  They are vestiges of those I saw and some of which I fixed
before.  The problem is that %lc expects a wint_t argument which on
this target is an alias for long in but the argument of 0 has type
int.  The warning is coming out of the -Wformat checker which doesn't
seem to care that int and long have the same size.  I've committed
r240758 that should fix the remaining warnings of this kind but long
term I think GCC should change to avoid warning in this case (Clang
doesn't).

>
> while the second is
>
> Excess errors:
> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-4.c:15:23: warning: writing a terminating nul past the end of the destination [-Wformat-length=]/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-4.c:30:21: warning: writing format character '4' at offset 3 past the end of the destination [-Wformat-length=]
> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-4.c:46:21: warning: writing format character '4' at offset 3 past the end of the destination [-Wformat-length=]
> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-4.c:61:25: warning: writing a terminating nul past the end of the destination [-Wformat-length=]
> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-4.c:74:22: warning: '%-s' directive writing 4 bytes into a region of size 1 [-Wformat-length=]
>
> I've no idea yet why in the first error message two different messages
> are joined into one line.  Probably something with DejaGnu mangling the
> output...

I've reproduced this as well and it took me a while to see the
problem.  It turns out that the target specifier I used in the
test (*-*-*-*) happened to match my native target
x86_64-pc-linux-gnu but not sparc-sun-solaris2.12.  Let me fix
that in the next patch.  Hopefully with that all the remaining
failures should clear up.

Thanks again for your help and patience!

Martin
Joseph Myers Oct. 5, 2016, 12:21 a.m. UTC | #10
On Tue, 4 Oct 2016, Martin Sebor wrote:

> I've built the sparc-sun-solaris2.12 toolchain and reproduced these
> warnings.  They are vestiges of those I saw and some of which I fixed
> before.  The problem is that %lc expects a wint_t argument which on
> this target is an alias for long in but the argument of 0 has type
> int.  The warning is coming out of the -Wformat checker which doesn't
> seem to care that int and long have the same size.  I've committed
> r240758 that should fix the remaining warnings of this kind but long
> term I think GCC should change to avoid warning in this case (Clang
> doesn't).

Well, typically cases where one of long and int is passed and the other is 
expected, but they have the same size, are bugs waiting to happen when the 
code is built on a 64-bit system.  That is, they *should* warn.

There have been arguments that we should go further and warn for e.g. %zu 
with a type that happens to be the same as size_t but doesn't use the 
size_t typedef (or sizeof etc.), %td for something that happens to be the 
same as ptrdiff_t but doesn't use the typedef (or pointer difference 
etc.), etc. - which would get many similar cases of bugs waiting to happen 
on a different system, but is also tricker because you need to decide 
whether a given type is logically size_t etc. or not - code could validly 
use autoconf to identify the underlying type, or use __SIZE_TYPE__, or use 
an expression mixing size_t with other types, or in the case of %j* 
(intmax_t) use the INTMAX_C macro to construct constants.  That probably 
*would* need an option to disable just those format warnings (whereas I 
don't see the need for such an option for the case of mixing int and 
long).
Martin Sebor Oct. 5, 2016, 12:31 a.m. UTC | #11
On 10/04/2016 06:21 PM, Joseph Myers wrote:
> On Tue, 4 Oct 2016, Martin Sebor wrote:
>
>> I've built the sparc-sun-solaris2.12 toolchain and reproduced these
>> warnings.  They are vestiges of those I saw and some of which I fixed
>> before.  The problem is that %lc expects a wint_t argument which on
>> this target is an alias for long in but the argument of 0 has type
>> int.  The warning is coming out of the -Wformat checker which doesn't
>> seem to care that int and long have the same size.  I've committed
>> r240758 that should fix the remaining warnings of this kind but long
>> term I think GCC should change to avoid warning in this case (Clang
>> doesn't).
>
> Well, typically cases where one of long and int is passed and the other is
> expected, but they have the same size, are bugs waiting to happen when the
> code is built on a 64-bit system.  That is, they *should* warn.

Typically, yes.  In the case of wchar_t (or int) and wint_t I don't
think it's helpful.  Consider this case from my comment #5 on bug
72858.  I don't think there is any point in issuing a warning here.
On the majority of targets they either all are or promote to a type
of the same size, don't they?

$ cat t.c && gcc -S -Wall -m32 t.c
typedef __WCHAR_TYPE__ wchar_t;

void f (const wchar_t *ws)
{
   __builtin_printf ("%lc", *ws);
}
t.c: In function ‘f’:
t.c:5:24: warning: format ‘%lc’ expects argument of type ‘wint_t’, but 
argument 2 has type ‘wchar_t {aka const long int}’ [-Wformat=]
    __builtin_printf ("%lc", *ws);
                       ~~^   ~~~
                       %ld
>
> There have been arguments that we should go further and warn for e.g. %zu
> with a type that happens to be the same as size_t but doesn't use the
> size_t typedef (or sizeof etc.), %td for something that happens to be the
> same as ptrdiff_t but doesn't use the typedef (or pointer difference
> etc.), etc. - which would get many similar cases of bugs waiting to happen
> on a different system, but is also tricker because you need to decide
> whether a given type is logically size_t etc. or not - code could validly
> use autoconf to identify the underlying type, or use __SIZE_TYPE__, or use
> an expression mixing size_t with other types, or in the case of %j*
> (intmax_t) use the INTMAX_C macro to construct constants.  That probably
> *would* need an option to disable just those format warnings (whereas I
> don't see the need for such an option for the case of mixing int and
> long).

I would view it useful if GCC warned on %zu with an argument that's
not size_t, and similarly for other directives and typedefs, for the
reason you mention.  But I don't think the same rationale applies
to warning on %lc with wchar_t or int arguments.

Martin
Joseph Myers Oct. 5, 2016, 3:40 p.m. UTC | #12
On Tue, 4 Oct 2016, Martin Sebor wrote:

> > Well, typically cases where one of long and int is passed and the other is
> > expected, but they have the same size, are bugs waiting to happen when the
> > code is built on a 64-bit system.  That is, they *should* warn.
> 
> Typically, yes.  In the case of wchar_t (or int) and wint_t I don't
> think it's helpful.  Consider this case from my comment #5 on bug
> 72858.  I don't think there is any point in issuing a warning here.
> On the majority of targets they either all are or promote to a type
> of the same size, don't they?

I'm unconvinced by that "majority of targets" of argument (which once 
would have been true for int and long, and you could say it is true for 
long and size_t).  There's a clear correct type here, and it's wint_t, and 
it's always easy to fix the code to use the correct type.
Jakub Jelinek Oct. 5, 2016, 3:44 p.m. UTC | #13
On Wed, Oct 05, 2016 at 03:40:18PM +0000, Joseph Myers wrote:
> On Tue, 4 Oct 2016, Martin Sebor wrote:
> 
> > > Well, typically cases where one of long and int is passed and the other is
> > > expected, but they have the same size, are bugs waiting to happen when the
> > > code is built on a 64-bit system.  That is, they *should* warn.
> > 
> > Typically, yes.  In the case of wchar_t (or int) and wint_t I don't
> > think it's helpful.  Consider this case from my comment #5 on bug
> > 72858.  I don't think there is any point in issuing a warning here.
> > On the majority of targets they either all are or promote to a type
> > of the same size, don't they?
> 
> I'm unconvinced by that "majority of targets" of argument (which once 
> would have been true for int and long, and you could say it is true for 
> long and size_t).  There's a clear correct type here, and it's wint_t, and 
> it's always easy to fix the code to use the correct type.

But, can we reliably detect differences between wint_t and unsigned int if
wint_t is a typedef to that, or size_t and unsigned long etc., I mean doesn't
early folding already throw it away?

	Jakub
Joseph Myers Oct. 5, 2016, 4:10 p.m. UTC | #14
On Wed, 5 Oct 2016, Jakub Jelinek wrote:

> On Wed, Oct 05, 2016 at 03:40:18PM +0000, Joseph Myers wrote:
> > On Tue, 4 Oct 2016, Martin Sebor wrote:
> > 
> > > > Well, typically cases where one of long and int is passed and the other is
> > > > expected, but they have the same size, are bugs waiting to happen when the
> > > > code is built on a 64-bit system.  That is, they *should* warn.
> > > 
> > > Typically, yes.  In the case of wchar_t (or int) and wint_t I don't
> > > think it's helpful.  Consider this case from my comment #5 on bug
> > > 72858.  I don't think there is any point in issuing a warning here.
> > > On the majority of targets they either all are or promote to a type
> > > of the same size, don't they?
> > 
> > I'm unconvinced by that "majority of targets" of argument (which once 
> > would have been true for int and long, and you could say it is true for 
> > long and size_t).  There's a clear correct type here, and it's wint_t, and 
> > it's always easy to fix the code to use the correct type.
> 
> But, can we reliably detect differences between wint_t and unsigned int if
> wint_t is a typedef to that, or size_t and unsigned long etc., I mean doesn't
> early folding already throw it away?

Detecting it in all cases is hard as for size_t etc., but that shouldn't 
stop us warning in the cases where the types are different enough that we 
can tell the program is wrong.
Martin Sebor Oct. 5, 2016, 4:20 p.m. UTC | #15
On 10/05/2016 09:40 AM, Joseph Myers wrote:
> On Tue, 4 Oct 2016, Martin Sebor wrote:
>
>>> Well, typically cases where one of long and int is passed and the other is
>>> expected, but they have the same size, are bugs waiting to happen when the
>>> code is built on a 64-bit system.  That is, they *should* warn.
>>
>> Typically, yes.  In the case of wchar_t (or int) and wint_t I don't
>> think it's helpful.  Consider this case from my comment #5 on bug
>> 72858.  I don't think there is any point in issuing a warning here.
>> On the majority of targets they either all are or promote to a type
>> of the same size, don't they?
>
> I'm unconvinced by that "majority of targets" of argument (which once
> would have been true for int and long, and you could say it is true for
> long and size_t).  There's a clear correct type here, and it's wint_t, and
> it's always easy to fix the code to use the correct type.

The warning would only helpful if there was a target where wchar_t
didn't promote to a type with the same precision as wint_t, or in
other words where printf("%lc", L'a') did something different than
printf("%lc", (wint_t) L'a'). I don't know of such a target.  Can
you give an example of one?

It isn't any of the ILP32 targets where wint_t is int and wchar_t
is long and where the warning was found to cause test suite failures,
and it isn't any other other targets where the warning didn't trigger
because the two types have the same precision.

Otherwise, when there is no such target there is nothing to fix, and
changing working code just for the sake of pedantry only runs the
unnecessary risk of introducing bugs.  IMO, this is especially true
in subtle cases like this where the people who make the changes often
don't fully appreciate the subtleties of the code.  Suggesting the
wrong directive (replacing %lc with %ld) compounds the problem
further(*).

Martin

[*] This is not meant as a dig at David but rather to underscore
that both the "problem" is subtle and the expected solution not
obvious.
Joseph Myers Oct. 5, 2016, 4:27 p.m. UTC | #16
On Wed, 5 Oct 2016, Martin Sebor wrote:

> The warning would only helpful if there was a target where wchar_t
> didn't promote to a type with the same precision as wint_t, or in

Or it could show up a logical error where the code should have had a 
wint_t variable and checked for WEOF somewhere but just assumed things 
would fit in wchar_t (and so could be passing a negative value where the 
ABI means the callee expects wint_t values to be zero-extended, with 
consequent undefined behavior there).  The correct fix involves thinking 
about where you know something is a wide character, and where something 
might be a wide character or WEOF.  It might then involve a cast to 
wint_t, or changing the type of a variable, or more changes than that.
Martin Sebor Oct. 5, 2016, 5:11 p.m. UTC | #17
On 10/05/2016 10:27 AM, Joseph Myers wrote:
> On Wed, 5 Oct 2016, Martin Sebor wrote:
>
>> The warning would only helpful if there was a target where wchar_t
>> didn't promote to a type with the same precision as wint_t, or in
>
> Or it could show up a logical error where the code should have had a
> wint_t variable and checked for WEOF somewhere but just assumed things
> would fit in wchar_t

WEOF that fits in a wint_t also fits in a wchar_t when the two types
have the same precision, the case where GCC issues the warning (and
where WEOF expands to 0xffffffffu).  So this isn't an example of
a problem that could arise on the targets that brought this up, or
any other targets that I'm aware of.

> (and so could be passing a negative value where the
> ABI means the callee expects wint_t values to be zero-extended, with
> consequent undefined behavior there).

Sign extension is also not an issue on the target where the warning
is issued (x86_64 ILP32) and where wchar_t is int and wint_t is
unsigned long.

> The correct fix involves thinking
> about where you know something is a wide character, and where something
> might be a wide character or WEOF.  It might then involve a cast to
> wint_t, or changing the type of a variable, or more changes than that.

There is nothing to fix when wchar_t and wint_t have the same
precision and  printf("%lc", L'x') does the same thing as
printf("%lc", (wint_t) L'x').  The cast is entirely superfluous.

Even so, the warning could have merit if there were a different
target to which the program might need to be ported where this
property didn't hold.  But as I said, I don't know of one and,
as it seems, neither do you.

I'm very much in favor of strict warnings that help point out
either real or potential bugs.  But I see no value in warning
about constructs that are perfectly safe.  To the contrary,
I think such warnings are harmful because they lead to
unnecessary churn and to bug injection.

It might be worthwhile to mention that Clang used to issue the
same warning as GCC until 2011 when they fixed it.  The bug is
here:

   https://llvm.org/bugs/show_bug.cgi?id=7981

(Although I think there is a flaw in the logic they used in
comment #2 it's only hypothetical because there is no target
that Clang supports where the submitted test case doesn't do
what it's expected to do either.)

Martin
Joseph Myers Oct. 5, 2016, 5:30 p.m. UTC | #18
On Wed, 5 Oct 2016, Martin Sebor wrote:

> > Or it could show up a logical error where the code should have had a
> > wint_t variable and checked for WEOF somewhere but just assumed things
> > would fit in wchar_t
> 
> WEOF that fits in a wint_t also fits in a wchar_t when the two types
> have the same precision, the case where GCC issues the warning (and
> where WEOF expands to 0xffffffffu).  So this isn't an example of
> a problem that could arise on the targets that brought this up, or
> any other targets that I'm aware of.

The problem is e.g. code calling getwchar and not checking for WEOF.  
0xffffffffu does not fit in a 32-bit signed variable unless your code is 
confusing signed and unsigned freely.  And even if you consider it OK to 
check after conversion to wchar_t rather than before, the lack of a wint_t 
variable for the result seems likely to accompany a missing check.

> > (and so could be passing a negative value where the
> > ABI means the callee expects wint_t values to be zero-extended, with
> > consequent undefined behavior there).
> 
> Sign extension is also not an issue on the target where the warning
> is issued (x86_64 ILP32) and where wchar_t is int and wint_t is
> unsigned long.

There are plenty of architectures where the ABI does require the high part 
to be extended in a particular way; that's why the Linux kernel has 
syscall wrappers, because improperly extended 32-bit syscall arguments 
caused security issues in the past.  In some cases it's even 
architecturally undefined what the processor does when an instruction 
input isn't properly extended (although at least in the MIPS case the 
required extension there is always sign-extension, whether the 32-bit 
value is considered signed or unsigned).

I'd actually hope for future forms of sanitization to be able to detect at 
runtime when you have undefined behavior from incorrectly typed arguments 
to variadic functions.

I think the warning is the correct thing for anyone trying to write C as a 
high-level language and be type-correct and properly check before doing 
conversions that might change values unless those changes are actually 
intended, and that warning rather than lesser warnings for "C as portable 
assembler" is the appropriate default for format checking.
Martin Sebor Oct. 5, 2016, 10:59 p.m. UTC | #19
> The problem is e.g. code calling getwchar and not checking for WEOF.
> 0xffffffffu does not fit in a 32-bit signed variable unless your code is
> confusing signed and unsigned freely.  And even if you consider it OK to
> check after conversion to wchar_t rather than before, the lack of a wint_t
> variable for the result seems likely to accompany a missing check.

Calling printf("%lc", x) with a wchar_t argument hardly suggests
anything about where the argument originated or what checking it
may have been subjected to.  Issuing a warning for a safe piece
of code only on the basis that there might be some other piece
of code in the program that does something wrong makes no sense.
Suggesting to the user to change the safe piece of code is
misleading and counterproductive.

> There are plenty of architectures where the ABI does require the high part
> to be extended in a particular way;

There is no high part when the wchar_t and wint_t types have
the same size, which they do not only in the case of interest but
also in all the other cases we know of.  Again, unless you know
of a counterexample this point isn't relevant to the discussion.

> I think the warning is the correct thing for anyone trying to write C as a
> high-level language and be type-correct and properly check before doing
> conversions that might change values unless those changes are actually
> intended, and that warning rather than lesser warnings for "C as portable
> assembler" is the appropriate default for format checking.

I certainly agree with this general philosophy.  Unfortunately,
the wchar_t and wint_t types on the target of interest (i386
or ILP32 on x86_64) are defined in an unhelpful way that the
-Wformat checker normally uses to diagnose real (or potential)
problems that can do do come up with their underlying types.
They just can't come up with wchar_t and wint_t.  Useful
warnings shouldn't pedantically enforce type rules in cases
where the rules serve no practical purpose or are loose enough
to allow for nonsensical implementations.  This is a basic
principle that's been followed by the vast majority of GCC
warnings and that users have come to appreciate and rely on.

The underlying problem here (and I claim a bug) is simply that
the -Wformat code doesn't distinguish the wint_t typedef (and
in C, the wchar_t typedef as well) from their underlying types
and treats the typedefs the same.  It should not, at least not
at the default warning level.  I wouldn't object to retaining
the warning with -Wpedantic but otherwise it's unhelpful.

Martin
Joseph Myers Oct. 5, 2016, 11:11 p.m. UTC | #20
On Wed, 5 Oct 2016, Martin Sebor wrote:

> may have been subjected to.  Issuing a warning for a safe piece
> of code only on the basis that there might be some other piece
> of code in the program that does something wrong makes no sense.
> Suggesting to the user to change the safe piece of code is
> misleading and counterproductive.

It's warning because it's *bad code*.  Whether it's safe or not is 
peripheral.  Lots of warnings are for things that are dubious even if they 
may happen to do what the user wants on the platforms they care about.  
I'd consider even cases of different signedness (e.g. passing one of int 
and unsigned int to a format expecting the other) to be dubious without a 
reason why values not representable in the other type can't reach that 
code - and I'd consider cases where the types are different not just in 
signedness to be clearly worse than the cases that we only warn for with 
-Wformat-signedness.

> > There are plenty of architectures where the ABI does require the high part
> > to be extended in a particular way;
> 
> There is no high part when the wchar_t and wint_t types have
> the same size, which they do not only in the case of interest but

I mean high parts if the ABI is passing those arguments in 64-bit 
registers / stack slots and has requirements on how 32-bit arguments are 
extended for passing as 64-bit.
Martin Sebor Oct. 5, 2016, 11:27 p.m. UTC | #21
On 10/05/2016 05:11 PM, Joseph Myers wrote:
> On Wed, 5 Oct 2016, Martin Sebor wrote:
>
>> may have been subjected to.  Issuing a warning for a safe piece
>> of code only on the basis that there might be some other piece
>> of code in the program that does something wrong makes no sense.
>> Suggesting to the user to change the safe piece of code is
>> misleading and counterproductive.
>
> It's warning because it's *bad code*.

Why?  What can go wrong and on what system?  Please name at least
one specific example.

> Whether it's safe or not is
> peripheral.  Lots of warnings are for things that are dubious even if they
> may happen to do what the user wants on the platforms they care about.
> I'd consider even cases of different signedness (e.g. passing one of int
> and unsigned int to a format expecting the other) to be dubious without a
> reason why values not representable in the other type can't reach that
> code - and I'd consider cases where the types are different not just in
> signedness to be clearly worse than the cases that we only warn for with
> -Wformat-signedness.
>
>>> There are plenty of architectures where the ABI does require the high part
>>> to be extended in a particular way;
>>
>> There is no high part when the wchar_t and wint_t types have
>> the same size, which they do not only in the case of interest but
>
> I mean high parts if the ABI is passing those arguments in 64-bit
> registers / stack slots and has requirements on how 32-bit arguments are
> extended for passing as 64-bit.

I don't understand what you mean.  We're talking about passing
an integer (wchar_t) via the ellipsis.  That's just as well
defined as passing any other integer of that size, as is (for
all practical purposes) extracting it via an integer of the
opposite signedness.  printf then stuffs the extracted wint_t
value (which is of the same size as the original wchar_t) into
a wchar_t, appends a nul, and formats the result via %ls (at
least that's how it's specified).

Please explain what can go wrong on i386 when someone calls
printf("%lc", L'a'), or on any other target GCC supports.
I will be happy to agree that the warning is justified if
you show me how this breaks and where.

Martin
Joseph Myers Oct. 5, 2016, 11:54 p.m. UTC | #22
On Wed, 5 Oct 2016, Martin Sebor wrote:

> On 10/05/2016 05:11 PM, Joseph Myers wrote:
> > On Wed, 5 Oct 2016, Martin Sebor wrote:
> > 
> > > may have been subjected to.  Issuing a warning for a safe piece
> > > of code only on the basis that there might be some other piece
> > > of code in the program that does something wrong makes no sense.
> > > Suggesting to the user to change the safe piece of code is
> > > misleading and counterproductive.
> > 
> > It's warning because it's *bad code*.
> 
> Why?  What can go wrong and on what system?  Please name at least
> one specific example.

It's bad because getting types wrong (more generally than just in this 
case) is a careless coding practice.  It's a code smell.  By requiring 
much more careful reasoning about whether it is OK in a particular 
context, given knowledge of ABIs and compiler optimizations, such code 
smells are liable to distract attention from other problems in the code, 
as well as being symptomatic of possible other problems in the code.

It should be possible to accommodate -Wall within a wide range of code 
styles, but you may will still need to adapt your code to it in some ways 
if you are doing things that are commonly dubious, even if you have 
reasons why they are safe in the cases that appear in your code and in the 
contexts in which that code will be used.

> > I mean high parts if the ABI is passing those arguments in 64-bit
> > registers / stack slots and has requirements on how 32-bit arguments are
> > extended for passing as 64-bit.
> 
> I don't understand what you mean.  We're talking about passing
> an integer (wchar_t) via the ellipsis.  That's just as well
> defined as passing any other integer of that size, as is (for
> all practical purposes) extracting it via an integer of the
> opposite signedness.  printf then stuffs the extracted wint_t
> value (which is of the same size as the original wchar_t) into
> a wchar_t, appends a nul, and formats the result via %ls (at
> least that's how it's specified).

Consider the powerpc64 ABI used for GNU/Linux (both BE and LE ABIs have 
this property).  A 32-bit argument is passed in a register (or in memory 
after a certain point), sign-extended to 64-bit if of a signed type and 
zero-extended to 64-bit if of an unsigned type.  wchar_t is int and wint_t 
is unsigned int in this case.  So if you pass a negative wchar_t argument, 
and the callee uses va_arg with type wint_t to access it, the compiler 
compiling the callee is entitled to optimize it on the basis that the 
value is already zero-extended to 64 bits, resulting in undefined behavior 
because the caller in fact sign-extended it.  Even if GCC doesn't optimize 
on that basis today, it could in future - and with a printf call, it's 
very likely that code compiled now will be used with a future libc.so 
shared library compiled with a newer compiler version, so printf may be 
built with a newer compiler than the code calling it.

The problem is that extracting via an integer of the opposite signedness 
is *not* defined unless the argument has a value representable in both 
types - both as a matter of ISO C rules on variadic functions, and as a 
practical matter of what various ABIs say about how sub-word arguments are 
extended when passed as function arguments.

(Negative wchar_t isn't actually a valid wide character, but without 
knowing where the wchar_t came from it's entirely possible it could have a 
value of type wchar_t that doesn't represent a valid character.  If you 
properly pass a wint_t value, then you avoid undefined behavior for 
invalid characters, since they are defined to produce an EILSEQ error.)

But my basic point is: if something can lead to analysis in this level of 
detail of whether the code is or is not safe in a particular context, 
while there is a trivial fix to the code that would short-circuit all that 
analysis, that by itself is enough evidence that the code deserves a 
warning and should be cleaned up to make it more obviously safe.
Martin Sebor Oct. 6, 2016, 9:22 p.m. UTC | #23
On 10/05/2016 05:54 PM, Joseph Myers wrote:
> On Wed, 5 Oct 2016, Martin Sebor wrote:
>
>> On 10/05/2016 05:11 PM, Joseph Myers wrote:
>>> On Wed, 5 Oct 2016, Martin Sebor wrote:
>>>
>>>> may have been subjected to.  Issuing a warning for a safe piece
>>>> of code only on the basis that there might be some other piece
>>>> of code in the program that does something wrong makes no sense.
>>>> Suggesting to the user to change the safe piece of code is
>>>> misleading and counterproductive.
>>>
>>> It's warning because it's *bad code*.
>>
>> Why?  What can go wrong and on what system?  Please name at least
>> one specific example.
>
> It's bad because getting types wrong (more generally than just in this
> case) is a careless coding practice.  It's a code smell.

There are lots of other "smells" that GCC doesn't warn about by
default even though it easily could, including -Wformat-signedness.

> The problem is that extracting via an integer of the opposite signedness
> is *not* defined unless the argument has a value representable in both
> types - both as a matter of ISO C rules on variadic functions, and as a
> practical matter of what various ABIs say about how sub-word arguments are
> extended when passed as function arguments.

No, sign mismatch is not the problem we're discussing.  GCC doesn't
warn for %lc when wchar_t and wint_t differ in signedness except
with -Wformat-signedness.  The cause of the warning is that wchar_t
is a distinct type from wint_t such as long from int (irrespective
of their sign), even if the two have the same size, and the checker
doesn't distinguish the typedefs from the underlying types.

My position is that when int and long have the same size, although
warning for objects of those types is appropriate because their
sizes often do differ from one target to the next, the %lc warning
is unhelpful because, unlike the underlying types, wchar_t and
wint_t are specifically designed to be interoperable and the same
propensity for a difference in size doesn't apply to them.

I did say, however, that the warning might have some merit even
in this case if there also were a target to which the code could
be ported where wchar_t and wint_t had different sizes (i.e., as
a portability warning).  But for such a portability warning to
be useful it would need to be issued by all GCC compilers, not
just when compiling for the target with that property.  (But
since neither of us knows(*) of such a target this is a moot
point.)

> But my basic point is: if something can lead to analysis in this level of
> detail of whether the code is or is not safe in a particular context,
> while there is a trivial fix to the code that would short-circuit all that
> analysis, that by itself is enough evidence that the code deserves a
> warning and should be cleaned up to make it more obviously safe.

The premise that "it's trivial to fix" is wrong and at the core
of my objection to the warning (by default; I would have no issue
with it being issued under a separate option or under -Wpedantic).

The text of the warning itself indicates that even suggesting how
to fix it isn't trivial.  GCC warns for just for the %lc directive
(not for the sign mismatch) and suggests to fix it by replacing
the %lc directive with %ld.  That's clearly wrong.  Printing
a hint with fix you're looking for (i.e., casting the argument
to wint_t) may be difficult because of the lack of location
information for local variables.

   $ cat a.c && gcc -S -Wall -m32 a.c
   typedef __WCHAR_TYPE__ wchar_t;

   void f (int x, unsigned y)
   {
     extern wchar_t w;

     __builtin_printf ("%lc", w);
     __builtin_printf ("%lc", x);
     __builtin_printf ("%lc", y);
   }
   a.c: In function ‘f’:
   a.c:7:24: warning: format ‘%lc’ expects argument of type ‘wint_t’, 
but argument 2 has type ‘wchar_t {aka long int}’ [-Wformat=]
      __builtin_printf ("%lc", w);
                         ~~^
                         %ld

But I think we're both starting to repeat ourselves without making
any headway.  I spent more time on this discussion (and the research
I did for it) than I should have an I don't view this as serious
enough of a problem to be worth spending more time on, just a minor
wart in GCC.  There are plenty others with a much better ROI.

Martin

[*] Since you've persistently dodged the question about the target
where this might be a problem I surveyed the GCC config files to
see if I could find one.  All but two of the operating systems GCC
apparently knows about define wint_t to be an alias for signed int.
The two exceptions are Ming32 and VxWorks both of which define both
wint_t and wchar_t to be unsigned short, and so those two targets
can be ruled out.

Besides these, the are a bunch of back ends that define wchar_t
to be long where, if long were wider than int, calling
printf("%lc", L'x') would be a problem.  But based on a closer
review of some of these targets it looks like they define both
int and long as 32-bit types and so there is no issue there.
I didn't look at all of them but defining wint_t to be narrower
than wchar_t would make no sense because then the former wouldn't
be able to represent all the values of the latter, which it is
specifically intended to do.  So while I can't definitively say
that no such target exists, it seems exceedingly unlikely.
Given that, it's pretty conclusive that issuing the warning is
not necessary or helpful on any target.
Joseph Myers Oct. 6, 2016, 9:30 p.m. UTC | #24
On Thu, 6 Oct 2016, Martin Sebor wrote:

> > The problem is that extracting via an integer of the opposite signedness
> > is *not* defined unless the argument has a value representable in both
> > types - both as a matter of ISO C rules on variadic functions, and as a
> > practical matter of what various ABIs say about how sub-word arguments are
> > extended when passed as function arguments.
> 
> No, sign mismatch is not the problem we're discussing.  GCC doesn't
> warn for %lc when wchar_t and wint_t differ in signedness except
> with -Wformat-signedness.  The cause of the warning is that wchar_t
> is a distinct type from wint_t such as long from int (irrespective
> of their sign), even if the two have the same size, and the checker
> doesn't distinguish the typedefs from the underlying types.

For most configurations of GCC, wint_t is unsigned int (the default from 
defaults.h).  Only a few specific (OS, CPU) pairs make a different choice.

> I did say, however, that the warning might have some merit even
> in this case if there also were a target to which the code could
> be ported where wchar_t and wint_t had different sizes (i.e., as
> a portability warning).  But for such a portability warning to
> be useful it would need to be issued by all GCC compilers, not
> just when compiling for the target with that property.  (But
> since neither of us knows(*) of such a target this is a moot
> point.)

As I pointed out, wint_t and wchar_t are ABI-incompatible when passed to 
functions on powerpc64-linux-gnu, because of the different signedness, 
even though they have the same size.

Thus, diagnosing the wrong choice of type is useful as a portability 
warning.  Unfortunately, it's hard to diagnose it on architectures where 
wchar_t and wint_t are the same type or too similar, but by diagnosing 
where they are sufficiently different at the C level (but possibly not at 
the ABI level), code can be fixed so it doesn't break in cases where the 
difference matters.

> The text of the warning itself indicates that even suggesting how
> to fix it isn't trivial.  GCC warns for just for the %lc directive
> (not for the sign mismatch) and suggests to fix it by replacing
> the %lc directive with %ld.  That's clearly wrong.  Printing

It's true the hint is wrong, but the warning is right.
diff mbox

Patch

diff --git a/gcc/gimple-ssa-sprintf.c b/gcc/gimple-ssa-sprintf.c
--- a/gcc/gimple-ssa-sprintf.c
+++ b/gcc/gimple-ssa-sprintf.c
@@ -1125,7 +1127,7 @@  format_floating_max (tree type, char spe
      round-to-nearest mode.  */
   mpfr_t x;
   mpfr_init2 (x, rfmt->p);
-  mpfr_from_real (x, &rv, MPFR_RNDN);
+  mpfr_from_real (x, &rv, GMP_RNDN);
 
   const char fmt[] = { '%', 'R', spec, '\0' };
   int n = mpfr_snprintf (NULL, 0, fmt, x);
@@ -1325,7 +1327,7 @@  format_floating (const conversion_spec &
 	 round-to-nearest mode.  */
       mpfr_t mpfrval;
       mpfr_init2 (mpfrval, rfmt->p);
-      mpfr_from_real (mpfrval, rvp, MPFR_RNDN);
+      mpfr_from_real (mpfrval, rvp, GMP_RNDN);
 
       char fmtstr [40];
       char *pfmt = fmtstr;