diff mbox series

add attribute nonstring

Message ID d0432148-24d3-baaa-faf6-b00c33e45103@gmail.com
State New
Headers show
Series add attribute nonstring | expand

Commit Message

Martin Sebor Nov. 10, 2017, 5:31 p.m. UTC
The latest revision of GCC 8.0 adds a -Wstringop-truncation option
to detect common misuses of the strncpy and strncat functions that
may truncate the copy and leave the result without a terminating
nul character.

On its own, the warning cannot distinguish the intended and safe
uses of the functions (to fill a buffer with data) from the
incorrect and unsafe ones (to make a "bounded" copy of a string).
Because the misuses have become prevalent and the correct/intended
uses are in a minority, GCC has added a new attribute called
nonstring to annotate character arrays that aren't meant to be
treated as nul-terminated strings.  The attribute disables
the checker.  In the future, GCC will also use the attribute to
issue warnings when such arrays are passed to functions that
expect nul-terminated strings.

Attached is a patch to annotate such arrays in Glibc to prevent
warnings on x86_64 (I didn't test any other targets).  The patch
touches the timezone/tzfile.h header even though (IIUC) it comes
from another project.  I'm not familiar with the process for
keeping the timezone directory in sync with tzcode.  Please let
me know how you'd like to go about making this change.

Thanks
Martin

Comments

Joseph Myers Nov. 10, 2017, 5:53 p.m. UTC | #1
On Fri, 10 Nov 2017, Martin Sebor wrote:

> Attached is a patch to annotate such arrays in Glibc to prevent
> warnings on x86_64 (I didn't test any other targets).  The patch
> touches the timezone/tzfile.h header even though (IIUC) it comes
> from another project.  I'm not familiar with the process for
> keeping the timezone directory in sync with tzcode.  Please let
> me know how you'd like to go about making this change.

For tzcode the normal practice is a bulk update from the latest release of 
tzcode.  We might consider a fix from https://github.com/eggert/tz.git if 
something newer than the latest release is needed in a particular case, 
but release versions are preferred.
Martin Sebor Nov. 10, 2017, 11:11 p.m. UTC | #2
On 11/10/2017 10:53 AM, Joseph Myers wrote:
> On Fri, 10 Nov 2017, Martin Sebor wrote:
>
>> Attached is a patch to annotate such arrays in Glibc to prevent
>> warnings on x86_64 (I didn't test any other targets).  The patch
>> touches the timezone/tzfile.h header even though (IIUC) it comes
>> from another project.  I'm not familiar with the process for
>> keeping the timezone directory in sync with tzcode.  Please let
>> me know how you'd like to go about making this change.
>
> For tzcode the normal practice is a bulk update from the latest release of
> tzcode.  We might consider a fix from https://github.com/eggert/tz.git if
> something newer than the latest release is needed in a particular case,
> but release versions are preferred.

Okay, thanks.  I've read that you run periodic Glibc builds with
the latest GCC so I wanted to give you a heads up on this change
and the patch to avoid breaking them.  Unless that's not a concern
(do you use -Werror in those builds?) what is the best way to
coordinate this update?

I CC Paul for his input in case updating tzcode first is important.

Martin
Joseph Myers Nov. 10, 2017, 11:45 p.m. UTC | #3
On Fri, 10 Nov 2017, Martin Sebor wrote:

> Okay, thanks.  I've read that you run periodic Glibc builds with
> the latest GCC so I wanted to give you a heads up on this change
> and the patch to avoid breaking them.  Unless that's not a concern
> (do you use -Werror in those builds?) what is the best way to
> coordinate this update?

The builds use -Werror, and run daily (actually more like every 30 hours 
when the builds succeed, because each build starts 24 hours after the 
previous one finished).  The main issue with a GCC patch breaking the 
build going in some time before a glibc patch fixing the build is that it 
may make it harder to find what GCC patch introduced any other build 
breakage that appears while there are warnings (we currently have two such 
breakages with GCC mainline, for powerpc64le and sh).

The most obvious question for people to consider in reviewing this patch 
(apart from getting a tzfile.h fix from upstream) is coding style: do we 
want __NONSTRING on a separate line as in this patch, or on the same line 
as the member declaration in question (before the type, or after the [])?  
Also, sys/cdefs.h should be using __GNUC_PREREQ (8, 0), not a direct test 
of __GNUC__.
Paul Eggert Nov. 12, 2017, 9:06 a.m. UTC | #4
Thanks for mentioning the problem with GCC 8. If the intent of 
__attribute__((__nonstring__)) is to consistently mark char arrays that are not 
necessarily null-terminated strings, then lots of data structures in the tzdb 
code would need to be marked with this attribute. For example, every member of 
tzfile.h's 'struct tzhead' would have the attribute, because they are all char 
arrays that might not be null-terminated.

Looking at the proposed glibc patch, it seems its intent is more modest. All 
it's trying to do is to pacify GCC 8 so that GCC doesn't complain about calls to 
strncpy and strncat. In that case, a simpler and cleaner fix for tzdb is to stop 
using strncpy and strncat. (I suspect that the main reason tzdb uses strncpy is 
that its code is so old that it predates the widespread availability of memcpy.) 
So I installed the attached patch into tzdb's zic.c and it should be part of the 
next tzdb release.

Perhaps you can this idea elsewhere in the rest of the glibc patch too. It would 
be a lot of work to consistently change glibc to use 
__attribute__((__nonstring__)) for its documented purpose. If glibc can avoid 
calling strncpy and strncat, it won't need __NONSTRING; if not, then I suggest 
changing this:

/* Describes a char array that is not necessarily a NUL-terminated
    string.  */
# define __NONSTRING __attribute__ ((__nonstring__))

to something more like this:

/* Describes a char array whose address can safely be passed as the first
    argument to strncpy and strncat, as the char array is not necessarily
    a NUL-terminated string.  */
# define __NONSTRING __attribute__ ((__nonstring__))

in sys/cdefs.h, so that the more-modest goal of the patch is clearer to the reader.

I'll CC: this to tz@iana.org as a heads-up there.
From 641f2c48b10dec5ab6dd90f560b66e061aa660ba Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sun, 12 Nov 2017 00:57:19 -0800
Subject: [PROPOSED] Port to GCC 8 -Wstringop-truncation

Problem reported by Martin Sebor in:
https://sourceware.org/ml/libc-alpha/2017-11/msg00336.html
* NEWS: Mention this.
* zic.c (writezone): Use memcpy, not strncpy.
---
 NEWS  | 3 +++
 zic.c | 2 +-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/NEWS b/NEWS
index 966f9b6..7881936 100644
--- a/NEWS
+++ b/NEWS
@@ -29,6 +29,9 @@ Unreleased, experimental changes
     Diagnostics and commentary now distinguish UT from UTC more
     carefully; see theory.html for more information about UT vs UTC.
 
+    zic has been ported to GCC 8's -Wstringop-truncation option.
+    (Problem reported by Martin Sebor.)
+
 
 Release 2017c - 2017-10-20 14:49:34 -0700
 
diff --git a/zic.c b/zic.c
index 82e653d..e36301f 100644
--- a/zic.c
+++ b/zic.c
@@ -1960,7 +1960,7 @@ writezone(const char *const name, const char *const string, char version)
 		}
 #define DO(field)	fwrite(tzh.field, sizeof tzh.field, 1, fp)
 		tzh = tzh0;
-		strncpy(tzh.tzh_magic, TZ_MAGIC, sizeof tzh.tzh_magic);
+		memcpy(tzh.tzh_magic, TZ_MAGIC, sizeof tzh.tzh_magic);
 		tzh.tzh_version[0] = version;
 		convert(thistypecnt, tzh.tzh_ttisgmtcnt);
 		convert(thistypecnt, tzh.tzh_ttisstdcnt);
Martin Sebor Nov. 12, 2017, 5:33 p.m. UTC | #5
On 11/12/2017 02:06 AM, Paul Eggert wrote:
> Thanks for mentioning the problem with GCC 8. If the intent of
> __attribute__((__nonstring__)) is to consistently mark char arrays that
> are not necessarily null-terminated strings, then lots of data
> structures in the tzdb code would need to be marked with this attribute.
> For example, every member of tzfile.h's 'struct tzhead' would have the
> attribute, because they are all char arrays that might not be
> null-terminated.
>
> Looking at the proposed glibc patch, it seems its intent is more modest.
> All it's trying to do is to pacify GCC 8 so that GCC doesn't complain
> about calls to strncpy and strncat. In that case, a simpler and cleaner
> fix for tzdb is to stop using strncpy and strncat. (I suspect that the
> main reason tzdb uses strncpy is that its code is so old that it
> predates the widespread availability of memcpy.) So I installed the
> attached patch into tzdb's zic.c and it should be part of the next tzdb
> release.

Great!

>
> Perhaps you can this idea elsewhere in the rest of the glibc patch too.
> It would be a lot of work to consistently change glibc to use
> __attribute__((__nonstring__)) for its documented purpose. If glibc can
> avoid calling strncpy and strncat, it won't need __NONSTRING;

Replacing strncpy with memcpy works well in some case (such
the TZ_MAGIC macro).  In other cases the string pointed to by
the constant argument is unknown and can be shorter than the size
of the destination so in those the replacement would need to be
more involved.  I tried to make only minimal changes to suppress
the new warning to avoid excessive churn.  I also didn't want to
remove the originally intended and correct uses of strncpy (to
completely fill a buffer).

I have another GCC enhancement that I plan to submit this week
that points out uses of these non-string arrays with common
string functions like strlen.  With it, annotating more of
the non-string arrays with the attribute to help detect their
accidental misuses might be preferable to using memcpy.  I'll
leave that decision to the Glibc folks here.

  if not,
> then I suggest changing this:
>
> /* Describes a char array that is not necessarily a NUL-terminated
>    string.  */
> # define __NONSTRING __attribute__ ((__nonstring__))
>
> to something more like this:
>
> /* Describes a char array whose address can safely be passed as the first
>    argument to strncpy and strncat, as the char array is not necessarily
>    a NUL-terminated string.  */
> # define __NONSTRING __attribute__ ((__nonstring__))
>
> in sys/cdefs.h, so that the more-modest goal of the patch is clearer to
> the reader.
>
> I'll CC: this to tz@iana.org as a heads-up there.

That would make sense to me as well  I'll wait to submit an updated
patch until this and Joseph's style question have been settled.

Thanks
Martin
Paul Eggert Nov. 12, 2017, 8:39 p.m. UTC | #6
[Dropping tz@iana.org as this email isn't relevant to tzdb.]

Martin Sebor wrote:
> I tried to make only minimal changes to suppress
> the new warning to avoid excessive churn.

As the main goal here is to find bugs not to pacify GCC, it's helpful to take 
these diagnostics as an opportunity to look more carefully at the strncpy calls. 
I just now looked at the other two data structures affected by the patch, and 
found some issues.

First, the patch to sysdeps/gnu/bits/utmp.h seems reasonable, as the members 
ut_user etc. have used strncpy format since the 1970s and people who deal with 
this format are used to it. However, this format does not appear to be 
documented in the manual. I suggest accompanying it with a patch to the part of 
manual/users.texi that talks about ut_user etc., so that the __NONSTRING issue 
is documented there.

Second, the patch to sysdeps/gnu/net/if.h is dubious. Its goal appears to be to 
pacify calls like the first call to strncpy in 
sysdeps/unix/sysv/linux/if_index.c. But that first call appears to be a bug in 
glibc, since the underlying Linux system call expects a null-terminated string. 
My guess is that
this particular data structure should indeed be a null-terminated string, and 
should not be marked with __NONSTRING, and that we should instead change the 
glibc code to not use strncpy here. I suppose the code should report an error 
instead of creating a non-null-terminated array, and that it need not initialize 
the rest of the array to zeros.
Martin Sebor Nov. 12, 2017, 11:49 p.m. UTC | #7
On 11/12/2017 01:39 PM, Paul Eggert wrote:
> [Dropping tz@iana.org as this email isn't relevant to tzdb.]
>
> Martin Sebor wrote:
>> I tried to make only minimal changes to suppress
>> the new warning to avoid excessive churn.
>
> As the main goal here is to find bugs not to pacify GCC, it's helpful to
> take these diagnostics as an opportunity to look more carefully at the
> strncpy calls. I just now looked at the other two data structures
> affected by the patch, and found some issues.
>
> First, the patch to sysdeps/gnu/bits/utmp.h seems reasonable, as the
> members ut_user etc. have used strncpy format since the 1970s and people
> who deal with this format are used to it. However, this format does not
> appear to be documented in the manual. I suggest accompanying it with a
> patch to the part of manual/users.texi that talks about ut_user etc., so
> that the __NONSTRING issue is documented there.
>
> Second, the patch to sysdeps/gnu/net/if.h is dubious. Its goal appears
> to be to pacify calls like the first call to strncpy in
> sysdeps/unix/sysv/linux/if_index.c. But that first call appears to be a
> bug in glibc, since the underlying Linux system call expects a
> null-terminated string. My guess is that
> this particular data structure should indeed be a null-terminated
> string, and should not be marked with __NONSTRING, and that we should
> instead change the glibc code to not use strncpy here. I suppose the
> code should report an error instead of creating a non-null-terminated
> array, and that it need not initialize the rest of the array to zeros.

I'm not familiar with this ioct call beyond what I've read on
the Linux netdevice man page before submitting the patch.  If
ifr_name is, in fact, supposed to be a string and I missed it
then that would actually be awesome because it would prove
the warning useful.

Martin

PS I still don't see it discussed on the Linux man page but
I did find such a requirement on an AIX 6.1 ioctl man page:
https://www.ibm.com/support/knowledgecenter/en/ssw_ibm_i_61/apis/ioctl.htm

The descriptions of the if_indextoname and if_nametoindex
functions specified by RFC 3493 also talk about the name being
a nul-terminated string so it looks to me like you are correct
and the warning has found a Glibc bug.  Yay! :)
Joseph Myers Nov. 13, 2017, 12:50 a.m. UTC | #8
On Sun, 12 Nov 2017, Paul Eggert wrote:

> availability of memcpy.) So I installed the attached patch into tzdb's zic.c
> and it should be part of the next tzdb release.

Please commit this zic.c fix also to glibc.  (We can do a bulk update from 
tzcode after the next tzcode release.)
Steve Ellcey Nov. 13, 2017, 5:35 p.m. UTC | #9
On Sun, 2017-11-12 at 16:49 -0700, Martin Sebor wrote:
> 
> PS I still don't see it discussed on the Linux man page but
> I did find such a requirement on an AIX 6.1 ioctl man page:
> https://www.ibm.com/support/knowledgecenter/en/ssw_ibm_i_61/apis/ioct
> l.htm
> 
> The descriptions of the if_indextoname and if_nametoindex
> functions specified by RFC 3493 also talk about the name being
> a nul-terminated string so it looks to me like you are correct
> and the warning has found a Glibc bug.  Yay! :)

I think this is a bug and that if_nametoindex should check for a name
that is too long.  Based on RFC 3493 it would appear that we don't need
to set errno in this case though I am not sure if that is a correct
interpretation.  I tested this patch:


2017-11-13  Steve Ellcey  <sellcey@cavium.com>

	* sysdeps/unix/sysv/linux/if_index.c (__if_nametoindex):
	Check if ifname is too long.


diff --git a/sysdeps/unix/sysv/linux/if_index.c
b/sysdeps/unix/sysv/linux/if_index.c
index 56f3f13..1e081c0 100644
--- a/sysdeps/unix/sysv/linux/if_index.c
+++ b/sysdeps/unix/sysv/linux/if_index.c
@@ -43,6 +43,9 @@ __if_nametoindex (const char *ifname)
   if (fd < 0)
     return 0;
 
+  if (strlen (ifname) >= IFNAMSIZ)
+    return 0;
+
   strncpy (ifr.ifr_name, ifname, sizeof (ifr.ifr_name));
   if (__ioctl (fd, SIOCGIFINDEX, &ifr) < 0)
     {


And it compiled fine using the latest GCC.  Apparently GCC's constant
propogation allowed it to see that sizeof could not longer be IFNAMSIZ.

Steve Ellcey
Joseph Myers Nov. 14, 2017, 4:36 p.m. UTC | #10
On Mon, 13 Nov 2017, Steve Ellcey wrote:

> On Sun, 2017-11-12 at 16:49 -0700, Martin Sebor wrote:
> > 
> > PS I still don't see it discussed on the Linux man page but
> > I did find such a requirement on an AIX 6.1 ioctl man page:
> > https://www.ibm.com/support/knowledgecenter/en/ssw_ibm_i_61/apis/ioct
> > l.htm
> > 
> > The descriptions of the if_indextoname and if_nametoindex
> > functions specified by RFC 3493 also talk about the name being
> > a nul-terminated string so it looks to me like you are correct
> > and the warning has found a Glibc bug.  Yay! :)
> 
> I think this is a bug and that if_nametoindex should check for a name
> that is too long.  Based on RFC 3493 it would appear that we don't need
> to set errno in this case though I am not sure if that is a correct
> interpretation.  I tested this patch:
> 
> 
> 2017-11-13  Steve Ellcey  <sellcey@cavium.com>
> 
> 	* sysdeps/unix/sysv/linux/if_index.c (__if_nametoindex):
> 	Check if ifname is too long.

Florian, any comments on the proper handling of too-long names here, and 
so on how we should avoid the strncpy warnings in this case?
Florian Weimer Nov. 14, 2017, 4:49 p.m. UTC | #11
On 11/13/2017 06:35 PM, Steve Ellcey wrote:
> On Sun, 2017-11-12 at 16:49 -0700, Martin Sebor wrote:
>>
>> PS I still don't see it discussed on the Linux man page but
>> I did find such a requirement on an AIX 6.1 ioctl man page:
>> https://www.ibm.com/support/knowledgecenter/en/ssw_ibm_i_61/apis/ioct
>> l.htm
>>
>> The descriptions of the if_indextoname and if_nametoindex
>> functions specified by RFC 3493 also talk about the name being
>> a nul-terminated string so it looks to me like you are correct
>> and the warning has found a Glibc bug.  Yay! :)
> 
> I think this is a bug and that if_nametoindex should check for a name
> that is too long.  Based on RFC 3493 it would appear that we don't need
> to set errno in this case though I am not sure if that is a correct
> interpretation.  I tested this patch:
> 
> 
> 2017-11-13  Steve Ellcey  <sellcey@cavium.com>
> 
> 	* sysdeps/unix/sysv/linux/if_index.c (__if_nametoindex):
> 	Check if ifname is too long.
> 
> 
> diff --git a/sysdeps/unix/sysv/linux/if_index.c
> b/sysdeps/unix/sysv/linux/if_index.c
> index 56f3f13..1e081c0 100644
> --- a/sysdeps/unix/sysv/linux/if_index.c
> +++ b/sysdeps/unix/sysv/linux/if_index.c
> @@ -43,6 +43,9 @@ __if_nametoindex (const char *ifname)
>     if (fd < 0)
>       return 0;
>   
> +  if (strlen (ifname) >= IFNAMSIZ)
> +    return 0;
> +
>     strncpy (ifr.ifr_name, ifname, sizeof (ifr.ifr_name));
>     if (__ioctl (fd, SIOCGIFINDEX, &ifr) < 0)
>       {

The Linux kernel has this in include/linux/netdevice.h:

| struct net_device {
|         char                    name[IFNAMSIZ];

If I traced this correctly, the ioctl compares the passed-in name using 
strncmp:

| struct net_device *dev_get_by_name_rcu(struct net *net, const char *name)
| {
|         struct net_device *dev;
|         struct hlist_head *head = dev_name_hash(net, name);
|
|         hlist_for_each_entry_rcu(dev, head, name_hlist)
|                 if (!strncmp(dev->name, name, IFNAMSIZ))
|                         return dev;
|
|         return NULL;
| }

So this means that we should add the nostring attribute and not the 
length check.

I'll ask one of our networking people to double-check this.

Thanks,
Florian
Joseph Myers Nov. 14, 2017, 5:06 p.m. UTC | #12
On Tue, 14 Nov 2017, Florian Weimer wrote:

> So this means that we should add the nostring attribute and not the length
> check.

If the name passed is actually longer than this field, is it undefined 
behavior or not?  If it's not undefined behavior, I think we should avoid 
silent truncation (which means a length check, even if the check allows 
the possibility of a non-NUL-terminated value ending up in the field, 
unless for some reason truncation is always OK and does not affect the 
semantics).
Andreas Schwab Nov. 14, 2017, 5:55 p.m. UTC | #13
On Nov 14 2017, Joseph Myers <joseph@codesourcery.com> wrote:

> On Tue, 14 Nov 2017, Florian Weimer wrote:
>
>> So this means that we should add the nostring attribute and not the length
>> check.
>
> If the name passed is actually longer than this field, is it undefined 
> behavior or not?

The kernel will always zero-terminate the string at IFNAMSIZ-1, see
dev_ifname in net/core/dev_ioctl.c.

Andreas.
Joseph Myers Nov. 14, 2017, 5:58 p.m. UTC | #14
On further investigation of the coding style issue: the normal practice in 
glibc is definitely that attributes (except on function definitions) go 
immediately before the ';' at the end of a declaration (there are e.g. 
various __attribute__ ((__aligned__ (something))) attributes like that on 
struct members).  Also, the normal naming practice in sys/cdefs.h would be 
__attribute_nonstring__ unless there is some reason to deviate from that.
Florian Weimer Nov. 14, 2017, 6:21 p.m. UTC | #15
On 11/14/2017 06:55 PM, Andreas Schwab wrote:
> On Nov 14 2017, Joseph Myers <joseph@codesourcery.com> wrote:
> 
>> On Tue, 14 Nov 2017, Florian Weimer wrote:
>>
>>> So this means that we should add the nostring attribute and not the length
>>> check.
>>
>> If the name passed is actually longer than this field, is it undefined
>> behavior or not?
> 
> The kernel will always zero-terminate the string at IFNAMSIZ-1, see
> dev_ifname in net/core/dev_ioctl.c.

Yes, Hannes and I went over the code and reached the same conclusion 
(for the ioctl interface, netlink is a bit more involved to check).

However, the original patch should really use strnlen or memchr, and not 
strlen.  As posted, the strlen is either invalid because the array is 
not NUL-terminated, or it passes because the string is short enough.

Thanks,
Florian
Joseph Myers Nov. 14, 2017, 6:27 p.m. UTC | #16
On Tue, 14 Nov 2017, Florian Weimer wrote:

> However, the original patch should really use strnlen or memchr, and not
> strlen.  As posted, the strlen is either invalid because the array is not
> NUL-terminated, or it passes because the string is short enough.

if_nametoindex takes a char * argument.  POSIX doesn't say explicitly, but 
I'd presume that must be a NUL-terminated string, with undefined behavior 
otherwise.  That's separate from what the kernel interface is.
Florian Weimer Nov. 14, 2017, 6:29 p.m. UTC | #17
On 11/14/2017 07:27 PM, Joseph Myers wrote:
> On Tue, 14 Nov 2017, Florian Weimer wrote:
> 
>> However, the original patch should really use strnlen or memchr, and not
>> strlen.  As posted, the strlen is either invalid because the array is not
>> NUL-terminated, or it passes because the string is short enough.
> 
> if_nametoindex takes a char * argument.  POSIX doesn't say explicitly, but
> I'd presume that must be a NUL-terminated string, with undefined behavior
> otherwise.  That's separate from what the kernel interface is.

Oh, you are of course right.  strlen is fine under these circumstances.

So the only thing that's missing is the __set_errno (ENODEV); call, I 
think.  (It's what the ioctl should fail with for an unknown interface 
name.)

Thanks,
Florian
Steve Ellcey Nov. 14, 2017, 11:22 p.m. UTC | #18
On Tue, 2017-11-14 at 19:29 +0100, Florian Weimer wrote:
> 
> So the only thing that's missing is the __set_errno (ENODEV); call, I 
> think.  (It's what the ioctl should fail with for an unknown interface 
> name.)
> 
> Thanks,
> Florian

OK, here is a new version of the patch that sets errno to ENODEV.  I
tested it (Thanks to Joseph for fixing the tests that would not compile
due to the new GCC warning) and got three failures that I think are all
unrelated to this change:

FAIL: crypt/badsalttest
FAIL: nptl/tst-thread_local1
FAIL: nss/tst-nss-files-hosts-multi

tst-thread_local1 is a failure I have seen before, tst-nss-files-hosts-
multi is a new test, and I am not sure what is happening with
badsalttest but it does not seem to be related to this change.  I also
have the not-checked-in __NONSTRING changes in utmp.h in my tree in
order to get this to build.

OK to checkin?

Steve Ellcey

2017-11-14  Steve Ellcey  <sellcey@cavium.com>

	* sysdeps/unix/sysv/linux/if_index.c (__if_nametoindex):
	Check if ifname is too long.

diff --git a/sysdeps/unix/sysv/linux/if_index.c b/sysdeps/unix/sysv/linux/if_index.c
index 56f3f13..e7ca27b 100644
--- a/sysdeps/unix/sysv/linux/if_index.c
+++ b/sysdeps/unix/sysv/linux/if_index.c
@@ -43,6 +43,12 @@ __if_nametoindex (const char *ifname)
   if (fd < 0)
     return 0;
 
+  if (strlen (ifname) >= IFNAMSIZ)
+    {
+      __set_errno (ENODEV);
+      return 0;
+    }
+
   strncpy (ifr.ifr_name, ifname, sizeof (ifr.ifr_name));
   if (__ioctl (fd, SIOCGIFINDEX, &ifr) < 0)
     {
Florian Weimer Nov. 15, 2017, 10:48 a.m. UTC | #19
On 11/15/2017 12:22 AM, Steve Ellcey wrote:
> On Tue, 2017-11-14 at 19:29 +0100, Florian Weimer wrote:
>>
>> So the only thing that's missing is the __set_errno (ENODEV); call, I
>> think.  (It's what the ioctl should fail with for an unknown interface
>> name.)
>>
>> Thanks,
>> Florian
> 
> OK, here is a new version of the patch that sets errno to ENODEV.  I
> tested it (Thanks to Joseph for fixing the tests that would not compile
> due to the new GCC warning) and got three failures that I think are all
> unrelated to this change:
> 
> FAIL: crypt/badsalttest
> FAIL: nptl/tst-thread_local1
> FAIL: nss/tst-nss-files-hosts-multi

Can you provide more details (such as .out file contents)?  Even if 
these failures are unrelated, they might point to something we need to fix.

> tst-thread_local1 is a failure I have seen before, tst-nss-files-hosts-
> multi is a new test, and I am not sure what is happening with
> badsalttest but it does not seem to be related to this change.  I also
> have the not-checked-in __NONSTRING changes in utmp.h in my tree in
> order to get this to build.
> 
> OK to checkin?

Patch looks good to me.  Please add [BZ #22442] to both the ChangeLog 
entry and the commit message because it is a user-visible bug.

Thanks,
Florian
Steve Ellcey Nov. 15, 2017, 5:58 p.m. UTC | #20
On Wed, 2017-11-15 at 11:48 +0100, Florian Weimer wrote:

> > FAIL: crypt/badsalttest
> > FAIL: nptl/tst-thread_local1
> > FAIL: nss/tst-nss-files-hosts-multi
> Can you provide more details (such as .out file contents)?  Even if 
> these failures are unrelated, they might point to something we need
> to fix.

bassalttest.out:

Didn't expect signal from child: got `Segmentation fault'


tst-thread_local1.out is empty, tst-thread_local1.test-result has:

FAIL: nptl/tst-thread_local1
original exit status 1

tst-nss-files-hosts-multi.out:

Timed out: killed the child process



Steve Ellcey
sellcey@cavium.com
Steve Ellcey Nov. 15, 2017, 10:09 p.m. UTC | #21
On Fri, 2017-11-10 at 16:11 -0700, Martin Sebor wrote:

> Okay, thanks.  I've read that you run periodic Glibc builds with
> the latest GCC so I wanted to give you a heads up on this change
> and the patch to avoid breaking them.  Unless that's not a concern
> (do you use -Werror in those builds?) what is the best way to
> coordinate this update?
> 
> I CC Paul for his input in case updating tzcode first is important.
> 
> Martin

Martin, were you going to update this patch with Joseph's comments and
resubmit?  The tz changes are checked in and the ifreq struct is taken
care of so all that remains are the changes to cdefs.h to define
__NONSTRING and then to use in utmp.h.  I don't have any comments on
that change beyond what Joseph made but I did do a build and run on
aarch64 with the latest GCC to verify that those changes work for me.

Steve Ellcey
sellcey@cavium.com
Martin Sebor Nov. 15, 2017, 11:05 p.m. UTC | #22
On 11/15/2017 03:09 PM, Steve Ellcey wrote:
> On Fri, 2017-11-10 at 16:11 -0700, Martin Sebor wrote:
>>
>> Okay, thanks.  I've read that you run periodic Glibc builds with
>> the latest GCC so I wanted to give you a heads up on this change
>> and the patch to avoid breaking them.  Unless that's not a concern
>> (do you use -Werror in those builds?) what is the best way to
>> coordinate this update?
>>
>> I CC Paul for his input in case updating tzcode first is important.
>>
>> Martin
>
> Martin, were you going to update this patch with Joseph's comments and
> resubmit?  The tz changes are checked in and the ifreq struct is taken
> care of so all that remains are the changes to cdefs.h to define
> __NONSTRING and then to use in utmp.h.  I don't have any comments on
> that change beyond what Joseph made but I did do a build and run on
> aarch64 with the latest GCC to verify that those changes work for me.

Sure.  I've got the attached patch.  According to the Linux utmp
man page:
        String fields are terminated by a null byte ('\0')
        if they are shorter than the size of the field.

The patch lets Glibc compile fine with the top of GCC 8 trunk.

As a heads up, the new attribute triggers warnings with my patch
that enhances GCC to check for string functions being passed
arrays declared with the attribute.  Some are false positives
that I'll try to handle in GCC but at least one of these looks
like a Glibc bug.  Assuming the GCC patch gets approved there
may be some more fallout for Glibc to deal with.

../sysdeps/unix/getlogin_r.c:83:23: warning: ‘strlen’ argument 1 
declared attribute ‘nonstring’ [-Wstringop-overflow=]
        size_t needed = strlen (ut->ut_user) + 1;
                        ^~~~~~~~~~~~~~~~~~~~
In file included from ../login/utmp.h:29,
                  from ../include/utmp.h:2,
                  from ../sysdeps/unix/getlogin_r.c:26,
                  from ../sysdeps/unix/sysv/linux/getlogin_r.c:25:
../sysdeps/gnu/bits/utmp.h:65:8: note: argument ‘ut_user’ declared here
    char ut_user[UT_NAMESIZE]
         ^~~~~~~

Martin
The -Wstringop-truncation option new in GCC 8 detects common misuses
of the strncat and strncpy function that may result in truncating
the copied string before the terminating NUL.  To avoid false positive
warnings for correct code that intentionally creates sequences of
characters that aren't guaranteed to be NUL-terminated, arrays that
are intended to store such sequences should be decorated with a new
nonstring attribute.  This change add this attribute to Glibc and
uses it to suppress such false positives.

ChangeLog:
	* misc/sys/cdefs.h (__attribute_nonstring__): New macro.
	* sysdeps/gnu/bits/utmp.h (struct utmp): Use it.

diff --git a/misc/sys/cdefs.h b/misc/sys/cdefs.h
index cfd39d5..a603cb9 100644
--- a/misc/sys/cdefs.h
+++ b/misc/sys/cdefs.h
@@ -407,6 +407,15 @@
 # endif
 #endif
 
+#if __GNUC_PREREQ (8, 0)
+/* Describes a char array whose address can safely be passed as the first
+   argument to strncpy and strncat, as the char array is not necessarily
+   a NUL-terminated string.  */
+# define __attribute_nonstring__ __attribute__ ((__nonstring__))
+#else
+# define __attribute_nonstring__
+#endif
+
 #if (!defined _Static_assert && !defined __cplusplus \
      && (defined __STDC_VERSION__ ? __STDC_VERSION__ : 0) < 201112 \
      && (!__GNUC_PREREQ (4, 6) || defined __STRICT_ANSI__))
diff --git a/sysdeps/gnu/bits/utmp.h b/sysdeps/gnu/bits/utmp.h
index 2ee11cb..71c9fa2 100644
--- a/sysdeps/gnu/bits/utmp.h
+++ b/sysdeps/gnu/bits/utmp.h
@@ -59,10 +59,13 @@ struct utmp
 {
   short int ut_type;           /* Type of login.  */
   pid_t ut_pid;                        /* Process ID of login process.  */
-  char ut_line[UT_LINESIZE];   /* Devicename.  */
+  char ut_line[UT_LINESIZE]
+    __attribute_nonstring__;   /* Devicename.  */
   char ut_id[4];               /* Inittab ID.  */
-  char ut_user[UT_NAMESIZE];   /* Username.  */
-  char ut_host[UT_HOSTSIZE];   /* Hostname for remote login.  */
+  char ut_user[UT_NAMESIZE]
+    __attribute_nonstring__;   /* Username.  */
+  char ut_host[UT_HOSTSIZE]
+    __attribute_nonstring__;   /* Hostname for remote login.  */
   struct exit_status ut_exit;  /* Exit status of a process marked
                                   as DEAD_PROCESS.  */
 /* The ut_session and ut_tv fields must be the same size when compiled
Joseph Myers Nov. 15, 2017, 11:14 p.m. UTC | #23
On Wed, 15 Nov 2017, Martin Sebor wrote:

> Sure.  I've got the attached patch.  According to the Linux utmp
> man page:
>        String fields are terminated by a null byte ('\0')
>        if they are shorter than the size of the field.
> 
> The patch lets Glibc compile fine with the top of GCC 8 trunk.

This patch is OK, with the corresponding change also to 
sysdeps/unix/sysv/linux/s390/bits/utmp.h unless you have reason to believe 
that version of bits/utmp.h doesn't need changing.
Adhemerval Zanella Netto Nov. 16, 2017, 12:37 a.m. UTC | #24
On 15/11/2017 15:58, Steve Ellcey wrote:
> On Wed, 2017-11-15 at 11:48 +0100, Florian Weimer wrote:
>>  
>>> FAIL: crypt/badsalttest
>>> FAIL: nptl/tst-thread_local1
>>> FAIL: nss/tst-nss-files-hosts-multi
>> Can you provide more details (such as .out file contents)?  Even if 
>> these failures are unrelated, they might point to something we need
>> to fix.
> 
> bassalttest.out:
> 
> Didn't expect signal from child: got `Segmentation fault'

This is something we should investigate.

> 
> 
> tst-thread_local1.out is empty, tst-thread_local1.test-result has:
> 
> FAIL: nptl/tst-thread_local1
> original exit status 1

I usually see it when I try with a toolchain that requires a GLIBCXX_*
version higher than then one installed on the system (for instance
the one generated by build-many-glibcs.py). The test should ran without
issues if you pass the path of built libstdc++ with the usual paths
in testrun.sh.

> 
> tst-nss-files-hosts-multi.out:
> 
> Timed out: killed the child process
> 
> 
> 
> Steve Ellcey
> sellcey@cavium.com
>
Szabolcs Nagy Nov. 17, 2017, 12:49 p.m. UTC | #25
On 15/11/17 17:58, Steve Ellcey wrote:
> tst-nss-files-hosts-multi.out:
> 
> Timed out: killed the child process
> 

i've seen this timeout too (in virtual machine),
it does 1.5M syscalls a third of which is slow socket syscalls.

i think that's a bit too much, less iterations should be enough.
Florian Weimer Nov. 17, 2017, 12:56 p.m. UTC | #26
On 11/17/2017 01:49 PM, Szabolcs Nagy wrote:
> On 15/11/17 17:58, Steve Ellcey wrote:
>> tst-nss-files-hosts-multi.out:
>>
>> Timed out: killed the child process
>>
> 
> i've seen this timeout too (in virtual machine),
> it does 1.5M syscalls a third of which is slow socket syscalls.
> 
> i think that's a bit too much, less iterations should be enough.

The buffer code in nss_files (and nss_db) is really sensitive to input 
sizes, and we have seen bugs which are triggered with very specific 
result sizes only, hence the the first loop with the small counts.

The 22222 count is arbitrary, but a fairly large value is needed to make 
the super-linear behavior of the old implementation immediately obvious. 
  An alternative would be to reduce the RLIMIT_AS limit considerably, 
but that might unduly affect architectures which are wasteful with 
address space.

Thanks,
Florian
Paul Eggert Nov. 21, 2017, 5:27 a.m. UTC | #27
Joseph Myers wrote:
> This patch is OK, with the corresponding change also to
> sysdeps/unix/sysv/linux/s390/bits/utmp.h unless you have reason to believe
> that version of bits/utmp.h doesn't need changing

Are similar changes also needed to bits/utmp.h and to 
sysdeps/unix/sysv/linux/s390/s390-32/utmp32.h? They also define these arrays.

Also, manual/users.texi should document that the relevant arrays are not 
necessarily null-terminated.

Also, what about struct utmpx? It has similarly-named fields, which also appear 
to use strncpy format. This affects sysdeps/unix/sysv/linux/s390/bits/utmpx.h, 
sysdeps/unix/sysv/linux/s390/s390-32/utmpx32.h, sysdeps/gnu/bits/utmpx.h, and 
manual/users.texi.
Joseph Myers Nov. 21, 2017, 1:38 p.m. UTC | #28
On Tue, 21 Nov 2017, Paul Eggert wrote:

> Joseph Myers wrote:
> > This patch is OK, with the corresponding change also to
> > sysdeps/unix/sysv/linux/s390/bits/utmp.h unless you have reason to believe
> > that version of bits/utmp.h doesn't need changing
> 
> Are similar changes also needed to bits/utmp.h and to
> sysdeps/unix/sysv/linux/s390/s390-32/utmp32.h? They also define these arrays.

bits/utmp.h is not used by any current glibc configuration, though it 
could reasonably be changed.  
sysdeps/unix/sysv/linux/s390/s390-32/utmp32.h is a purely internal header, 
not an installed one, only relevant for converting between different utmp 
formats.

> Also, what about struct utmpx? It has similarly-named fields, which also
> appear to use strncpy format. This affects

I don't see anything clear in the POSIX specification of utmpx.h about 
whether these fields are NUL-terminated.  In any case, that structure was 
not an observed cause of build failures.
Siddhesh Poyarekar Nov. 27, 2017, 11:05 a.m. UTC | #29
On Friday 10 November 2017 11:01 PM, Martin Sebor wrote:
> diff --git a/sysdeps/gnu/bits/utmp.h b/sysdeps/gnu/bits/utmp.h
> index 2ee11cb..d4885f3 100644
> --- a/sysdeps/gnu/bits/utmp.h
> +++ b/sysdeps/gnu/bits/utmp.h
> @@ -59,9 +59,13 @@ struct utmp
>  {
>    short int ut_type;		/* Type of login.  */
>    pid_t ut_pid;			/* Process ID of login process.  */
> +  __NONSTRING
>    char ut_line[UT_LINESIZE];	/* Devicename.  */
> +  __NONSTRING
>    char ut_id[4];		/* Inittab ID.  */
> +  __NONSTRING
>    char ut_user[UT_NAMESIZE];	/* Username.  */
> +  __NONSTRING
>    char ut_host[UT_HOSTSIZE];	/* Hostname for remote login.  */
>    struct exit_status ut_exit;	/* Exit status of a process marked
>  				   as DEAD_PROCESS.  */

This breaks builds with the latest gcc since your commit to warn on
non-string arguments to strlen:


commit 0c45740b611e0930073eeae00422a17c62c2d983
Author: msebor <msebor@138bc75d-0d04-0410-961f-82ee72b054a4>
Date:   Tue Nov 21 20:01:58 2017 +0000

    PR tree-optimization/82945 - add warning for passing non-strings to
functions that expect string arguments

    gcc/ChangeLog:

        PR tree-optimization/82945
        * builtins.c (expand_builtin_strlen): Call maybe_warn_nonstring_arg.
        * calls.h (maybe_warn_nonstring_arg): Declare new function.
        * calls.c (get_attr_nonstring_decl, maybe_warn_nonstring_arg): New
        functions.
        (initialize_argument_information): Call maybe_warn_nonstring_arg.
        * calls.h (get_attr_nonstring_decl): Declare new function.
        * doc/extend.texi (attribute nonstring): Update.
        * gimple-fold.c (gimple_fold_builtin_strncpy): Call
        get_attr_nonstring_decl and handle it.
        * tree-ssa-strlen.c (maybe_diag_stxncpy_trunc): Same.  Improve
        detection of nul-termination.
        (strlen_to_stridx): Change to a pointer.
        (handle_builtin_strlen, handle_builtin_stxncpy): Adjust.
        (pass_strlen::execute): Same.

    gcc/testsuite/ChangeLog:

        PR tree-optimization/82945
        * c-c++-common/Wstringop-truncation-2.c: New test.
        * c-c++-common/Wstringop-truncation.c: Adjust.
        * c-c++-common/attr-nonstring-2.c: Adjust.
        * c-c++-common/attr-nonstring-3.c: New test.


Siddhesh
Martin Sebor Nov. 27, 2017, 4:04 p.m. UTC | #30
On 11/27/2017 04:05 AM, Siddhesh Poyarekar wrote:
> On Friday 10 November 2017 11:01 PM, Martin Sebor wrote:
>> diff --git a/sysdeps/gnu/bits/utmp.h b/sysdeps/gnu/bits/utmp.h
>> index 2ee11cb..d4885f3 100644
>> --- a/sysdeps/gnu/bits/utmp.h
>> +++ b/sysdeps/gnu/bits/utmp.h
>> @@ -59,9 +59,13 @@ struct utmp
>>  {
>>    short int ut_type;		/* Type of login.  */
>>    pid_t ut_pid;			/* Process ID of login process.  */
>> +  __NONSTRING
>>    char ut_line[UT_LINESIZE];	/* Devicename.  */
>> +  __NONSTRING
>>    char ut_id[4];		/* Inittab ID.  */
>> +  __NONSTRING
>>    char ut_user[UT_NAMESIZE];	/* Username.  */
>> +  __NONSTRING
>>    char ut_host[UT_HOSTSIZE];	/* Hostname for remote login.  */
>>    struct exit_status ut_exit;	/* Exit status of a process marked
>>  				   as DEAD_PROCESS.  */
>
> This breaks builds with the latest gcc since your commit to warn on
> non-string arguments to strlen:

I didn't see any warnings in my x86_64-linux build but there could
be others on targets I don't test.  The only problem I do know of
(bug 22447) was fixed last Wednesday.

It would be helpful if you could share some details about the code
that triggers the warning, what target, etc.

Martin
Siddhesh Poyarekar Nov. 27, 2017, 4:24 p.m. UTC | #31
On Monday 27 November 2017 09:34 PM, Martin Sebor wrote:
> I didn't see any warnings in my x86_64-linux build but there could
> be others on targets I don't test.  The only problem I do know of
> (bug 22447) was fixed last Wednesday.
> 
> It would be helpful if you could share some details about the code
> that triggers the warning, what target, etc.

I saw this during an aarch64 cross-build on an x86_64 host:

In file included from ../sysdeps/unix/sysv/linux/getlogin_r.c:25:
../sysdeps/unix/getlogin_r.c: In function ‘getlogin_r_fd0’:
../sysdeps/unix/getlogin_r.c:83:23: error: ‘strlen’ argument 1 declared
attribute ‘nonstring’ [-Werror=stringop-overflow=]
       size_t needed = strlen (ut->ut_user) + 1;
                       ^~~~~~~~~~~~~~~~~~~~
In file included from ../login/utmp.h:29,
                 from ../include/utmp.h:2,
                 from ../sysdeps/unix/getlogin_r.c:26,
                 from ../sysdeps/unix/sysv/linux/getlogin_r.c:25:
../sysdeps/gnu/bits/utmp.h:65:8: note: argument ‘ut_user’ declared here
   char ut_user[UT_NAMESIZE]
        ^~~~~~~
cc1: all warnings being treated as errors

Siddhesh
Siddhesh Poyarekar Nov. 27, 2017, 4:26 p.m. UTC | #32
On Monday 27 November 2017 09:54 PM, Siddhesh Poyarekar wrote:
> On Monday 27 November 2017 09:34 PM, Martin Sebor wrote:
>> I didn't see any warnings in my x86_64-linux build but there could
>> be others on targets I don't test.  The only problem I do know of
>> (bug 22447) was fixed last Wednesday.
>>
>> It would be helpful if you could share some details about the code
>> that triggers the warning, what target, etc.
> 
> I saw this during an aarch64 cross-build on an x86_64 host:
> 
> In file included from ../sysdeps/unix/sysv/linux/getlogin_r.c:25:
> ../sysdeps/unix/getlogin_r.c: In function ‘getlogin_r_fd0’:
> ../sysdeps/unix/getlogin_r.c:83:23: error: ‘strlen’ argument 1 declared
> attribute ‘nonstring’ [-Werror=stringop-overflow=]
>        size_t needed = strlen (ut->ut_user) + 1;
>                        ^~~~~~~~~~~~~~~~~~~~
> In file included from ../login/utmp.h:29,
>                  from ../include/utmp.h:2,
>                  from ../sysdeps/unix/getlogin_r.c:26,
>                  from ../sysdeps/unix/sysv/linux/getlogin_r.c:25:
> ../sysdeps/gnu/bits/utmp.h:65:8: note: argument ‘ut_user’ declared here
>    char ut_user[UT_NAMESIZE]
>         ^~~~~~~
> cc1: all warnings being treated as errors

Ugh, sorry, that is in fact the fix.  I just noticed that I had not
synced to the HEAD.

Thanks,
Siddhesh
diff mbox series

Patch

The -Wstringop-truncation option new in GCC 8 detects common misuses
of the strncat and strncpy function that may result in truncating
the copied string before the terminating NUL.  To avoid false positive
warnings for correct code that intentionally creates sequences of
characters that aren't guaranteed to be NUL-terminated, arrays that
are intended to store such sequences should be decorated with a new
nonstring attribute.  This change add this attribute to Glibc and
uses it to suppress such false positives.

ChangeLog:
	* misc/sys/cdefs.h (__NONSTRING): New macro.
	* sysdeps/gnu/bits/utmp.h (struct utmp): Use it.
	* sysdeps/gnu/net/if.h (struct ifreq): Same.
	* timezone/tzfile.h (struct tzhead): Same.

diff --git a/misc/sys/cdefs.h b/misc/sys/cdefs.h
index cfd39d5..af6d1ac 100644
--- a/misc/sys/cdefs.h
+++ b/misc/sys/cdefs.h
@@ -407,6 +407,14 @@ 
 # endif
 #endif
 
+#if __GNUC__ >= 8
+/* Describes a char array that is not necessarily a NUL-terminated
+   string.  */
+# define __NONSTRING __attribute__ ((__nonstring__))
+#else
+# define __NONSTRING
+#endif
+
 #if (!defined _Static_assert && !defined __cplusplus \
      && (defined __STDC_VERSION__ ? __STDC_VERSION__ : 0) < 201112 \
      && (!__GNUC_PREREQ (4, 6) || defined __STRICT_ANSI__))
diff --git a/sysdeps/gnu/bits/utmp.h b/sysdeps/gnu/bits/utmp.h
index 2ee11cb..d4885f3 100644
--- a/sysdeps/gnu/bits/utmp.h
+++ b/sysdeps/gnu/bits/utmp.h
@@ -59,9 +59,13 @@  struct utmp
 {
   short int ut_type;		/* Type of login.  */
   pid_t ut_pid;			/* Process ID of login process.  */
+  __NONSTRING
   char ut_line[UT_LINESIZE];	/* Devicename.  */
+  __NONSTRING
   char ut_id[4];		/* Inittab ID.  */
+  __NONSTRING
   char ut_user[UT_NAMESIZE];	/* Username.  */
+  __NONSTRING
   char ut_host[UT_HOSTSIZE];	/* Hostname for remote login.  */
   struct exit_status ut_exit;	/* Exit status of a process marked
 				   as DEAD_PROCESS.  */
diff --git a/sysdeps/gnu/net/if.h b/sysdeps/gnu/net/if.h
index 0afce08..eb11813 100644
--- a/sysdeps/gnu/net/if.h
+++ b/sysdeps/gnu/net/if.h
@@ -129,6 +129,7 @@  struct ifreq
 # define IFNAMSIZ	IF_NAMESIZE
     union
       {
+	__NONSTRING
 	char ifrn_name[IFNAMSIZ];	/* Interface name, e.g. "en0".  */
       } ifr_ifrn;
 
diff --git a/timezone/tzfile.h b/timezone/tzfile.h
index 0e51dce..19fe20a 100644
--- a/timezone/tzfile.h
+++ b/timezone/tzfile.h
@@ -38,6 +38,7 @@ 
 #define	TZ_MAGIC	"TZif"
 
 struct tzhead {
+	__NONSTRING
 	char	tzh_magic[4];		/* TZ_MAGIC */
 	char	tzh_version[1];		/* '\0' or '2' or '3' as of 2013 */
 	char	tzh_reserved[15];	/* reserved; must be zero */