diff mbox series

[v2] socket: Implement sockaddr_storage with an anonymous union

Message ID 20230120134043.10247-1-alx@kernel.org
State New
Headers show
Series [v2] socket: Implement sockaddr_storage with an anonymous union | expand

Commit Message

Alejandro Colomar Jan. 20, 2023, 1:40 p.m. UTC
The historical design of `sockaddr_storage` makes it impossible to use
without breaking strict aliasing rules.  Not only this type is unusable,
but even the use of other `sockaddr_*` structures directly (when the
programmer only cares about a single address family) is also incorrect,
since at some point the structure will be accessed as a `sockaddr`, and
that breaks strict aliasing rules too.

So, the only way for a programmer to not invoke Undefined Behavior is to
declare a union that includes `sockaddr` and any `sockaddr_*` structures
that are of interest, which allows later accessing as either the correct
structure or plain `sockaddr` for the sa_family.

This patch fixes sockaddr_storage to remove UB on its uses and make it
that structure that everybody should be using.  It also allows removing
many casts in code that needs to pass a sockaddr as a side effect.

The following is an example of how this improves both existing code and
new code:

void
foo(foo)
{
    struct old_sockaddr_storage  oss;
    struct new_sockaddr_storage  nss;

    // ... (initialize oss and nss)

    inet_sockaddr2str(&nss.sa);  // correct (and has no casts)
    inet_sockaddr2str((struct sockaddr *)&oss);  // UB
    inet_sockaddr2str((struct sockaddr *)&nss);  // correct
}

/* This function is correct, as far as the accessed object has the
 * type we're using.  That's only possible through a `union`, since
 * we're accessing it with 2 different types: `sockaddr` for the
 * `sa_family` and then the appropriate subtype for the address
 * itself.
 */
const char *
inet_sockaddr2str(const struct sockaddr *sa)
{
    struct sockaddr_in   *sin;
    struct sockaddr_in6  *sin6;

    static char          buf[MAXHOSTNAMELEN];

    switch (sa->sa_family) {
    case AF_INET:
        sin = (struct sockaddr_in *) sa;
        inet_ntop(AF_INET, &sin->sin_addr, buf, NITEMS(buf));
        return buf;
    case AF_INET6:
        sin6 = (struct sockaddr_in6 *) sa;
        inet_ntop(AF_INET6, &sin6->sin6_addr, buf, NITEMS(buf));
        return buf;
    default:
        errno = EAFNOSUPPORT;
        return NULL;
    }
}

While it's not necessary to do the same for `sockaddr`, it might still
be interesting to so, since it will allow removing many casts in the
implementation of many libc functions.

Link: <https://lore.kernel.org/linux-man/2860541.uBSZ6KuyZf@portable-bastien/T/>
Link: <https://inbox.sourceware.org/libc-alpha/0f25d60f-f183-b518-b6c1-6d46aa63ee57@gmail.com/T/>
Link: <https://stackoverflow.com/a/42190913/6872717>
Link: <https://software.codidact.com/posts/287748>
Cc: Bastien Roucariès <rouca@debian.org>
Cc: Eric Blake <eblake@redhat.com>
Cc: Zack Weinberg <zack@owlfolio.org>
Cc: Stefan Puiu <stefan.puiu@gmail.com>
Cc: Igor Sysoev <igor@sysoev.ru>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---

v2:

-  Fix incorrect cast in commit message.

 bits/socket.h | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

Comments

Joseph Myers Jan. 20, 2023, 5:49 p.m. UTC | #1
On Fri, 20 Jan 2023, Alejandro Colomar via Libc-alpha wrote:

> This patch fixes sockaddr_storage to remove UB on its uses and make it
> that structure that everybody should be using.  It also allows removing
> many casts in code that needs to pass a sockaddr as a side effect.

This patch only changes the fallback version of bits/socket.h that would 
be used by a new OS port that doesn't have its own, not either of the 
versions that any current glibc port actually uses 
(sysdeps/unix/sysv/linux/bits/socket.h and 
sysdeps/mach/hurd/bits/socket.h).
Zack Weinberg Jan. 20, 2023, 6:04 p.m. UTC | #2
On Fri, Jan 20, 2023, at 8:40 AM, Alejandro Colomar wrote:
> The historical design of `sockaddr_storage` makes it impossible to use
> without breaking strict aliasing rules.  Not only this type is unusable,
> but even the use of other `sockaddr_*` structures directly (when the
> programmer only cares about a single address family) is also incorrect,
> since at some point the structure will be accessed as a `sockaddr`, and
> that breaks strict aliasing rules too.
>
> So, the only way for a programmer to not invoke Undefined Behavior is to
> declare a union that includes `sockaddr` and any `sockaddr_*` structures
> that are of interest, which allows later accessing as either the correct
> structure or plain `sockaddr` for the sa_family.

...

>     struct new_sockaddr_storage  nss;
>
>     // ... (initialize oss and nss)
>
>     inet_sockaddr2str(&nss.sa);  // correct (and has no casts)

I think we need to move slowly here and be _sure_ that any proposed change
does in fact reduce the amount of UB.  This construct, in particular, might
not actually be correct in practice: see https://godbolt.org/z/rn51cracn for
a case where, if I'm reading it right, the compiler assumes that a write
through a `struct fancy *` cannot alter the values accessible through a
`struct simple *` even though both pointers point into the same union.
(Test case provided by <https://stackoverflow.com/users/363751/supercat>;
I don't know any other identifier for them.)

zw
Alejandro Colomar Jan. 20, 2023, 7:25 p.m. UTC | #3
[CC += GCC]  // pun not intended :P

Hi Zack,

On 1/20/23 19:04, Zack Weinberg wrote:
> On Fri, Jan 20, 2023, at 8:40 AM, Alejandro Colomar wrote:
>> The historical design of `sockaddr_storage` makes it impossible to use
>> without breaking strict aliasing rules.  Not only this type is unusable,
>> but even the use of other `sockaddr_*` structures directly (when the
>> programmer only cares about a single address family) is also incorrect,
>> since at some point the structure will be accessed as a `sockaddr`, and
>> that breaks strict aliasing rules too.
>>
>> So, the only way for a programmer to not invoke Undefined Behavior is to
>> declare a union that includes `sockaddr` and any `sockaddr_*` structures
>> that are of interest, which allows later accessing as either the correct
>> structure or plain `sockaddr` for the sa_family.
> 
> ...
> 
>>      struct new_sockaddr_storage  nss;
>>
>>      // ... (initialize oss and nss)
>>
>>      inet_sockaddr2str(&nss.sa);  // correct (and has no casts)
> 
> I think we need to move slowly here and be _sure_ that any proposed change
> does in fact reduce the amount of UB.

Sure, I'm just sending the patch to polish the idea around some concrete code. 
While I checked that it compiles, I didn't add any tests about it or anything, 
to see that it's usable (and Joseph's email showed me that it's far from being 
finished).  I expect that this'll take some time.


>  This construct, in particular, might
> not actually be correct in practice: see https://godbolt.org/z/rn51cracn for
> a case where, if I'm reading it right, the compiler assumes that a write
> through a `struct fancy *` cannot alter the values accessible through a
> `struct simple *` even though both pointers point into the same union.
> (Test case provided by <https://stackoverflow.com/users/363751/supercat>;

I @supercat around?  Maybe you want to open a question on SO (or Codidact) and 
we can discuss it there; it would be interesting.  :)

About the program, I have doubts.  It's more or less what I asked on Codidact 
yesterday[1].  I can't find anything in the standard to support GCC's behavior, 
and am inclined to think that it's a compiler bug.  @Lundin's answer[2] seems 
reasonable.  But still, my understanding until yesterday seemed to be in line 
with the compiler behavior that you showed: the compiler sees a pointer to a 
different type, and assumes things.

I believe it's fine according to the common initial sequence rule in 
C11::6.5.2.3/6 <https://port70.net/%7Ensz/c/c11/n1570.html#6.5.2.3p6>
in combination with the pointer conversion rule in C11::6.3.2.3/7 
<https://port70.net/%7Ensz/c/c11/n1570.html#6.3.2.3p7>.

The test you showed in godbolt shows two behaviors: sometimes it prints 1/3 
(this is correct; it happens for high values of -O) and sometimes it prints 3/3 
(this is invalid; it happens for low values of -O).  BTW, I'm a bit surprised 
that GCC optimizes out the local in -O0 or even -Og.

What I guess that GCC is doing in the invalid case is that in main() it sees 
that myThing is modified in vtest(), and then the structure is only used through 
a pointer to a different member, which according to 6.5.2.3/6 is only allowed 
for reading but not writing, so it assumes that loading the structure again at 
printf() will be fine, so it optimizes out the local.  However, GCC didn't take 
into account that the pointer can later be casted back to the right member, and 
that way it's allowed to modify it, according to 6.3.2.3/7.

I had dropped GCC from the CC in the email to not spam them too much, but since 
this concerns them, I'll keep them in the loop from now on.



[1]:  <https://software.codidact.com/posts/287748>

[2]:  <https://software.codidact.com/posts/287748/287750#answer-287750>

> I don't know any other identifier for them.)
> 
> zw

Cheers,

Alex
Alejandro Colomar Jan. 20, 2023, 7:26 p.m. UTC | #4
Hi Joseph,

On 1/20/23 18:49, Joseph Myers wrote:
> On Fri, 20 Jan 2023, Alejandro Colomar via Libc-alpha wrote:
> 
>> This patch fixes sockaddr_storage to remove UB on its uses and make it
>> that structure that everybody should be using.  It also allows removing
>> many casts in code that needs to pass a sockaddr as a side effect.
> 
> This patch only changes the fallback version of bits/socket.h that would
> be used by a new OS port that doesn't have its own, not either of the
> versions that any current glibc port actually uses
> (sysdeps/unix/sysv/linux/bits/socket.h and
> sysdeps/mach/hurd/bits/socket.h).

Hi Joseph!

Thanks; I didn't notice about those.  It seems that patching those is not so 
trivial, since bits/socket.h is included by the headers that provide the 
specialized sockaddr_* types.  I guess we need to move sockaddr_{in,in6,un} to 
bits/ headers, right?

Cheers,

Alex

>
Bastien Roucariès Jan. 20, 2023, 8:32 p.m. UTC | #5
Le vendredi 20 janvier 2023, 13:40:44 UTC Alejandro Colomar a écrit :
> The historical design of `sockaddr_storage` makes it impossible to use
> without breaking strict aliasing rules.  Not only this type is unusable,
> but even the use of other `sockaddr_*` structures directly (when the
> programmer only cares about a single address family) is also incorrect,
> since at some point the structure will be accessed as a `sockaddr`, and
> that breaks strict aliasing rules too.
> 
> So, the only way for a programmer to not invoke Undefined Behavior is to
> declare a union that includes `sockaddr` and any `sockaddr_*` structures
> that are of interest, which allows later accessing as either the correct
> structure or plain `sockaddr` for the sa_family.
> 
> This patch fixes sockaddr_storage to remove UB on its uses and make it
> that structure that everybody should be using.  It also allows removing
> many casts in code that needs to pass a sockaddr as a side effect.
> 
> The following is an example of how this improves both existing code and
> new code:
> 
> void
> foo(foo)
> {
>     struct old_sockaddr_storage  oss;
>     struct new_sockaddr_storage  nss;
> 
>     // ... (initialize oss and nss)
> 
>     inet_sockaddr2str(&nss.sa);  // correct (and has no casts)
>     inet_sockaddr2str((struct sockaddr *)&oss);  // UB
>     inet_sockaddr2str((struct sockaddr *)&nss);  // correct
> }
> 
> /* This function is correct, as far as the accessed object has the
>  * type we're using.  That's only possible through a `union`, since
>  * we're accessing it with 2 different types: `sockaddr` for the
>  * `sa_family` and then the appropriate subtype for the address
>  * itself.
>  */
> const char *
> inet_sockaddr2str(const struct sockaddr *sa)
> {
>     struct sockaddr_in   *sin;
>     struct sockaddr_in6  *sin6;
> 
>     static char          buf[MAXHOSTNAMELEN];
> 
>     switch (sa->sa_family) {
>     case AF_INET:
>         sin = (struct sockaddr_in *) sa;
>         inet_ntop(AF_INET, &sin->sin_addr, buf, NITEMS(buf));
>         return buf;
>     case AF_INET6:
>         sin6 = (struct sockaddr_in6 *) sa;
>         inet_ntop(AF_INET6, &sin6->sin6_addr, buf, NITEMS(buf));
>         return buf;
>     default:
>         errno = EAFNOSUPPORT;
>         return NULL;
>     }
> }
> 
> While it's not necessary to do the same for `sockaddr`, it might still
> be interesting to so, since it will allow removing many casts in the
> implementation of many libc functions.
> 
> Link: <https://lore.kernel.org/linux-man/2860541.uBSZ6KuyZf@portable-bastien/T/>
> Link: <https://inbox.sourceware.org/libc-alpha/0f25d60f-f183-b518-b6c1-6d46aa63ee57@gmail.com/T/>
> Link: <https://stackoverflow.com/a/42190913/6872717>
> Link: <https://software.codidact.com/posts/287748>
> Cc: Bastien Roucariès <rouca@debian.org>
> Cc: Eric Blake <eblake@redhat.com>
> Cc: Zack Weinberg <zack@owlfolio.org>
> Cc: Stefan Puiu <stefan.puiu@gmail.com>
> Cc: Igor Sysoev <igor@sysoev.ru>
> Signed-off-by: Alejandro Colomar <alx@kernel.org>
> ---
> 
> v2:
> 
> -  Fix incorrect cast in commit message.
> 
>  bits/socket.h | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/bits/socket.h b/bits/socket.h
> index aac8c49b00..c0c23b4e84 100644
> --- a/bits/socket.h
> +++ b/bits/socket.h
> @@ -168,9 +168,14 @@ struct sockaddr
>  
>  struct sockaddr_storage
>    {
> -    __SOCKADDR_COMMON (ss_);	/* Address family, etc.  */
> -    char __ss_padding[_SS_PADSIZE];
> -    __ss_aligntype __ss_align;	/* Force desired alignment.  */
no this is not correct you break ABI by reducing size
> +    union
> +      {
> +        __SOCKADDR_COMMON (ss_);	/* Address family, etc.  */
> +        struct sockaddr      sa;
> +        struct sockaddr_in   sin;
> +        struct sockaddr_in6  sin6;
> +        struct sockaddr_un   sun;
> +      };
>    };

Correct one structure is

struct __private_sock_storage {
    __SOCKADDR_COMMON (ssprivate_);   /* Address family, etc. */
    char __ss_padding[_SS_PADSIZE];
    __ss_aligntype __ss_align; /* Force desired alignment. */
}

 struct sockaddr_storage
   {
       union
      {
         __SOCKADDR_COMMON (ss_);       /* Address family, etc. */
        struct sockaddr      sa;
         struct sockaddr_in   sin;
        struct sockaddr_in6  sin6;
        struct sockaddr_un   sun;
        struct __private_sock_storage _private;
      };
};

May it could be dropped later using align construct for modern C and padding

Bastien
>  
>  
>
Alejandro Colomar Jan. 20, 2023, 8:38 p.m. UTC | #6
Hi Bastien,

On 1/20/23 21:32, Bastien Roucariès wrote:
[...]

>> diff --git a/bits/socket.h b/bits/socket.h
>> index aac8c49b00..c0c23b4e84 100644
>> --- a/bits/socket.h
>> +++ b/bits/socket.h
>> @@ -168,9 +168,14 @@ struct sockaddr
>>   
>>   struct sockaddr_storage
>>     {
>> -    __SOCKADDR_COMMON (ss_);	/* Address family, etc.  */
>> -    char __ss_padding[_SS_PADSIZE];
>> -    __ss_aligntype __ss_align;	/* Force desired alignment.  */
> no this is not correct you break ABI by reducing size
>> +    union
>> +      {
>> +        __SOCKADDR_COMMON (ss_);	/* Address family, etc.  */
>> +        struct sockaddr      sa;
>> +        struct sockaddr_in   sin;
>> +        struct sockaddr_in6  sin6;
>> +        struct sockaddr_un   sun;
>> +      };
>>     };
> 
> Correct one structure is
> 
> struct __private_sock_storage {
>      __SOCKADDR_COMMON (ssprivate_);   /* Address family, etc. */
>      char __ss_padding[_SS_PADSIZE];
>      __ss_aligntype __ss_align; /* Force desired alignment. */
> }

What is this structure for?  I expect that it's for declaring a wide-enough and 
correctly aligned type, but the union containing all the other types already 
guarantees a size as wide as any other sockaddr_* and with the widest alignment.

Also, any member that is necessary for superalignment or padding could be added 
at the end of sockaddr_storage, after the anon union; you don't need the extra 
struct, I guess.

Right?

> 
>   struct sockaddr_storage
>     {
>         union
>        {
>           __SOCKADDR_COMMON (ss_);       /* Address family, etc. */
>          struct sockaddr      sa;
>           struct sockaddr_in   sin;
>          struct sockaddr_in6  sin6;
>          struct sockaddr_un   sun;
>          struct __private_sock_storage _private;
>        };
> };
> 
> May it could be dropped later using align construct for modern C and padding
> 

Cheers,

Alex

> Bastien
>>   
>>   
>>
>
Bastien Roucariès Jan. 20, 2023, 8:46 p.m. UTC | #7
Le vendredi 20 janvier 2023, 20:38:32 UTC Alejandro Colomar a écrit :
> Hi Bastien,
> 
> On 1/20/23 21:32, Bastien Roucariès wrote:
> [...]
> 
> >> diff --git a/bits/socket.h b/bits/socket.h
> >> index aac8c49b00..c0c23b4e84 100644
> >> --- a/bits/socket.h
> >> +++ b/bits/socket.h
> >> @@ -168,9 +168,14 @@ struct sockaddr
> >>   
> >>   struct sockaddr_storage
> >>     {
> >> -    __SOCKADDR_COMMON (ss_);	/* Address family, etc.  */
> >> -    char __ss_padding[_SS_PADSIZE];
> >> -    __ss_aligntype __ss_align;	/* Force desired alignment.  */
> > no this is not correct you break ABI by reducing size
> >> +    union
> >> +      {
> >> +        __SOCKADDR_COMMON (ss_);	/* Address family, etc.  */
> >> +        struct sockaddr      sa;
> >> +        struct sockaddr_in   sin;
> >> +        struct sockaddr_in6  sin6;
> >> +        struct sockaddr_un   sun;
> >> +      };
> >>     };
> > 
> > Correct one structure is
> > 
> > struct __private_sock_storage {
> >      __SOCKADDR_COMMON (ssprivate_);   /* Address family, etc. */
> >      char __ss_padding[_SS_PADSIZE];
> >      __ss_aligntype __ss_align; /* Force desired alignment. */
> > }
> 
> What is this structure for?  I expect that it's for declaring a wide-enough and 
> correctly aligned type, but the union containing all the other types already 
> guarantees a size as wide as any other sockaddr_* and with the widest alignment.
> 
> Also, any member that is necessary for superalignment or padding could be added 
> at the end of sockaddr_storage, after the anon union; you don't need the extra 
> struct, I guess.

No we need it, max of structure is struct sockaddr_un   sun and is size of 108.
sizeof(sockaddr_storage) is 128...

Did you see the line of the kernel source I send you ? kernel expect size of 109 for un aka we should pad by a nul byte...

I think it is safer in a first step, to keep the old structure... Maybe later simplify

Did you also see 
https://github.com/bminor/glibc/blob/master/socket/sys/socket.h#L63

Bastien


> 
> Right?
> 
> > 
> >   struct sockaddr_storage
> >     {
> >         union
> >        {
> >           __SOCKADDR_COMMON (ss_);       /* Address family, etc. */
> >          struct sockaddr      sa;
> >           struct sockaddr_in   sin;
> >          struct sockaddr_in6  sin6;
> >          struct sockaddr_un   sun;
> >          struct __private_sock_storage _private;
> >        };
> > };
> > 
> > May it could be dropped later using align construct for modern C and padding
> > 
> 
> Cheers,
> 
> Alex
> 
> > Bastien
> >>   
> >>   
> >>
> > 
> 
> -- 
> <http://www.alejandro-colomar.es/>
>
Alejandro Colomar Jan. 20, 2023, 8:51 p.m. UTC | #8
On 1/20/23 21:46, Bastien Roucariès wrote:
> Le vendredi 20 janvier 2023, 20:38:32 UTC Alejandro Colomar a écrit :
>> Hi Bastien,
>>
>> On 1/20/23 21:32, Bastien Roucariès wrote:
>> [...]
>>
>>>> diff --git a/bits/socket.h b/bits/socket.h
>>>> index aac8c49b00..c0c23b4e84 100644
>>>> --- a/bits/socket.h
>>>> +++ b/bits/socket.h
>>>> @@ -168,9 +168,14 @@ struct sockaddr
>>>>    
>>>>    struct sockaddr_storage
>>>>      {
>>>> -    __SOCKADDR_COMMON (ss_);	/* Address family, etc.  */
>>>> -    char __ss_padding[_SS_PADSIZE];
>>>> -    __ss_aligntype __ss_align;	/* Force desired alignment.  */
>>> no this is not correct you break ABI by reducing size
>>>> +    union
>>>> +      {
>>>> +        __SOCKADDR_COMMON (ss_);	/* Address family, etc.  */
>>>> +        struct sockaddr      sa;
>>>> +        struct sockaddr_in   sin;
>>>> +        struct sockaddr_in6  sin6;
>>>> +        struct sockaddr_un   sun;
>>>> +      };
>>>>      };
>>>
>>> Correct one structure is
>>>
>>> struct __private_sock_storage {
>>>       __SOCKADDR_COMMON (ssprivate_);   /* Address family, etc. */
>>>       char __ss_padding[_SS_PADSIZE];
>>>       __ss_aligntype __ss_align; /* Force desired alignment. */
>>> }
>>
>> What is this structure for?  I expect that it's for declaring a wide-enough and
>> correctly aligned type, but the union containing all the other types already
>> guarantees a size as wide as any other sockaddr_* and with the widest alignment.
>>
>> Also, any member that is necessary for superalignment or padding could be added
>> at the end of sockaddr_storage, after the anon union; you don't need the extra
>> struct, I guess.
> 
> No we need it, max of structure is struct sockaddr_un   sun and is size of 108.
> sizeof(sockaddr_storage) is 128...
> 
> Did you see the line of the kernel source I send you ? kernel expect size of 109 for un aka we should pad by a nul byte...

Yes, I saw it.  But that line from the kernel is already Undefined Behavior. 
The correct fix should go to sockaddr_un, not sockaddr_storage, IMO.

However, applying this change to sockaddr_storage would expose that kernel bug, 
so I think a prepatch to sockaddr_un that adds a padding byte to sockaddr_un 
would make sense.

struct sockaddr_un {
	__kernel_sa_family_t sun_family; /* AF_UNIX */
	char sun_path[UNIX_PATH_MAX];	/* pathname */
	char __null;  // make sure sun_path is terminated
};


> 
> I think it is safer in a first step, to keep the old structure... Maybe later simplify
> 
> Did you also see
> https://github.com/bminor/glibc/blob/master/socket/sys/socket.h#L63

Heh, I didn't see that one :)
I'll take it into account for a revision of the patch.

Cheers,

Alex
Alejandro Colomar Jan. 21, 2023, 2:38 a.m. UTC | #9
Hi Zack,

On 1/20/23 20:25, Alejandro Colomar wrote:
> [CC += GCC]  // pun not intended :P
> 
> Hi Zack,
> 
> On 1/20/23 19:04, Zack Weinberg wrote:
>> On Fri, Jan 20, 2023, at 8:40 AM, Alejandro Colomar wrote:
>>> The historical design of `sockaddr_storage` makes it impossible to use
>>> without breaking strict aliasing rules.  Not only this type is unusable,
>>> but even the use of other `sockaddr_*` structures directly (when the
>>> programmer only cares about a single address family) is also incorrect,
>>> since at some point the structure will be accessed as a `sockaddr`, and
>>> that breaks strict aliasing rules too.
>>>
>>> So, the only way for a programmer to not invoke Undefined Behavior is to
>>> declare a union that includes `sockaddr` and any `sockaddr_*` structures
>>> that are of interest, which allows later accessing as either the correct
>>> structure or plain `sockaddr` for the sa_family.
>>
>> ...
>>
>>>      struct new_sockaddr_storage  nss;
>>>
>>>      // ... (initialize oss and nss)
>>>
>>>      inet_sockaddr2str(&nss.sa);  // correct (and has no casts)
>>
>> I think we need to move slowly here and be _sure_ that any proposed change
>> does in fact reduce the amount of UB.
> 
> Sure, I'm just sending the patch to polish the idea around some concrete code. 
> While I checked that it compiles, I didn't add any tests about it or anything, 
> to see that it's usable (and Joseph's email showed me that it's far from being 
> finished).  I expect that this'll take some time.
> 
> 
>>  This construct, in particular, might
>> not actually be correct in practice: see https://godbolt.org/z/rn51cracn for
>> a case where, if I'm reading it right, the compiler assumes that a write
>> through a `struct fancy *` cannot alter the values accessible through a
>> `struct simple *` even though both pointers point into the same union.
>> (Test case provided by <https://stackoverflow.com/users/363751/supercat>;
> 

[...]

I was wrong in my guess; the correct output is 3/3; I think I had read it the 
other way around.  So yes, I believe it's doing what you just wrote there, but 
can't understand why.

I reduced @supercat's example to a smaller reproducer program (I couldn't 
minimize it any more than this; any further simplification removes the incorrect 
behavior):

#include <stdio.h>

struct a { int y[1];};
struct b { int y[1];};
union u  { struct a a; struct b b; };


int read_a(struct a *a)
{
     return a->y[0];
}

void write_b(struct b *b, int j)
{
     b->y[j] = 2;
}

int use_union(union u *u, int j)
{
     if (u->a.y[0] == 0)
         write_b(&u->b, j);
         //write_b((struct b *)u, j);   // this has the same issue
     return read_a(&u->a);
     return read_a((struct a *)u);      // this has the same issue
}

int (*volatile vtest)(union u *u, int j) = use_union;

int main(void)
{
     int       r1, r2;
     union u   u;
     struct b  b = {0};

     u.b = b;
     r1 = vtest(&u, 0);
     r2 = u.a.y[0];

     printf("%d/%d\n", r1, r2);
}


Cheers,

Alex
Alejandro Colomar Jan. 21, 2023, 3:17 a.m. UTC | #10
Hi Zack,

On 1/21/23 03:38, Alejandro Colomar wrote:
> Hi Zack,
> 
> On 1/20/23 20:25, Alejandro Colomar wrote:
>> [CC += GCC]  // pun not intended :P
>>
>> Hi Zack,
>>
>> On 1/20/23 19:04, Zack Weinberg wrote:
>>> On Fri, Jan 20, 2023, at 8:40 AM, Alejandro Colomar wrote:
>>>> The historical design of `sockaddr_storage` makes it impossible to use
>>>> without breaking strict aliasing rules.  Not only this type is unusable,
>>>> but even the use of other `sockaddr_*` structures directly (when the
>>>> programmer only cares about a single address family) is also incorrect,
>>>> since at some point the structure will be accessed as a `sockaddr`, and
>>>> that breaks strict aliasing rules too.
>>>>
>>>> So, the only way for a programmer to not invoke Undefined Behavior is to
>>>> declare a union that includes `sockaddr` and any `sockaddr_*` structures
>>>> that are of interest, which allows later accessing as either the correct
>>>> structure or plain `sockaddr` for the sa_family.
>>>
>>> ...
>>>
>>>>      struct new_sockaddr_storage  nss;
>>>>
>>>>      // ... (initialize oss and nss)
>>>>
>>>>      inet_sockaddr2str(&nss.sa);  // correct (and has no casts)
>>>
>>> I think we need to move slowly here and be _sure_ that any proposed change
>>> does in fact reduce the amount of UB.
>>
>> Sure, I'm just sending the patch to polish the idea around some concrete code. 
>> While I checked that it compiles, I didn't add any tests about it or anything, 
>> to see that it's usable (and Joseph's email showed me that it's far from being 
>> finished).  I expect that this'll take some time.
>>
>>
>>>  This construct, in particular, might
>>> not actually be correct in practice: see https://godbolt.org/z/rn51cracn for
>>> a case where, if I'm reading it right, the compiler assumes that a write
>>> through a `struct fancy *` cannot alter the values accessible through a
>>> `struct simple *` even though both pointers point into the same union.
>>> (Test case provided by <https://stackoverflow.com/users/363751/supercat>;
>>
> 
> [...]
> 
> I was wrong in my guess; the correct output is 3/3; I think I had read it the 
> other way around.  So yes, I believe it's doing what you just wrote there, but 
> can't understand why.
> 
> I reduced @supercat's example to a smaller reproducer program (I couldn't 
> minimize it any more than this; any further simplification removes the incorrect 
> behavior):
> 
> #include <stdio.h>
> 
> struct a { int y[1];};
> struct b { int y[1];};
> union u  { struct a a; struct b b; };
> 
> 
> int read_a(struct a *a)
> {
>      return a->y[0];
> }
> 
> void write_b(struct b *b, int j)
> {
>      b->y[j] = 2;
> }
> 
> int use_union(union u *u, int j)
> {
>      if (u->a.y[0] == 0)
>          write_b(&u->b, j);
>          //write_b((struct b *)u, j);   // this has the same issue
>      return read_a(&u->a);
>      return read_a((struct a *)u);      // this has the same issue
> }
> 
> int (*volatile vtest)(union u *u, int j) = use_union;
> 
> int main(void)
> {
>      int       r1, r2;
>      union u   u;
>      struct b  b = {0};
> 
>      u.b = b;
>      r1 = vtest(&u, 0);
>      r2 = u.a.y[0];
> 
>      printf("%d/%d\n", r1, r2);
> }


Ahh, indeed it seems to be UB.  It's in the same 6.5.2.3/6:  there's a 
requirement that the information about the union is kept in the function in 
which it's accessed.

The standard presents an example, which is a bit ambiguous:

      The following is not a valid fragment (because the union type is not 
visible within function f):

           struct t1 { int m; };
           struct t2 { int m; };
           int f(struct t1 *p1, struct t2 *p2)
           {
                 if (p1->m < 0)
                         p2->m = -p2->m;
                 return p1->m;
           }
           int g()
           {
                 union {
                         struct t1 s1;
                         struct t2 s2;
                 } u;
                 /* ... */
                 return f(&u.s1, &u.s2);
           }

I don't know what's the intention if the union type was global but the variable 
`u` was still not seen by f().  But it seems GCC's interpretation is UB, 
according to the test we just saw.

The solution that I can see for that is making sockaddr also be a union.  That 
way, the union is kept across all calls (since they all use sockaddr).

struct sockaddr {
	union {
		struct {
			sa_family_t  sa_family;
			char         sa_data[14];  // why 14?
		}
		struct sockaddr_in   sin;
		struct sockaddr_in6  sin6;
		struct sockaddr_un   sun;
	};
};


struct sockaddr_storage {
	union {
		sa_family_t          ss_family;
		struct sockaddr      sa;
	};
};


With this, sockaddr_storage becomes useless, but still usable.  New code, could 
just use sockaddr and use the internal union members as necessary.  Union info 
is kept across all function boundaries.

Cheers,

Alex
Bastien Roucariès Jan. 21, 2023, 1:30 p.m. UTC | #11
Le samedi 21 janvier 2023, 03:17:39 UTC Alejandro Colomar a écrit :
> Hi Zack,
> 
> On 1/21/23 03:38, Alejandro Colomar wrote:
> > Hi Zack,
> > 
> > On 1/20/23 20:25, Alejandro Colomar wrote:
> >> [CC += GCC]  // pun not intended :P
> >>
> >> Hi Zack,
> >>
> >> On 1/20/23 19:04, Zack Weinberg wrote:
> >>> On Fri, Jan 20, 2023, at 8:40 AM, Alejandro Colomar wrote:
> >>>> The historical design of `sockaddr_storage` makes it impossible to use
> >>>> without breaking strict aliasing rules.  Not only this type is unusable,
> >>>> but even the use of other `sockaddr_*` structures directly (when the
> >>>> programmer only cares about a single address family) is also incorrect,
> >>>> since at some point the structure will be accessed as a `sockaddr`, and
> >>>> that breaks strict aliasing rules too.
> >>>>
> >>>> So, the only way for a programmer to not invoke Undefined Behavior is to
> >>>> declare a union that includes `sockaddr` and any `sockaddr_*` structures
> >>>> that are of interest, which allows later accessing as either the correct
> >>>> structure or plain `sockaddr` for the sa_family.
> >>>
> >>> ...
> >>>
> >>>>      struct new_sockaddr_storage  nss;
> >>>>
> >>>>      // ... (initialize oss and nss)
> >>>>
> >>>>      inet_sockaddr2str(&nss.sa);  // correct (and has no casts)
> >>>
> >>> I think we need to move slowly here and be _sure_ that any proposed change
> >>> does in fact reduce the amount of UB.
> >>
> >> Sure, I'm just sending the patch to polish the idea around some concrete code. 
> >> While I checked that it compiles, I didn't add any tests about it or anything, 
> >> to see that it's usable (and Joseph's email showed me that it's far from being 
> >> finished).  I expect that this'll take some time.
> >>
> >>
> >>>  This construct, in particular, might
> >>> not actually be correct in practice: see https://godbolt.org/z/rn51cracn for
> >>> a case where, if I'm reading it right, the compiler assumes that a write
> >>> through a `struct fancy *` cannot alter the values accessible through a
> >>> `struct simple *` even though both pointers point into the same union.
> >>> (Test case provided by <https://stackoverflow.com/users/363751/supercat>;
> >>
> > 
> > [...]
> > 
> > I was wrong in my guess; the correct output is 3/3; I think I had read it the 
> > other way around.  So yes, I believe it's doing what you just wrote there, but 
> > can't understand why.
> > 
> > I reduced @supercat's example to a smaller reproducer program (I couldn't 
> > minimize it any more than this; any further simplification removes the incorrect 
> > behavior):
> > 
> > #include <stdio.h>
> > 
> > struct a { int y[1];};
> > struct b { int y[1];};
> > union u  { struct a a; struct b b; };
> > 
> > 
> > int read_a(struct a *a)
> > {
> >      return a->y[0];
> > }
> > 
> > void write_b(struct b *b, int j)
> > {
> >      b->y[j] = 2;
> > }
> > 
> > int use_union(union u *u, int j)
> > {
> >      if (u->a.y[0] == 0)
> >          write_b(&u->b, j);
> >          //write_b((struct b *)u, j);   // this has the same issue
> >      return read_a(&u->a);
> >      return read_a((struct a *)u);      // this has the same issue
> > }
> > 
> > int (*volatile vtest)(union u *u, int j) = use_union;
> > 
> > int main(void)
> > {
> >      int       r1, r2;
> >      union u   u;
> >      struct b  b = {0};
> > 
> >      u.b = b;
> >      r1 = vtest(&u, 0);
> >      r2 = u.a.y[0];
> > 
> >      printf("%d/%d\n", r1, r2);
> > }
> 
> 
> Ahh, indeed it seems to be UB.  It's in the same 6.5.2.3/6:  there's a 
> requirement that the information about the union is kept in the function in 
> which it's accessed.
> 
> The standard presents an example, which is a bit ambiguous:
> 
>       The following is not a valid fragment (because the union type is not 
> visible within function f):
> 
>            struct t1 { int m; };
>            struct t2 { int m; };
>            int f(struct t1 *p1, struct t2 *p2)
>            {
>                  if (p1->m < 0)
>                          p2->m = -p2->m;
>                  return p1->m;
>            }
>            int g()
>            {
>                  union {
>                          struct t1 s1;
>                          struct t2 s2;
>                  } u;
>                  /* ... */
>                  return f(&u.s1, &u.s2);
>            }
> 
> I don't know what's the intention if the union type was global but the variable 
> `u` was still not seen by f().  But it seems GCC's interpretation is UB, 
> according to the test we just saw.
> 
> The solution that I can see for that is making sockaddr also be a union.  That 
> way, the union is kept across all calls (since they all use sockaddr).
> 
> struct sockaddr {
> 	union {
> 		struct {
> 			sa_family_t  sa_family;
> 			char         sa_data[14];  // why 14?
> 		}
> 		struct sockaddr_in   sin;
> 		struct sockaddr_in6  sin6;
> 		struct sockaddr_un   sun;
> 	};
> };

No the solution is to avoid sockaddr and mark as deprecated. The problem it should be part of union without raising a warning each time we use a safe type...

The other solution is to render public  and ABI stable the type here
https://github.com/bminor/glibc/blob/ae612c45efb5e34713859a5facf92368307efb6e/socket/sys/socket.h#L78
under for instance sockaddr_ptr and sockaddr_const_ptr

Moreover this are some patch arch by arch
https://sourceware.org/legacy-ml/libc-alpha/2016-02/msg00340.html that should be made default 

Bastien



> 
> struct sockaddr_storage {
> 	union {
> 		sa_family_t          ss_family;
> 		struct sockaddr      sa;
> 	};
> };
> 
> 
> With this, sockaddr_storage becomes useless, but still usable.  New code, could 
> just use sockaddr and use the internal union members as necessary.  Union info 
> is kept across all function boundaries.
> 
> Cheers,
> 
> Alex
> 
> -- 
> <http://www.alejandro-colomar.es/>
>
Alejandro Colomar Jan. 21, 2023, 2:30 p.m. UTC | #12
Hi Bastien,

On 1/21/23 14:30, Bastien Roucariès wrote:

[...]

>> Ahh, indeed it seems to be UB.  It's in the same 6.5.2.3/6:  there's a
>> requirement that the information about the union is kept in the function in
>> which it's accessed.
>>
>> The standard presents an example, which is a bit ambiguous:
>>
>>        The following is not a valid fragment (because the union type is not
>> visible within function f):
>>

[...]

>>
>> I don't know what's the intention if the union type was global but the variable
>> `u` was still not seen by f().  But it seems GCC's interpretation is UB,
>> according to the test we just saw.
>>
>> The solution that I can see for that is making sockaddr also be a union.  That
>> way, the union is kept across all calls (since they all use sockaddr).
>>
>> struct sockaddr {
>> 	union {
>> 		struct {
>> 			sa_family_t  sa_family;
>> 			char         sa_data[14];  // why 14?
>> 		}
>> 		struct sockaddr_in   sin;
>> 		struct sockaddr_in6  sin6;
>> 		struct sockaddr_un   sun;
>> 	};
>> };
> 
> No the solution is to avoid sockaddr and mark as deprecated.

Declaring `sockaddr` as deprecated means deprecating also:

accept(2)
accept4(2)
bind(2)
connect(2)
getpeername(2)
getsockname(2)
recvfrom(2)
sendto(2)
getnameinfo(3)

which use the type in their prototype.

Also, other types such as `addrinfo`, which contain `sockaddr` would also need 
to be deprecated, which would itself require deprecating:

getaddrinfo(3)
freeaddrinfo(3)

Since addrinfo is itself contained in other structures such as `gaicb`, we would 
also need to deprecate those, which would in turn require deprecating more APIs:

getaddrinfo_a(3)
gai_error(3)
gai_cancel(3)

And maybe I left some.  This feels like nuking the entire networking API, which 
I don't see happening soon.


Otherwise, we need to come up with a solution that keeps these APIs compatible 
with whatever new type we suggest using.  Changing them to use `void*` instead 
of `sockaddr*` would be possible, but would decrease type safety considerably, 
so there should be a good reason for that.

Suggesting to use always `sockaddr_storage` but using `sockaddr` in the function 
parameters keeps the current not-so-nice casting issues, which are not Undefined 
Behavior per se, but not ideal either (in fact, I don't think `void*` is much 
worse than code full of casts).  And it would also be error-prone, since users 
could get the idea that `sockaddr` can be used safely, since it's what gets 
passed as the parameter.

> The problem it should be part of union without raising a warning each time we use a safe type...

I don't understand this; please detail.

> 
> The other solution is to render public  and ABI stable the type here
> https://github.com/bminor/glibc/blob/ae612c45efb5e34713859a5facf92368307efb6e/socket/sys/socket.h#L78
> under for instance sockaddr_ptr and sockaddr_const_ptr

[[gnu::transparent_union]] (used in that link) seems like a "the design of this 
interface is bad, sorry, this workaround will just make it work".  I guess it 
just works, and probably it's the reason that so much undefined behavior hasn't 
exploded so far.  However, if we can solve this using core language features, 
I'd go that way.

> 
> Moreover this are some patch arch by arch
> https://sourceware.org/legacy-ml/libc-alpha/2016-02/msg00340.html that should be made default
> 
> Bastien
> 
> 
> 
>>
>> struct sockaddr_storage {
>> 	union {
>> 		sa_family_t          ss_family;
>> 		struct sockaddr      sa;
>> 	};
>> };

Hmm, this isn't still perfect.  Since the APIs get the sockaddr, this union 
information would be lost.  `sockaddr` needs to be the type that is declared. 
`sockaddr_storage` should just die; there's no way to make it compatible with 
APIs getting a `sockaddr`.  The attribute `transparent_union` is the only way to 
use is safely.

Cheers,

Alex

>>
>>
>> With this, sockaddr_storage becomes useless, but still usable.  New code, could
>> just use sockaddr and use the internal union members as necessary.  Union info
>> is kept across all function boundaries.
Bastien Roucariès Jan. 22, 2023, 2:12 p.m. UTC | #13
Le samedi 21 janvier 2023, 14:30:11 UTC Alejandro Colomar a écrit :
> Hi Bastien,
> 
> On 1/21/23 14:30, Bastien Roucariès wrote:
> 
> [...]
> 
> >> Ahh, indeed it seems to be UB.  It's in the same 6.5.2.3/6:  there's a
> >> requirement that the information about the union is kept in the function in
> >> which it's accessed.
> >>
> >> The standard presents an example, which is a bit ambiguous:
> >>
> >>        The following is not a valid fragment (because the union type is not
> >> visible within function f):
> >>
> 
> [...]
> 
> >>
> >> I don't know what's the intention if the union type was global but the variable
> >> `u` was still not seen by f().  But it seems GCC's interpretation is UB,
> >> according to the test we just saw.
> >>
> >> The solution that I can see for that is making sockaddr also be a union.  That
> >> way, the union is kept across all calls (since they all use sockaddr).
> >>
> >> struct sockaddr {
> >> 	union {
> >> 		struct {
> >> 			sa_family_t  sa_family;
> >> 			char         sa_data[14];  // why 14?
> >> 		}
> >> 		struct sockaddr_in   sin;
> >> 		struct sockaddr_in6  sin6;
> >> 		struct sockaddr_un   sun;
> >> 	};
> >> };
> > 
> > No the solution is to avoid sockaddr and mark as deprecated.
> 
> Declaring `sockaddr` as deprecated means deprecating also:
> 
> accept(2)
> accept4(2)
> bind(2)
> connect(2)
> getpeername(2)
> getsockname(2)
> recvfrom(2)
> sendto(2)
> getnameinfo(3)
> 
> which use the type in their prototype.
> 
> Also, other types such as `addrinfo`, which contain `sockaddr` would also need 
> to be deprecated, which would itself require deprecating:

No because this function will take a opaque transparent union pointer. I mean only raise 
a warning when user declare a variable (storage) of struct sockaddr...
> 
> getaddrinfo(3)
> freeaddrinfo(3)
> 
> Since addrinfo is itself contained in other structures such as `gaicb`, we would 
> also need to deprecate those, which would in turn require deprecating more APIs:
> 
> getaddrinfo_a(3)
> gai_error(3)
> gai_cancel(3)
> 
> And maybe I left some.  This feels like nuking the entire networking API, which 
> I don't see happening soon.
> 
> 
> Otherwise, we need to come up with a solution that keeps these APIs compatible 
> with whatever new type we suggest using.  Changing them to use `void*` instead 
> of `sockaddr*` would be possible, but would decrease type safety considerably, 
> so there should be a good reason for that.
> 
> Suggesting to use always `sockaddr_storage` but using `sockaddr` in the function 
> parameters keeps the current not-so-nice casting issues, which are not Undefined 
> Behavior per se, but not ideal either (in fact, I don't think `void*` is much 
> worse than code full of casts).  And it would also be error-prone, since users 
> could get the idea that `sockaddr` can be used safely, since it's what gets 
> passed as the parameter.
> 
> > The problem it should be part of union without raising a warning each time we use a safe type...
> 
> I don't understand this; please detail.

the transparent union will include sockaddr, thus even if we use it correctly raise a warning...
> 
> > 
> > The other solution is to render public  and ABI stable the type here
> > https://github.com/bminor/glibc/blob/ae612c45efb5e34713859a5facf92368307efb6e/socket/sys/socket.h#L78
> > under for instance sockaddr_ptr and sockaddr_const_ptr
> 
> [[gnu::transparent_union]] (used in that link) seems like a "the design of this 
> interface is bad, sorry, this workaround will just make it work".  I guess it 
> just works, and probably it's the reason that so much undefined behavior hasn't 
> exploded so far.  However, if we can solve this using core language features, 
> I'd go that way.

It solve the problems and could be used without the  [[gnu::transparent_union]], c++17 support transparent union.

The transparent union should also include pointer to sockaddr_storage and bluetooth socket.

Bastien
> 
> > 
> > Moreover this are some patch arch by arch
> > https://sourceware.org/legacy-ml/libc-alpha/2016-02/msg00340.html that should be made default
> > 
> > Bastien
> > 
> > 
> > 
> >>
> >> struct sockaddr_storage {
> >> 	union {
> >> 		sa_family_t          ss_family;
> >> 		struct sockaddr      sa;
> >> 	};
> >> };
> 
> Hmm, this isn't still perfect.  Since the APIs get the sockaddr, this union 
> information would be lost.  `sockaddr` needs to be the type that is declared. 
> `sockaddr_storage` should just die; there's no way to make it compatible with 
> APIs getting a `sockaddr`.  The attribute `transparent_union` is the only way to 
> use is safely.
> 
> Cheers,
> 
> Alex
> 
> >>
> >>
> >> With this, sockaddr_storage becomes useless, but still usable.  New code, could
> >> just use sockaddr and use the internal union members as necessary.  Union info
> >> is kept across all function boundaries.
> 
> -- 
> <http://www.alejandro-colomar.es/>
>
diff mbox series

Patch

diff --git a/bits/socket.h b/bits/socket.h
index aac8c49b00..c0c23b4e84 100644
--- a/bits/socket.h
+++ b/bits/socket.h
@@ -168,9 +168,14 @@  struct sockaddr
 
 struct sockaddr_storage
   {
-    __SOCKADDR_COMMON (ss_);	/* Address family, etc.  */
-    char __ss_padding[_SS_PADSIZE];
-    __ss_aligntype __ss_align;	/* Force desired alignment.  */
+    union
+      {
+        __SOCKADDR_COMMON (ss_);	/* Address family, etc.  */
+        struct sockaddr      sa;
+        struct sockaddr_in   sin;
+        struct sockaddr_in6  sin6;
+        struct sockaddr_un   sun;
+      };
   };