diff mbox

[2/3] network: recvmsg and sendmsg standard compliance (BZ#16919)

Message ID 1459175641-12520-3-git-send-email-adhemerval.zanella@linaro.org
State New
Headers show

Commit Message

Adhemerval Zanella March 28, 2016, 2:34 p.m. UTC
POSIX specifies that both msghdr::msg_iovlen and msghdr::msg_controllen
to be of size int and socklen_t respectively.  However Linux defines it as
both size_t and for 64-bit it requires some adjustments to make the
functions standard compliance.

This patch fixes it by creating a temporary header and zeroing the pad
fields for 64-bits architecture where size of size_t exceeds the size of
the int.

Also the new recvmsg and sendmsg implementation is only added on libc,
with libpthread only containing a compat symbol.

Tested on x86_64, i686, aarch64, armhf, and powerpc64le.

	* conform/data/sys/socket.h-data (msghdr.msg_iovlen): Remove xfail-
	and change to correct expected type.
	(msghdr.msg_controllen): Likewise.
	(cmsghdr.cmsg_len): Likewise.
	* sysdeps/unix/sysv/linux/bits/socket.h (msghdr.msg_iovlen): Fix
	expected POSIX assumption about the size.
	(msghdr.msg_controllen): Likewise.
	(msghdr.__glibc_reserved1): Likewise.
	(msghdr.__glibc_reserved2): Likewise.
	(cmsghdr.cmsg_len): Likewise.
	(cmsghdr.__glibc_reserved1): Likewise.
	* nptl/Makefile (libpthread-routines): Remove ptw-recvmsg and ptw-sendmsg.
	Add ptw-oldrecvmsg and ptw-oldsendmsg.
	(CFLAGS-sendmsg.c): Remove rule.
	(CFLAGS-recvmsg.c): Likewise.
	(CFLAGS-oldsendmsg.c): Add rule.
	(CFLAGS-oldrecvmsg.c): Likewise.
	* sysdeps/unix/sysv/linux/alpha/Versions [libc] (GLIBC_2.24): Add
	recvmsg and sendmsg.
	* sysdeps/unix/sysv/linux/aarch64/Version [libc] (GLIBC_2.24):
	Likewise.
	* sysdeps/unix/sysv/linux/arm/Versions [libc] (GLIBC_2.24): Likewise.
	* sysdeps/unix/sysv/linux/hppa/Versions [libc] (GLIBC_2.24): Likewise.
	* sysdeps/unix/sysv/linux/i386/Versions [libc] (GLIBC_2.24): Likewise.
	* sysdeps/unix/sysv/linux/ia64/Versions [libc] (GLIBC_2.24): Likewise.
	* sysdeps/unix/sysv/linux/m68k/Versions [libc] (GLIBC_2.24): Likewise.
	* sysdeps/unix/sysv/linux/microblaze/Versions [libc] (GLIBC_2.24):
	Likewise.
	* sysdeps/unix/sysv/linux/mips/mips32/Versions [libc] (GLIBC_2.24):
	Likewise.
	* sysdeps/unix/sysv/linux/mips/mips64/n32/Versions [libc] (GLIBC_2.24):
	Likewise.
	* sysdeps/unix/sysv/linux/mips/mips64/Versions [libc] (GLIBC_2.24):
	Likewise.
	* sysdeps/unix/sysv/linux/nios2/Versions [libc] (GLIBC_2.24):
	Likewise.
	* sysdeps/unix/sysv/linux/powerpc/Versions [libc] (GLIBC_2.24):
	Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/Versions [libc]
	(GLIBC_2.24): Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-32/Versions [libc] (GLIBC_2.24):
	Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-64/Versions [libc] (GLIBC_2.24):
	Likewise.
	* sysdeps/unix/sysv/linux/sh/Versions [libc] (GLIBC_2.24): Likewise.
	* sysdeps/unix/sysv/linux/sparc/Versions [libc] (GLIBC_2.24):
	Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc64/Versions [libc] (GLIBC_2.24):
	Likewise.
	( sysdeps/unix/sysv/linux/tile/Versions [libc] (GLIBC_2.24):
	Likewise.
	* sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/Versions [libc]
	(GLIBC_2.24): Likewise.
	( sysdeps/unix/sysv/linux/x86_64/64/Versions [libc] (GLIBC_2.24):
	Likewise.
	* sysdeps/unix/sysv/linux/x84_64/Versions [libc] (GLIBC_2.24):
	Likewise.
	* sysdeps/unix/sysv/linux/Makefile
	[$(subdir) = socket)] (sysdep_headers): Add oldrecvmsg and oldsendmsg.
	(CFLAGS-sendmsg.c): Add rule.
	(CFLAGS-recvmsg.c): Likewise.
	(CFLAGS-oldsendmsg.c): Likewise.
	(CFLAGS-oldrecvmsg.c): Likewise.
	* sysdeps/unix/sysv/linux/check_native.c (__check_native): Fix msghdr
	initialization.
	* sysdeps/unix/sysv/linux/check_pf.c (make_request): Likewise.
	* sysdeps/unix/sysv/linux/ifaddrs.c (__netlink_request): Likewise.
	* sysdeps/unix/sysv/linux/oldrecvmsg.c: New file.
	* sysdeps/unix/sysv/linux/oldsendmsg.c: Likewise.
	* sysdeps/unix/sysv/linux/recvmsg.c (__libc_recvmsg): Adjust msghdr
	iovlen and controllen fields to adjust to POSIX specification.
	* sysdeps/unix/sysv/linux/sendmsg.c (__libc_sendmsg): Likewise.
	* sysdeps/unix/sysv/linux/aarch64/libc.abilist: New version and
	added recvmsg and sendmsg.
	* sysdeps/unix/sysv/linux/alpha/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/hppa/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/i386/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/ia64/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/microblaze/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/nios2/libc.abilist: Likewise
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist:
	Likewise.
	* sysdeps/unix/linux/powerpc/powerpc32/nofpu/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/libc-le.abilist: Likewise.
	* sysdepe/unix/sysv/linux/powerpc/powerpc64/libc.abilist: Likewise.
	Likewise.
	Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/sh/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/tile/tilepro/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libc.abilist: Likewise.
	Likewise.
	* sysdeps/unix/sysv/linux/x86_64/64/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist: Likewise.
---
 ChangeLog                                          | 106 +++++++++++++++++++++
 conform/data/sys/socket.h-data                     |   8 +-
 nptl/Makefile                                      |   9 +-
 sysdeps/unix/sysv/linux/Makefile                   |   6 +-
 sysdeps/unix/sysv/linux/aarch64/Versions           |   4 +
 sysdeps/unix/sysv/linux/aarch64/libc.abilist       |   3 +
 sysdeps/unix/sysv/linux/alpha/Versions             |   3 +
 sysdeps/unix/sysv/linux/alpha/libc.abilist         |   3 +
 sysdeps/unix/sysv/linux/arm/Versions               |   3 +
 sysdeps/unix/sysv/linux/arm/libc.abilist           |   3 +
 sysdeps/unix/sysv/linux/bits/socket.h              |  45 +++++++--
 sysdeps/unix/sysv/linux/check_native.c             |  11 ++-
 sysdeps/unix/sysv/linux/check_pf.c                 |  11 ++-
 sysdeps/unix/sysv/linux/hppa/Versions              |   3 +
 sysdeps/unix/sysv/linux/hppa/libc.abilist          |   3 +
 sysdeps/unix/sysv/linux/i386/Versions              |   3 +
 sysdeps/unix/sysv/linux/i386/libc.abilist          |   3 +
 sysdeps/unix/sysv/linux/ia64/Versions              |   3 +
 sysdeps/unix/sysv/linux/ia64/libc.abilist          |   3 +
 sysdeps/unix/sysv/linux/ifaddrs.c                  |  11 ++-
 sysdeps/unix/sysv/linux/m68k/Versions              |   3 +
 sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist |   3 +
 sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist   |   3 +
 sysdeps/unix/sysv/linux/microblaze/Versions        |   3 +
 sysdeps/unix/sysv/linux/microblaze/libc.abilist    |   3 +
 sysdeps/unix/sysv/linux/mips/mips32/Versions       |   3 +
 .../unix/sysv/linux/mips/mips32/fpu/libc.abilist   |   3 +
 .../unix/sysv/linux/mips/mips32/nofpu/libc.abilist |   3 +
 sysdeps/unix/sysv/linux/mips/mips64/n32/Versions   |   3 +
 .../unix/sysv/linux/mips/mips64/n32/libc.abilist   |   3 +
 sysdeps/unix/sysv/linux/mips/mips64/n64/Versions   |   5 +
 .../unix/sysv/linux/mips/mips64/n64/libc.abilist   |   3 +
 sysdeps/unix/sysv/linux/nios2/Versions             |   3 +
 sysdeps/unix/sysv/linux/nios2/libc.abilist         |   3 +
 sysdeps/unix/sysv/linux/oldrecvmsg.c               |  40 ++++++++
 sysdeps/unix/sysv/linux/oldsendmsg.c               |  40 ++++++++
 sysdeps/unix/sysv/linux/powerpc/Versions           |   3 +
 .../sysv/linux/powerpc/powerpc32/fpu/libc.abilist  |   3 +
 .../linux/powerpc/powerpc32/nofpu/libc.abilist     |   3 +
 sysdeps/unix/sysv/linux/powerpc/powerpc64/Versions |   3 +
 .../sysv/linux/powerpc/powerpc64/libc-le.abilist   |   3 +
 .../unix/sysv/linux/powerpc/powerpc64/libc.abilist |   3 +
 sysdeps/unix/sysv/linux/recvmsg.c                  |  36 +++++--
 sysdeps/unix/sysv/linux/s390/s390-32/Versions      |   3 +
 sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist  |   3 +
 sysdeps/unix/sysv/linux/s390/s390-64/Versions      |   3 +
 sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist  |   3 +
 sysdeps/unix/sysv/linux/sendmsg.c                  |  23 +++--
 sysdeps/unix/sysv/linux/sh/Versions                |   3 +
 sysdeps/unix/sysv/linux/sh/libc.abilist            |   3 +
 sysdeps/unix/sysv/linux/sparc/Versions             |   3 +
 sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist |   3 +
 sysdeps/unix/sysv/linux/sparc/sparc64/Versions     |   3 +
 sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist |   3 +
 sysdeps/unix/sysv/linux/tile/Versions              |   3 +
 .../sysv/linux/tile/tilegx/tilegx32/libc.abilist   |   3 +
 .../unix/sysv/linux/tile/tilegx/tilegx64/Versions  |   5 +
 .../sysv/linux/tile/tilegx/tilegx64/libc.abilist   |   3 +
 sysdeps/unix/sysv/linux/tile/tilepro/libc.abilist  |   3 +
 sysdeps/unix/sysv/linux/x86_64/64/Versions         |   5 +
 sysdeps/unix/sysv/linux/x86_64/64/libc.abilist     |   3 +
 sysdeps/unix/sysv/linux/x86_64/Versions            |   3 +
 sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist    |   3 +
 63 files changed, 460 insertions(+), 46 deletions(-)
 create mode 100644 sysdeps/unix/sysv/linux/mips/mips64/n64/Versions
 create mode 100644 sysdeps/unix/sysv/linux/oldrecvmsg.c
 create mode 100644 sysdeps/unix/sysv/linux/oldsendmsg.c
 create mode 100644 sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/Versions
 create mode 100644 sysdeps/unix/sysv/linux/x86_64/64/Versions

Comments

Szabolcs Nagy April 7, 2016, 9:22 a.m. UTC | #1
On 28/03/16 15:34, Adhemerval Zanella wrote:
> POSIX specifies that both msghdr::msg_iovlen and msghdr::msg_controllen
> to be of size int and socklen_t respectively.  However Linux defines it as
> both size_t and for 64-bit it requires some adjustments to make the
> functions standard compliance.
> 
> This patch fixes it by creating a temporary header and zeroing the pad
> fields for 64-bits architecture where size of size_t exceeds the size of
> the int.

sendmsg is harder to fix because cmsghdr also needs fix ups:

>  /* Structure used for storage of ancillary data object information.  */
>  struct cmsghdr
>    {
> -    size_t cmsg_len;		/* Length of data in cmsg_data plus length
> -				   of cmsghdr structure.
> -				   !! The type should be socklen_t but the
> -				   definition of the kernel is incompatible
> -				   with this.  */
> +#if __BYTE_ORDER == __BIG_ENDIAN
> +    int __glibc_reserved1;	/* Pad toadjust Linux size to POSIX defined
> +				   size for cmsg_len.  */
> +    socklen_t cmsg_len;		/* Length of data in cmsg_data plus length
> +				   of cmsghdr structure.  */
> +#else
> +    socklen_t cmsg_len;
> +    int __glibc_reserved1;
> +#endif
>      int cmsg_level;		/* Originating protocol.  */
>      int cmsg_type;		/* Protocol specific type.  */
...

>  ssize_t
>  __libc_sendmsg (int fd, const struct msghdr *msg, int flags)
>  {
> +  /* POSIX specifies that both msghdr::msg_iovlen and msghdr::msg_controllen
> +     to be int and socklen_t respectively.  However Linux defines it as
> +     both size_t.  So for 64-bit it requires some adjustments by copying to
> +     temporary header and zeroing the pad fields.  */
> +#if __WORDSIZE == 64
> +  struct msghdr hdr;
> +  if (msg != NULL)
> +    {
> +      hdr = *msg;
> +      hdr.__glibc_reserved1 = 0;
> +      hdr.__glibc_reserved2 = 0;
> +      msg = &hdr;
> +    }
> +#endif

e.g. user supplied msg.msg_control might contain cmsghdr
with __glibc_reserved1 != 0 since user code might not
initialize the struct.

in musl this is fixed by copying the controls to a tmp
buf on the stack (which has fixed size so it can fail)
and fixing the paddings there.
Florian Weimer April 7, 2016, 9:56 a.m. UTC | #2
On 03/28/2016 04:34 PM, Adhemerval Zanella wrote:
> diff --git a/sysdeps/unix/sysv/linux/check_native.c b/sysdeps/unix/sysv/linux/check_native.c
> index b3cbbe3..a8e447e 100644
> --- a/sysdeps/unix/sysv/linux/check_native.c
> +++ b/sysdeps/unix/sysv/linux/check_native.c
> @@ -111,10 +111,13 @@ __check_native (uint32_t a1_index, int *a1_native,
>      {
>        struct msghdr msg =
>  	{
> -	  (void *) &nladdr, sizeof (nladdr),
> -	  &iov, 1,
> -	  NULL, 0,
> -	  0
> +	  .msg_name = (void *) &nladdr,
> +	  .msg_namelen =  sizeof (nladdr),
> +	  .msg_iov = &iov,
> +	  .msg_iovlen = 1,
> +	  .msg_control = NULL,
> +	  .msg_controllen = 0,
> +	  .msg_flags = 0
>  	};

The requirement for such changes always makes me nervous.  If we have
breakage in our own code, how many applications are affected?

Note that the recvmsg manual page says “is defined as follows” about
struct msghdr, not “contains the following members in some arbitrary order”.

Is standards compliance here really worth this risk?

(I do not have a strong opinion either way, I just want to raise this
point.)

Florian
Szabolcs Nagy April 7, 2016, 11:37 a.m. UTC | #3
On 07/04/16 10:56, Florian Weimer wrote:
> On 03/28/2016 04:34 PM, Adhemerval Zanella wrote:
>> diff --git a/sysdeps/unix/sysv/linux/check_native.c b/sysdeps/unix/sysv/linux/check_native.c
>> index b3cbbe3..a8e447e 100644
>> --- a/sysdeps/unix/sysv/linux/check_native.c
>> +++ b/sysdeps/unix/sysv/linux/check_native.c
>> @@ -111,10 +111,13 @@ __check_native (uint32_t a1_index, int *a1_native,
>>      {
>>        struct msghdr msg =
>>  	{
>> -	  (void *) &nladdr, sizeof (nladdr),
>> -	  &iov, 1,
>> -	  NULL, 0,
>> -	  0
>> +	  .msg_name = (void *) &nladdr,
>> +	  .msg_namelen =  sizeof (nladdr),
>> +	  .msg_iov = &iov,
>> +	  .msg_iovlen = 1,
>> +	  .msg_control = NULL,
>> +	  .msg_controllen = 0,
>> +	  .msg_flags = 0
>>  	};
> 
> The requirement for such changes always makes me nervous.  If we have
> breakage in our own code, how many applications are affected?
> 

yes, it looks risky.

> Note that the recvmsg manual page says “is defined as follows” about
> struct msghdr, not “contains the following members in some arbitrary order”.
> 

i think the man page should be more clear about
that the msg_controllen, msg_iovlen and cmsg_len
types conflict with posix.

to prevent more non-portable code being written
based on the linux man page.

> Is standards compliance here really worth this risk?
> 
> (I do not have a strong opinion either way, I just want to raise this
> point.)
> 
> Florian
>
Adhemerval Zanella April 7, 2016, 12:23 p.m. UTC | #4
On 07-04-2016 06:22, Szabolcs Nagy wrote:
> On 28/03/16 15:34, Adhemerval Zanella wrote:
>> POSIX specifies that both msghdr::msg_iovlen and msghdr::msg_controllen
>> to be of size int and socklen_t respectively.  However Linux defines it as
>> both size_t and for 64-bit it requires some adjustments to make the
>> functions standard compliance.
>>
>> This patch fixes it by creating a temporary header and zeroing the pad
>> fields for 64-bits architecture where size of size_t exceeds the size of
>> the int.
> 
> sendmsg is harder to fix because cmsghdr also needs fix ups:
> 
>>  /* Structure used for storage of ancillary data object information.  */
>>  struct cmsghdr
>>    {
>> -    size_t cmsg_len;		/* Length of data in cmsg_data plus length
>> -				   of cmsghdr structure.
>> -				   !! The type should be socklen_t but the
>> -				   definition of the kernel is incompatible
>> -				   with this.  */
>> +#if __BYTE_ORDER == __BIG_ENDIAN
>> +    int __glibc_reserved1;	/* Pad toadjust Linux size to POSIX defined
>> +				   size for cmsg_len.  */
>> +    socklen_t cmsg_len;		/* Length of data in cmsg_data plus length
>> +				   of cmsghdr structure.  */
>> +#else
>> +    socklen_t cmsg_len;
>> +    int __glibc_reserved1;
>> +#endif
>>      int cmsg_level;		/* Originating protocol.  */
>>      int cmsg_type;		/* Protocol specific type.  */
> ...
> 
>>  ssize_t
>>  __libc_sendmsg (int fd, const struct msghdr *msg, int flags)
>>  {
>> +  /* POSIX specifies that both msghdr::msg_iovlen and msghdr::msg_controllen
>> +     to be int and socklen_t respectively.  However Linux defines it as
>> +     both size_t.  So for 64-bit it requires some adjustments by copying to
>> +     temporary header and zeroing the pad fields.  */
>> +#if __WORDSIZE == 64
>> +  struct msghdr hdr;
>> +  if (msg != NULL)
>> +    {
>> +      hdr = *msg;
>> +      hdr.__glibc_reserved1 = 0;
>> +      hdr.__glibc_reserved2 = 0;
>> +      msg = &hdr;
>> +    }
>> +#endif
> 
> e.g. user supplied msg.msg_control might contain cmsghdr
> with __glibc_reserved1 != 0 since user code might not
> initialize the struct.
> 
> in musl this is fixed by copying the controls to a tmp
> buf on the stack (which has fixed size so it can fail)
> and fixing the paddings there.
> 

Yes I am aware and I noted this in my patch header:

 1. Current sendmsg fix does not handle larger msg_control neither
    pads the cmsghdr associated.  The problem with this approach
    is to accomplish a complete fix it will require to allocate
    a limited buffer, copying the incoming struct and zero pad.
    Although it tend to work it also add some limitation of total
    msg_control length.
    The general usage for such facily is passing file descriptors
    and permissions between processes over unix sockets so it might
    be factible to use a large stack allocated buffer (1024, 2048
    or large) and return ENOMEM for larger buffers.

I am planning to send a fix for this based on this patch.
Adhemerval Zanella April 7, 2016, 12:29 p.m. UTC | #5
On 07-04-2016 08:37, Szabolcs Nagy wrote:
> On 07/04/16 10:56, Florian Weimer wrote:
>> On 03/28/2016 04:34 PM, Adhemerval Zanella wrote:
>>> diff --git a/sysdeps/unix/sysv/linux/check_native.c b/sysdeps/unix/sysv/linux/check_native.c
>>> index b3cbbe3..a8e447e 100644
>>> --- a/sysdeps/unix/sysv/linux/check_native.c
>>> +++ b/sysdeps/unix/sysv/linux/check_native.c
>>> @@ -111,10 +111,13 @@ __check_native (uint32_t a1_index, int *a1_native,
>>>      {
>>>        struct msghdr msg =
>>>  	{
>>> -	  (void *) &nladdr, sizeof (nladdr),
>>> -	  &iov, 1,
>>> -	  NULL, 0,
>>> -	  0
>>> +	  .msg_name = (void *) &nladdr,
>>> +	  .msg_namelen =  sizeof (nladdr),
>>> +	  .msg_iov = &iov,
>>> +	  .msg_iovlen = 1,
>>> +	  .msg_control = NULL,
>>> +	  .msg_controllen = 0,
>>> +	  .msg_flags = 0
>>>  	};
>>
>> The requirement for such changes always makes me nervous.  If we have
>> breakage in our own code, how many applications are affected?
>>
> 
> yes, it looks risky.
> 
>> Note that the recvmsg manual page says “is defined as follows” about
>> struct msghdr, not “contains the following members in some arbitrary order”.
>>
> 
> i think the man page should be more clear about
> that the msg_controllen, msg_iovlen and cmsg_len
> types conflict with posix.
> 
> to prevent more non-portable code being written
> based on the linux man page.
> 
>> Is standards compliance here really worth this risk?
>>
>> (I do not have a strong opinion either way, I just want to raise this
>> point.)
>>
>> Florian
>>
> 

I do not have a strong opinion as well, but I also do not see it a
compelling reason to *not* follow the standard as well.  I would follow
Szabolcs suggestion and update the manual specifying to more portable
way to access the structures is through designated initializers (which
gcc accepts in c90 as well).

Best option would be to get this fixed in kernel, but it is another
thread.
Szabolcs Nagy April 21, 2016, 2:01 p.m. UTC | #6
On 28/03/16 15:34, Adhemerval Zanella wrote:
>  /* Structure used for storage of ancillary data object information.  */
>  struct cmsghdr
>    {
> -    size_t cmsg_len;		/* Length of data in cmsg_data plus length
> -				   of cmsghdr structure.
> -				   !! The type should be socklen_t but the
> -				   definition of the kernel is incompatible
> -				   with this.  */
> +#if __BYTE_ORDER == __BIG_ENDIAN
> +    int __glibc_reserved1;	/* Pad toadjust Linux size to POSIX defined
> +				   size for cmsg_len.  */
> +    socklen_t cmsg_len;		/* Length of data in cmsg_data plus length
> +				   of cmsghdr structure.  */
> +#else
> +    socklen_t cmsg_len;
> +    int __glibc_reserved1;
> +#endif
>      int cmsg_level;		/* Originating protocol.  */
>      int cmsg_type;		/* Protocol specific type.  */

i think #if __WORDSIZE == 64 is missing here.

but even in that case there is a subtle issue:
if the size_t member is removed all other
members have 4byte alignment, so the struct
alignment changes from 8byte to 4byte.

it is not clear from the standard how the
msg_control buffer may be allocated (since
only CMSG_* macros can access it), on linux
the kernel makes a copy so it does not care
about alignment in user-space, but the struct
alignment is still visible in the c and c++ abi.

msg_control usage should be probably documented
in the linux man-page: glibc sunrpc sometimes
uses plain char[], nscd uses a union with struct
cmsghdr, i think neither of them makes a

  CMSG_FIRSTHDR (&msg)->cmsg_len

access strictly iso c confrom, but the later at
least uses correct alignment.

maybe a posix issue should be filed to the
austin group.
Rich Felker April 21, 2016, 5:14 p.m. UTC | #7
On Mon, Mar 28, 2016 at 11:34:00AM -0300, Adhemerval Zanella wrote:
> POSIX specifies that both msghdr::msg_iovlen and msghdr::msg_controllen
> to be of size int and socklen_t respectively.  However Linux defines it as
> both size_t and for 64-bit it requires some adjustments to make the
> functions standard compliance.
> 
> This patch fixes it by creating a temporary header and zeroing the pad
> fields for 64-bits architecture where size of size_t exceeds the size of
> the int.
> 
> Also the new recvmsg and sendmsg implementation is only added on libc,
> with libpthread only containing a compat symbol.

Just a heads-up: this needs a bug report/patch to the Linux man-pages
project as well, since they're currently documenting the wrong types.
The documented types should probably be fixed to align with the
standard, with a note about them previously being wrong added to the
NOTES section.

Rich
Adhemerval Zanella April 21, 2016, 8:07 p.m. UTC | #8
On 21-04-2016 11:01, Szabolcs Nagy wrote:
> On 28/03/16 15:34, Adhemerval Zanella wrote:
>>  /* Structure used for storage of ancillary data object information.  */
>>  struct cmsghdr
>>    {
>> -    size_t cmsg_len;		/* Length of data in cmsg_data plus length
>> -				   of cmsghdr structure.
>> -				   !! The type should be socklen_t but the
>> -				   definition of the kernel is incompatible
>> -				   with this.  */
>> +#if __BYTE_ORDER == __BIG_ENDIAN
>> +    int __glibc_reserved1;	/* Pad toadjust Linux size to POSIX defined
>> +				   size for cmsg_len.  */
>> +    socklen_t cmsg_len;		/* Length of data in cmsg_data plus length
>> +				   of cmsghdr structure.  */
>> +#else
>> +    socklen_t cmsg_len;
>> +    int __glibc_reserved1;
>> +#endif
>>      int cmsg_level;		/* Originating protocol.  */
>>      int cmsg_type;		/* Protocol specific type.  */
> 
> i think #if __WORDSIZE == 64 is missing here.
> 

Right, I will add it.

> but even in that case there is a subtle issue:
> if the size_t member is removed all other
> members have 4byte alignment, so the struct
> alignment changes from 8byte to 4byte.
> 
> it is not clear from the standard how the
> msg_control buffer may be allocated (since
> only CMSG_* macros can access it), on linux
> the kernel makes a copy so it does not care
> about alignment in user-space, but the struct
> alignment is still visible in the c and c++ abi.

Indeed and I am not sure how to enforce (and if it is really required) in
the cleanest way. Do you think this as an blocker for such fix?

I am asking because I am following the mips64 ip thread failure report
for musl and looks like it is not really related to the change of 
struct alignment. Also I am testing on s390x and looks like both
make check and ip shows no issue. 

> 
> msg_control usage should be probably documented
> in the linux man-page: glibc sunrpc sometimes
> uses plain char[], nscd uses a union with struct
> cmsghdr, i think neither of them makes a
> 
>   CMSG_FIRSTHDR (&msg)->cmsg_len
> 
> access strictly iso c confrom, but the later at
> least uses correct alignment.
> 
> maybe a posix issue should be filed to the
> austin group.
>
Michael Kerrisk \(man-pages\) April 22, 2016, 8:04 a.m. UTC | #9
Hello Szabolcs

On 21 April 2016 at 16:01, Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:
> On 28/03/16 15:34, Adhemerval Zanella wrote:
>>  /* Structure used for storage of ancillary data object information.  */
>>  struct cmsghdr
>>    {
>> -    size_t cmsg_len;         /* Length of data in cmsg_data plus length
>> -                                of cmsghdr structure.
>> -                                !! The type should be socklen_t but the
>> -                                definition of the kernel is incompatible
>> -                                with this.  */
>> +#if __BYTE_ORDER == __BIG_ENDIAN
>> +    int __glibc_reserved1;   /* Pad toadjust Linux size to POSIX defined
>> +                                size for cmsg_len.  */
>> +    socklen_t cmsg_len;              /* Length of data in cmsg_data plus length
>> +                                of cmsghdr structure.  */
>> +#else
>> +    socklen_t cmsg_len;
>> +    int __glibc_reserved1;
>> +#endif
>>      int cmsg_level;          /* Originating protocol.  */
>>      int cmsg_type;           /* Protocol specific type.  */
>
> i think #if __WORDSIZE == 64 is missing here.
>
> but even in that case there is a subtle issue:
> if the size_t member is removed all other
> members have 4byte alignment, so the struct
> alignment changes from 8byte to 4byte.
>
> it is not clear from the standard how the
> msg_control buffer may be allocated (since
> only CMSG_* macros can access it), on linux
> the kernel makes a copy so it does not care
> about alignment in user-space, but the struct
> alignment is still visible in the c and c++ abi.
>
> msg_control usage should be probably documented
> in the linux man-page: glibc sunrpc sometimes
> uses plain char[], nscd uses a union with struct
> cmsghdr, i think neither of them makes a
>
>   CMSG_FIRSTHDR (&msg)->cmsg_len
>
> access strictly iso c confrom, but the later at
> least uses correct alignment.
>
> maybe a posix issue should be filed to the
> austin group.

Below are the current snippets of code shown in the cmsg(3) man page.
Could you please tell me more exactly what you think needs fixing?

[[
    EXAMPLE
       This code looks for the IP_TTL option in a  received  ancillary
       buffer:

           struct msghdr msgh;
           struct cmsghdr *cmsg;
           int *ttlptr;
           int received_ttl;

           /* Receive auxiliary data in msgh */

           for (cmsg = CMSG_FIRSTHDR(&msgh); cmsg != NULL;
                   cmsg = CMSG_NXTHDR(&msgh, cmsg)) {
               if (cmsg->cmsg_level == IPPROTO_IP
                       && cmsg->cmsg_type == IP_TTL) {
                   ttlptr = (int *) CMSG_DATA(cmsg);
                   received_ttl = *ttlptr;
                   break;
               }
           }

           if (cmsg == NULL) {
               /* Error: IP_TTL not enabled or small buffer or I/O error */
           }

       The  code below passes an array of file descriptors over a UNIX
       domain socket using SCM_RIGHTS:

           struct msghdr msg = {0};
           struct cmsghdr *cmsg;
           int myfds[NUM_FD];  /* Contains the file descriptors to pass */
           int *fdptr;
           union {         /* Ancillary data buffer, wrapped in a union
                              in order to ensure it is suitably aligned */
               char buf[CMSG_SPACE(sizeof myfds)];
               struct cmsghdr align;
           } u;

           msg.msg_control = u.buf;
           msg.msg_controllen = sizeof u.buf;
           cmsg = CMSG_FIRSTHDR(&msg);
           cmsg->cmsg_level = SOL_SOCKET;
           cmsg->cmsg_type = SCM_RIGHTS;
           cmsg->cmsg_len = CMSG_LEN(sizeof(int) * NUM_FD);
           fdptr = (int *) CMSG_DATA(cmsg);    /* Initialize the payload */
           memcpy(fdptr, myfds, NUM_FD * sizeof(int));
]]

Cheers,

Michael
Szabolcs Nagy April 22, 2016, 10:25 a.m. UTC | #10
On 22/04/16 09:04, Michael Kerrisk (man-pages) wrote:
> On 21 April 2016 at 16:01, Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:
>> it is not clear from the standard how the
>> msg_control buffer may be allocated (since
>> only CMSG_* macros can access it), on linux
>> the kernel makes a copy so it does not care
>> about alignment in user-space, but the struct
>> alignment is still visible in the c and c++ abi.
>>
>> msg_control usage should be probably documented
>> in the linux man-page: glibc sunrpc sometimes
>> uses plain char[], nscd uses a union with struct
>> cmsghdr, i think neither of them makes a
>>
>>   CMSG_FIRSTHDR (&msg)->cmsg_len
>>
>> access strictly iso c confrom, but the later at
>> least uses correct alignment.
>>
>> maybe a posix issue should be filed to the
>> austin group.
> 
> Below are the current snippets of code shown in the cmsg(3) man page.
> Could you please tell me more exactly what you think needs fixing?
> 

thanks, this is what i was looking for.
(i missed cmsg(3), only looked at recvmsg(2) and sendmsg(2).)

this means some code in glibc and iproute2 should be fixed
to use such a union. (what a nasty interface.)

> [[
>     EXAMPLE
>        This code looks for the IP_TTL option in a  received  ancillary
>        buffer:
> 
>            struct msghdr msgh;
>            struct cmsghdr *cmsg;
>            int *ttlptr;
>            int received_ttl;
> 
>            /* Receive auxiliary data in msgh */
> 
>            for (cmsg = CMSG_FIRSTHDR(&msgh); cmsg != NULL;
>                    cmsg = CMSG_NXTHDR(&msgh, cmsg)) {
>                if (cmsg->cmsg_level == IPPROTO_IP
>                        && cmsg->cmsg_type == IP_TTL) {
>                    ttlptr = (int *) CMSG_DATA(cmsg);
>                    received_ttl = *ttlptr;
>                    break;
>                }
>            }
> 
>            if (cmsg == NULL) {
>                /* Error: IP_TTL not enabled or small buffer or I/O error */
>            }
> 
>        The  code below passes an array of file descriptors over a UNIX
>        domain socket using SCM_RIGHTS:
> 
>            struct msghdr msg = {0};
>            struct cmsghdr *cmsg;
>            int myfds[NUM_FD];  /* Contains the file descriptors to pass */
>            int *fdptr;
>            union {         /* Ancillary data buffer, wrapped in a union
>                               in order to ensure it is suitably aligned */
>                char buf[CMSG_SPACE(sizeof myfds)];
>                struct cmsghdr align;
>            } u;
> 
>            msg.msg_control = u.buf;
>            msg.msg_controllen = sizeof u.buf;
>            cmsg = CMSG_FIRSTHDR(&msg);
>            cmsg->cmsg_level = SOL_SOCKET;
>            cmsg->cmsg_type = SCM_RIGHTS;
>            cmsg->cmsg_len = CMSG_LEN(sizeof(int) * NUM_FD);
>            fdptr = (int *) CMSG_DATA(cmsg);    /* Initialize the payload */
>            memcpy(fdptr, myfds, NUM_FD * sizeof(int));
> ]]
> 
> Cheers,
> 
> Michael
>
Michael Kerrisk \(man-pages\) April 22, 2016, 1:19 p.m. UTC | #11
Hi Szabolcs,

On 22 April 2016 at 12:25, Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:
> On 22/04/16 09:04, Michael Kerrisk (man-pages) wrote:
>> On 21 April 2016 at 16:01, Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:
>>> it is not clear from the standard how the
>>> msg_control buffer may be allocated (since
>>> only CMSG_* macros can access it), on linux
>>> the kernel makes a copy so it does not care
>>> about alignment in user-space, but the struct
>>> alignment is still visible in the c and c++ abi.
>>>
>>> msg_control usage should be probably documented
>>> in the linux man-page: glibc sunrpc sometimes
>>> uses plain char[], nscd uses a union with struct
>>> cmsghdr, i think neither of them makes a
>>>
>>>   CMSG_FIRSTHDR (&msg)->cmsg_len
>>>
>>> access strictly iso c confrom, but the later at
>>> least uses correct alignment.
>>>
>>> maybe a posix issue should be filed to the
>>> austin group.
>>
>> Below are the current snippets of code shown in the cmsg(3) man page.
>> Could you please tell me more exactly what you think needs fixing?
>>
>
> thanks, this is what i was looking for.
> (i missed cmsg(3), only looked at recvmsg(2) and sendmsg(2).)
>
> this means some code in glibc and iproute2 should be fixed
> to use such a union. (what a nasty interface.)

So, the code in the cmsg(3) man page seems okay then?

Cheers,

Michael

>> [[
>>     EXAMPLE
>>        This code looks for the IP_TTL option in a  received  ancillary
>>        buffer:
>>
>>            struct msghdr msgh;
>>            struct cmsghdr *cmsg;
>>            int *ttlptr;
>>            int received_ttl;
>>
>>            /* Receive auxiliary data in msgh */
>>
>>            for (cmsg = CMSG_FIRSTHDR(&msgh); cmsg != NULL;
>>                    cmsg = CMSG_NXTHDR(&msgh, cmsg)) {
>>                if (cmsg->cmsg_level == IPPROTO_IP
>>                        && cmsg->cmsg_type == IP_TTL) {
>>                    ttlptr = (int *) CMSG_DATA(cmsg);
>>                    received_ttl = *ttlptr;
>>                    break;
>>                }
>>            }
>>
>>            if (cmsg == NULL) {
>>                /* Error: IP_TTL not enabled or small buffer or I/O error */
>>            }
>>
>>        The  code below passes an array of file descriptors over a UNIX
>>        domain socket using SCM_RIGHTS:
>>
>>            struct msghdr msg = {0};
>>            struct cmsghdr *cmsg;
>>            int myfds[NUM_FD];  /* Contains the file descriptors to pass */
>>            int *fdptr;
>>            union {         /* Ancillary data buffer, wrapped in a union
>>                               in order to ensure it is suitably aligned */
>>                char buf[CMSG_SPACE(sizeof myfds)];
>>                struct cmsghdr align;
>>            } u;
>>
>>            msg.msg_control = u.buf;
>>            msg.msg_controllen = sizeof u.buf;
>>            cmsg = CMSG_FIRSTHDR(&msg);
>>            cmsg->cmsg_level = SOL_SOCKET;
>>            cmsg->cmsg_type = SCM_RIGHTS;
>>            cmsg->cmsg_len = CMSG_LEN(sizeof(int) * NUM_FD);
>>            fdptr = (int *) CMSG_DATA(cmsg);    /* Initialize the payload */
>>            memcpy(fdptr, myfds, NUM_FD * sizeof(int));
>> ]]
>>
>> Cheers,
>>
>> Michael
>>
>
Szabolcs Nagy April 22, 2016, 1:40 p.m. UTC | #12
On 22/04/16 14:19, Michael Kerrisk (man-pages) wrote:
> Hi Szabolcs,
> 
> On 22 April 2016 at 12:25, Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:
>> On 22/04/16 09:04, Michael Kerrisk (man-pages) wrote:
>>> On 21 April 2016 at 16:01, Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:
>>>> it is not clear from the standard how the
>>>> msg_control buffer may be allocated (since
>>>> only CMSG_* macros can access it), on linux
>>>> the kernel makes a copy so it does not care
>>>> about alignment in user-space, but the struct
>>>> alignment is still visible in the c and c++ abi.
>>>>
>>>> msg_control usage should be probably documented
>>>> in the linux man-page: glibc sunrpc sometimes
>>>> uses plain char[], nscd uses a union with struct
>>>> cmsghdr, i think neither of them makes a
>>>>
>>>>   CMSG_FIRSTHDR (&msg)->cmsg_len
>>>>
>>>> access strictly iso c confrom, but the later at
>>>> least uses correct alignment.
>>>>
>>>> maybe a posix issue should be filed to the
>>>> austin group.
>>>
>>> Below are the current snippets of code shown in the cmsg(3) man page.
>>> Could you please tell me more exactly what you think needs fixing?
>>>
>>
>> thanks, this is what i was looking for.
>> (i missed cmsg(3), only looked at recvmsg(2) and sendmsg(2).)
>>
>> this means some code in glibc and iproute2 should be fixed
>> to use such a union. (what a nasty interface.)
> 
> So, the code in the cmsg(3) man page seems okay then?
> 

yes

(i originally thought the cast in CMSG_FIRSTHDR
from the union* to struct cmsghdr* would be
invalid on the sending side, but it's ok.

on the receiving side the dereference through the
(int*) cast is debatable, but should work in practice.)

> Cheers,
> 
> Michael
> 
>>> [[
>>>     EXAMPLE
>>>        This code looks for the IP_TTL option in a  received  ancillary
>>>        buffer:
>>>
>>>            struct msghdr msgh;
>>>            struct cmsghdr *cmsg;
>>>            int *ttlptr;
>>>            int received_ttl;
>>>
>>>            /* Receive auxiliary data in msgh */
>>>
>>>            for (cmsg = CMSG_FIRSTHDR(&msgh); cmsg != NULL;
>>>                    cmsg = CMSG_NXTHDR(&msgh, cmsg)) {
>>>                if (cmsg->cmsg_level == IPPROTO_IP
>>>                        && cmsg->cmsg_type == IP_TTL) {
>>>                    ttlptr = (int *) CMSG_DATA(cmsg);
>>>                    received_ttl = *ttlptr;
>>>                    break;
>>>                }
>>>            }
>>>
>>>            if (cmsg == NULL) {
>>>                /* Error: IP_TTL not enabled or small buffer or I/O error */
>>>            }
>>>
>>>        The  code below passes an array of file descriptors over a UNIX
>>>        domain socket using SCM_RIGHTS:
>>>
>>>            struct msghdr msg = {0};
>>>            struct cmsghdr *cmsg;
>>>            int myfds[NUM_FD];  /* Contains the file descriptors to pass */
>>>            int *fdptr;
>>>            union {         /* Ancillary data buffer, wrapped in a union
>>>                               in order to ensure it is suitably aligned */
>>>                char buf[CMSG_SPACE(sizeof myfds)];
>>>                struct cmsghdr align;
>>>            } u;
>>>
>>>            msg.msg_control = u.buf;
>>>            msg.msg_controllen = sizeof u.buf;
>>>            cmsg = CMSG_FIRSTHDR(&msg);
>>>            cmsg->cmsg_level = SOL_SOCKET;
>>>            cmsg->cmsg_type = SCM_RIGHTS;
>>>            cmsg->cmsg_len = CMSG_LEN(sizeof(int) * NUM_FD);
>>>            fdptr = (int *) CMSG_DATA(cmsg);    /* Initialize the payload */
>>>            memcpy(fdptr, myfds, NUM_FD * sizeof(int));
>>> ]]
>>>
>>> Cheers,
>>>
>>> Michael
>>>
>>
> 
> 
>
Adhemerval Zanella May 2, 2016, 7:17 p.m. UTC | #13
On 21/04/2016 14:14, Rich Felker wrote:
> On Mon, Mar 28, 2016 at 11:34:00AM -0300, Adhemerval Zanella wrote:
>> POSIX specifies that both msghdr::msg_iovlen and msghdr::msg_controllen
>> to be of size int and socklen_t respectively.  However Linux defines it as
>> both size_t and for 64-bit it requires some adjustments to make the
>> functions standard compliance.
>>
>> This patch fixes it by creating a temporary header and zeroing the pad
>> fields for 64-bits architecture where size of size_t exceeds the size of
>> the int.
>>
>> Also the new recvmsg and sendmsg implementation is only added on libc,
>> with libpthread only containing a compat symbol.
> 
> Just a heads-up: this needs a bug report/patch to the Linux man-pages
> project as well, since they're currently documenting the wrong types.
> The documented types should probably be fixed to align with the
> standard, with a note about them previously being wrong added to the
> NOTES section.
> 
> Rich
> 

Based on previous messages I think Michael Kerrisk is already taking care
of it.

I think the remaining issue regarding this patch is 1. the alignment
issue for cmsg_hdr and 2. the sendmsg cmsg padding.

For 1. I do not think this should be a block issue, so I would like
ask if someone have a impending reason for this patch.

For 2. I already have prepared patch that I intend to send after 
this fix is upstream.
Zack Weinberg June 7, 2016, 1:31 p.m. UTC | #14
On 03/28/2016 10:34 AM, Adhemerval Zanella wrote:
> POSIX specifies that both msghdr::msg_iovlen and msghdr::msg_controllen
> to be of size int and socklen_t respectively.  However Linux defines it as
> both size_t and for 64-bit it requires some adjustments to make the
> functions standard compliance.

I expressed objections to the follow-up patch to this (tackling the same
type inconsistency within cmsgbuf) and, on reflection, my objection
applies as well (with somewhat less force) to this patch, so I'm going
to explain myself once here.

send/recv(m)msg are kernel primitives, and the fundamental basis of my
objection is that I think the C library should always faithfully expose
all the true kernel interfaces, *even if they are in some way wrong*.
This conformance violation should be addressed by the kernel first, and
only then should the C library follow suit.  That means that neither
this patch, nor the follow-up patch tackling cmsgbuf, should be applied
at all.  If either has already been applied, they should be backed out.

(If the kernel developers refuse to fix a conformance violation, then
that kernel has chosen to permanently deviate from the standard and,
again, the C library should faithfully reflect that.)

This objection has extra force under four circumstances all of which
apply to send/recv(m)msg:

 * The problem is a minor deviation from a standard and is unlikely to
   affect non-contrived programs.

 * The kernel primitives in question are async-signal-safe; that is a
   difficult contract to uphold in user space -- the more work the C
   library does, the more likely it is to screw something up.

 * A completely transparent user-space workaround would need to allocate
   memory.  In this case, it is my considered opinion that the proposed
   hard upper limit on the size of cmsgbuf is *far* more likely to break
   existing programs than leaving the status quo alone.

 * The kernel primitives in question are arbitrarily extensible, so it
   is not possible to guarantee that a user-space workaround cannot
   break future additions to the interface.

Earlier, I said that I didn't like copying cmsgbuf because it wasn't
possible to be sure that no cmsg opcodes cared (now or in the future)
about the address of the buffer, and Adhemerval said (effectively) that
such an opcode would not make sense.  That's not true.  Imagine, if you
will, a cmsg that expects the ancillary buffer to be overlaid on a
shared memory area, and rewrites the *non*-ancillary buffer to reflect
the location of that memory area in the receiver. Contrived? Perhaps.
Can we be sure no one will ever want to do something like that?  No, we
cannot.

zw
Adhemerval Zanella June 7, 2016, 2:21 p.m. UTC | #15
On 07/06/2016 10:31, Zack Weinberg wrote:
> On 03/28/2016 10:34 AM, Adhemerval Zanella wrote:
>> POSIX specifies that both msghdr::msg_iovlen and msghdr::msg_controllen
>> to be of size int and socklen_t respectively.  However Linux defines it as
>> both size_t and for 64-bit it requires some adjustments to make the
>> functions standard compliance.
> 
> I expressed objections to the follow-up patch to this (tackling the same
> type inconsistency within cmsgbuf) and, on reflection, my objection
> applies as well (with somewhat less force) to this patch, so I'm going
> to explain myself once here.
> 
> send/recv(m)msg are kernel primitives, and the fundamental basis of my
> objection is that I think the C library should always faithfully expose
> all the true kernel interfaces, *even if they are in some way wrong*.
> This conformance violation should be addressed by the kernel first, and
> only then should the C library follow suit.  That means that neither
> this patch, nor the follow-up patch tackling cmsgbuf, should be applied
> at all.  If either has already been applied, they should be backed out.

I strongly disagree with this definition, the C library is still an
abstraction on underlying kernel and GLIBC should and follows POSIX
standards even when it deviates from the kernel primitives.  The same
idea of standard is what drove the various fixes on both math library
conformance and various primitives (quick_exit is an example).

And it is also why some from community view explicit exposing some
Linux primitives (such as gettid) to be a controversial subject.

But I do agree with that it *should* be fixed in kernel in the same
way it has fixed deviation from standard: by providing a new
syscall entry and by glibc using is directly.

> 
> (If the kernel developers refuse to fix a conformance violation, then
> that kernel has chosen to permanently deviate from the standard and,
> again, the C library should faithfully reflect that.)
> 
> This objection has extra force under four circumstances all of which
> apply to send/recv(m)msg:
> 
>  * The problem is a minor deviation from a standard and is unlikely to
>    affect non-contrived programs.

I agree it is a minor deviation, but it is still a deviation I see no 
compelling reason to just continue to deviate from standard just because
it might not affect non-contrived programs.

> 
>  * The kernel primitives in question are async-signal-safe; that is a
>    difficult contract to uphold in user space -- the more work the C
>    library does, the more likely it is to screw something up.

It is not the kernel that provides async-signal-safe, but rather the way
the underlying libc wrapper is programmed.  If we had an async-signal-safe
malloc, we could allocate an arbitrary copy buffer. And current wrappers 
does not call any non async-signal-safe functions. 

> 
>  * A completely transparent user-space workaround would need to allocate
>    memory.  In this case, it is my considered opinion that the proposed
>    hard upper limit on the size of cmsgbuf is *far* more likely to break
>    existing programs than leaving the status quo alone.

That's why I sent the second part of my fix as RFC and this part I think
it is the most controversial nit which I am not sure if the possible
breakage is worth.  I am proposing to use a large buffer to avoid 
issues on mostly usercases, but indeed it might not be suffice for some
rare cases.  But again, such cases I presume it would be rare and somewhat
easily fixable.

> 
>  * The kernel primitives in question are arbitrarily extensible, so it
>    is not possible to guarantee that a user-space workaround cannot
>    break future additions to the interface.
> 
> Earlier, I said that I didn't like copying cmsgbuf because it wasn't
> possible to be sure that no cmsg opcodes cared (now or in the future)
> about the address of the buffer, and Adhemerval said (effectively) that
> such an opcode would not make sense.  That's not true.  Imagine, if you
> will, a cmsg that expects the ancillary buffer to be overlaid on a
> shared memory area, and rewrites the *non*-ancillary buffer to reflect
> the location of that memory area in the receiver.

Again, I see to no problem in this scenario: the function prototype states
a constant cmsghdr and it will not change its state. Even if the ancillary
buffer might change, it is up to application to synchronize its access
and call sendmsg in a flow where the data is a consistent state.  I personally
see that calling a syscall with a buffer in racy condition does not make
sense.

 Contrived? Perhaps.
> Can we be sure no one will ever want to do something like that?  No, we
> cannot.
> 
> zw
>
Zack Weinberg June 8, 2016, 8:15 p.m. UTC | #16
On Tue, Jun 7, 2016 at 10:21 AM, Adhemerval Zanella
<adhemerval.zanella@linaro.org> wrote:
> On 07/06/2016 10:31, Zack Weinberg wrote:
>>
>> send/recv(m)msg are kernel primitives, and the fundamental basis of my
>> objection is that I think the C library should always faithfully expose
>> all the true kernel interfaces, *even if they are in some way wrong*.
>> This conformance violation should be addressed by the kernel first, and
>> only then should the C library follow suit.  That means that neither
>> this patch, nor the follow-up patch tackling cmsgbuf, should be applied
>> at all.  If either has already been applied, they should be backed out.
>
> I strongly disagree with this definition, the C library is still an
> abstraction on underlying kernel and GLIBC should and follows POSIX
> standards even when it deviates from the kernel primitives.  The same
> idea of standard is what drove the various fixes on both math library
> conformance and various primitives (quick_exit is an example).

You are going to have a very hard time persuading me to change my
position, and this ain't gonna do it. This is circular logic.  "We
should follow POSIX because we should follow POSIX."

I would consider a real (not made up for the purpose, and ideally,
already existing) program that is broken by not having these types be
as POSIX specifies to be a *valid argument* for changing the types,
but even that might not be a *persuasive* argument for changing the
types, especially since Florian has pointed out actual breakage from
changing them.  (Frankly, I think Florian's report of actual breakage
should be the last word on the subject - back the patch out already,
and let us never speak of this again.)

What would persuade you to accept *my* position on this issue?

(I'm cc:ing some of the usual standards-compliance gurus.  I'm
slightly more likely to be convinced by someone who is not advocating
for their own patch.)

> And it is also why some from community view explicit exposing some
> Linux primitives (such as gettid) to be a controversial subject.

As soon as I get some spare time (probably not in the 2.24 time frame)
I am going to post a patch that makes glibc expose every single one of
the Linux primitives that we don't already expose, because that's what
I think we should do.  But that's a tangent from this discussion.

...
>> Earlier, I said that I didn't like copying cmsgbuf because it wasn't
>> possible to be sure that no cmsg opcodes cared (now or in the future)
>> about the address of the buffer, and Adhemerval said (effectively) that
>> such an opcode would not make sense.  That's not true.  Imagine, if you
>> will, a cmsg that expects the ancillary buffer to be overlaid on a
>> shared memory area, and rewrites the *non*-ancillary buffer to reflect
>> the location of that memory area in the receiver.
>
> Again, I see to no problem in this scenario: the function prototype states
> a constant cmsghdr and it will not change its state. Even if the ancillary
> buffer might change, it is up to application to synchronize its access
> and call sendmsg in a flow where the data is a consistent state.  I personally
> see that calling a syscall with a buffer in racy condition does not make
> sense.

You clearly still don't get it.  It's not about the buffer being in a
racy condition.  It's that the *address might be part of the message.*
 "Nobody should do that" is NOT a valid objection, because this is an
arbitrarily extensible interface.

Let me try again with another example.  Imagine that there exists a
SCM_CREATE_SHMEM ancillary message whose effect is to *convert that
chunk of the ancillary buffer into a shared memory area*.  The kernel
will remap the data portion of the cmsg into the receiver, and supply
the receiver with the address at which it was mapped.  (You might be
about to object that it doesn't make sense to embed the desired shared
memory area in the cmsg, but, again, that is not a valid objection.
This is an arbitrarily extensible interface.  People can, will, and
*have* done arbitrarily bizarre things with it.)  Copying the
ancillary buffer, *in and of itself*, would break this message.  So
would applying any small size limit to the length of an ancillary
buffer.  And come to think of it, this hypothetical cmsg would also
justify the kernel's insisting to continue to use size_t for both
cmsg_len and msg_controllen.

zw
Adhemerval Zanella June 8, 2016, 8:57 p.m. UTC | #17
On 08/06/2016 17:15, Zack Weinberg wrote:
> On Tue, Jun 7, 2016 at 10:21 AM, Adhemerval Zanella
> <adhemerval.zanella@linaro.org> wrote:
>> On 07/06/2016 10:31, Zack Weinberg wrote:
>>>
>>> send/recv(m)msg are kernel primitives, and the fundamental basis of my
>>> objection is that I think the C library should always faithfully expose
>>> all the true kernel interfaces, *even if they are in some way wrong*.
>>> This conformance violation should be addressed by the kernel first, and
>>> only then should the C library follow suit.  That means that neither
>>> this patch, nor the follow-up patch tackling cmsgbuf, should be applied
>>> at all.  If either has already been applied, they should be backed out.
>>
>> I strongly disagree with this definition, the C library is still an
>> abstraction on underlying kernel and GLIBC should and follows POSIX
>> standards even when it deviates from the kernel primitives.  The same
>> idea of standard is what drove the various fixes on both math library
>> conformance and various primitives (quick_exit is an example).
> 
> You are going to have a very hard time persuading me to change my
> position, and this ain't gonna do it. This is circular logic.  "We
> should follow POSIX because we should follow POSIX."
> 
> I would consider a real (not made up for the purpose, and ideally,
> already existing) program that is broken by not having these types be
> as POSIX specifies to be a *valid argument* for changing the types,
> but even that might not be a *persuasive* argument for changing the
> types, especially since Florian has pointed out actual breakage from
> changing them.  (Frankly, I think Florian's report of actual breakage
> should be the last word on the subject - back the patch out already,
> and let us never speak of this again.)
> 
> What would persuade you to accept *my* position on this issue?

I am stating we follow POSIX to very reason we follow other technical
standard: to provide libc compatibility.

And the breakage Florian has pointed (and I replied) is a very
specific one that also require the interposing library to know a
very deal of the interposed library.  This kind of tooling is highly
coupled with implementation and there are various hacks and slight
breakages that minor GLIBC changes already incurred (for instance
on libsanitizer, every TCB size change needs to be explicit take
in account).

And I do not see the tooling breakage as compelling reason to break
interface changes and fixes.

> 
> (I'm cc:ing some of the usual standards-compliance gurus.  I'm
> slightly more likely to be convinced by someone who is not advocating
> for their own patch.)
> 
>> And it is also why some from community view explicit exposing some
>> Linux primitives (such as gettid) to be a controversial subject.
> 
> As soon as I get some spare time (probably not in the 2.24 time frame)
> I am going to post a patch that makes glibc expose every single one of
> the Linux primitives that we don't already expose, because that's what
> I think we should do.  But that's a tangent from this discussion.

This has been discussed before, so I would suggest you to first read
Joseph suggested list [1].  The original thread [2] also show more
discussion for each syscalls [2].

[1] https://sourceware.org/ml/libc-alpha/2015-11/msg00373.html
[2] https://sourceware.org/ml/libc-alpha/2013-02/msg00030.html

> 
> ...
>>> Earlier, I said that I didn't like copying cmsgbuf because it wasn't
>>> possible to be sure that no cmsg opcodes cared (now or in the future)
>>> about the address of the buffer, and Adhemerval said (effectively) that
>>> such an opcode would not make sense.  That's not true.  Imagine, if you
>>> will, a cmsg that expects the ancillary buffer to be overlaid on a
>>> shared memory area, and rewrites the *non*-ancillary buffer to reflect
>>> the location of that memory area in the receiver.
>>
>> Again, I see to no problem in this scenario: the function prototype states
>> a constant cmsghdr and it will not change its state. Even if the ancillary
>> buffer might change, it is up to application to synchronize its access
>> and call sendmsg in a flow where the data is a consistent state.  I personally
>> see that calling a syscall with a buffer in racy condition does not make
>> sense.
> 
> You clearly still don't get it.  It's not about the buffer being in a
> racy condition.  It's that the *address might be part of the message.*
>  "Nobody should do that" is NOT a valid objection, because this is an
> arbitrarily extensible interface.
> 
> Let me try again with another example.  Imagine that there exists a
> SCM_CREATE_SHMEM ancillary message whose effect is to *convert that
> chunk of the ancillary buffer into a shared memory area*.  The kernel
> will remap the data portion of the cmsg into the receiver, and supply
> the receiver with the address at which it was mapped.  (You might be
> about to object that it doesn't make sense to embed the desired shared
> memory area in the cmsg, but, again, that is not a valid objection.
> This is an arbitrarily extensible interface.  People can, will, and
> *have* done arbitrarily bizarre things with it.)  Copying the
> ancillary buffer, *in and of itself*, would break this message.  So
> would applying any small size limit to the length of an ancillary
> buffer.  And come to think of it, this hypothetical cmsg would also
> justify the kernel's insisting to continue to use size_t for both
> cmsg_len and msg_controllen.

This very interface does not make sense: the ancillary message will
be required to be remmaped anyway in syscall transition to kernel
space.  So in the end, if you try to remap a 1GB buffer in this
hypothetical syscall, kernel in the end will need to first to copy
the 1GB message to kernel space and then remap the original pointer.
I highly double kernel will ever supports such syscall or any syscall
that you might pass a buffer that is suppose to be volatile.
Carlos O'Donell June 9, 2016, 3:18 a.m. UTC | #18
On 06/08/2016 04:57 PM, Adhemerval Zanella wrote:
> On 08/06/2016 17:15, Zack Weinberg wrote:
>> What would persuade you to accept *my* position on this issue?
> 
> I am stating we follow POSIX to very reason we follow other technical
> standard: to provide libc compatibility.

There are some places where POSIX and Linux disagree.

In as much as we possibly can we should try to standardize on POSIX.

Each decision should be done on a case-by-base basis, and I think in
this case of recvmsg/sendmsg it is possible to comply with POSIX without
serious problems. I've only seen rare examples of theoretical breakage,
and the reported dlsym issue simply needs fixing in the dynamic loader.

I support the use of a standard, and working to try and comply with
the standard, and also to work with the standard to fix it as we go.
The mess that was futex return codes before Michael, Torvald and
others started documenting and refining things should show how out
of control some linux syscalls can get.

If we really want better network APIs then we should just be creating
them ourselves, implementing them in Linux, and exporting them via
glibc wrappers/headers. We should make APIs that are *better* than
recvmsg/sendmsg. Once we get a good API we can propose that to POSIX.
Rather than the other way around, which is a half-hearted attempt to
be POSIX compatible, but with enough changes to provided portability
issues and documentation problems.

I will end by giving a counter example...

In POSIX section 2.9.7 it states that 39 filesystem operations need to 
be atomic with respect to eachother given threads in a process. This
is huge performance killer, particularly the write and fstat atomicity.
As of linux 3.14 the two threads calling write at exactly the same time
(on the same fd) will actually get two distinct  offsets, which is good,
you get interleaved writes. The rest of the functions will never, and
should not be, guaranteed atomic (wrt to threads in the same process),
and glibc on linux should never follow POSIX in this case. To support
POSIX we would need userspace IO locks and purposely serialize these
operations.

Each case for conformance should be considered on it's own merit.

Regarding adding all linux syscalls, please see:
https://sourceware.org/glibc/wiki/Consensus#WIP:_Kernel_syscalls_wrappers
and the notes that Adhemerval provided.
Zack Weinberg June 9, 2016, 1:25 p.m. UTC | #19
On Wed, Jun 8, 2016 at 4:57 PM, Adhemerval Zanella
<adhemerval.zanella@linaro.org> wrote:
> On 08/06/2016 17:15, Zack Weinberg wrote:
>> On Tue, Jun 7, 2016 at 10:21 AM, Adhemerval Zanella
>> <adhemerval.zanella@linaro.org> wrote:
>>> On 07/06/2016 10:31, Zack Weinberg wrote:
>>>>
>>>> send/recv(m)msg are kernel primitives, and the fundamental basis of my
>>>> objection is that I think the C library should always faithfully expose
>>>> all the true kernel interfaces, *even if they are in some way wrong*.
...
>> You are going to have a very hard time persuading me to change my
>> position, and this ain't gonna do it. This is circular logic.  "We
>> should follow POSIX because we should follow POSIX."
>>
>> I would consider a real (not made up for the purpose, and ideally,
>> already existing) program that is broken by not having these types be
>> as POSIX specifies to be a *valid argument* for changing the types,
>> but even that might not be a *persuasive* argument for changing the
>> types[.]
...
>> What would persuade you to accept *my* position on this issue?
>
> I am stating we follow POSIX to very reason we follow other technical
> standard: to provide libc compatibility.

This is not a response to my argument.

I said that I consider faithfully exposing all the true kernel
interfaces to be MORE IMPORTANT than following POSIX.  I also told you
what I would consider to be a valid (but not necessarily persuasive)
argument against my position: exhibit a real program that is broken by
this conformance deviation.

You respond by reiterating that following POSIX is important in the
abstract.  Sure, but I still think that faithfully exposing all the
true kernel interfaces is *more* important.  You give a reason why
it's important in the abstract: to provide compatibility with other
systems.  Sure, but for that to have any persuasive force at all, you
need to exhibit a real program that cares.

And I would still like to know what could persuade *you* to agree with
*me*; maybe then I could actually present that.

> And the breakage Florian has pointed (and I replied) is a very
> specific one that also require the interposing library to know a
> very deal of the interposed library.  This kind of tooling is highly
> coupled with implementation and there are various hacks and slight
> breakages that minor GLIBC changes already incurred (for instance
> on libsanitizer, every TCB size change needs to be explicit take
> in account).

I'm sorry, I have not been able to make any sense whatsoever out of
this paragraph.

> And I do not see the tooling breakage as compelling reason to break
> interface changes and fixes.

If you're trying to say that you think the breakage Florian cited is
too unusual to worry about, I cannot agree with that.  We're bending
over backward to keep old Emacs binaries working that depended on
glibc-specific details of the malloc implementation -- interposition
of network APIs is commonplace by comparison.

(I also suspect that this *can't* be easily papered over by messing
with the semantics of dlsym() -- see upcoming reply to Carlos.)

>> You clearly still don't get it.  It's not about the buffer being in a
>> racy condition.  It's that the *address might be part of the message.*
>> "Nobody should do that" is NOT a valid objection, because this is an
>> arbitrarily extensible interface.
>>
>> Let me try again with another example.
[...]
> This very interface does not make sense

What part of '"Nobody should do that" is NOT a valid objection' was
unclear?

zw
Zack Weinberg June 9, 2016, 1:35 p.m. UTC | #20
On Wed, Jun 8, 2016 at 11:18 PM, Carlos O'Donell <carlos@redhat.com> wrote:
> On 06/08/2016 04:57 PM, Adhemerval Zanella wrote:
>> On 08/06/2016 17:15, Zack Weinberg wrote:
>>> What would persuade you to accept *my* position on this issue?
>>
>> I am stating we follow POSIX to very reason we follow other technical
>> standard: to provide libc compatibility.
>
> There are some places where POSIX and Linux disagree.
> In as much as we possibly can we should try to standardize on POSIX.

Right, so as I have been saying all along, I don't agree with this; I
think it is *more important* to faithfully reflect the semantics of
each supported kernel.  If that means that glibc-on-Linux has
different semantics than glibc-on-FreeBSD (for instance), well, that
was probably unavoidable anyway.

The reason for this is that POSIX is very slow, and the C library
should not stand in the way of *improvements* to the kernel interface.
The case we're arguing about is relatively minor, but maybe someone
has an actual need for size_t-sized ancillary data buffers - I don't
think we should be judging whether that is a valid use case.

(A much more significant issue here is the ridiculous delays in adding
things like gettid() and getrandom().)

> Each decision should be done on a case-by-base basis, and I think in
> this case of recvmsg/sendmsg it is possible to comply with POSIX without
> serious problems. I've only seen rare examples of theoretical breakage,
> and the reported dlsym issue simply needs fixing in the dynamic loader.

I do not think the reported dlsym issue is a simple matter to fix.  I
haven't sat down and worked it out, but I am fairly sure it is
possible, for most symbols that glibc exposes multiple versions of, to
construct a pair of interposition modules which are black-box
indistinguishable except that one of them needs version A and the
other needs version B -- which is to say, whatever dlsym("name",
RTLD_NEXT) does, it's going to be wrong for one of them.  Which is why
there is dlvsym()... but changing the semantics of dlsym() strikes me
as asking for much greater breakage.

zw
Joseph Myers June 9, 2016, 2:21 p.m. UTC | #21
On Wed, 8 Jun 2016, Carlos O'Donell wrote:

> Regarding adding all linux syscalls, please see:
> https://sourceware.org/glibc/wiki/Consensus#WIP:_Kernel_syscalls_wrappers
> and the notes that Adhemerval provided.

Note that we have never managed to reach consensus on syscall wrappers.  
However, I now disagree with the idea of a separate libinux-syscalls.so 
library, or of only adding to libc.so those syscalls considered 
appropriate for the OS-independent GNU API.  I think that all non-obsolete 
syscalls that can meaningfully be used outside of glibc in a glibc-using 
program, even those extremely OS-specific or only likely to be of use to a 
handful of programs on an OS, should have wrappers added to glibc, and 
that we should not need to decide when adding them whether they are 
appropriate for the OS-independent GNU API (although if there is consensus 
for putting them in the OS-independent GNU API at the time they are added, 
they should go there at that point - other functions might be added to the 
OS-independent GNU API later).

We *do* still need to decide for such wrappers what the userspace types 
are to use with them, what the prototype is, what header declares the 
functions, and how errno and thread cancellation are handled.  And we *do* 
need documentation in the glibc manual for all such wrappers, and 
testcases in the glibc testsuite (though some such tests may not be able 
to do more than test that a call to the function compiles and links).
Adhemerval Zanella June 9, 2016, 2:25 p.m. UTC | #22
On 09/06/2016 10:25, Zack Weinberg wrote:
> On Wed, Jun 8, 2016 at 4:57 PM, Adhemerval Zanella
> <adhemerval.zanella@linaro.org> wrote:
>> On 08/06/2016 17:15, Zack Weinberg wrote:
>>> On Tue, Jun 7, 2016 at 10:21 AM, Adhemerval Zanella
>>> <adhemerval.zanella@linaro.org> wrote:
>>>> On 07/06/2016 10:31, Zack Weinberg wrote:
>>>>>
>>>>> send/recv(m)msg are kernel primitives, and the fundamental basis of my
>>>>> objection is that I think the C library should always faithfully expose
>>>>> all the true kernel interfaces, *even if they are in some way wrong*.
> ...
>>> You are going to have a very hard time persuading me to change my
>>> position, and this ain't gonna do it. This is circular logic.  "We
>>> should follow POSIX because we should follow POSIX."
>>>
>>> I would consider a real (not made up for the purpose, and ideally,
>>> already existing) program that is broken by not having these types be
>>> as POSIX specifies to be a *valid argument* for changing the types,
>>> but even that might not be a *persuasive* argument for changing the
>>> types[.]
> ...
>>> What would persuade you to accept *my* position on this issue?
>>
>> I am stating we follow POSIX to very reason we follow other technical
>> standard: to provide libc compatibility.
> 
> This is not a response to my argument.
> 
> I said that I consider faithfully exposing all the true kernel
> interfaces to be MORE IMPORTANT than following POSIX.  I also told you
> what I would consider to be a valid (but not necessarily persuasive)
> argument against my position: exhibit a real program that is broken by
> this conformance deviation.
> 
> You respond by reiterating that following POSIX is important in the
> abstract.  Sure, but I still think that faithfully exposing all the
> true kernel interfaces is *more* important.  You give a reason why
> it's important in the abstract: to provide compatibility with other
> systems.  Sure, but for that to have any persuasive force at all, you
> need to exhibit a real program that cares.
> 
> And I would still like to know what could persuade *you* to agree with
> *me*; maybe then I could actually present that.

Like Carlos has pointed out, my idea is not blindly follow POSIX in any
statement, but rather discuss if each standard definition and interface
make sense to implement or not in glibc. Now, I see that not only
conformance, but also compatibility is a goal GLIBC is aim to provide.
I do not see the case that I should provide a real program, mostly
because real programs either will fix the system difference by 
themselves.

Now related this specific issue: BZ#16919 was opened in 2014 and there
was some discussion in the way it should be fixed in GLIBC. Now, what 
you are advocating is we should close it as WONTFIX, keep the documentation
that GLIBC does not follow the standard and state that we won't fix
it in GLIBC.

I personally do not oppose for this course of action, but we need
*consensus*. What I do not agree is to GLIBC also blindly follow kernel
interfaces, even when they differ from POSIX or any other standard
GLIBC aims to follow.

But I do agree that libsanitizer breakage it is something we should 
consider, even if relies on implementation details not exposed by the API.
That why I talked with Carlos en Joseph at IRC (and my mistake here,
I should have brought it on maillist as well). And my understanding
it is still desirable to push this fix.

> 
>> And the breakage Florian has pointed (and I replied) is a very
>> specific one that also require the interposing library to know a
>> very deal of the interposed library.  This kind of tooling is highly
>> coupled with implementation and there are various hacks and slight
>> breakages that minor GLIBC changes already incurred (for instance
>> on libsanitizer, every TCB size change needs to be explicit take
>> in account).
> 
> I'm sorry, I have not been able to make any sense whatsoever out of
> this paragraph.

I mean that the way libsanitizer couple with GLIBC requires to know
internal specific details that is not part of the ABI/exported
interfaces. And also it is possible to hook on libc provided symbols
with some sanitizer changes, so stating this change breaks libsanitizer
is not a strong reason to block this fix IMHO.

> 
>> And I do not see the tooling breakage as compelling reason to break
>> interface changes and fixes.
> 
> If you're trying to say that you think the breakage Florian cited is
> too unusual to worry about, I cannot agree with that.  We're bending
> over backward to keep old Emacs binaries working that depended on
> glibc-specific details of the malloc implementation -- interposition
> of network APIs is commonplace by comparison.
> 
> (I also suspect that this *can't* be easily papered over by messing
> with the semantics of dlsym() -- see upcoming reply to Carlos.)

Because EMACS uses a defined ABI/interface GLIBC aimed to exposed.
The Florian specific one is another bug [1].

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=14932

> 
>>> You clearly still don't get it.  It's not about the buffer being in a
>>> racy condition.  It's that the *address might be part of the message.*
>>> "Nobody should do that" is NOT a valid objection, because this is an
>>> arbitrarily extensible interface.
>>>
>>> Let me try again with another example.
> [...]
>> This very interface does not make sense
> 
> What part of '"Nobody should do that" is NOT a valid objection' was
> unclear?

It is not 'nobody should do that', it is 'this interface you described
do not make sense'.
Carlos O'Donell June 10, 2016, 6:01 p.m. UTC | #23
On 06/09/2016 09:35 AM, Zack Weinberg wrote:
> On Wed, Jun 8, 2016 at 11:18 PM, Carlos O'Donell <carlos@redhat.com> wrote:
>> On 06/08/2016 04:57 PM, Adhemerval Zanella wrote:
>>> On 08/06/2016 17:15, Zack Weinberg wrote:
>>>> What would persuade you to accept *my* position on this issue?
>>>
>>> I am stating we follow POSIX to very reason we follow other technical
>>> standard: to provide libc compatibility.
>>
>> There are some places where POSIX and Linux disagree.
>> In as much as we possibly can we should try to standardize on POSIX.
> 
> Right, so as I have been saying all along, I don't agree with this; I
> think it is *more important* to faithfully reflect the semantics of
> each supported kernel.  If that means that glibc-on-Linux has
> different semantics than glibc-on-FreeBSD (for instance), well, that
> was probably unavoidable anyway.

That's fine so long as you don't pretend to be POSIX-compatible.
 
> The reason for this is that POSIX is very slow, and the C library
> should not stand in the way of *improvements* to the kernel interface.

It doesn't stand in the way. We have lots of interfaces that are
non-POSIX and we aren't touching them.

> The case we're arguing about is relatively minor, but maybe someone
> has an actual need for size_t-sized ancillary data buffers - I don't
> think we should be judging whether that is a valid use case.

The problem I have with this line of argument is that a standards-conforming
interface name was taken and changed.

If you really wanted size_t sized ancillary data buffers we should have
just declared a different type and new functions to handle that. It's
that easy.

> (A much more significant issue here is the ridiculous delays in adding
> things like gettid() and getrandom().)

There are no delays in adding gettid or getrandom, there are only the
lack of skilled people willing to do the work at level of quality required
of a core library implementation.

>> Each decision should be done on a case-by-base basis, and I think in
>> this case of recvmsg/sendmsg it is possible to comply with POSIX without
>> serious problems. I've only seen rare examples of theoretical breakage,
>> and the reported dlsym issue simply needs fixing in the dynamic loader.
> 
> I do not think the reported dlsym issue is a simple matter to fix.  I
> haven't sat down and worked it out, but I am fairly sure it is
> possible, for most symbols that glibc exposes multiple versions of, to
> construct a pair of interposition modules which are black-box
> indistinguishable except that one of them needs version A and the
> other needs version B -- which is to say, whatever dlsym("name",
> RTLD_NEXT) does, it's going to be wrong for one of them.  Which is why
> there is dlvsym()... but changing the semantics of dlsym() strikes me
> as asking for much greater breakage.
 
The entire dlsym/dlvsym discussion is a red-herring. At some point you'll
need to interpose yourself in front of a versioned interface and you'll
need two things to do it properly:
* A versioned symbol of your own.
* Use of dlvsym().
Anything that doesn't work with this setup is a bug we need to fix.

Fixing dlsym() is a required fix because the semantics are just plain
wrong, and it happened to fix the present use case, but eventually it
doesn't work.
Joseph Myers June 10, 2016, 6:19 p.m. UTC | #24
On Fri, 10 Jun 2016, Carlos O'Donell wrote:

> > (A much more significant issue here is the ridiculous delays in adding
> > things like gettid() and getrandom().)
> 
> There are no delays in adding gettid or getrandom, there are only the
> lack of skilled people willing to do the work at level of quality required
> of a core library implementation.

For gettid I think the actual issue is difficulty in establishing 
consensus in the absence of unanimity.  That's caused issues for all of: 
gettid / pthread_gettid_np, explicit_bzero, strlcpy / strlcat, Linux 
syscall wrappers in general especially where not appropriate for the 
OS-independent GNU API.  (I think all of those are appropriate for 
inclusion in glibc.)
Carlos O'Donell June 10, 2016, 6:45 p.m. UTC | #25
On 06/10/2016 02:19 PM, Joseph Myers wrote:
> On Fri, 10 Jun 2016, Carlos O'Donell wrote:
> 
>>> (A much more significant issue here is the ridiculous delays in adding
>>> things like gettid() and getrandom().)
>>
>> There are no delays in adding gettid or getrandom, there are only the
>> lack of skilled people willing to do the work at level of quality required
>> of a core library implementation.
> 
> For gettid I think the actual issue is difficulty in establishing 
> consensus in the absence of unanimity.  That's caused issues for all of: 
> gettid / pthread_gettid_np, explicit_bzero, strlcpy / strlcat, Linux 
> syscall wrappers in general especially where not appropriate for the 
> OS-independent GNU API.  (I think all of those are appropriate for 
> inclusion in glibc.)
 
The OS-independent GNU API is in my mind an issue of documentation,
describing which APIs are Linux-specific and which are not.

I would be happy to see gettid implemented as a wrapper if someone
would step up to document exactly how it behaves and where it is valid
to use the resulting return value and in what APIs. That I think is
where everyone balks at the work.
Joseph Myers June 10, 2016, 8:17 p.m. UTC | #26
On Fri, 10 Jun 2016, Carlos O'Donell wrote:

> > For gettid I think the actual issue is difficulty in establishing 
> > consensus in the absence of unanimity.  That's caused issues for all of: 
> > gettid / pthread_gettid_np, explicit_bzero, strlcpy / strlcat, Linux 
> > syscall wrappers in general especially where not appropriate for the 
> > OS-independent GNU API.  (I think all of those are appropriate for 
> > inclusion in glibc.)
>  
> The OS-independent GNU API is in my mind an issue of documentation,
> describing which APIs are Linux-specific and which are not.

There were objections from Roland and Mike at least to adding 
Linux-specific APIs to libc.so.

Now, I disagree with the proposal for libinux-syscalls.so:

* Because of thread cancellation (something that needs considering for 
each syscall wrapper, along with errno and the choice of types), 
libinux-syscalls.so would be closely tied to a particular libc version, so 
you can't readily keep around copies from multiple libc versions.

* Thus, removing any interfaces from it to add them to libc.so would be 
problematic, because glibc would need to install multiple versions of 
libinux-syscalls.so.

* We cannot readily tell when adding an interface whether it would remain 
Linux-specific or be adopted by other OSes in future, and so whether it 
belongs in the OS-independent GNU API or not.  So we cannot tell when 
adding an interface which library it belongs in.  So we'd be likely 
repeatedly to add interfaces to libinux-syscalls.so and then add them to 
libc.so as well, accumulating compat duplicates in libinux-syscalls.so.

But it's not clear there's consensus on adding syscall wrappers to 
libc.so, and there certainly isn't unanimity.

> I would be happy to see gettid implemented as a wrapper if someone
> would step up to document exactly how it behaves and where it is valid
> to use the resulting return value and in what APIs. That I think is
> where everyone balks at the work.

I think the answer is that the return value may be used with syscalls that 
take tids (which are legimitate interfaces for glibc users to use, in most 
cases), some of which may have matching glibc interfaces and some of which 
may not.  While we should describe what wrappers do in the glibc manual, 
ultimately for Linux-specific syscall wrappers the kernel's behavior is 
more authoritative than the manual.
Carlos O'Donell June 13, 2016, 2:43 p.m. UTC | #27
On 06/10/2016 04:17 PM, Joseph Myers wrote:
> I think the answer is that the return value may be used with syscalls that 
> take tids (which are legimitate interfaces for glibc users to use, in most 
> cases), some of which may have matching glibc interfaces and some of which 
> may not.  While we should describe what wrappers do in the glibc manual, 
> ultimately for Linux-specific syscall wrappers the kernel's behavior is 
> more authoritative than the manual.
 
The authoritative source in this case is the linux man pages project IMO.
diff mbox

Patch

diff --git a/conform/data/sys/socket.h-data b/conform/data/sys/socket.h-data
index 442d4d2..3a6cf7c 100644
--- a/conform/data/sys/socket.h-data
+++ b/conform/data/sys/socket.h-data
@@ -22,10 +22,9 @@  type {struct msghdr}
 element {struct msghdr} {void*} msg_name
 element {struct msghdr} socklen_t msg_namelen
 element {struct msghdr} {struct iovec*} msg_iov
-// Bug 16919: wrong type for msg_iovlen and msg_controllen members.
-xfail-element {struct msghdr} int msg_iovlen
+element {struct msghdr} int msg_iovlen
 element {struct msghdr} {void*} msg_control
-xfail-element {struct msghdr} socklen_t msg_controllen
+element {struct msghdr} socklen_t msg_controllen
 element {struct msghdr} int msg_flags
 
 type {struct iovec}
@@ -35,8 +34,7 @@  element {struct iovec} size_t iov_len
 
 type {struct cmsghdr}
 
-// Bug 16919: wrong type for cmsg_len member.
-xfail-element {struct cmsghdr} socklen_t cmsg_len
+element {struct cmsghdr} socklen_t cmsg_len
 element {struct cmsghdr} int cmsg_level
 element {struct cmsghdr} int cmsg_type
 
diff --git a/nptl/Makefile b/nptl/Makefile
index dc3ccab..4240928 100644
--- a/nptl/Makefile
+++ b/nptl/Makefile
@@ -109,12 +109,13 @@  libpthread-routines = nptl-init vars events version pt-interp \
 		      lll_timedlock_wait lll_timedwait_tid \
 		      pt-fork pt-vfork \
 		      ptw-write ptw-read ptw-close ptw-fcntl ptw-accept \
-		      ptw-connect ptw-recv ptw-recvfrom ptw-recvmsg ptw-send \
-		      ptw-sendmsg ptw-sendto ptw-fsync ptw-lseek ptw-llseek \
+		      ptw-connect ptw-recv ptw-recvfrom ptw-send \
+		      ptw-sendto ptw-fsync ptw-lseek ptw-llseek \
 		      ptw-msync ptw-nanosleep ptw-open ptw-open64 ptw-pause \
 		      ptw-pread ptw-pread64 ptw-pwrite ptw-pwrite64 \
 		      ptw-tcdrain ptw-wait ptw-waitpid ptw-msgrcv ptw-msgsnd \
 		      ptw-sigwait ptw-sigsuspend \
+		      ptw-oldrecvmsg ptw-oldsendmsg \
 		      pt-raise pt-system \
 		      flockfile ftrylockfile funlockfile \
 		      sigaction \
@@ -204,10 +205,10 @@  CFLAGS-recv.c = -fexceptions -fasynchronous-unwind-tables
 CFLAGS-send.c = -fexceptions -fasynchronous-unwind-tables
 CFLAGS-accept.c = -fexceptions -fasynchronous-unwind-tables
 CFLAGS-sendto.c = -fexceptions -fasynchronous-unwind-tables
-CFLAGS-sendmsg.c = -fexceptions -fasynchronous-unwind-tables
 CFLAGS-connect.c = -fexceptions -fasynchronous-unwind-tables
-CFLAGS-recvmsg.c = -fexceptions -fasynchronous-unwind-tables
 CFLAGS-recvfrom.c = -fexceptions -fasynchronous-unwind-tables
+CFLAGS-oldrecvmsg.c = -fexceptions -fasynchronous-unwind-tables
+CFLAGS-oldrecvfrom.c = -fexceptions -fasynchronous-unwind-tables
 
 CFLAGS-pt-system.c = -fexceptions
 
diff --git a/sysdeps/unix/sysv/linux/Makefile b/sysdeps/unix/sysv/linux/Makefile
index 9999600..af8e13a 100644
--- a/sysdeps/unix/sysv/linux/Makefile
+++ b/sysdeps/unix/sysv/linux/Makefile
@@ -124,7 +124,11 @@  ifeq ($(subdir),socket)
 sysdep_headers += net/if_ppp.h net/ppp-comp.h \
 		  net/ppp_defs.h net/if_arp.h net/route.h net/ethernet.h \
 		  net/if_slip.h net/if_packet.h net/if_shaper.h
-sysdep_routines += cmsg_nxthdr
+sysdep_routines += cmsg_nxthdr oldrecvmsg oldsendmsg
+CFLAGS-recvmsg.c = -fexceptions -fasynchronous-unwind-tables
+CFLAGS-sendmsg.c = -fexceptions -fasynchronous-unwind-tables
+CFLAGS-oldrecvmsg.c = -fexceptions -fasynchronous-unwind-tables
+CFLAGS-oldsendmsg.c = -fexceptions -fasynchronous-unwind-tables
 endif
 
 ifeq ($(subdir),sunrpc)
diff --git a/sysdeps/unix/sysv/linux/aarch64/Versions b/sysdeps/unix/sysv/linux/aarch64/Versions
index 9bd87fe..ae3742c 100644
--- a/sysdeps/unix/sysv/linux/aarch64/Versions
+++ b/sysdeps/unix/sysv/linux/aarch64/Versions
@@ -5,6 +5,10 @@  ld {
   }
 }
 libc {
+  GLIBC_2.24 {
+    recvmsg; sendmsg;
+  }
+
   GLIBC_PRIVATE {
     __vdso_clock_gettime;
     __vdso_clock_getres;
diff --git a/sysdeps/unix/sysv/linux/aarch64/libc.abilist b/sysdeps/unix/sysv/linux/aarch64/libc.abilist
index 5799239..c3f2346 100644
--- a/sysdeps/unix/sysv/linux/aarch64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/aarch64/libc.abilist
@@ -2087,3 +2087,6 @@  GLIBC_2.23 fts64_close F
 GLIBC_2.23 fts64_open F
 GLIBC_2.23 fts64_read F
 GLIBC_2.23 fts64_set F
+GLIBC_2.24 GLIBC_2.24 A
+GLIBC_2.24 recvmsg F
+GLIBC_2.24 sendmsg F
diff --git a/sysdeps/unix/sysv/linux/alpha/Versions b/sysdeps/unix/sysv/linux/alpha/Versions
index 29b82f9..31abb22 100644
--- a/sysdeps/unix/sysv/linux/alpha/Versions
+++ b/sysdeps/unix/sysv/linux/alpha/Versions
@@ -85,6 +85,9 @@  libc {
     #errlist-compat	140
     _sys_errlist; sys_errlist; _sys_nerr; sys_nerr;
   }
+  GLIBC_2.24 {
+    recvmsg; sendmsg;
+  }
   GLIBC_PRIVATE {
     __libc_alpha_cache_shape;
   }
diff --git a/sysdeps/unix/sysv/linux/alpha/libc.abilist b/sysdeps/unix/sysv/linux/alpha/libc.abilist
index 0fa4ee9..7822242 100644
--- a/sysdeps/unix/sysv/linux/alpha/libc.abilist
+++ b/sysdeps/unix/sysv/linux/alpha/libc.abilist
@@ -1998,6 +1998,9 @@  GLIBC_2.23 fts64_close F
 GLIBC_2.23 fts64_open F
 GLIBC_2.23 fts64_read F
 GLIBC_2.23 fts64_set F
+GLIBC_2.24 GLIBC_2.24 A
+GLIBC_2.24 recvmsg F
+GLIBC_2.24 sendmsg F
 GLIBC_2.3 GLIBC_2.3 A
 GLIBC_2.3 __ctype_b_loc F
 GLIBC_2.3 __ctype_tolower_loc F
diff --git a/sysdeps/unix/sysv/linux/arm/Versions b/sysdeps/unix/sysv/linux/arm/Versions
index 5ff2225..7e5ba53 100644
--- a/sysdeps/unix/sysv/linux/arm/Versions
+++ b/sysdeps/unix/sysv/linux/arm/Versions
@@ -7,6 +7,9 @@  libc {
   GLIBC_2.11 {
     fallocate64;
   }
+  GLIBC_2.24 {
+    recvmsg; sendmsg;
+  }
   GLIBC_PRIVATE {
     # A copy of sigaction lives in libpthread, and needs these.
     __default_sa_restorer; __default_rt_sa_restorer;
diff --git a/sysdeps/unix/sysv/linux/arm/libc.abilist b/sysdeps/unix/sysv/linux/arm/libc.abilist
index db9fa35..2b2f9f0 100644
--- a/sysdeps/unix/sysv/linux/arm/libc.abilist
+++ b/sysdeps/unix/sysv/linux/arm/libc.abilist
@@ -88,6 +88,9 @@  GLIBC_2.23 fts64_close F
 GLIBC_2.23 fts64_open F
 GLIBC_2.23 fts64_read F
 GLIBC_2.23 fts64_set F
+GLIBC_2.24 GLIBC_2.24 A
+GLIBC_2.24 recvmsg F
+GLIBC_2.24 sendmsg F
 GLIBC_2.4 GLIBC_2.4 A
 GLIBC_2.4 _Exit F
 GLIBC_2.4 _IO_2_1_stderr_ D 0xa0
diff --git a/sysdeps/unix/sysv/linux/bits/socket.h b/sysdeps/unix/sysv/linux/bits/socket.h
index 0581c79..9c284a4 100644
--- a/sysdeps/unix/sysv/linux/bits/socket.h
+++ b/sysdeps/unix/sysv/linux/bits/socket.h
@@ -27,6 +27,8 @@ 
 #include <stddef.h>
 
 #include <sys/types.h>
+#include <endian.h>
+#include <bits/wordsize.h>
 
 /* Type for length arguments in socket calls.  */
 #ifndef __socklen_t_defined
@@ -231,13 +233,32 @@  struct msghdr
     socklen_t msg_namelen;	/* Length of address data.  */
 
     struct iovec *msg_iov;	/* Vector of data to send/receive into.  */
-    size_t msg_iovlen;		/* Number of elements in the vector.  */
+#if __WORDSIZE == 64
+# if __BYTE_ORDER == __BIG_ENDIAN
+    int __glibc_reserved1;	/* Pad to adjust Linux size to POSIX defined
+				   size for msg_iovlen.  */
+    int msg_iovlen;		/* Number of elements in the vector.  */
+# else
+    int msg_iovlen;
+    int __glibc_reserved1;
+# endif
+#else
+    int msg_iovlen;
+#endif
 
     void *msg_control;		/* Ancillary data (eg BSD filedesc passing). */
-    size_t msg_controllen;	/* Ancillary data buffer length.
-				   !! The type should be socklen_t but the
-				   definition of the kernel is incompatible
-				   with this.  */
+#if __WORDSIZE == 64
+# if __BYTE_ORDER == __BIG_ENDIAN
+    int __glibc_reserved2;	/* Pad to adjust Linux size to POSIX defined
+				   size for msg_controllen.  */
+    socklen_t msg_controllen;	/* Ancillary data buffer length.  */
+# else
+    socklen_t msg_controllen;
+    int __glibc_reserved2;
+# endif
+#else 
+    socklen_t msg_controllen;
+#endif
 
     int msg_flags;		/* Flags on received message.  */
   };
@@ -245,11 +266,15 @@  struct msghdr
 /* Structure used for storage of ancillary data object information.  */
 struct cmsghdr
   {
-    size_t cmsg_len;		/* Length of data in cmsg_data plus length
-				   of cmsghdr structure.
-				   !! The type should be socklen_t but the
-				   definition of the kernel is incompatible
-				   with this.  */
+#if __BYTE_ORDER == __BIG_ENDIAN
+    int __glibc_reserved1;	/* Pad toadjust Linux size to POSIX defined
+				   size for cmsg_len.  */
+    socklen_t cmsg_len;		/* Length of data in cmsg_data plus length
+				   of cmsghdr structure.  */
+#else
+    socklen_t cmsg_len;
+    int __glibc_reserved1;
+#endif
     int cmsg_level;		/* Originating protocol.  */
     int cmsg_type;		/* Protocol specific type.  */
 #if (!defined __STRICT_ANSI__ && __GNUC__ >= 2) || __STDC_VERSION__ >= 199901L
diff --git a/sysdeps/unix/sysv/linux/check_native.c b/sysdeps/unix/sysv/linux/check_native.c
index b3cbbe3..a8e447e 100644
--- a/sysdeps/unix/sysv/linux/check_native.c
+++ b/sysdeps/unix/sysv/linux/check_native.c
@@ -111,10 +111,13 @@  __check_native (uint32_t a1_index, int *a1_native,
     {
       struct msghdr msg =
 	{
-	  (void *) &nladdr, sizeof (nladdr),
-	  &iov, 1,
-	  NULL, 0,
-	  0
+	  .msg_name = (void *) &nladdr,
+	  .msg_namelen =  sizeof (nladdr),
+	  .msg_iov = &iov,
+	  .msg_iovlen = 1,
+	  .msg_control = NULL,
+	  .msg_controllen = 0,
+	  .msg_flags = 0
 	};
 
       ssize_t read_len = TEMP_FAILURE_RETRY (__recvmsg (fd, &msg, 0));
diff --git a/sysdeps/unix/sysv/linux/check_pf.c b/sysdeps/unix/sysv/linux/check_pf.c
index d55953a..89e9031 100644
--- a/sysdeps/unix/sysv/linux/check_pf.c
+++ b/sysdeps/unix/sysv/linux/check_pf.c
@@ -158,10 +158,13 @@  make_request (int fd, pid_t pid)
     {
       struct msghdr msg =
 	{
-	  (void *) &nladdr, sizeof (nladdr),
-	  &iov, 1,
-	  NULL, 0,
-	  0
+	  .msg_name = (void *) &nladdr,
+	  .msg_namelen =  sizeof (nladdr),
+	  .msg_iov = &iov,
+	  .msg_iovlen = 1,
+	  .msg_control = NULL,
+	  .msg_controllen = 0,
+	  .msg_flags = 0
 	};
 
       ssize_t read_len = TEMP_FAILURE_RETRY (__recvmsg (fd, &msg, 0));
diff --git a/sysdeps/unix/sysv/linux/hppa/Versions b/sysdeps/unix/sysv/linux/hppa/Versions
index b5098b2..895696e 100644
--- a/sysdeps/unix/sysv/linux/hppa/Versions
+++ b/sysdeps/unix/sysv/linux/hppa/Versions
@@ -35,4 +35,7 @@  libc {
   GLIBC_2.19 {
     fanotify_mark;
   }
+  GLIBC_2.24 {
+    recvms; sendmsg;
+  }
 }
diff --git a/sysdeps/unix/sysv/linux/hppa/libc.abilist b/sysdeps/unix/sysv/linux/hppa/libc.abilist
index 1d30644..84e8431 100644
--- a/sysdeps/unix/sysv/linux/hppa/libc.abilist
+++ b/sysdeps/unix/sysv/linux/hppa/libc.abilist
@@ -1852,6 +1852,9 @@  GLIBC_2.23 fts64_close F
 GLIBC_2.23 fts64_open F
 GLIBC_2.23 fts64_read F
 GLIBC_2.23 fts64_set F
+GLIBC_2.24 GLIBC_2.24 A
+GLIBC_2.24 recvmsg F
+GLIBC_2.24 sendmsg F
 GLIBC_2.3 GLIBC_2.3 A
 GLIBC_2.3 __ctype_b_loc F
 GLIBC_2.3 __ctype_tolower_loc F
diff --git a/sysdeps/unix/sysv/linux/i386/Versions b/sysdeps/unix/sysv/linux/i386/Versions
index f3544ac..55d1277 100644
--- a/sysdeps/unix/sysv/linux/i386/Versions
+++ b/sysdeps/unix/sysv/linux/i386/Versions
@@ -45,6 +45,9 @@  libc {
     # f*
     fallocate64;
   }
+  GLIBC_2.24 {
+    recvms; sendmsg;
+  }
   GLIBC_PRIVATE {
     __modify_ldt;
   }
diff --git a/sysdeps/unix/sysv/linux/i386/libc.abilist b/sysdeps/unix/sysv/linux/i386/libc.abilist
index 8f3502d..0229cd6 100644
--- a/sysdeps/unix/sysv/linux/i386/libc.abilist
+++ b/sysdeps/unix/sysv/linux/i386/libc.abilist
@@ -2010,6 +2010,9 @@  GLIBC_2.23 fts64_close F
 GLIBC_2.23 fts64_open F
 GLIBC_2.23 fts64_read F
 GLIBC_2.23 fts64_set F
+GLIBC_2.24 GLIBC_2.24 A
+GLIBC_2.24 recvmsg F
+GLIBC_2.24 sendmsg F
 GLIBC_2.3 GLIBC_2.3 A
 GLIBC_2.3 __ctype_b_loc F
 GLIBC_2.3 __ctype_tolower_loc F
diff --git a/sysdeps/unix/sysv/linux/ia64/Versions b/sysdeps/unix/sysv/linux/ia64/Versions
index b38d6ef..116f4e8 100644
--- a/sysdeps/unix/sysv/linux/ia64/Versions
+++ b/sysdeps/unix/sysv/linux/ia64/Versions
@@ -22,6 +22,9 @@  libc {
   GLIBC_2.2.6 {
     getunwind;
   }
+  GLIBC_2.24 {
+    recvmsg; sendmsg;
+  }
 }
 libpthread {
   GLIBC_2.3.3 {
diff --git a/sysdeps/unix/sysv/linux/ia64/libc.abilist b/sysdeps/unix/sysv/linux/ia64/libc.abilist
index 921ec55..f5739b4 100644
--- a/sysdeps/unix/sysv/linux/ia64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/ia64/libc.abilist
@@ -1874,6 +1874,9 @@  GLIBC_2.23 fts64_close F
 GLIBC_2.23 fts64_open F
 GLIBC_2.23 fts64_read F
 GLIBC_2.23 fts64_set F
+GLIBC_2.24 GLIBC_2.24 A
+GLIBC_2.24 recvmsg F
+GLIBC_2.24 sendmsg F
 GLIBC_2.3 GLIBC_2.3 A
 GLIBC_2.3 __ctype_b_loc F
 GLIBC_2.3 __ctype_tolower_loc F
diff --git a/sysdeps/unix/sysv/linux/ifaddrs.c b/sysdeps/unix/sysv/linux/ifaddrs.c
index ca38d1a..54f1124 100644
--- a/sysdeps/unix/sysv/linux/ifaddrs.c
+++ b/sysdeps/unix/sysv/linux/ifaddrs.c
@@ -161,10 +161,13 @@  __netlink_request (struct netlink_handle *h, int type)
     {
       struct msghdr msg =
 	{
-	  (void *) &nladdr, sizeof (nladdr),
-	  &iov, 1,
-	  NULL, 0,
-	  0
+	  .msg_name = (void *) &nladdr,
+	  .msg_namelen =  sizeof (nladdr),
+	  .msg_iov = &iov,
+	  .msg_iovlen = 1,
+	  .msg_control = NULL,
+	  .msg_controllen = 0,
+	  .msg_flags = 0
 	};
 
       read_len = TEMP_FAILURE_RETRY (__recvmsg (h->fd, &msg, 0));
diff --git a/sysdeps/unix/sysv/linux/m68k/Versions b/sysdeps/unix/sysv/linux/m68k/Versions
index 7ecc96e..eceb89a 100644
--- a/sysdeps/unix/sysv/linux/m68k/Versions
+++ b/sysdeps/unix/sysv/linux/m68k/Versions
@@ -40,6 +40,9 @@  libc {
   GLIBC_2.12 {
     __m68k_read_tp;
   }
+  GLIBC_2.24 {
+    recvms; sendmsg;
+  }
   GLIBC_PRIVATE {
     __vdso_atomic_cmpxchg_32; __vdso_atomic_barrier;
   }
diff --git a/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist b/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist
index 019095b..3a498cb 100644
--- a/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist
+++ b/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist
@@ -89,6 +89,9 @@  GLIBC_2.23 fts64_close F
 GLIBC_2.23 fts64_open F
 GLIBC_2.23 fts64_read F
 GLIBC_2.23 fts64_set F
+GLIBC_2.24 GLIBC_2.24 A
+GLIBC_2.24 recvmsg F
+GLIBC_2.24 sendmsg F
 GLIBC_2.4 GLIBC_2.4 A
 GLIBC_2.4 _Exit F
 GLIBC_2.4 _IO_2_1_stderr_ D 0x98
diff --git a/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist b/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist
index a999a48..948b050 100644
--- a/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist
+++ b/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist
@@ -1966,6 +1966,9 @@  GLIBC_2.23 fts64_close F
 GLIBC_2.23 fts64_open F
 GLIBC_2.23 fts64_read F
 GLIBC_2.23 fts64_set F
+GLIBC_2.24 GLIBC_2.24 A
+GLIBC_2.24 recvmsg F
+GLIBC_2.24 sendmsg F
 GLIBC_2.3 GLIBC_2.3 A
 GLIBC_2.3 __ctype_b_loc F
 GLIBC_2.3 __ctype_tolower_loc F
diff --git a/sysdeps/unix/sysv/linux/microblaze/Versions b/sysdeps/unix/sysv/linux/microblaze/Versions
index aa48a3c..c2f4505 100644
--- a/sysdeps/unix/sysv/linux/microblaze/Versions
+++ b/sysdeps/unix/sysv/linux/microblaze/Versions
@@ -2,4 +2,7 @@  libc {
   GLIBC_2.18 {
     fallocate64;
   }
+  GLIBC_2.24 {
+    recvms; sendmsg;
+  }
 }
diff --git a/sysdeps/unix/sysv/linux/microblaze/libc.abilist b/sysdeps/unix/sysv/linux/microblaze/libc.abilist
index 0a08bba..d7ba0be 100644
--- a/sysdeps/unix/sysv/linux/microblaze/libc.abilist
+++ b/sysdeps/unix/sysv/linux/microblaze/libc.abilist
@@ -2087,3 +2087,6 @@  GLIBC_2.23 fts64_close F
 GLIBC_2.23 fts64_open F
 GLIBC_2.23 fts64_read F
 GLIBC_2.23 fts64_set F
+GLIBC_2.24 GLIBC_2.24 A
+GLIBC_2.24 recvmsg F
+GLIBC_2.24 sendmsg F
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/Versions b/sysdeps/unix/sysv/linux/mips/mips32/Versions
index 9621fb5..c4f38d8 100644
--- a/sysdeps/unix/sysv/linux/mips/mips32/Versions
+++ b/sysdeps/unix/sysv/linux/mips/mips32/Versions
@@ -3,4 +3,7 @@  libc {
     getrlimit64;
     setrlimit64;
   }
+  GLIBC_2.24 {
+    recvmsg; sendmsg;
+  }
 }
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist
index 2ab9e94..87bb49b 100644
--- a/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist
+++ b/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist
@@ -1941,6 +1941,9 @@  GLIBC_2.23 fts64_close F
 GLIBC_2.23 fts64_open F
 GLIBC_2.23 fts64_read F
 GLIBC_2.23 fts64_set F
+GLIBC_2.24 GLIBC_2.24 A
+GLIBC_2.24 recvmsg F
+GLIBC_2.24 sendmsg F
 GLIBC_2.3 GLIBC_2.3 A
 GLIBC_2.3 __ctype_b_loc F
 GLIBC_2.3 __ctype_tolower_loc F
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist
index b9b4b74..1a415ab 100644
--- a/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist
+++ b/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist
@@ -1939,6 +1939,9 @@  GLIBC_2.23 fts64_close F
 GLIBC_2.23 fts64_open F
 GLIBC_2.23 fts64_read F
 GLIBC_2.23 fts64_set F
+GLIBC_2.24 GLIBC_2.24 A
+GLIBC_2.24 recvmsg F
+GLIBC_2.24 sendmsg F
 GLIBC_2.3 GLIBC_2.3 A
 GLIBC_2.3 __ctype_b_loc F
 GLIBC_2.3 __ctype_tolower_loc F
diff --git a/sysdeps/unix/sysv/linux/mips/mips64/n32/Versions b/sysdeps/unix/sysv/linux/mips/mips64/n32/Versions
index 9621fb5..c4f38d8 100644
--- a/sysdeps/unix/sysv/linux/mips/mips64/n32/Versions
+++ b/sysdeps/unix/sysv/linux/mips/mips64/n32/Versions
@@ -3,4 +3,7 @@  libc {
     getrlimit64;
     setrlimit64;
   }
+  GLIBC_2.24 {
+    recvmsg; sendmsg;
+  }
 }
diff --git a/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist
index 14e1236..949761b 100644
--- a/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist
@@ -1937,6 +1937,9 @@  GLIBC_2.23 fts64_close F
 GLIBC_2.23 fts64_open F
 GLIBC_2.23 fts64_read F
 GLIBC_2.23 fts64_set F
+GLIBC_2.24 GLIBC_2.24 A
+GLIBC_2.24 recvmsg F
+GLIBC_2.24 sendmsg F
 GLIBC_2.3 GLIBC_2.3 A
 GLIBC_2.3 __ctype_b_loc F
 GLIBC_2.3 __ctype_tolower_loc F
diff --git a/sysdeps/unix/sysv/linux/mips/mips64/n64/Versions b/sysdeps/unix/sysv/linux/mips/mips64/n64/Versions
new file mode 100644
index 0000000..517d79a
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/mips/mips64/n64/Versions
@@ -0,0 +1,5 @@ 
+libc {
+  GLIBC_2.24 {
+    recvmsg; sendmsg;
+  }
+}
diff --git a/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist
index 53e0c9a..6722f90 100644
--- a/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist
@@ -1932,6 +1932,9 @@  GLIBC_2.23 fts64_close F
 GLIBC_2.23 fts64_open F
 GLIBC_2.23 fts64_read F
 GLIBC_2.23 fts64_set F
+GLIBC_2.24 GLIBC_2.24 A
+GLIBC_2.24 recvmsg F
+GLIBC_2.24 sendmsg F
 GLIBC_2.3 GLIBC_2.3 A
 GLIBC_2.3 __ctype_b_loc F
 GLIBC_2.3 __ctype_tolower_loc F
diff --git a/sysdeps/unix/sysv/linux/nios2/Versions b/sysdeps/unix/sysv/linux/nios2/Versions
index e42c85f..93458f5 100644
--- a/sysdeps/unix/sysv/linux/nios2/Versions
+++ b/sysdeps/unix/sysv/linux/nios2/Versions
@@ -3,4 +3,7 @@  libc {
     _flush_cache;
     cacheflush;
   }
+  GLIBC_2.24 {
+    recvmsg; sendmsg;
+  }
 }
diff --git a/sysdeps/unix/sysv/linux/nios2/libc.abilist b/sysdeps/unix/sysv/linux/nios2/libc.abilist
index dff1ee9..75ef1ab 100644
--- a/sysdeps/unix/sysv/linux/nios2/libc.abilist
+++ b/sysdeps/unix/sysv/linux/nios2/libc.abilist
@@ -2128,3 +2128,6 @@  GLIBC_2.23 fts64_close F
 GLIBC_2.23 fts64_open F
 GLIBC_2.23 fts64_read F
 GLIBC_2.23 fts64_set F
+GLIBC_2.24 GLIBC_2.24 A
+GLIBC_2.24 recvmsg F
+GLIBC_2.24 sendmsg F
diff --git a/sysdeps/unix/sysv/linux/oldrecvmsg.c b/sysdeps/unix/sysv/linux/oldrecvmsg.c
new file mode 100644
index 0000000..01c596e
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/oldrecvmsg.c
@@ -0,0 +1,40 @@ 
+/* Compatibility version of recvmsg.
+   Copyright (C) 2016 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sys/socket.h>
+#include <sysdep-cancel.h>
+#include <socketcall.h>
+#include <shlib-compat.h>
+
+/* Both libc.so and libpthread.so provides sendmsg, so we need to
+   provide the compat symbol for both libraries.  */
+#if SHLIB_COMPAT (MODULE_NAME, GLIBC_2_0, GLIBC_2_24)
+
+/* We can use the same struct layout for old symbol version since
+   size is the same.  */
+ssize_t
+__old_recvmsg (int fd, struct msghdr *msg, int flags)
+{
+# ifdef __ASSUME_RECVMSG_SYSCALL
+  return SYSCALL_CANCEL (recvmsg, fd, msg, flags);
+# else
+  return SOCKETCALL_CANCEL (recvmsg, fd, msg, flags);
+# endif
+}
+compat_symbol (MODULE_NAME, __old_recvmsg, recvmsg, GLIBC_2_0);
+#endif
diff --git a/sysdeps/unix/sysv/linux/oldsendmsg.c b/sysdeps/unix/sysv/linux/oldsendmsg.c
new file mode 100644
index 0000000..a96790a
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/oldsendmsg.c
@@ -0,0 +1,40 @@ 
+/* Compatibility implementation of sendmsg.
+   Copyright (C) 2016 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sys/socket.h>
+#include <sysdep-cancel.h>
+#include <socketcall.h>
+#include <shlib-compat.h>
+
+/* Both libc.so and libpthread.so provides sendmsg, so we need to
+   provide the compat symbol for both libraries.  */
+#if SHLIB_COMPAT (MODULE_NAME, GLIBC_2_0, GLIBC_2_24)
+
+/* We can use the same struct layout for old symbol version since
+   size is the same.  */
+ssize_t
+__old_sendmsg (int fd, const struct msghdr *msg, int flags)
+{
+# ifdef __ASSUME_SENDMSG_SYSCALL
+  return SYSCALL_CANCEL (sendmsg, fd, msg, flags);
+# else
+  return SOCKETCALL_CANCEL (sendmsg, fd, msg, flags);
+# endif
+}
+compat_symbol (MODULE_NAME, __old_sendmsg, sendmsg, GLIBC_2_0);
+#endif
diff --git a/sysdeps/unix/sysv/linux/powerpc/Versions b/sysdeps/unix/sysv/linux/powerpc/Versions
index 8ebeea1..ab0db57 100644
--- a/sysdeps/unix/sysv/linux/powerpc/Versions
+++ b/sysdeps/unix/sysv/linux/powerpc/Versions
@@ -5,6 +5,9 @@  ld {
   }
 }
 libc {
+  GLIBC_2.24 {
+    recvmsg; sendmsg;
+  }
   GLIBC_PRIVATE {
     __vdso_get_tbfreq;
     __vdso_clock_gettime;
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist
index 6861846..5a0890e 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist
@@ -1970,6 +1970,9 @@  GLIBC_2.23 fts64_close F
 GLIBC_2.23 fts64_open F
 GLIBC_2.23 fts64_read F
 GLIBC_2.23 fts64_set F
+GLIBC_2.24 GLIBC_2.24 A
+GLIBC_2.24 recvmsg F
+GLIBC_2.24 sendmsg F
 GLIBC_2.3 GLIBC_2.3 A
 GLIBC_2.3 __ctype_b_loc F
 GLIBC_2.3 __ctype_tolower_loc F
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist
index fd611aa..adbe736 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist
@@ -1975,6 +1975,9 @@  GLIBC_2.23 fts64_close F
 GLIBC_2.23 fts64_open F
 GLIBC_2.23 fts64_read F
 GLIBC_2.23 fts64_set F
+GLIBC_2.24 GLIBC_2.24 A
+GLIBC_2.24 recvmsg F
+GLIBC_2.24 sendmsg F
 GLIBC_2.3 GLIBC_2.3 A
 GLIBC_2.3 __ctype_b_loc F
 GLIBC_2.3 __ctype_tolower_loc F
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/Versions b/sysdeps/unix/sysv/linux/powerpc/powerpc64/Versions
index a8e88b8..53e5527 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/Versions
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/Versions
@@ -22,6 +22,9 @@  libc {
   GLIBC_2.17 {
     __ppc_get_timebase_freq;
   }
+  GLIBC_2.24 {
+    recvmsg; sendmsg;
+  }
 }
 
 librt {
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc-le.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc-le.abilist
index a97bd43..7839b5a 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc-le.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc-le.abilist
@@ -2175,3 +2175,6 @@  GLIBC_2.23 fts64_close F
 GLIBC_2.23 fts64_open F
 GLIBC_2.23 fts64_read F
 GLIBC_2.23 fts64_set F
+GLIBC_2.24 GLIBC_2.24 A
+GLIBC_2.24 recvmsg F
+GLIBC_2.24 sendmsg F
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc.abilist
index 00772cb..20d5a19 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc.abilist
@@ -89,6 +89,9 @@  GLIBC_2.23 fts64_close F
 GLIBC_2.23 fts64_open F
 GLIBC_2.23 fts64_read F
 GLIBC_2.23 fts64_set F
+GLIBC_2.24 GLIBC_2.24 A
+GLIBC_2.24 recvmsg F
+GLIBC_2.24 sendmsg F
 GLIBC_2.3 GLIBC_2.3 A
 GLIBC_2.3 _Exit F
 GLIBC_2.3 _IO_2_1_stderr_ D 0xe0
diff --git a/sysdeps/unix/sysv/linux/recvmsg.c b/sysdeps/unix/sysv/linux/recvmsg.c
index 4caf22e..25a3193 100644
--- a/sysdeps/unix/sysv/linux/recvmsg.c
+++ b/sysdeps/unix/sysv/linux/recvmsg.c
@@ -15,23 +15,43 @@ 
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
-#include <errno.h>
-#include <signal.h>
 #include <sys/socket.h>
-
 #include <sysdep-cancel.h>
 #include <socketcall.h>
-#include <kernel-features.h>
-#include <sys/syscall.h>
+#include <shlib-compat.h>
 
 ssize_t
 __libc_recvmsg (int fd, struct msghdr *msg, int flags)
 {
+  ssize_t ret;
+
+  /* POSIX specifies that both msghdr::msg_iovlen and msghdr::msg_controllen
+     to be int and socklen_t respectively.  However Linux defines it as
+     both size_t.  So for 64-bit it requires some adjustments by copying to
+     temporary header and zeroing the pad fields.  */
+#if __WORDSIZE == 64
+  struct msghdr hdr, *orig = msg;
+  if (msg != NULL)
+    {
+      hdr = *msg;
+      hdr.__glibc_reserved1 = 0;
+      hdr.__glibc_reserved2 = 0;
+      msg = &hdr;
+    }
+#endif
+
 #ifdef __ASSUME_RECVMSG_SYSCALL
-  return SYSCALL_CANCEL (recvmsg, fd, msg, flags);
+  ret = SYSCALL_CANCEL (recvmsg, fd, msg, flags);
 #else
-  return SOCKETCALL_CANCEL (recvmsg, fd, msg, flags);
+  ret = SOCKETCALL_CANCEL (recvmsg, fd, msg, flags);
 #endif
+
+#if __WORDSIZE == 64
+  if (orig != NULL)
+    *orig = hdr;
+#endif
+
+  return ret;
 }
-weak_alias (__libc_recvmsg, recvmsg)
 weak_alias (__libc_recvmsg, __recvmsg)
+versioned_symbol (libc, __libc_recvmsg, recvmsg, GLIBC_2_24);
diff --git a/sysdeps/unix/sysv/linux/s390/s390-32/Versions b/sysdeps/unix/sysv/linux/s390/s390-32/Versions
index 1c120e8..afcc3fe 100644
--- a/sysdeps/unix/sysv/linux/s390/s390-32/Versions
+++ b/sysdeps/unix/sysv/linux/s390/s390-32/Versions
@@ -49,6 +49,9 @@  libc {
   GLIBC_2.11 {
     fallocate64;
   }
+  GLIBC_2.24 {
+    recvmsg; sendmsg;
+  }
 }
 
 libutil {
diff --git a/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist b/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist
index 05cb85e..03983df 100644
--- a/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist
@@ -1970,6 +1970,9 @@  GLIBC_2.23 fts64_close F
 GLIBC_2.23 fts64_open F
 GLIBC_2.23 fts64_read F
 GLIBC_2.23 fts64_set F
+GLIBC_2.24 GLIBC_2.24 A
+GLIBC_2.24 recvmsg F
+GLIBC_2.24 sendmsg F
 GLIBC_2.3 GLIBC_2.3 A
 GLIBC_2.3 __ctype_b_loc F
 GLIBC_2.3 __ctype_tolower_loc F
diff --git a/sysdeps/unix/sysv/linux/s390/s390-64/Versions b/sysdeps/unix/sysv/linux/s390/s390-64/Versions
index 3f4d960..fde5aee 100644
--- a/sysdeps/unix/sysv/linux/s390/s390-64/Versions
+++ b/sysdeps/unix/sysv/linux/s390/s390-64/Versions
@@ -4,6 +4,9 @@  libc {
     __register_frame; __register_frame_table; __deregister_frame;
     __frame_state_for; __register_frame_info_table;
   }
+  GLIBC_2.24 {
+    recvmsg; sendmsg;
+  }
 }
 
 librt {
diff --git a/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist b/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist
index 1af185f..5892fcd 100644
--- a/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist
@@ -1871,6 +1871,9 @@  GLIBC_2.23 fts64_close F
 GLIBC_2.23 fts64_open F
 GLIBC_2.23 fts64_read F
 GLIBC_2.23 fts64_set F
+GLIBC_2.24 GLIBC_2.24 A
+GLIBC_2.24 recvmsg F
+GLIBC_2.24 sendmsg F
 GLIBC_2.3 GLIBC_2.3 A
 GLIBC_2.3 __ctype_b_loc F
 GLIBC_2.3 __ctype_tolower_loc F
diff --git a/sysdeps/unix/sysv/linux/sendmsg.c b/sysdeps/unix/sysv/linux/sendmsg.c
index 5b2741a..a5ef238 100644
--- a/sysdeps/unix/sysv/linux/sendmsg.c
+++ b/sysdeps/unix/sysv/linux/sendmsg.c
@@ -15,23 +15,34 @@ 
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
-#include <errno.h>
-#include <signal.h>
 #include <sys/socket.h>
-
 #include <sysdep-cancel.h>
 #include <socketcall.h>
-#include <kernel-features.h>
-#include <sys/syscall.h>
+#include <shlib-compat.h>
 
 ssize_t
 __libc_sendmsg (int fd, const struct msghdr *msg, int flags)
 {
+  /* POSIX specifies that both msghdr::msg_iovlen and msghdr::msg_controllen
+     to be int and socklen_t respectively.  However Linux defines it as
+     both size_t.  So for 64-bit it requires some adjustments by copying to
+     temporary header and zeroing the pad fields.  */
+#if __WORDSIZE == 64
+  struct msghdr hdr;
+  if (msg != NULL)
+    {
+      hdr = *msg;
+      hdr.__glibc_reserved1 = 0;
+      hdr.__glibc_reserved2 = 0;
+      msg = &hdr;
+    }
+#endif
+
 #ifdef __ASSUME_SENDMSG_SYSCALL
   return SYSCALL_CANCEL (sendmsg, fd, msg, flags);
 #else
   return SOCKETCALL_CANCEL (sendmsg, fd, msg, flags);
 #endif
 }
-weak_alias (__libc_sendmsg, sendmsg)
 weak_alias (__libc_sendmsg, __sendmsg)
+versioned_symbol (libc, __libc_sendmsg, sendmsg, GLIBC_2_24);
diff --git a/sysdeps/unix/sysv/linux/sh/Versions b/sysdeps/unix/sysv/linux/sh/Versions
index e0938c4..ae5a00e 100644
--- a/sysdeps/unix/sysv/linux/sh/Versions
+++ b/sysdeps/unix/sysv/linux/sh/Versions
@@ -30,4 +30,7 @@  libc {
   GLIBC_2.16 {
     fanotify_mark;
   }
+  GLIBC_2.24 {
+    recvmsg; sendmsg;
+  }
 }
diff --git a/sysdeps/unix/sysv/linux/sh/libc.abilist b/sysdeps/unix/sysv/linux/sh/libc.abilist
index e128692..a2d85e6 100644
--- a/sysdeps/unix/sysv/linux/sh/libc.abilist
+++ b/sysdeps/unix/sysv/linux/sh/libc.abilist
@@ -1856,6 +1856,9 @@  GLIBC_2.23 fts64_close F
 GLIBC_2.23 fts64_open F
 GLIBC_2.23 fts64_read F
 GLIBC_2.23 fts64_set F
+GLIBC_2.24 GLIBC_2.24 A
+GLIBC_2.24 recvmsg F
+GLIBC_2.24 sendmsg F
 GLIBC_2.3 GLIBC_2.3 A
 GLIBC_2.3 __ctype_b_loc F
 GLIBC_2.3 __ctype_tolower_loc F
diff --git a/sysdeps/unix/sysv/linux/sparc/Versions b/sysdeps/unix/sysv/linux/sparc/Versions
index 4dc1cd7..adbdec5 100644
--- a/sysdeps/unix/sysv/linux/sparc/Versions
+++ b/sysdeps/unix/sysv/linux/sparc/Versions
@@ -29,6 +29,9 @@  libc {
 
     __getshmlba;
   }
+  GLIBC_2.24 {
+    recvmsg; sendmsg;
+  }
 }
 
 libpthread {
diff --git a/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist b/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist
index eb14113..c51e790 100644
--- a/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist
@@ -1962,6 +1962,9 @@  GLIBC_2.23 fts64_close F
 GLIBC_2.23 fts64_open F
 GLIBC_2.23 fts64_read F
 GLIBC_2.23 fts64_set F
+GLIBC_2.24 GLIBC_2.24 A
+GLIBC_2.24 recvmsg F
+GLIBC_2.24 sendmsg F
 GLIBC_2.3 GLIBC_2.3 A
 GLIBC_2.3 __ctype_b_loc F
 GLIBC_2.3 __ctype_tolower_loc F
diff --git a/sysdeps/unix/sysv/linux/sparc/sparc64/Versions b/sysdeps/unix/sysv/linux/sparc/sparc64/Versions
index fbea1bb..f950070 100644
--- a/sysdeps/unix/sysv/linux/sparc/sparc64/Versions
+++ b/sysdeps/unix/sysv/linux/sparc/sparc64/Versions
@@ -8,6 +8,9 @@  libc {
     # w*
     wordexp;
   }
+  GLIBC_2.24 {
+    recvmsg; sendmsg;
+  }
 }
 
 librt {
diff --git a/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist b/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist
index 91b97ef..015a2f1 100644
--- a/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist
@@ -1900,6 +1900,9 @@  GLIBC_2.23 fts64_close F
 GLIBC_2.23 fts64_open F
 GLIBC_2.23 fts64_read F
 GLIBC_2.23 fts64_set F
+GLIBC_2.24 GLIBC_2.24 A
+GLIBC_2.24 recvmsg F
+GLIBC_2.24 sendmsg F
 GLIBC_2.3 GLIBC_2.3 A
 GLIBC_2.3 __ctype_b_loc F
 GLIBC_2.3 __ctype_tolower_loc F
diff --git a/sysdeps/unix/sysv/linux/tile/Versions b/sysdeps/unix/sysv/linux/tile/Versions
index 13da68f..a68e181 100644
--- a/sysdeps/unix/sysv/linux/tile/Versions
+++ b/sysdeps/unix/sysv/linux/tile/Versions
@@ -11,6 +11,9 @@  libc {
     fallocate64;
     set_dataplane;
   }
+  GLIBC_2.24 {
+    recvmsg; sendmsg;
+  }
   GLIBC_PRIVATE {
     __syscall_error;
     __vdso_clock_gettime;
diff --git a/sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libc.abilist b/sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libc.abilist
index ffcc4a0..cd48be1 100644
--- a/sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libc.abilist
@@ -2094,3 +2094,6 @@  GLIBC_2.23 fts64_close F
 GLIBC_2.23 fts64_open F
 GLIBC_2.23 fts64_read F
 GLIBC_2.23 fts64_set F
+GLIBC_2.24 GLIBC_2.24 A
+GLIBC_2.24 recvmsg F
+GLIBC_2.24 sendmsg F
diff --git a/sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/Versions b/sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/Versions
new file mode 100644
index 0000000..517d79a
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/Versions
@@ -0,0 +1,5 @@ 
+libc {
+  GLIBC_2.24 {
+    recvmsg; sendmsg;
+  }
+}
diff --git a/sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libc.abilist b/sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libc.abilist
index a66e8ec..1e160bd 100644
--- a/sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libc.abilist
@@ -2094,3 +2094,6 @@  GLIBC_2.23 fts64_close F
 GLIBC_2.23 fts64_open F
 GLIBC_2.23 fts64_read F
 GLIBC_2.23 fts64_set F
+GLIBC_2.24 GLIBC_2.24 A
+GLIBC_2.24 recvmsg F
+GLIBC_2.24 sendmsg F
diff --git a/sysdeps/unix/sysv/linux/tile/tilepro/libc.abilist b/sysdeps/unix/sysv/linux/tile/tilepro/libc.abilist
index ffcc4a0..cd48be1 100644
--- a/sysdeps/unix/sysv/linux/tile/tilepro/libc.abilist
+++ b/sysdeps/unix/sysv/linux/tile/tilepro/libc.abilist
@@ -2094,3 +2094,6 @@  GLIBC_2.23 fts64_close F
 GLIBC_2.23 fts64_open F
 GLIBC_2.23 fts64_read F
 GLIBC_2.23 fts64_set F
+GLIBC_2.24 GLIBC_2.24 A
+GLIBC_2.24 recvmsg F
+GLIBC_2.24 sendmsg F
diff --git a/sysdeps/unix/sysv/linux/x86_64/64/Versions b/sysdeps/unix/sysv/linux/x86_64/64/Versions
new file mode 100644
index 0000000..517d79a
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/x86_64/64/Versions
@@ -0,0 +1,5 @@ 
+libc {
+  GLIBC_2.24 {
+    recvmsg; sendmsg;
+  }
+}
diff --git a/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist b/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist
index c6e3cd4..175339e 100644
--- a/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist
@@ -1851,6 +1851,9 @@  GLIBC_2.23 fts64_close F
 GLIBC_2.23 fts64_open F
 GLIBC_2.23 fts64_read F
 GLIBC_2.23 fts64_set F
+GLIBC_2.24 GLIBC_2.24 A
+GLIBC_2.24 recvmsg F
+GLIBC_2.24 sendmsg F
 GLIBC_2.3 GLIBC_2.3 A
 GLIBC_2.3 __ctype_b_loc F
 GLIBC_2.3 __ctype_tolower_loc F
diff --git a/sysdeps/unix/sysv/linux/x86_64/Versions b/sysdeps/unix/sysv/linux/x86_64/Versions
index 2a7ed28..bbef7e0 100644
--- a/sysdeps/unix/sysv/linux/x86_64/Versions
+++ b/sysdeps/unix/sysv/linux/x86_64/Versions
@@ -6,6 +6,9 @@  libc {
 
     modify_ldt;
   }
+  GLIBC_2.24 {
+    recvmsg; sendmsg;
+  }
 }
 
 librt {
diff --git a/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist b/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist
index 04dc8e4..4f52e2e 100644
--- a/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist
@@ -2094,3 +2094,6 @@  GLIBC_2.23 fts64_close F
 GLIBC_2.23 fts64_open F
 GLIBC_2.23 fts64_read F
 GLIBC_2.23 fts64_set F
+GLIBC_2.24 GLIBC_2.24 A
+GLIBC_2.24 recvmsg F
+GLIBC_2.24 sendmsg F