Patchwork iproute uses too small of a receive buffer

login
register
mail settings
Submitter Ben Greear
Date Oct. 27, 2009, 11:16 p.m.
Message ID <4AE77F64.3090302@candelatech.com>
Download mbox | patch
Permalink /patch/37028/
State Not Applicable
Delegated to: David Miller
Headers show

Comments

Ben Greear - Oct. 27, 2009, 11:16 p.m.
I have a very busy system with a bunch of xorp router processes (mis)configured.

This thing is rapidly making route changes for whatever reason.

The 'ip monitor route' command was failing:

[root@i7-dqc-1 ]# ip monitor route
netlink receive error No buffer space available (105)
Dump terminated


It is only using a 32k rcv buffer, and it seems the OS was
overdriving it.

Please consider making the rcv buffer larger, perhaps something
like this (inline is white-space damaged...attachment should apply
if deemed useful.):

Signed-off-by:  Ben Greear <greearb@candelatech.com>


Thanks,
Ben
stephen hemminger - Oct. 27, 2009, 11:24 p.m.
On Tue, 27 Oct 2009 16:16:52 -0700
Ben Greear <greearb@candelatech.com> wrote:

> I have a very busy system with a bunch of xorp router processes (mis)configured.
> 
> This thing is rapidly making route changes for whatever reason.
> 
> The 'ip monitor route' command was failing:
> 
> [root@i7-dqc-1 ]# ip monitor route
> netlink receive error No buffer space available (105)
> Dump terminated
> 
> 
> It is only using a 32k rcv buffer, and it seems the OS was
> overdriving it.
> 
> Please consider making the rcv buffer larger, perhaps something
> like this (inline is white-space damaged...attachment should apply
> if deemed useful.):
> 
> Signed-off-by:  Ben Greear <greearb@candelatech.com>
> 
> diff --git a/lib/libnetlink.c b/lib/libnetlink.c
> index b68e2fd..95a7d1d 100644
> --- a/lib/libnetlink.c
> +++ b/lib/libnetlink.c
> @@ -38,7 +38,7 @@ int rtnl_open_byproto(struct rtnl_handle *rth, unsigned subscriptions,
>   {
>          socklen_t addr_len;
>          int sndbuf = 32768;
> -       int rcvbuf = 32768;
> +       int rcvbuf = 3276800;
> 
>          memset(rth, 0, sizeof(*rth));
> 
> 
> Thanks,
> Ben
> 

Just having larger buffer isn't guarantee of success. Allocating
a huge buffer is not going to work on embedded.

Why not have it continue after one error.
Ben Greear - Oct. 27, 2009, 11:30 p.m.
On 10/27/2009 04:24 PM, Stephen Hemminger wrote:
> On Tue, 27 Oct 2009 16:16:52 -0700
> Ben Greear<greearb@candelatech.com>  wrote:
>
>> I have a very busy system with a bunch of xorp router processes (mis)configured.
>>
>> This thing is rapidly making route changes for whatever reason.
>>
>> The 'ip monitor route' command was failing:
>>
>> [root@i7-dqc-1 ]# ip monitor route
>> netlink receive error No buffer space available (105)
>> Dump terminated
>>
>>
>> It is only using a 32k rcv buffer, and it seems the OS was
>> overdriving it.
>>
>> Please consider making the rcv buffer larger, perhaps something
>> like this (inline is white-space damaged...attachment should apply
>> if deemed useful.):
>>
>> Signed-off-by:  Ben Greear<greearb@candelatech.com>
>>
>> diff --git a/lib/libnetlink.c b/lib/libnetlink.c
>> index b68e2fd..95a7d1d 100644
>> --- a/lib/libnetlink.c
>> +++ b/lib/libnetlink.c
>> @@ -38,7 +38,7 @@ int rtnl_open_byproto(struct rtnl_handle *rth, unsigned subscriptions,
>>    {
>>           socklen_t addr_len;
>>           int sndbuf = 32768;
>> -       int rcvbuf = 32768;
>> +       int rcvbuf = 3276800;
>>
>>           memset(rth, 0, sizeof(*rth));
>>
>>
>> Thanks,
>> Ben
>>
>
> Just having larger buffer isn't guarantee of success. Allocating
> a huge buffer is not going to work on embedded.
>
> Why not have it continue after one error.

Probably the right way is to give a cmd-line arg to set the buffer size
and also continue if the error is ENOBUFs (but print some error out
so users know they have issues).  I can make the attempt if that
sounds good to you.

Thanks,
Ben
Eric Dumazet - Oct. 28, 2009, 7:52 a.m.
Stephen Hemminger a écrit :
> 
> Just having larger buffer isn't guarantee of success. Allocating
> a huge buffer is not going to work on embedded.
> 

Please note we do not allocate a big buffer, only allow more small skbs
to be queued on socket receive queue.

If memory is not available, skb allocation will eventually fail
and be reported as well, embedded or not.

I vote for allowing 1024*1024 bytes instead of 32768,
and eventually user should be warned that it is capped by 
/proc/sys/net/core/rmem_max


> Why not have it continue after one error.

Yes, but caller of 'ip monitor' just restart it anyway
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller - Oct. 28, 2009, 7:55 a.m.
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 28 Oct 2009 08:52:57 +0100

> Stephen Hemminger a écrit :
>> 
>> Just having larger buffer isn't guarantee of success. Allocating
>> a huge buffer is not going to work on embedded.
>> 
> 
> Please note we do not allocate a big buffer, only allow more small skbs
> to be queued on socket receive queue.
> 
> If memory is not available, skb allocation will eventually fail
> and be reported as well, embedded or not.
> 
> I vote for allowing 1024*1024 bytes instead of 32768,
> and eventually user should be warned that it is capped by 
> /proc/sys/net/core/rmem_max

This discussion constantly reminds me of:

/*
 *	skb should fit one page. This choice is good for headerless malloc.
 *	But we should limit to 8K so that userspace does not have to
 *	use enormous buffer sizes on recvmsg() calls just to avoid
 *	MSG_TRUNC when PAGE_SIZE is very large.
 */
#if PAGE_SIZE < 8192UL
#define NLMSG_GOODSIZE	SKB_WITH_OVERHEAD(PAGE_SIZE)
#else
#define NLMSG_GOODSIZE	SKB_WITH_OVERHEAD(8192UL)
#endif

#define NLMSG_DEFAULT_SIZE (NLMSG_GOODSIZE - NLMSG_HDRLEN)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/lib/libnetlink.c b/lib/libnetlink.c
index b68e2fd..95a7d1d 100644
--- a/lib/libnetlink.c
+++ b/lib/libnetlink.c
@@ -38,7 +38,7 @@  int rtnl_open_byproto(struct rtnl_handle *rth, unsigned subscriptions,
  {
         socklen_t addr_len;
         int sndbuf = 32768;
-       int rcvbuf = 32768;
+       int rcvbuf = 3276800;

         memset(rth, 0, sizeof(*rth));