Patchwork iproute uses too small of a receive buffer

login
register
mail settings
Submitter Eric Dumazet
Date Oct. 28, 2009, 7:01 a.m.
Message ID <4AE7EC65.8000600@gmail.com>
Download mbox | patch
Permalink /patch/37054/
State Not Applicable
Delegated to: David Miller
Headers show

Comments

Eric Dumazet - Oct. 28, 2009, 7:01 a.m.
Ben Greear a écrit :
> 
> Probably the right way is to give a cmd-line arg to set the buffer size
> and also continue if the error is ENOBUFs (but print some error out
> so users know they have issues).  I can make the attempt if that
> sounds good to you.

Real fix is to realloc buffer at receive time, no need for user setting.

In my testings I saw it reaching 1 Mbyte
write(2, "REALLOC buflen 8192\n"..., 20) = 20
write(2, "REALLOC buflen 16384\n"..., 21) = 21
write(2, "REALLOC buflen 32768\n"..., 21) = 21
write(2, "REALLOC buflen 65536\n"..., 21) = 21
write(2, "REALLOC buflen 131072\n"..., 22) = 22
write(2, "REALLOC buflen 262144\n"..., 22) = 22
write(2, "REALLOC buflen 524288\n"..., 22) = 22


[iproute2] realloc buffer in rtnl_listen

# ip monitor route
netlink receive error No buffer space available (105)
Dump terminated 

Reported-by: Ben Greear<greearb@candelatech.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet - Oct. 28, 2009, 7:09 a.m.
Eric Dumazet a écrit :
> Ben Greear a écrit :
>> Probably the right way is to give a cmd-line arg to set the buffer size
>> and also continue if the error is ENOBUFs (but print some error out
>> so users know they have issues).  I can make the attempt if that
>> sounds good to you.
> 
> Real fix is to realloc buffer at receive time, no need for user setting.
> 

Then, another problem is that some information can be dropped at kernel level
when socket rcvbuf is full (ip monitor too slow to read its socket)

Thats hard to fix because you need to tweak /proc/sys/net/core/rmem_max


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet - Oct. 28, 2009, 7:37 a.m.
Eric Dumazet a écrit :
> Ben Greear a écrit :
>> Probably the right way is to give a cmd-line arg to set the buffer size
>> and also continue if the error is ENOBUFs (but print some error out
>> so users know they have issues).  I can make the attempt if that
>> sounds good to you.
> 
> Real fix is to realloc buffer at receive time, no need for user setting.
> 
> In my testings I saw it reaching 1 Mbyte
> write(2, "REALLOC buflen 8192\n"..., 20) = 20
> write(2, "REALLOC buflen 16384\n"..., 21) = 21
> write(2, "REALLOC buflen 32768\n"..., 21) = 21
> write(2, "REALLOC buflen 65536\n"..., 21) = 21
> write(2, "REALLOC buflen 131072\n"..., 22) = 22
> write(2, "REALLOC buflen 262144\n"..., 22) = 22
> write(2, "REALLOC buflen 524288\n"..., 22) = 22
> 
> 
> [iproute2] realloc buffer in rtnl_listen
> 
> # ip monitor route
> netlink receive error No buffer space available (105)
> Dump terminated 
> 
> Reported-by: Ben Greear<greearb@candelatech.com>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Oops, this was wrong, Ben was right, sorry...

ENOBUFS errors is a flag to actually report to user that some information was dropped,
not that user supplied buffer at recv() time is not big enough.

I was surprised that buffer could reach 1Mbytes, while RCVBUF was 32768 or so.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/lib/libnetlink.c b/lib/libnetlink.c
index b68e2fd..134ce7f 100644
--- a/lib/libnetlink.c
+++ b/lib/libnetlink.c
@@ -392,8 +392,14 @@  int rtnl_listen(struct rtnl_handle *rtnl,
 		.msg_iov = &iov,
 		.msg_iovlen = 1,
 	};
-	char   buf[8192];
+	char   *buf;
+	size_t buflen = 8192;
 
+	buf = malloc(buflen);
+	if (buf == NULL) {
+		fprintf(stderr, "netlink could not alloc %lu bytes\n", buflen);
+		return -1;
+	}
 	memset(&nladdr, 0, sizeof(nladdr));
 	nladdr.nl_family = AF_NETLINK;
 	nladdr.nl_pid = 0;
@@ -401,12 +407,20 @@  int rtnl_listen(struct rtnl_handle *rtnl,
 
 	iov.iov_base = buf;
 	while (1) {
-		iov.iov_len = sizeof(buf);
+		iov.iov_len = buflen;
 		status = recvmsg(rtnl->fd, &msg, 0);
 
 		if (status < 0) {
 			if (errno == EINTR || errno == EAGAIN)
 				continue;
+			if (errno == ENOBUFS) {
+				buf = realloc(buf, buflen * 2);
+				if (buf) {
+					buflen *= 2;
+					iov.iov_base = buf;
+					continue;
+				}
+			}
 			fprintf(stderr, "netlink receive error %s (%d)\n",
 				strerror(errno), errno);
 			return -1;