diff mbox

iproute uses too small of a receive buffer

Message ID 4AE7EC65.8000600@gmail.com
State Not Applicable, archived
Delegated to: David Miller
Headers show

Commit Message

Eric Dumazet Oct. 28, 2009, 7:01 a.m. UTC
Ben Greear a écrit :
> 
> Probably the right way is to give a cmd-line arg to set the buffer size
> and also continue if the error is ENOBUFs (but print some error out
> so users know they have issues).  I can make the attempt if that
> sounds good to you.

Real fix is to realloc buffer at receive time, no need for user setting.

In my testings I saw it reaching 1 Mbyte
write(2, "REALLOC buflen 8192\n"..., 20) = 20
write(2, "REALLOC buflen 16384\n"..., 21) = 21
write(2, "REALLOC buflen 32768\n"..., 21) = 21
write(2, "REALLOC buflen 65536\n"..., 21) = 21
write(2, "REALLOC buflen 131072\n"..., 22) = 22
write(2, "REALLOC buflen 262144\n"..., 22) = 22
write(2, "REALLOC buflen 524288\n"..., 22) = 22


[iproute2] realloc buffer in rtnl_listen

# ip monitor route
netlink receive error No buffer space available (105)
Dump terminated 

Reported-by: Ben Greear<greearb@candelatech.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Eric Dumazet Oct. 28, 2009, 7:09 a.m. UTC | #1
Eric Dumazet a écrit :
> Ben Greear a écrit :
>> Probably the right way is to give a cmd-line arg to set the buffer size
>> and also continue if the error is ENOBUFs (but print some error out
>> so users know they have issues).  I can make the attempt if that
>> sounds good to you.
> 
> Real fix is to realloc buffer at receive time, no need for user setting.
> 

Then, another problem is that some information can be dropped at kernel level
when socket rcvbuf is full (ip monitor too slow to read its socket)

Thats hard to fix because you need to tweak /proc/sys/net/core/rmem_max


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Oct. 28, 2009, 7:37 a.m. UTC | #2
Eric Dumazet a écrit :
> Ben Greear a écrit :
>> Probably the right way is to give a cmd-line arg to set the buffer size
>> and also continue if the error is ENOBUFs (but print some error out
>> so users know they have issues).  I can make the attempt if that
>> sounds good to you.
> 
> Real fix is to realloc buffer at receive time, no need for user setting.
> 
> In my testings I saw it reaching 1 Mbyte
> write(2, "REALLOC buflen 8192\n"..., 20) = 20
> write(2, "REALLOC buflen 16384\n"..., 21) = 21
> write(2, "REALLOC buflen 32768\n"..., 21) = 21
> write(2, "REALLOC buflen 65536\n"..., 21) = 21
> write(2, "REALLOC buflen 131072\n"..., 22) = 22
> write(2, "REALLOC buflen 262144\n"..., 22) = 22
> write(2, "REALLOC buflen 524288\n"..., 22) = 22
> 
> 
> [iproute2] realloc buffer in rtnl_listen
> 
> # ip monitor route
> netlink receive error No buffer space available (105)
> Dump terminated 
> 
> Reported-by: Ben Greear<greearb@candelatech.com>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Oops, this was wrong, Ben was right, sorry...

ENOBUFS errors is a flag to actually report to user that some information was dropped,
not that user supplied buffer at recv() time is not big enough.

I was surprised that buffer could reach 1Mbytes, while RCVBUF was 32768 or so.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/lib/libnetlink.c b/lib/libnetlink.c
index b68e2fd..134ce7f 100644
--- a/lib/libnetlink.c
+++ b/lib/libnetlink.c
@@ -392,8 +392,14 @@  int rtnl_listen(struct rtnl_handle *rtnl,
 		.msg_iov = &iov,
 		.msg_iovlen = 1,
 	};
-	char   buf[8192];
+	char   *buf;
+	size_t buflen = 8192;
 
+	buf = malloc(buflen);
+	if (buf == NULL) {
+		fprintf(stderr, "netlink could not alloc %lu bytes\n", buflen);
+		return -1;
+	}
 	memset(&nladdr, 0, sizeof(nladdr));
 	nladdr.nl_family = AF_NETLINK;
 	nladdr.nl_pid = 0;
@@ -401,12 +407,20 @@  int rtnl_listen(struct rtnl_handle *rtnl,
 
 	iov.iov_base = buf;
 	while (1) {
-		iov.iov_len = sizeof(buf);
+		iov.iov_len = buflen;
 		status = recvmsg(rtnl->fd, &msg, 0);
 
 		if (status < 0) {
 			if (errno == EINTR || errno == EAGAIN)
 				continue;
+			if (errno == ENOBUFS) {
+				buf = realloc(buf, buflen * 2);
+				if (buf) {
+					buflen *= 2;
+					iov.iov_base = buf;
+					continue;
+				}
+			}
 			fprintf(stderr, "netlink receive error %s (%d)\n",
 				strerror(errno), errno);
 			return -1;