diff mbox

[RFC] iproute2: gracefully exit from rtnl_listen()

Message ID 20090901155955.7448c1d6@nehalam
State Not Applicable, archived
Delegated to: David Miller
Headers show

Commit Message

stephen hemminger Sept. 1, 2009, 10:59 p.m. UTC
On Tue,  1 Sep 2009 17:41:26 -0400
David Ward <david.ward@ll.mit.edu> wrote:

> rtnl_listen() (in iproute2's libnetlink) allows a userspace application to
> monitor an RTNETLINK multicast group, so that the application can react to
> changes in the kernel's routing table, neighbor cache, etc. as they occur.
> (This is in contrast with using RTNETLINK to solicit the kernel for a full
> copy of the routing table or neighbor cache at a specific point in time.)
> 
> However rtnl_listen() creates an infinite loop, which does not break in the
> absence of errors, so there does not seem to be any way to gracefully exit
> from it. Existing applications that call rtnl_listen(), such as rtmon, break
> from this loop by terminating the entire application. There should be some
> mechanism for stopping rtnl_listen() and continuing program execution.
> 
> If you assume that rtnl_listen() will only need to be stopped because the
> application has received a signal (such as SIGINT), then the application's
> signal handler can call rtnl_close() to close the RTNETLINK socket.
> Afterwards, with this patch, rtnl_listen() can detect that the socket was
> closed during the interrupt and exit.
> 
> I am offering this as a starting point for consideration, but I feel that
> this is at best a poor solution. What is a better way to fix rtnl_listen()
> (and similar libnetlink functions for consistency)?
> ---
>  lib/libnetlink.c |   15 ++++++++++++---
>  1 files changed, 12 insertions(+), 3 deletions(-)

Good idea, but why not just change the while(1) loop?



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

David Ward Sept. 2, 2009, 2:27 a.m. UTC | #1
On 09/01/2009 06:59 PM, Stephen Hemminger wrote:
>
> On Tue,  1 Sep 2009 17:41:26 -0400
> David Ward <david.ward@ll.mit.edu> wrote:
>>
>> There should be some mechanism for stopping rtnl_listen() and
>> continuing program execution.
>
> Good idea, but why not just change the while(1) loop?

Stephen, thanks for your response. I agree that checking a condition on 
the while loop, as you showed, can be used to exit the loop cleanly.

Here are my concerns though:

    * rtnl_listen() is the function that I believe needs to be fixed.
      It is what allows applications to passively observe messages sent
      to the rtnetlink multicast groups -- messages which are not
      prompted by any request from the application, where there is no
      defined "end" to the sequence of messages being received. This
      function currently has no condition in the loop where it will stop
      trying to receive messages and exit cleanly.

      rtnl_dump_filter(), which your patch modifies, appears to be for
      processing the response to an rtnetlink dump request from the
      application. It is at least able to exit cleanly when it sees the
      NLMSG_DONE message at the end of the kernel's response.

      However, I think that any changes that are made to rtnl_listen()
      which allow it to exit gracefully could potentially be applied to
      the loops in rtnl_dump_filter() and rtnl_talk() as well.

    * With your patch, rtnl_dump_filter() returns -1 after the rtnetlink
      socket is closed. This indicates than an error has occurred.
      However, because trying to exit the loop by closing the socket is
      intentional, the function should be able to return 0.

      But could there be other cases where the rtnetlink socket is closed
      unintentionally, and -1 actually needs to be returned?

    * Is it really necessary to close the rtnetlink socket in order to
      break out of rtnl_listen()? What if the user wants to continue
      using the socket -- should the socket have to be closed to exit
      from rtnl_listen(), and then reopened? Is there a better way to
      notify the loop to exit instead?

    * Furthermore, should you have to rely on a signal like SIGINT to be
      received by the application in order to check for the condition
      under which to break out of the loop? Shouldn't it be possible to
      break out of the loop without a signal? But recvmsg() is blocking.


Again, I am raising these questions because I'm not sure what a 
"correct" solution to breaking out of rtnl_listen() would look like.

Thanks,

David
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/lib/libnetlink.c b/lib/libnetlink.c
index b68e2fd..cecfa34 100644
--- a/lib/libnetlink.c
+++ b/lib/libnetlink.c
@@ -188,7 +188,7 @@  int rtnl_dump_filter(struct rtnl_handle *rth,
 	char buf[16384];
 
 	iov.iov_base = buf;
-	while (1) {
+	while (rtnl->fd >= 0) {
 		int status;
 		struct nlmsghdr *h;
 
@@ -200,12 +200,12 @@  int rtnl_dump_filter(struct rtnl_handle *rth,
 				continue;
 			fprintf(stderr, "netlink receive error %s (%d)\n",
 				strerror(errno), errno);
-			return -1;
+			break;
 		}
 
 		if (status == 0) {
 			fprintf(stderr, "EOF on netlink\n");
-			return -1;
+			break;
 		}
 
 		h = (struct nlmsghdr*)buf;
@@ -251,6 +251,8 @@  skip_it:
 			exit(1);
 		}
 	}
+
+	return -1;
 }
 
 int rtnl_talk(struct rtnl_handle *rtnl, struct nlmsghdr *n, pid_t peer,