From patchwork Sun Feb 1 13:33:57 2009 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pablo Neira Ayuso X-Patchwork-Id: 21460 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.176.167]) by ozlabs.org (Postfix) with ESMTP id 2B7B5DDF57 for ; Mon, 2 Feb 2009 00:34:19 +1100 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751710AbZBANeN (ORCPT ); Sun, 1 Feb 2009 08:34:13 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751900AbZBANeM (ORCPT ); Sun, 1 Feb 2009 08:34:12 -0500 Received: from mail.us.es ([193.147.175.20]:48291 "EHLO us.es" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751661AbZBANeK (ORCPT ); Sun, 1 Feb 2009 08:34:10 -0500 Received: (qmail 18007 invoked from network); 1 Feb 2009 14:34:07 +0100 Received: from unknown (HELO us.es) (192.168.2.13) by us.es with SMTP; 1 Feb 2009 14:34:07 +0100 Received: (qmail 24666 invoked by uid 511); 1 Feb 2009 13:34:00 -0000 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on antivirus3 X-Spam-Level: X-Spam-Status: No, score=0.1 required=6.5 tests=BAYES_50,RDNS_NONE autolearn=disabled version=3.2.5 Received: from localhost by antivirus3 (envelope-from , uid 502) with qmail-scanner-2.02 (clamdscan: 0.94.2/8933. Clear:RC:1(127.0.0.1):. Processed in 0.026449 secs); 01 Feb 2009 13:34:00 -0000 Received: from localhost (HELO us.es) (127.0.0.1) by us.es with SMTP; 1 Feb 2009 14:34:00 +0100 Received: (qmail 13964 invoked from network); 1 Feb 2009 14:33:58 +0100 Received: from unknown (HELO ?192.168.2.100?) (pneira@us.es@89.130.131.28) by us.es with (DHE-RSA-AES256-SHA encrypted) SMTP; 1 Feb 2009 14:33:58 +0100 Message-ID: <4985A4C5.4050908@netfilter.org> Date: Sun, 01 Feb 2009 14:33:57 +0100 From: Pablo Neira Ayuso User-Agent: Thunderbird 1.5.0.5 (X11/20060812) MIME-Version: 1.0 To: netdev@vger.kernel.org CC: Patrick McHardy , Netfilter Development Mailinglist Subject: [RFC] netlink broadcast return value X-Enigmail-Version: 0.94.0.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Currently, and according to my interpretation of the source code, netlink_broadcast() return-value reports errors to the caller if no messages at all were delivered: 1) If, at least, one message has been delivered correctly, returns 0. 2) Otherwise, if no messages at all were delivered due to skb_clone() failure, return -ENOBUFS. 3) Otherwise, if there are no listeners, return -ESRCH. I would need to know if the caller has failed delivering any of the messages to the listeners as follows: 1) If it fails to deliver any message (for whatever reason), return -ENOBUFS. 2) If all messages were delivered OK, returns 0. 3) If no listeners, return -ESRCH. In the current ctnetlink code and in Netfilter in general, we can add reliable logging and connection tracking event delivery by dropping the packets whose events were not successfully delivered over Netlink. Of course, this option would be settable via /proc as this approach reduces performance (in terms of filtered connections per seconds by a stateful firewall) but providing reliable logging and event delivery (for conntrackd) in return. I have check the whole kernel code to look for current users of netlink_broadcast() to see how they are handling errors reported and how a change in the return value would affect them. Here it follows a short summary: = current list of clients of netlink_broadcast() = == netlink_broadcast() == Handling drivers/scsi/scsi_transport_iscsi.c : printk error drivers/connector/connector.c : cn_netlink_send() return value include/net/netlink.h : nlmsg_multicast() return value lib/kobject_uevent.c : ignores return value net/core/rtnetlink.c : ignores return value net/ipv4/netfilter/ipt_ULOG.c : ignores return value net/bridge/netfilter/ebt_ulog.c : ignores return value net/decnet/netfilter/dn_rtmsg.c : ignores return value security/selinux/netlink.c : ignores return value == cn_netlink_send (uses netlink_broadcast return value) == drivers/w1/w1_netlink.c : ignores return value drivers/video/uvesafb.c : printk error (if err != ESRCH) == nlmsg_multicast (calls netlink_broadcast) == drivers/scsi/scsi_transport_fc.c : printk error (if err != -ESRCH) include/net/genetlink.h : genlmsg_multicast() return value net/xfrm/xfrm_user.c : xfrm_send_migrate() return value xfrm_exp_state_notify() return value xfrm_aevent_state_notify() return value xfrm_notify_sa_flush() return value xfrm_notify_sa() return value xfrm_send_acquire() return value xfrm_exp_policy_notify() return value xfrm_notify_policy() return value xfrm_notify_policy_flush() return val xfrm_send_report() return value xfrm_send_mapping() return value ... later they all ignore the return value == genlmsg_multicast (calls nlmsg_multicast) == net/netlink/genetlink.c : ignores return value drivers/acpi/event.c : printk error fs/dquot.c : printk error (if err != -ESRCH) net/wireless/nl80211.c : ignores return value In short, I think that the change that I'm proposing would also require to fix some netlink_broadcast() clients to skip ENOBUFS errors: they are not meaningful for them since they assume that Netlink is unreliable and so the return value does not provide any useful information. Please, let me know how crazy this idea is ;). diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c index 480184a..26e1a89 100644 --- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -943,6 +943,7 @@ struct netlink_broadcast_data { u32 pid; u32 group; int failure; + int delivery_failure; int congested; int delivered; gfp_t allocation; @@ -992,6 +993,7 @@ static inline int do_one_broadcast(struct sock *sk, p->skb2 = NULL; } else if ((val = netlink_broadcast_deliver(sk, p->skb2)) < 0) { netlink_overrun(sk); + p->delivery_failure = 1; } else { p->congested |= val; p->delivered = 1; @@ -1018,6 +1020,7 @@ int netlink_broadcast(struct sock *ssk, struct sk_buff *skb, u32 pid, info.pid = pid; info.group = group; info.failure = 0; + info.delivery_failure = 0; info.congested = 0; info.delivered = 0; info.allocation = allocation; @@ -1038,13 +1041,14 @@ int netlink_broadcast(struct sock *ssk, struct sk_buff *skb, u32 pid, if (info.skb2) kfree_skb(info.skb2); + if (info.delivery_failure || info.failure) + return -ENOBUFS; + if (info.delivered) { if (info.congested && (allocation & __GFP_WAIT)) yield(); return 0; } - if (info.failure) - return -ENOBUFS; return -ESRCH; } EXPORT_SYMBOL(netlink_broadcast);