From patchwork Fri Jul 22 17:42:33 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andy Whitcroft X-Patchwork-Id: 106356 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from chlorine.canonical.com (chlorine.canonical.com [91.189.94.204]) by ozlabs.org (Postfix) with ESMTP id 5908EB6F62 for ; Sat, 23 Jul 2011 03:43:16 +1000 (EST) Received: from localhost ([127.0.0.1] helo=chlorine.canonical.com) by chlorine.canonical.com with esmtp (Exim 4.71) (envelope-from ) id 1QkJkT-0001Cw-FG; Fri, 22 Jul 2011 17:42:53 +0000 Received: from adelie.canonical.com ([91.189.90.139]) by chlorine.canonical.com with esmtp (Exim 4.71) (envelope-from ) id 1QkJkP-0001BN-VM for kernel-team@lists.ubuntu.com; Fri, 22 Jul 2011 17:42:50 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by adelie.canonical.com with esmtp (Exim 4.71 #1 (Debian)) id 1QkJkP-0005ER-Tf for ; Fri, 22 Jul 2011 17:42:49 +0000 Received: from [85.210.144.167] (helo=localhost.localdomain) by youngberry.canonical.com with esmtpsa (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1QkJkP-0002BN-Hk for kernel-team@lists.ubuntu.com; Fri, 22 Jul 2011 17:42:49 +0000 From: Andy Whitcroft To: kernel-team@lists.ubuntu.com Subject: [lucid/fsl-imx51 CVE 04/12] ipv4: udp: Optimise multicast reception Date: Fri, 22 Jul 2011 18:42:33 +0100 Message-Id: <1311356561-11988-5-git-send-email-apw@canonical.com> X-Mailer: git-send-email 1.7.4.1 In-Reply-To: <1311356561-11988-1-git-send-email-apw@canonical.com> References: <1311356561-11988-1-git-send-email-apw@canonical.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.13 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: kernel-team-bounces@lists.ubuntu.com Errors-To: kernel-team-bounces@lists.ubuntu.com From: Eric Dumazet BugLink: http://bugs.launchpad.net/bugs/807462 backported from upstream commit 1240d1373cd7f874dd0f3057c3e9643e71ef75c6 (patch necessary for 6ed41136ab20c99d47792b3f19171ab9e523a97f) UDP multicast rx path is a bit complex and can hold a spinlock for a long time. Using a small (32 or 64 entries) stack of socket pointers can help to perform expensive operations (skb_clone(), udp_queue_rcv_skb()) outside of the lock, in most cases. It's also a base for a future RCU conversion of multicast recption. Signed-off-by: Eric Dumazet Signed-off-by: Lucian Adrian Grijincu Signed-off-by: David S. Miller Signed-off-by: Paolo Pisati Signed-off-by: Tim Gardner Signed-off-by: Andy Whitcroft --- net/ipv4/udp.c | 76 ++++++++++++++++++++++++++++++++++++------------------- 1 files changed, 50 insertions(+), 26 deletions(-) diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 00d1f6d..9715a30 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1173,49 +1173,73 @@ drop: return -1; } + +static void flush_stack(struct sock **stack, unsigned int count, + struct sk_buff *skb, unsigned int final) +{ + unsigned int i; + struct sk_buff *skb1 = NULL; + + for (i = 0; i < count; i++) { + if (likely(skb1 == NULL)) + skb1 = (i == final) ? skb : skb_clone(skb, GFP_ATOMIC); + + if (skb1 && udp_queue_rcv_skb(stack[i], skb1) <= 0) + skb1 = NULL; + } + if (unlikely(skb1)) + kfree_skb(skb1); +} + /* * Multicasts and broadcasts go to each listener. * - * Note: called only from the BH handler context, - * so we don't need to lock the hashes. + * Note: called only from the BH handler context. */ static int __udp4_lib_mcast_deliver(struct net *net, struct sk_buff *skb, struct udphdr *uh, __be32 saddr, __be32 daddr, struct udp_table *udptable) { - struct sock *sk; + struct sock *sk, *stack[256 / sizeof(struct sock *)]; struct udp_hslot *hslot = &udptable->hash[udp_hashfn(net, ntohs(uh->dest))]; int dif; + unsigned int i, count = 0; spin_lock(&hslot->lock); sk = sk_nulls_head(&hslot->head); dif = skb->dev->ifindex; sk = udp_v4_mcast_next(net, sk, uh->dest, daddr, uh->source, saddr, dif); - if (sk) { - struct sock *sknext = NULL; - - do { - struct sk_buff *skb1 = skb; - - sknext = udp_v4_mcast_next(net, sk_nulls_next(sk), uh->dest, - daddr, uh->source, saddr, - dif); - if (sknext) - skb1 = skb_clone(skb, GFP_ATOMIC); - - if (skb1) { - int ret = udp_queue_rcv_skb(sk, skb1); - if (ret > 0) - /* we should probably re-process instead - * of dropping packets here. */ - kfree_skb(skb1); - } - sk = sknext; - } while (sknext); - } else - consume_skb(skb); + while (sk) { + stack[count++] = sk; + sk = udp_v4_mcast_next(net, sk_nulls_next(sk), uh->dest, + daddr, uh->source, saddr, dif); + if (unlikely(count == ARRAY_SIZE(stack))) { + if (!sk) + break; + flush_stack(stack, count, skb, ~0); + count = 0; + } + } + /* + * before releasing chain lock, we must take a reference on sockets + */ + for (i = 0; i < count; i++) + sock_hold(stack[i]); + spin_unlock(&hslot->lock); + + /* + * do the slow work with no lock held + */ + if (count) { + flush_stack(stack, count, skb, count - 1); + + for (i = 0; i < count; i++) + sock_put(stack[i]); + } else { + kfree_skb(skb); + } return 0; }