From patchwork Mon Jul 11 14:20:12 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Pisati X-Patchwork-Id: 104219 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from chlorine.canonical.com (chlorine.canonical.com [91.189.94.204]) by ozlabs.org (Postfix) with ESMTP id 6A714B6F85 for ; Tue, 12 Jul 2011 00:20:43 +1000 (EST) Received: from localhost ([127.0.0.1] helo=chlorine.canonical.com) by chlorine.canonical.com with esmtp (Exim 4.71) (envelope-from ) id 1QgHLZ-0007MF-JJ; Mon, 11 Jul 2011 14:20:29 +0000 Received: from adelie.canonical.com ([91.189.90.139]) by chlorine.canonical.com with esmtp (Exim 4.71) (envelope-from ) id 1QgHLW-0007Km-QY for kernel-team@lists.ubuntu.com; Mon, 11 Jul 2011 14:20:26 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by adelie.canonical.com with esmtp (Exim 4.71 #1 (Debian)) id 1QgHLW-00037Z-NU for ; Mon, 11 Jul 2011 14:20:26 +0000 Received: from dynamic-adsl-94-36-117-159.clienti.tiscali.it ([94.36.117.159] helo=canonical.com) by youngberry.canonical.com with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1QgHLW-0004f7-Dq for kernel-team@lists.ubuntu.com; Mon, 11 Jul 2011 14:20:26 +0000 From: Paolo Pisati To: kernel-team@lists.ubuntu.com Subject: [PATCH 03/11] ipv6: udp: Optimise multicast reception Date: Mon, 11 Jul 2011 16:20:12 +0200 Message-Id: <1310394020-4846-4-git-send-email-paolo.pisati@canonical.com> X-Mailer: git-send-email 1.7.5.4 In-Reply-To: <1310394020-4846-1-git-send-email-paolo.pisati@canonical.com> References: <1310394020-4846-1-git-send-email-paolo.pisati@canonical.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.13 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: kernel-team-bounces@lists.ubuntu.com Errors-To: kernel-team-bounces@lists.ubuntu.com From: Eric Dumazet BugLink: http://bugs.launchpad.net/bugs/807462 IPV6 UDP multicast rx path is a bit complex and can hold a spinlock for a long time. Using a small (32 or 64 entries) stack of socket pointers can help to perform expensive operations (skb_clone(), udp_queue_rcv_skb()) outside of the lock, in most cases. Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller (cherry picked from commit a1ab77f97ed03f5dae66ae4c64375beffab83772) Signed-off-by: Paolo Pisati --- net/ipv6/udp.c | 71 +++++++++++++++++++++++++++++++++++++------------------- 1 files changed, 47 insertions(+), 24 deletions(-) diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index cf538ed..ede7a73 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -440,6 +440,27 @@ static struct sock *udp_v6_mcast_next(struct net *net, struct sock *sk, return NULL; } +static void flush_stack(struct sock **stack, unsigned int count, + struct sk_buff *skb, unsigned int final) +{ + unsigned int i; + struct sock *sk; + struct sk_buff *skb1; + + for (i = 0; i < count; i++) { + skb1 = (i == final) ? skb : skb_clone(skb, GFP_ATOMIC); + + if (skb1) { + sk = stack[i]; + bh_lock_sock(sk); + if (!sock_owned_by_user(sk)) + udpv6_queue_rcv_skb(sk, skb1); + else + sk_add_backlog(sk, skb1); + bh_unlock_sock(sk); + } + } +} /* * Note: called only from the BH handler context, * so we don't need to lock the hashes. @@ -448,41 +469,43 @@ static int __udp6_lib_mcast_deliver(struct net *net, struct sk_buff *skb, struct in6_addr *saddr, struct in6_addr *daddr, struct udp_table *udptable) { - struct sock *sk, *sk2; + struct sock *sk, *stack[256 / sizeof(struct sock *)]; const struct udphdr *uh = udp_hdr(skb); struct udp_hslot *hslot = &udptable->hash[udp_hashfn(net, ntohs(uh->dest))]; int dif; + unsigned int i, count = 0; spin_lock(&hslot->lock); sk = sk_nulls_head(&hslot->head); dif = inet6_iif(skb); sk = udp_v6_mcast_next(net, sk, uh->dest, daddr, uh->source, saddr, dif); - if (!sk) { - kfree_skb(skb); - goto out; - } - - sk2 = sk; - while ((sk2 = udp_v6_mcast_next(net, sk_nulls_next(sk2), uh->dest, daddr, - uh->source, saddr, dif))) { - struct sk_buff *buff = skb_clone(skb, GFP_ATOMIC); - if (buff) { - bh_lock_sock(sk2); - if (!sock_owned_by_user(sk2)) - udpv6_queue_rcv_skb(sk2, buff); - else - sk_add_backlog(sk2, buff); - bh_unlock_sock(sk2); + while (sk) { + stack[count++] = sk; + sk = udp_v6_mcast_next(net, sk_nulls_next(sk), uh->dest, daddr, + uh->source, saddr, dif); + if (unlikely(count == ARRAY_SIZE(stack))) { + if (!sk) + break; + flush_stack(stack, count, skb, ~0); + count = 0; } } - bh_lock_sock(sk); - if (!sock_owned_by_user(sk)) - udpv6_queue_rcv_skb(sk, skb); - else - sk_add_backlog(sk, skb); - bh_unlock_sock(sk); -out: + /* + * before releasing the lock, we must take reference on sockets + */ + for (i = 0; i < count; i++) + sock_hold(stack[i]); + spin_unlock(&hslot->lock); + + if (count) { + flush_stack(stack, count, skb, count - 1); + + for (i = 0; i < count; i++) + sock_put(stack[i]); + } else { + kfree_skb(skb); + } return 0; }