From patchwork Tue Oct 1 19:33:45 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shawn Bohrer X-Patchwork-Id: 279561 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 9C5422C00BA for ; Wed, 2 Oct 2013 05:34:34 +1000 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751985Ab3JATeb (ORCPT ); Tue, 1 Oct 2013 15:34:31 -0400 Received: from na3sys009aog126.obsmtp.com ([74.125.149.155]:32995 "EHLO na3sys009aog126.obsmtp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751716Ab3JATe0 (ORCPT ); Tue, 1 Oct 2013 15:34:26 -0400 Received: from mail-oa0-f46.google.com ([209.85.219.46]) (using TLSv1) by na3sys009aob126.postini.com ([74.125.148.12]) with SMTP ID DSNKUksjwgSE2DhjbMkir7tAaFsl1X2Pt7Mg@postini.com; Tue, 01 Oct 2013 12:34:26 PDT Received: by mail-oa0-f46.google.com with SMTP id k14so5321053oag.33 for ; Tue, 01 Oct 2013 12:34:26 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:from:to:cc:subject:date:message-id :in-reply-to:references:content-type; bh=3gdTr1biXgQujGZ3PR6NUdvtPzz3IK3iLNYh3Msyusk=; b=OXyucC2sDfgSZmu9Zn9erYSFbVpCFoZK6+g4x3datAsdOoafJWMmwYdwoKrKcjGIgT rA1UrUHtcm+07r7HzBBYtqCWiyhLqAv/ZZw9/qMZ+LJhYD+EDwsjES2oMFNdXzNZ91xq OzJyp49n3TzvOLDRcnPByv4Vmoz/aaJOW5oFRg2Un9PmkbFHM4yCwEhQ71xDFKVgzaHc 1JBHMnhDbBO826OdjGIAolpPQWiFZ++oeiEcdoWwcBnI6VUMfEnTEapv56i0vp3L/8Nj wGb9WBuTttKkwIRl7keNbQwyOYlSxRkAUMBelFYoItCV5IU6QYd5iXpFk2JgoYxEOcCJ Aesg== X-Gm-Message-State: ALoCoQkJ/u22RjTqVUiAZvBRq+RyEzVU01vFXbB7PAXnRAwADqHPXeI2HrhGBqhxmDMgr4y1D2ncF0bwKt+aPwY2WWgimUNVgVytVsn3o1/4D8Lkmg3LTi/FJ1KVZPXR/knzC384lRGHmRnAwjEvhTXNkiKRkVD9xjRRTsJQZH3zwPQDPWbtlW4= X-Received: by 10.60.141.225 with SMTP id rr1mr3512734oeb.55.1380656066079; Tue, 01 Oct 2013 12:34:26 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.60.141.225 with SMTP id rr1mr3512714oeb.55.1380656065969; Tue, 01 Oct 2013 12:34:25 -0700 (PDT) Received: from u1004.rgmadvisors.com ([173.227.92.65]) by mx.google.com with ESMTPSA id rr6sm15089761oeb.0.1969.12.31.16.00.00 (version=TLSv1 cipher=RC4-SHA bits=128/128); Tue, 01 Oct 2013 12:34:25 -0700 (PDT) From: Shawn Bohrer To: David Miller Cc: Eric Dumazet , tomk@rgmadvisors.com, netdev , Shawn Bohrer Subject: [net-next 3/3] net: ipv4 only populate IP_PKTINFO when needed Date: Tue, 1 Oct 2013 14:33:45 -0500 Message-Id: <1380656025-8847-4-git-send-email-sbohrer@rgmadvisors.com> X-Mailer: git-send-email 1.7.7.6 In-Reply-To: <1380656025-8847-1-git-send-email-sbohrer@rgmadvisors.com> References: <1380656025-8847-1-git-send-email-sbohrer@rgmadvisors.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The since the removal of the routing cache computing fib_compute_spec_dst() does a fib_table lookup for each UDP multicast packet received. This has introduced a performance regression for some UDP workloads. This change skips populating the packet info for sockets that do not have IP_PKTINFO set. Benchmark results from a netperf UDP_RR test: Before 91296.97 transactions/s After 91792.70 transactions/s Benchmark results from a fio 1 byte UDP multicast pingpong test (Multicast one way unicast response): Before 12.647us RTT After 12.233us RTT Signed-off-by: Shawn Bohrer --- include/net/ip.h | 2 +- net/ipv4/ip_sockglue.c | 5 +++-- net/ipv4/raw.c | 2 +- net/ipv4/udp.c | 2 +- 4 files changed, 6 insertions(+), 5 deletions(-) diff --git a/include/net/ip.h b/include/net/ip.h index 16078f4..bc98241 100644 --- a/include/net/ip.h +++ b/include/net/ip.h @@ -459,7 +459,7 @@ int ip_options_rcv_srr(struct sk_buff *skb); * Functions provided by ip_sockglue.c */ -void ipv4_pktinfo_prepare(struct sk_buff *skb); +void ipv4_pktinfo_prepare(struct sock *sk, struct sk_buff *skb); void ip_cmsg_recv(struct msghdr *msg, struct sk_buff *skb); int ip_cmsg_send(struct net *net, struct msghdr *msg, struct ipcm_cookie *ipc); int ip_setsockopt(struct sock *sk, int level, int optname, char __user *optval, diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c index 56e3445..dda9866 100644 --- a/net/ipv4/ip_sockglue.c +++ b/net/ipv4/ip_sockglue.c @@ -1052,11 +1052,12 @@ e_inval: * destination in skb->cb[] before dst drop. * This way, receiver doesnt make cache line misses to read rtable. */ -void ipv4_pktinfo_prepare(struct sk_buff *skb) +void ipv4_pktinfo_prepare(struct sock *sk, struct sk_buff *skb) { struct in_pktinfo *pktinfo = PKTINFO_SKB_CB(skb); - if (skb_rtable(skb)) { + if ((inet_sk(sk)->cmsg_flags & IP_CMSG_PKTINFO) && + skb_rtable(skb)) { pktinfo->ipi_ifindex = inet_iif(skb); pktinfo->ipi_spec_dst.s_addr = fib_compute_spec_dst(skb); } else { diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c index a3fe534..28694f8 100644 --- a/net/ipv4/raw.c +++ b/net/ipv4/raw.c @@ -297,7 +297,7 @@ static int raw_rcv_skb(struct sock *sk, struct sk_buff *skb) { /* Charge it to the socket. */ - ipv4_pktinfo_prepare(skb); + ipv4_pktinfo_prepare(sk, skb); if (sock_queue_rcv_skb(sk, skb) < 0) { kfree_skb(skb); return NET_RX_DROP; diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index ca54886..02185a5 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1543,7 +1543,7 @@ int udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb) rc = 0; - ipv4_pktinfo_prepare(skb); + ipv4_pktinfo_prepare(sk, skb); bh_lock_sock(sk); if (!sock_owned_by_user(sk)) rc = __udp_queue_rcv_skb(sk, skb);