From patchwork Thu Aug 16 22:48:35 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Waskiewicz Jr, Peter P" X-Patchwork-Id: 178119 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id AF2052C0082 for ; Fri, 17 Aug 2012 08:49:25 +1000 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755762Ab2HPWtL (ORCPT ); Thu, 16 Aug 2012 18:49:11 -0400 Received: from mga09.intel.com ([134.134.136.24]:35330 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755745Ab2HPWsv (ORCPT ); Thu, 16 Aug 2012 18:48:51 -0400 Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga102.jf.intel.com with ESMTP; 16 Aug 2012 15:48:47 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.77,781,1336374000"; d="scan'208";a="187431446" Received: from ppwaskie-mobl2.jf.intel.com ([10.24.80.204]) by orsmga002.jf.intel.com with ESMTP; 16 Aug 2012 15:48:46 -0700 From: Peter P Waskiewicz Jr To: davem@davemloft.net Cc: Alexander Duyck , netdev@vger.kernel.org, gospo@redhat.com, sassmann@redhat.com, Peter P Waskiewicz Jr Subject: [net-next 6/9] ixgbe: Copybreak sooner to avoid get_page/put_page and offset change overhead Date: Thu, 16 Aug 2012 15:48:35 -0700 Message-Id: <1345157318-23731-7-git-send-email-peter.p.waskiewicz.jr@intel.com> X-Mailer: git-send-email 1.7.11.2 In-Reply-To: <1345157318-23731-1-git-send-email-peter.p.waskiewicz.jr@intel.com> References: <1345157318-23731-1-git-send-email-peter.p.waskiewicz.jr@intel.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Alexander Duyck This change makes it so that if only the first 256 bytes of a buffer are used we just copy the data out and leave the offset and page count unchanged. There are multiple advantages to this. First it allows us to reuse the page much more in the case of pages larger than 4K. It also allows us to avoid some expensive atomic operations in the form of get_page/put_page. In perf I have seen CPU utilization for put_page drop from 3.5% to 1.8% as a result of this patch when doing small packet routing, and packet rates increased by about 3%. Signed-off-by: Alexander Duyck Tested-by: Phil Schmitt Signed-off-by: Peter P Waskiewicz Jr --- drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 32 ++++++++++++++------------- 1 file changed, 17 insertions(+), 15 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c index d926973..d11fac5 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c @@ -1487,9 +1487,7 @@ static void ixgbe_pull_tail(struct ixgbe_ring *rx_ring, * we need the header to contain the greater of either ETH_HLEN or * 60 bytes if the skb->len is less than 60 for skb_pad. */ - pull_len = skb_frag_size(frag); - if (pull_len > IXGBE_RX_HDR_SIZE) - pull_len = ixgbe_get_headlen(va, IXGBE_RX_HDR_SIZE); + pull_len = ixgbe_get_headlen(va, IXGBE_RX_HDR_SIZE); /* align pull length to size of long to optimize memcpy performance */ skb_copy_to_linear_data(skb, va, ALIGN(pull_len, sizeof(long))); @@ -1499,17 +1497,6 @@ static void ixgbe_pull_tail(struct ixgbe_ring *rx_ring, frag->page_offset += pull_len; skb->data_len -= pull_len; skb->tail += pull_len; - - /* - * if we sucked the frag empty then we should free it, - * if there are other frags here something is screwed up in hardware - */ - if (skb_frag_size(frag) == 0) { - BUG_ON(skb_shinfo(skb)->nr_frags != 1); - skb_shinfo(skb)->nr_frags = 0; - __skb_frag_unref(frag); - skb->truesize -= ixgbe_rx_bufsz(rx_ring); - } } /** @@ -1575,7 +1562,8 @@ static bool ixgbe_cleanup_headers(struct ixgbe_ring *rx_ring, } /* place header in linear portion of buffer */ - ixgbe_pull_tail(rx_ring, skb); + if (skb_is_nonlinear(skb)) + ixgbe_pull_tail(rx_ring, skb); #ifdef IXGBE_FCOE /* do not attempt to pad FCoE Frames as this will disrupt DDP */ @@ -1656,6 +1644,20 @@ static bool ixgbe_add_rx_frag(struct ixgbe_ring *rx_ring, ixgbe_rx_bufsz(rx_ring); #endif + if ((size <= IXGBE_RX_HDR_SIZE) && !skb_is_nonlinear(skb)) { + unsigned char *va = page_address(page) + rx_buffer->page_offset; + + memcpy(__skb_put(skb, size), va, ALIGN(size, sizeof(long))); + + /* we can reuse buffer as-is, just make sure it is local */ + if (likely(page_to_nid(page) == numa_node_id())) + return true; + + /* this page cannot be reused so discard it */ + put_page(page); + return false; + } + skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, page, rx_buffer->page_offset, size, truesize);