From patchwork Fri Nov 15 09:11:35 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Lobakin X-Patchwork-Id: 1195475 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=dlink.ru Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=dlink.ru header.i=@dlink.ru header.b="V8uZqTp7"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 47Dt0M3nxTz9sPL for ; Fri, 15 Nov 2019 20:12:19 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727243AbfKOJMQ (ORCPT ); Fri, 15 Nov 2019 04:12:16 -0500 Received: from mail.dlink.ru ([178.170.168.18]:33776 "EHLO fd.dlink.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726996AbfKOJMQ (ORCPT ); Fri, 15 Nov 2019 04:12:16 -0500 Received: by fd.dlink.ru (Postfix, from userid 5000) id 69EB41B21157; Fri, 15 Nov 2019 12:12:12 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 fd.dlink.ru 69EB41B21157 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dlink.ru; s=mail; t=1573809132; bh=CtxYuY+GbtGqv5wTXxa7arIIy/o1DNl/HAFZOiotkTU=; h=From:To:Cc:Subject:Date; b=V8uZqTp7MX2P/2kNdeX0wta4JIxkank0K0AVppscD0C9XbbHY17EAnzP3ZanbOIAP fMFiB5bR6qvBMctPGnVBTmvUmMkBzoqjd/+1RDNgaxwd6OSG8byt3kgQJdNsBUCF0A fTZbK4JUm+IIavyNjXYWRDZkfvKBOCSsBqYIQ+fo= X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on mail.dlink.ru X-Spam-Level: X-Spam-Status: No, score=-99.2 required=7.5 tests=BAYES_50,URIBL_BLOCKED, USER_IN_WHITELIST autolearn=disabled version=3.4.2 Received: from mail.rzn.dlink.ru (mail.rzn.dlink.ru [178.170.168.13]) by fd.dlink.ru (Postfix) with ESMTP id B5FE21B20B5F; Fri, 15 Nov 2019 12:12:02 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 fd.dlink.ru B5FE21B20B5F Received: from mail.rzn.dlink.ru (localhost [127.0.0.1]) by mail.rzn.dlink.ru (Postfix) with ESMTP id 6EF761B21209; Fri, 15 Nov 2019 12:12:01 +0300 (MSK) Received: from localhost.localdomain (unknown [196.196.203.126]) by mail.rzn.dlink.ru (Postfix) with ESMTPA; Fri, 15 Nov 2019 12:12:01 +0300 (MSK) From: Alexander Lobakin To: "David S. Miller" Cc: Edward Cree , Jiri Pirko , Eric Dumazet , Ido Schimmel , Paolo Abeni , Petr Machata , Sabrina Dubroca , Florian Fainelli , Jassi Brar , Manish Chopra , GR-Linux-NIC-Dev@marvell.com, Johannes Berg , Emmanuel Grumbach , Luca Coelho , Intel Linux Wireless , Kalle Valo , Alexander Lobakin , netdev@vger.kernel.org, linux-wireless@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 net-next] net: core: allow fast GRO for skbs with Ethernet header in head Date: Fri, 15 Nov 2019 12:11:35 +0300 Message-Id: <20191115091135.13487-1-alobakin@dlink.ru> X-Mailer: git-send-email 2.24.0 MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Commit 78d3fd0b7de8 ("gro: Only use skb_gro_header for completely non-linear packets") back in May'09 (v2.6.31-rc1) has changed the original condition '!skb_headlen(skb)' to 'skb->mac_header == skb->tail' in gro_reset_offset() saying: "Since the drivers that need this optimisation all provide completely non-linear packets" (note that this condition has become the current 'skb_mac_header(skb) == skb_tail_pointer(skb)' later with commmit ced14f6804a9 ("net: Correct comparisons and calculations using skb->tail and skb-transport_header") without any functional changes). For now, we have the following rough statistics for v5.4-rc7: 1) napi_gro_frags: 14 2) napi_gro_receive with skb->head containing (most of) payload: 83 3) napi_gro_receive with skb->head containing all the headers: 20 4) napi_gro_receive with skb->head containing only Ethernet header: 2 With the current condition, fast GRO with the usage of NAPI_GRO_CB(skb)->frag0 is available only in the [1] case. Packets pushed by [2] and [3] go through the 'slow' path, but it's not a problem for them as they already contain all the needed headers in skb->head, so pskb_may_pull() only moves skb->data. The layout of skbs in the fourth [4] case at the moment of dev_gro_receive() is identical to skbs that have come through [1], as napi_frags_skb() pulls Ethernet header to skb->head. The only difference is that the mentioned condition is always false for them, because skb_put() and friends irreversibly alter the tail pointer. They also go through the 'slow' path, but now every single pskb_may_pull() in every single .gro_receive() will call the *really* slow __pskb_pull_tail() to pull headers to head. This significantly decreases the overall performance for no visible reasons. The only two users of method [4] is: * drivers/staging/qlge * drivers/net/wireless/iwlwifi (all three variants: dvm, mvm, mvm-mq) Note that in case with wireless drivers we can't use [1] (napi_gro_frags()) at least for now and mac80211 stack always performs pushes and pulls anyways, so performance hit is inavoidable. At the moment of v2.6.31 the mentioned change was necessary (that's why I don't add the "Fixes:" tag), but it became obsolete since skb_gro_mac_header() has gone in commit a50e233c50db ("net-gro: restore frag0 optimization"), so we can simply revert the condition in gro_reset_offset() to allow skbs from [4] go through the 'fast' path just like in case [1]. This was tested on a 600 MHz MIPS CPU and a custom driver and this patch gave boosts up to 40 Mbps to method [4] in both directions comparing to net-next, which made overall performance relatively close to [1] (without it, [4] is the slowest). v2: - Add more references and explanations to commit message - Fix some typos ibid - No functional changes Signed-off-by: Alexander Lobakin --- net/core/dev.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/net/core/dev.c b/net/core/dev.c index 1c799d486623..da78a433c10c 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -5611,8 +5611,7 @@ static void skb_gro_reset_offset(struct sk_buff *skb) NAPI_GRO_CB(skb)->frag0 = NULL; NAPI_GRO_CB(skb)->frag0_len = 0; - if (skb_mac_header(skb) == skb_tail_pointer(skb) && - pinfo->nr_frags && + if (!skb_headlen(skb) && pinfo->nr_frags && !PageHighMem(skb_frag_page(frag0))) { NAPI_GRO_CB(skb)->frag0 = skb_frag_address(frag0); NAPI_GRO_CB(skb)->frag0_len = min_t(unsigned int,