From patchwork Tue Oct 29 13:50:50 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matteo Croce X-Patchwork-Id: 1186122 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.b="KfYIdTLG"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 472Y0532dHz9sPc for ; Wed, 30 Oct 2019 00:51:17 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389044AbfJ2NvQ (ORCPT ); Tue, 29 Oct 2019 09:51:16 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:48916 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2388983AbfJ2NvH (ORCPT ); Tue, 29 Oct 2019 09:51:07 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1572357066; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tIRfkARawuYx5E17weGS9sdJ1VOiUwg/d8BMqTS+xF4=; b=KfYIdTLGO9WHu6aAwygoMR35rdPkxUGAll58jFy29U1CtvY6wsP1hOl4/YAWrSyPXAQQiv CQlvFIy/Bwj2+gj/w/HpHaZnLSEMECxz3V3MCBi/swb1yM4+WLJDUeLc2GjSjwSho3IJH7 tzT1lzeiLM0Gc7e1PkScu37rOpB5XtE= Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-373-ZFe2rKfbOe2p_5iy3UQYog-1; Tue, 29 Oct 2019 09:51:04 -0400 Received: by mail-wm1-f69.google.com with SMTP id a81so891035wma.4 for ; Tue, 29 Oct 2019 06:51:04 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=SMEGf4c6Ma+Bze2+ZiGp43bjN8huF03EIdt6wvA9wXw=; b=FE3D63yl9mUHt/WZb7o8XbfHEPWTaiKPPsWFTSGEcReLquWggO7pLGb/XApi6nQoX4 Q8dJV7MqCxt5SuH8gp1WL0r4WudPayT84+WMuOi9egW+eBYQ8fM6ajHuk9ziNVFA3YLL 3SEIT1a1K5QLQrfewu0wfEYLV6iBQImRkNLPfSbcTUeG798knKOxOLQnD1dcYjzbIFtr 94TM8G0iD2nu9UPUCz5+DARCySCxFR5a9JBTEe2ZAVfm/6QYT1j9qNjTB6XvPP5FZm7w Kw6O/5/Stg6UnS/HDQ4sfAPdKecGUGwGuK1IOQNS5DY1nqj5Xt9SLpilBlWpszuLqhBp J6Zw== X-Gm-Message-State: APjAAAXFxx7SDUutR/DmA0tBuUazXLYsnBjkyiLfX7YeIg1nd4WqcoyA dFumIjcdTlRe8H8wEMArA9hqZzZHym5m++MBFyvYQ0XDsTcw0s1I+j51MuCyKmT2EbiscEGzRid oHYyXw0ml8fVc/SC1 X-Received: by 2002:adf:828c:: with SMTP id 12mr19790479wrc.40.1572357063581; Tue, 29 Oct 2019 06:51:03 -0700 (PDT) X-Google-Smtp-Source: APXvYqzjCJGyJ/tvm+foMGtEVPFNTayDFfATM7pwPRhzRETLIFTFzBxqPMY9xxiE5XklQiPQGzHwRQ== X-Received: by 2002:adf:828c:: with SMTP id 12mr19790460wrc.40.1572357063387; Tue, 29 Oct 2019 06:51:03 -0700 (PDT) Received: from mcroce-redhat.mxp.redhat.com (nat-pool-mxp-t.redhat.com. [149.6.153.186]) by smtp.gmail.com with ESMTPSA id 189sm2556920wmc.7.2019.10.29.06.51.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Oct 2019 06:51:02 -0700 (PDT) From: Matteo Croce To: netdev@vger.kernel.org Cc: Jay Vosburgh , Veaceslav Falico , Andy Gospodarek , "David S . Miller " , Stanislav Fomichev , Daniel Borkmann , Song Liu , Alexei Starovoitov , Paul Blakey , linux-kernel@vger.kernel.org Subject: [PATCH net-next v2 1/4] flow_dissector: add meaningful comments Date: Tue, 29 Oct 2019 14:50:50 +0100 Message-Id: <20191029135053.10055-2-mcroce@redhat.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20191029135053.10055-1-mcroce@redhat.com> References: <20191029135053.10055-1-mcroce@redhat.com> MIME-Version: 1.0 X-MC-Unique: ZFe2rKfbOe2p_5iy3UQYog-1 X-Mimecast-Spam-Score: 0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Documents two piece of code which can't be understood at a glance. Signed-off-by: Matteo Croce --- include/net/flow_dissector.h | 1 + net/core/flow_dissector.c | 6 ++++++ 2 files changed, 7 insertions(+) diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h index 90bd210be060..7747af3cc500 100644 --- a/include/net/flow_dissector.h +++ b/include/net/flow_dissector.h @@ -282,6 +282,7 @@ struct flow_keys { struct flow_dissector_key_vlan cvlan; struct flow_dissector_key_keyid keyid; struct flow_dissector_key_ports ports; + /* 'addrs' must be the last member */ struct flow_dissector_key_addrs addrs; }; diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c index 7c09d87d3269..affde70dad47 100644 --- a/net/core/flow_dissector.c +++ b/net/core/flow_dissector.c @@ -1374,6 +1374,9 @@ static inline size_t flow_keys_hash_length(const struct flow_keys *flow) { size_t diff = FLOW_KEYS_HASH_OFFSET + sizeof(flow->addrs); BUILD_BUG_ON((sizeof(*flow) - FLOW_KEYS_HASH_OFFSET) % sizeof(u32)); + /* flow.addrs MUST be the last member in struct flow_keys because + * different L3 protocols have different address length + */ BUILD_BUG_ON(offsetof(typeof(*flow), addrs) != sizeof(*flow) - sizeof(flow->addrs)); @@ -1421,6 +1424,9 @@ __be32 flow_get_u32_dst(const struct flow_keys *flow) } EXPORT_SYMBOL(flow_get_u32_dst); +/* Sort the source and destination IP (and the ports if the IP are the same), + * to have consistent hash within the two directions + */ static inline void __flow_hash_consistentify(struct flow_keys *keys) { int addr_diff, i; From patchwork Tue Oct 29 13:50:51 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matteo Croce X-Patchwork-Id: 1186121 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.b="Zl/mZjN9"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 472Y003dckz9sQq for ; Wed, 30 Oct 2019 00:51:12 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389015AbfJ2NvL (ORCPT ); Tue, 29 Oct 2019 09:51:11 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:44758 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2389007AbfJ2NvJ (ORCPT ); Tue, 29 Oct 2019 09:51:09 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1572357068; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=c6Zm4IY7VrUnQ8RKTixaznYCgJL3nPqGqx5Lz+kk3DI=; b=Zl/mZjN9ihhNDn01j9ng4MrnuLxHpRdj9D67De0z5g+mh/tgChHyecBCiXcM5ev4PXIDLA H+QAJqTANZPGPYzvxZ9IpVzzD1YWar8enON47174efK/GKD+fgVPNSUXtitobIiC+iIx90 jGgvNeoaLwnKnP0heDHrrEGRgQrTA8k= Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-37-KbciWpNZNki-ciBB7y4Opg-1; Tue, 29 Oct 2019 09:51:07 -0400 Received: by mail-wm1-f72.google.com with SMTP id z5so748052wma.5 for ; Tue, 29 Oct 2019 06:51:07 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=FOEIRt2uhE+VjfvSvcsGDHwxP6iiWF3+DKbQW0z4Upk=; b=frvqf/A6aW2uRn+3wDE44MCdTYN5QkHwG2x+X1GjPI5bcWi/mENuRtxyTcliNbkZvQ KKlhwbAMkVM2X4VnA0H4/i8o8bIN5568Cf4yYYQQZQGb/m0FJRxmUyRRnf4zQtCwGb1Q 1hEFs5ruGeMog8Mf4wIPr1iDUTql6DuS8jmZbog2Y9WZuBhODTl7MRjHD82t9oy9eS6R bEcvGLamEneKTJ05Wur6DdFoaolzaw08ROrATxs7auJME91Q0Iyy6y7pYKBXMFbFb2R+ diMtHbt5NbqdEWF8sJQYgCnZmjNe9i7tdI4oM/AN38K+4hNKKD8Dt4iCoeC33MrRAe6T dXaA== X-Gm-Message-State: APjAAAVNL4PjoireSpqwD+aMK2pR2CbLqb2X8sJC4NPudMi9TBK6NeNA HOI6L5aBJIbeJG0mL/68w7Wpp6dJDuKIcz0gsoGQwufhT0wywHGb9hIF8fLNNpZoOOd9BKQVBEW jfz0Fk45NJvm/UL4S X-Received: by 2002:adf:fc0a:: with SMTP id i10mr18965267wrr.257.1572357066097; Tue, 29 Oct 2019 06:51:06 -0700 (PDT) X-Google-Smtp-Source: APXvYqxMIycPUBI6kScBBX8AgD5qg5eK3nEhPeqqO6kubTkmz5NyT2QRwAeb8InIxDnf1gzm5aoFpg== X-Received: by 2002:adf:fc0a:: with SMTP id i10mr18965241wrr.257.1572357065777; Tue, 29 Oct 2019 06:51:05 -0700 (PDT) Received: from mcroce-redhat.mxp.redhat.com (nat-pool-mxp-t.redhat.com. [149.6.153.186]) by smtp.gmail.com with ESMTPSA id 189sm2556920wmc.7.2019.10.29.06.51.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Oct 2019 06:51:05 -0700 (PDT) From: Matteo Croce To: netdev@vger.kernel.org Cc: Jay Vosburgh , Veaceslav Falico , Andy Gospodarek , "David S . Miller " , Stanislav Fomichev , Daniel Borkmann , Song Liu , Alexei Starovoitov , Paul Blakey , linux-kernel@vger.kernel.org Subject: [PATCH net-next v2 2/4] flow_dissector: skip the ICMP dissector for non ICMP packets Date: Tue, 29 Oct 2019 14:50:51 +0100 Message-Id: <20191029135053.10055-3-mcroce@redhat.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20191029135053.10055-1-mcroce@redhat.com> References: <20191029135053.10055-1-mcroce@redhat.com> MIME-Version: 1.0 X-MC-Unique: KbciWpNZNki-ciBB7y4Opg-1 X-Mimecast-Spam-Score: 0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org FLOW_DISSECTOR_KEY_ICMP is checked for every packet, not only ICMP ones. Even if the test overhead is probably negligible, move the ICMP dissector code under the big 'switch(ip_proto)' so it gets called only for ICMP packets. Signed-off-by: Matteo Croce --- net/core/flow_dissector.c | 34 +++++++++++++++++++++++++--------- 1 file changed, 25 insertions(+), 9 deletions(-) diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c index affde70dad47..6443fac65ce8 100644 --- a/net/core/flow_dissector.c +++ b/net/core/flow_dissector.c @@ -203,6 +203,25 @@ __be32 __skb_flow_get_ports(const struct sk_buff *skb, int thoff, u8 ip_proto, } EXPORT_SYMBOL(__skb_flow_get_ports); +/* If FLOW_DISSECTOR_KEY_ICMP is set, get the Type and Code from an ICMP packet + * using skb_flow_get_be16(). + */ +static void __skb_flow_dissect_icmp(const struct sk_buff *skb, + struct flow_dissector *flow_dissector, + void *target_container, + void *data, int thoff, int hlen) +{ + struct flow_dissector_key_icmp *key_icmp; + + if (!dissector_uses_key(flow_dissector, FLOW_DISSECTOR_KEY_ICMP)) + return; + + key_icmp = skb_flow_dissector_target(flow_dissector, + FLOW_DISSECTOR_KEY_ICMP, + target_container); + key_icmp->icmp = skb_flow_get_be16(skb, thoff, data, hlen); +} + void skb_flow_dissect_meta(const struct sk_buff *skb, struct flow_dissector *flow_dissector, void *target_container) @@ -853,7 +872,6 @@ bool __skb_flow_dissect(const struct net *net, struct flow_dissector_key_basic *key_basic; struct flow_dissector_key_addrs *key_addrs; struct flow_dissector_key_ports *key_ports; - struct flow_dissector_key_icmp *key_icmp; struct flow_dissector_key_tags *key_tags; struct flow_dissector_key_vlan *key_vlan; struct bpf_prog *attached = NULL; @@ -1295,6 +1313,12 @@ bool __skb_flow_dissect(const struct net *net, data, nhoff, hlen); break; + case IPPROTO_ICMP: + case IPPROTO_ICMPV6: + __skb_flow_dissect_icmp(skb, flow_dissector, target_container, + data, nhoff, hlen); + break; + default: break; } @@ -1308,14 +1332,6 @@ bool __skb_flow_dissect(const struct net *net, data, hlen); } - if (dissector_uses_key(flow_dissector, - FLOW_DISSECTOR_KEY_ICMP)) { - key_icmp = skb_flow_dissector_target(flow_dissector, - FLOW_DISSECTOR_KEY_ICMP, - target_container); - key_icmp->icmp = skb_flow_get_be16(skb, nhoff, data, hlen); - } - /* Process result of IP proto processing */ switch (fdret) { case FLOW_DISSECT_RET_PROTO_AGAIN: From patchwork Tue Oct 29 13:50:52 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matteo Croce X-Patchwork-Id: 1186124 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.b="Tr6CBIl2"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 472Y0L2nD0z9sQm for ; Wed, 30 Oct 2019 00:51:30 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389093AbfJ2Nv2 (ORCPT ); Tue, 29 Oct 2019 09:51:28 -0400 Received: from us-smtp-2.mimecast.com ([207.211.31.81]:29781 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2388802AbfJ2NvX (ORCPT ); Tue, 29 Oct 2019 09:51:23 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1572357081; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tDNUWkKpQnw1yHaB0W3iJg2EXkt52WFEwMuPDE5I/Vc=; b=Tr6CBIl2n3oCRIRR4O9eNloS43uLvRziSOjS9dY/qzjtZHdMB8P3piR8Wj0YlrhZfn+oQU GWNDrcIWqJY61k1m4FXO4PLllXECie2374J7/VcDUh2BwQaEf2VKGcb6dmmgSgW5Gv6+6R MdMObfJbF7JEI9xkVQuvB21Hnv0Ve1w= Received: from mail-wr1-f72.google.com (mail-wr1-f72.google.com [209.85.221.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-187-aN6DRYU8MAqrZImbK9BdHg-1; Tue, 29 Oct 2019 09:51:12 -0400 Received: by mail-wr1-f72.google.com with SMTP id 4so8411120wrf.19 for ; Tue, 29 Oct 2019 06:51:11 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=SgCGxqUSVZkkV529W9PQ2OLPt6lHiQPIyC5Y4Q9bYuA=; b=cUJa9pBQsfSA/6cquVMkRvx7LKWMgBMS1l1VMELAoXEPpJPOn9+wEvgdi0sG59tZ3k nT5Bi273IBI5ABqBg8Cx+TEA0EhVi6C/imS4TMmFtWgNH5VhPu9MY/lrmn1Fmhe4+Q0T HnEwaKmhvjrVElVPJ99HYUbH4tUnA9rBlNzOlxGgZRFUXKxUeHsZXXY8pkzdHLV6kl/2 +IauyV6+GfxpZ8gAL5QtBn88hnHHgCWJvAewZUUPpVG29DgJ7zaXIqMGCAW6GQfyoJar S920y945KrmfFNQshudZRErXvnRp889961NwCs1Zi3gdelLpJNRZLzQIZ8IGE9ZCSA77 rZBg== X-Gm-Message-State: APjAAAWowZmSHvk8uylPWRvSla3CEGwVStacIN3Y2ewAZJomUrOE9IxI bXlbH0Pnw6M7lL8bgJh7tUivPu9btBHekcoCOfV83IwalpJ2OKz7q0VTTo8dJaYeyIwU0fID8zQ 387Rdjd9zgFnFaNAZ X-Received: by 2002:a1c:5459:: with SMTP id p25mr4009082wmi.109.1572357070414; Tue, 29 Oct 2019 06:51:10 -0700 (PDT) X-Google-Smtp-Source: APXvYqzzBABeXGWvah4RWCBM8DqkxgMA5OepJvzaZk5Rv2Cr/nCiIuL4/tk/KZxjcx57+oPv9S+yrA== X-Received: by 2002:a1c:5459:: with SMTP id p25mr4009050wmi.109.1572357070109; Tue, 29 Oct 2019 06:51:10 -0700 (PDT) Received: from mcroce-redhat.mxp.redhat.com (nat-pool-mxp-t.redhat.com. [149.6.153.186]) by smtp.gmail.com with ESMTPSA id 189sm2556920wmc.7.2019.10.29.06.51.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Oct 2019 06:51:09 -0700 (PDT) From: Matteo Croce To: netdev@vger.kernel.org Cc: Jay Vosburgh , Veaceslav Falico , Andy Gospodarek , "David S . Miller " , Stanislav Fomichev , Daniel Borkmann , Song Liu , Alexei Starovoitov , Paul Blakey , linux-kernel@vger.kernel.org Subject: [PATCH net-next v2 3/4] flow_dissector: extract more ICMP information Date: Tue, 29 Oct 2019 14:50:52 +0100 Message-Id: <20191029135053.10055-4-mcroce@redhat.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20191029135053.10055-1-mcroce@redhat.com> References: <20191029135053.10055-1-mcroce@redhat.com> MIME-Version: 1.0 X-MC-Unique: aN6DRYU8MAqrZImbK9BdHg-1 X-Mimecast-Spam-Score: 0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The ICMP flow dissector currently parses only the Type and Code fields. Some ICMP packets (echo, timestamp) have a 16 bit Identifier field which is used to correlate packets. Add such field in flow_dissector_key_icmp and replace skb_flow_get_be16() with a more complex function which populate this field. Signed-off-by: Matteo Croce --- include/net/flow_dissector.h | 19 +++++---- net/core/flow_dissector.c | 74 ++++++++++++++++++++++++------------ 2 files changed, 61 insertions(+), 32 deletions(-) diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h index 7747af3cc500..f8541d018848 100644 --- a/include/net/flow_dissector.h +++ b/include/net/flow_dissector.h @@ -6,6 +6,8 @@ #include #include +struct sk_buff; + /** * struct flow_dissector_key_control: * @thoff: Transport header offset @@ -156,19 +158,16 @@ struct flow_dissector_key_ports { /** * flow_dissector_key_icmp: - * @ports: type and code of ICMP header - * icmp: ICMP type (high) and code (low) * type: ICMP type * code: ICMP code + * id: session identifier */ struct flow_dissector_key_icmp { - union { - __be16 icmp; - struct { - u8 type; - u8 code; - }; + struct { + u8 type; + u8 code; }; + u16 id; }; /** @@ -282,6 +281,7 @@ struct flow_keys { struct flow_dissector_key_vlan cvlan; struct flow_dissector_key_keyid keyid; struct flow_dissector_key_ports ports; + struct flow_dissector_key_icmp icmp; /* 'addrs' must be the last member */ struct flow_dissector_key_addrs addrs; }; @@ -316,6 +316,9 @@ static inline bool flow_keys_have_l4(const struct flow_keys *keys) } u32 flow_hash_from_keys(struct flow_keys *keys); +void skb_flow_get_icmp_tci(const struct sk_buff *skb, + struct flow_dissector_key_icmp *key_icmp, + void *data, int thoff, int hlen); static inline bool dissector_uses_key(const struct flow_dissector *flow_dissector, enum flow_dissector_key_id key_id) diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c index 6443fac65ce8..0d014b81b269 100644 --- a/net/core/flow_dissector.c +++ b/net/core/flow_dissector.c @@ -147,27 +147,6 @@ int skb_flow_dissector_bpf_prog_detach(const union bpf_attr *attr) mutex_unlock(&flow_dissector_mutex); return 0; } -/** - * skb_flow_get_be16 - extract be16 entity - * @skb: sk_buff to extract from - * @poff: offset to extract at - * @data: raw buffer pointer to the packet - * @hlen: packet header length - * - * The function will try to retrieve a be32 entity at - * offset poff - */ -static __be16 skb_flow_get_be16(const struct sk_buff *skb, int poff, - void *data, int hlen) -{ - __be16 *u, _u; - - u = __skb_header_pointer(skb, poff, sizeof(_u), data, hlen, &_u); - if (u) - return *u; - - return 0; -} /** * __skb_flow_get_ports - extract the upper layer ports and return them @@ -203,8 +182,54 @@ __be32 __skb_flow_get_ports(const struct sk_buff *skb, int thoff, u8 ip_proto, } EXPORT_SYMBOL(__skb_flow_get_ports); -/* If FLOW_DISSECTOR_KEY_ICMP is set, get the Type and Code from an ICMP packet - * using skb_flow_get_be16(). +static bool icmp_has_id(u8 type) +{ + switch (type) { + case ICMP_ECHO: + case ICMP_ECHOREPLY: + case ICMP_TIMESTAMP: + case ICMP_TIMESTAMPREPLY: + case ICMPV6_ECHO_REQUEST: + case ICMPV6_ECHO_REPLY: + return true; + } + + return false; +} + +/** + * skb_flow_get_icmp_tci - extract ICMP(6) Type, Code and Identifier fields + * @skb: sk_buff to extract from + * @key_icmp: struct flow_dissector_key_icmp to fill + * @data: raw buffer pointer to the packet + * @toff: offset to extract at + * @hlen: packet header length + */ +void skb_flow_get_icmp_tci(const struct sk_buff *skb, + struct flow_dissector_key_icmp *key_icmp, + void *data, int thoff, int hlen) +{ + struct icmphdr *ih, _ih; + + ih = __skb_header_pointer(skb, thoff, sizeof(_ih), data, hlen, &_ih); + if (!ih) + return; + + key_icmp->type = ih->type; + key_icmp->code = ih->code; + + /* As we use 0 to signal that the Id field is not present, + * avoid confusion with packets without such field + */ + if (icmp_has_id(ih->type)) + key_icmp->id = ih->un.echo.id ? : 1; + else + key_icmp->id = 0; +} +EXPORT_SYMBOL(skb_flow_get_icmp_tci); + +/* If FLOW_DISSECTOR_KEY_ICMP is set, dissect an ICMP packet + * using skb_flow_get_icmp_tci(). */ static void __skb_flow_dissect_icmp(const struct sk_buff *skb, struct flow_dissector *flow_dissector, @@ -219,7 +244,8 @@ static void __skb_flow_dissect_icmp(const struct sk_buff *skb, key_icmp = skb_flow_dissector_target(flow_dissector, FLOW_DISSECTOR_KEY_ICMP, target_container); - key_icmp->icmp = skb_flow_get_be16(skb, thoff, data, hlen); + + skb_flow_get_icmp_tci(skb, key_icmp, data, thoff, hlen); } void skb_flow_dissect_meta(const struct sk_buff *skb, From patchwork Tue Oct 29 13:50:53 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matteo Croce X-Patchwork-Id: 1186123 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.b="hzqfoOYB"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 472Y065ypLz9sQv for ; Wed, 30 Oct 2019 00:51:18 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389049AbfJ2NvR (ORCPT ); Tue, 29 Oct 2019 09:51:17 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:24340 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2388802AbfJ2NvR (ORCPT ); Tue, 29 Oct 2019 09:51:17 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1572357075; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=y0QBgHuOvRpTy/yL3LM7SM8CMC4Q19dnHgWxMUVqqv8=; b=hzqfoOYBCrA+8Lp0bq5OgOq1FhJw2xraui97bSTFGLwdV7OJWeWtboWgqyjmAcUYp9M4/f 7oqjkhCuqbAvfK297cBpXIfn+My+zDcXmxuMYp3dOe7A3AJ8ElJ9vxQxOl2GvhjWcw0ITB YN4J0jc+oHJUtTMgKFQiPoIaoaCGkAQ= Received: from mail-wr1-f69.google.com (mail-wr1-f69.google.com [209.85.221.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-405-OllKlYiONbKn_ovuTqWcBQ-1; Tue, 29 Oct 2019 09:51:14 -0400 Received: by mail-wr1-f69.google.com with SMTP id h4so8433800wrx.15 for ; Tue, 29 Oct 2019 06:51:13 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Pci9O/Y1uGzGl4VbZQrWCqVtKu7mWU+rWUzr+zqpUus=; b=fwUIm6RpA1rzYr24j9yHtupvPxxvU1H9tPo+OSEap8ttYIjhBTjTwM0k4iCUVj8yRi 8Vm5TywaAt1iZHk72PMRTJMYnh0v/Q9ULUsCdzA75jWeZhWIxoQ4G0MMfC+mMePHlD9X NrKvevnmK6rWcqAxwj5XGPscyk89WRxm4L3UKeld4Yf1x/O7TcNi13Rxjd5hPf2s0MD5 YEYlBcCDyp4dd5DmgSmExoZ1TmNKiEuwoKPPo/IbIIxdMX2Rczwlc+YYuPQ2ntdqK7dN t4pnrmMSwfWTOyWeBAAvWn2i1GXFXKCixSGCRXiQbFG7uZLerTOV8nsad1XbL6slVBrn STNQ== X-Gm-Message-State: APjAAAWqCMPt5fxJR1HsAyF63oilmSxEU9eoOVbNrfkTxV7rtvABIm6E Q47MsPPamZmNpdL1MHgjQwaxeBVgNlyt8ufjvL4J718Xddd5I1BX478yMqJa7VFqKgGUPtKmYEj dQFo6L10bGT5hSm6S X-Received: by 2002:adf:fc10:: with SMTP id i16mr19093180wrr.157.1572357072803; Tue, 29 Oct 2019 06:51:12 -0700 (PDT) X-Google-Smtp-Source: APXvYqxG5wt0T0t9KCOCTG4DWyEMa2fhof6y77Xio3y/jp2lKJcEoqrdOFiVxJqSgiyOI0edNNmXKA== X-Received: by 2002:adf:fc10:: with SMTP id i16mr19093157wrr.157.1572357072538; Tue, 29 Oct 2019 06:51:12 -0700 (PDT) Received: from mcroce-redhat.mxp.redhat.com (nat-pool-mxp-t.redhat.com. [149.6.153.186]) by smtp.gmail.com with ESMTPSA id 189sm2556920wmc.7.2019.10.29.06.51.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Oct 2019 06:51:11 -0700 (PDT) From: Matteo Croce To: netdev@vger.kernel.org Cc: Jay Vosburgh , Veaceslav Falico , Andy Gospodarek , "David S . Miller " , Stanislav Fomichev , Daniel Borkmann , Song Liu , Alexei Starovoitov , Paul Blakey , linux-kernel@vger.kernel.org Subject: [PATCH net-next v2 4/4] bonding: balance ICMP echoes in layer3+4 mode Date: Tue, 29 Oct 2019 14:50:53 +0100 Message-Id: <20191029135053.10055-5-mcroce@redhat.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20191029135053.10055-1-mcroce@redhat.com> References: <20191029135053.10055-1-mcroce@redhat.com> MIME-Version: 1.0 X-MC-Unique: OllKlYiONbKn_ovuTqWcBQ-1 X-Mimecast-Spam-Score: 0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The bonding uses the L4 ports to balance flows between slaves. As the ICMP protocol has no ports, those packets are sent all to the same device: # tcpdump -qltnni veth0 ip |sed 's/^/0: /' & # tcpdump -qltnni veth1 ip |sed 's/^/1: /' & # ping -qc1 192.168.0.2 1: IP 192.168.0.1 > 192.168.0.2: ICMP echo request, id 315, seq 1, length 64 1: IP 192.168.0.2 > 192.168.0.1: ICMP echo reply, id 315, seq 1, length 64 # ping -qc1 192.168.0.2 1: IP 192.168.0.1 > 192.168.0.2: ICMP echo request, id 316, seq 1, length 64 1: IP 192.168.0.2 > 192.168.0.1: ICMP echo reply, id 316, seq 1, length 64 # ping -qc1 192.168.0.2 1: IP 192.168.0.1 > 192.168.0.2: ICMP echo request, id 317, seq 1, length 64 1: IP 192.168.0.2 > 192.168.0.1: ICMP echo reply, id 317, seq 1, length 64 But some ICMP packets have an Identifier field which is used to match packets within sessions, let's use this value in the hash function to balance these packets between bond slaves: # ping -qc1 192.168.0.2 0: IP 192.168.0.1 > 192.168.0.2: ICMP echo request, id 303, seq 1, length 64 0: IP 192.168.0.2 > 192.168.0.1: ICMP echo reply, id 303, seq 1, length 64 # ping -qc1 192.168.0.2 1: IP 192.168.0.1 > 192.168.0.2: ICMP echo request, id 304, seq 1, length 64 1: IP 192.168.0.2 > 192.168.0.1: ICMP echo reply, id 304, seq 1, length 64 Aso, let's use a flow_dissector_key which defines FLOW_DISSECTOR_KEY_ICMP, so we can balance pings encapsulated in a tunnel when using mode encap3+4: # ping -q 192.168.1.2 -c1 0: IP 192.168.0.1 > 192.168.0.2: GREv0, length 102: IP 192.168.1.1 > 192.168.1.2: ICMP echo request, id 585, seq 1, length 64 0: IP 192.168.0.2 > 192.168.0.1: GREv0, length 102: IP 192.168.1.2 > 192.168.1.1: ICMP echo reply, id 585, seq 1, length 64 # ping -q 192.168.1.2 -c1 1: IP 192.168.0.1 > 192.168.0.2: GREv0, length 102: IP 192.168.1.1 > 192.168.1.2: ICMP echo request, id 586, seq 1, length 64 1: IP 192.168.0.2 > 192.168.0.1: GREv0, length 102: IP 192.168.1.2 > 192.168.1.1: ICMP echo reply, id 586, seq 1, length 64 Signed-off-by: Matteo Croce Reviewed-by: Nikolay Aleksandrov --- drivers/net/bonding/bond_main.c | 77 ++++++++++++++++++++++++++++++--- 1 file changed, 70 insertions(+), 7 deletions(-) diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 21d8fcc83c9c..3e496e746cc6 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -200,6 +200,51 @@ atomic_t netpoll_block_tx = ATOMIC_INIT(0); unsigned int bond_net_id __read_mostly; +static const struct flow_dissector_key flow_keys_bonding_keys[] = { + { + .key_id = FLOW_DISSECTOR_KEY_CONTROL, + .offset = offsetof(struct flow_keys, control), + }, + { + .key_id = FLOW_DISSECTOR_KEY_BASIC, + .offset = offsetof(struct flow_keys, basic), + }, + { + .key_id = FLOW_DISSECTOR_KEY_IPV4_ADDRS, + .offset = offsetof(struct flow_keys, addrs.v4addrs), + }, + { + .key_id = FLOW_DISSECTOR_KEY_IPV6_ADDRS, + .offset = offsetof(struct flow_keys, addrs.v6addrs), + }, + { + .key_id = FLOW_DISSECTOR_KEY_TIPC, + .offset = offsetof(struct flow_keys, addrs.tipckey), + }, + { + .key_id = FLOW_DISSECTOR_KEY_PORTS, + .offset = offsetof(struct flow_keys, ports), + }, + { + .key_id = FLOW_DISSECTOR_KEY_ICMP, + .offset = offsetof(struct flow_keys, icmp), + }, + { + .key_id = FLOW_DISSECTOR_KEY_VLAN, + .offset = offsetof(struct flow_keys, vlan), + }, + { + .key_id = FLOW_DISSECTOR_KEY_FLOW_LABEL, + .offset = offsetof(struct flow_keys, tags), + }, + { + .key_id = FLOW_DISSECTOR_KEY_GRE_KEYID, + .offset = offsetof(struct flow_keys, keyid), + }, +}; + +static struct flow_dissector flow_keys_bonding __read_mostly; + /*-------------------------- Forward declarations ---------------------------*/ static int bond_init(struct net_device *bond_dev); @@ -3263,10 +3308,14 @@ static bool bond_flow_dissect(struct bonding *bond, struct sk_buff *skb, const struct iphdr *iph; int noff, proto = -1; - if (bond->params.xmit_policy > BOND_XMIT_POLICY_LAYER23) - return skb_flow_dissect_flow_keys(skb, fk, 0); + if (bond->params.xmit_policy > BOND_XMIT_POLICY_LAYER23) { + memset(fk, 0, sizeof(*fk)); + return __skb_flow_dissect(NULL, skb, &flow_keys_bonding, + fk, NULL, 0, 0, 0, 0); + } fk->ports.ports = 0; + memset(&fk->icmp, 0, sizeof(fk->icmp)); noff = skb_network_offset(skb); if (skb->protocol == htons(ETH_P_IP)) { if (unlikely(!pskb_may_pull(skb, noff + sizeof(*iph)))) @@ -3286,8 +3335,14 @@ static bool bond_flow_dissect(struct bonding *bond, struct sk_buff *skb, } else { return false; } - if (bond->params.xmit_policy == BOND_XMIT_POLICY_LAYER34 && proto >= 0) - fk->ports.ports = skb_flow_get_ports(skb, noff, proto); + if (bond->params.xmit_policy == BOND_XMIT_POLICY_LAYER34 && proto >= 0) { + if (proto == IPPROTO_ICMP || proto == IPPROTO_ICMPV6) + skb_flow_get_icmp_tci(skb, &fk->icmp, skb->data, + skb_transport_offset(skb), + skb_headlen(skb)); + else + fk->ports.ports = skb_flow_get_ports(skb, noff, proto); + } return true; } @@ -3314,10 +3369,14 @@ u32 bond_xmit_hash(struct bonding *bond, struct sk_buff *skb) return bond_eth_hash(skb); if (bond->params.xmit_policy == BOND_XMIT_POLICY_LAYER23 || - bond->params.xmit_policy == BOND_XMIT_POLICY_ENCAP23) + bond->params.xmit_policy == BOND_XMIT_POLICY_ENCAP23) { hash = bond_eth_hash(skb); - else - hash = (__force u32)flow.ports.ports; + } else { + if (flow.icmp.id) + memcpy(&hash, &flow.icmp, sizeof(hash)); + else + memcpy(&hash, &flow.ports.ports, sizeof(hash)); + } hash ^= (__force u32)flow_get_u32_dst(&flow) ^ (__force u32)flow_get_u32_src(&flow); hash ^= (hash >> 16); @@ -4901,6 +4960,10 @@ static int __init bonding_init(void) goto err; } + skb_flow_dissector_init(&flow_keys_bonding, + flow_keys_bonding_keys, + ARRAY_SIZE(flow_keys_bonding_keys)); + register_netdevice_notifier(&bond_netdev_notifier); out: return res;