{"id":833933,"url":"http://patchwork.ozlabs.org/api/1.2/patches/833933/?format=json","web_url":"http://patchwork.ozlabs.org/project/netdev/patch/20171103152636.9967-6-pablo@netfilter.org/","project":{"id":7,"url":"http://patchwork.ozlabs.org/api/1.2/projects/7/?format=json","name":"Linux network development","link_name":"netdev","list_id":"netdev.vger.kernel.org","list_email":"netdev@vger.kernel.org","web_url":null,"scm_url":null,"webscm_url":null,"list_archive_url":"","list_archive_url_format":"","commit_url_format":""},"msgid":"<20171103152636.9967-6-pablo@netfilter.org>","list_archive_url":null,"date":"2017-11-03T15:26:36","name":"[RFC,WIP,5/5] netfilter: nft_flow_offload: add ndo hooks for hardware offload","commit_ref":null,"pull_url":null,"state":"rfc","archived":true,"hash":"80dd3825cb9b7ef0652aa89bfc7b9ef06035f86d","submitter":{"id":1315,"url":"http://patchwork.ozlabs.org/api/1.2/people/1315/?format=json","name":"Pablo Neira Ayuso","email":"pablo@netfilter.org"},"delegate":{"id":34,"url":"http://patchwork.ozlabs.org/api/1.2/users/34/?format=json","username":"davem","first_name":"David","last_name":"Miller","email":"davem@davemloft.net"},"mbox":"http://patchwork.ozlabs.org/project/netdev/patch/20171103152636.9967-6-pablo@netfilter.org/mbox/","series":[{"id":11752,"url":"http://patchwork.ozlabs.org/api/1.2/series/11752/?format=json","web_url":"http://patchwork.ozlabs.org/project/netdev/list/?series=11752","date":"2017-11-03T15:26:31","name":"Flow offload infrastructure","version":1,"mbox":"http://patchwork.ozlabs.org/series/11752/mbox/"}],"comments":"http://patchwork.ozlabs.org/api/patches/833933/comments/","check":"pending","checks":"http://patchwork.ozlabs.org/api/patches/833933/checks/","tags":{},"related":[],"headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":"ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3yT5RG3PzSz9ryT\n\tfor <patchwork-incoming@ozlabs.org>;\n\tSat,  4 Nov 2017 02:27:06 +1100 (AEDT)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1755925AbdKCP05 (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tFri, 3 Nov 2017 11:26:57 -0400","from mail.us.es ([193.147.175.20]:43342 \"EHLO mail.us.es\"\n\trhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP\n\tid S1755883AbdKCP0u (ORCPT <rfc822;netdev@vger.kernel.org>);\n\tFri, 3 Nov 2017 11:26:50 -0400","from antivirus1-rhel7.int (unknown [192.168.2.11])\n\tby mail.us.es (Postfix) with ESMTP id 63A7DC0B46\n\tfor <netdev@vger.kernel.org>; Fri,  3 Nov 2017 16:26:48 +0100 (CET)","from antivirus1-rhel7.int (localhost [127.0.0.1])\n\tby antivirus1-rhel7.int (Postfix) with ESMTP id 542B7B7FE7\n\tfor <netdev@vger.kernel.org>; Fri,  3 Nov 2017 16:26:48 +0100 (CET)","by antivirus1-rhel7.int (Postfix, from userid 99)\n\tid 499FBB7FE2; Fri,  3 Nov 2017 16:26:48 +0100 (CET)","from antivirus1-rhel7.int (localhost [127.0.0.1])\n\tby antivirus1-rhel7.int (Postfix) with ESMTP id E1FBCB7FE7;\n\tFri,  3 Nov 2017 16:26:45 +0100 (CET)","from 192.168.1.97 (192.168.1.97) by antivirus1-rhel7.int\n\t(F-Secure/fsigk_smtp/550/antivirus1-rhel7.int); \n\tFri, 03 Nov 2017 16:26:45 +0100 (CET)","from salvia.here (unknown [31.4.245.115])\n\t(Authenticated sender: pneira@us.es)\n\tby entrada.int (Postfix) with ESMTPA id AA024403DFA1;\n\tFri,  3 Nov 2017 16:26:45 +0100 (CET)"],"X-Spam-Checker-Version":"SpamAssassin 3.4.1 (2015-04-28) on\n\tantivirus1-rhel7.int","X-Spam-Level":"","X-Spam-Status":"No, score=-108.2 required=7.5 tests=ALL_TRUSTED,BAYES_50,\n\tSMTPAUTH_US2,USER_IN_WHITELIST autolearn=disabled version=3.4.1","X-Virus-Status":"clean(F-Secure/fsigk_smtp/550/antivirus1-rhel7.int)","X-SMTPAUTHUS":"auth mail.us.es","From":"Pablo Neira Ayuso <pablo@netfilter.org>","To":"netfilter-devel@vger.kernel.org","Cc":"netdev@vger.kernel.org","Subject":"[PATCH RFC,\n\tWIP 5/5] netfilter: nft_flow_offload: add ndo hooks for hardware\n\toffload","Date":"Fri,  3 Nov 2017 16:26:36 +0100","Message-Id":"<20171103152636.9967-6-pablo@netfilter.org>","X-Mailer":"git-send-email 2.11.0","In-Reply-To":"<20171103152636.9967-1-pablo@netfilter.org>","References":"<20171103152636.9967-1-pablo@netfilter.org>","X-Virus-Scanned":"ClamAV using ClamSMTP","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"},"content":"This patch adds the infrastructure to offload flows to hardware, in case\nthe nic/switch comes with built-in flow tables capabilities.\n\nIf the hardware comes with not hardware flow tables or they have\nlimitations in terms of features, this falls back to the software\ngeneric flow table implementation.\n\nThe software flow table aging thread skips entries that resides in the\nhardware, so the hardware will be responsible for releasing this flow\ntable entry too.\n\nSigned-off-by: Pablo Neira Ayuso <pablo@netfilter.org>\n---\n include/linux/netdevice.h        |  4 ++\n net/netfilter/nf_flow_offload.c  |  3 ++\n net/netfilter/nft_flow_offload.c | 99 ++++++++++++++++++++++++++++++++++++++++\n 3 files changed, 106 insertions(+)","diff":"diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h\nindex f535779d9dc1..0787f53374b3 100644\n--- a/include/linux/netdevice.h\n+++ b/include/linux/netdevice.h\n@@ -826,6 +826,8 @@ struct xfrmdev_ops {\n };\n #endif\n \n+struct flow_offload;\n+\n /*\n  * This structure defines the management hooks for network devices.\n  * The following hooks can be defined; unless noted otherwise, they are\n@@ -1281,6 +1283,8 @@ struct net_device_ops {\n \tint\t\t\t(*ndo_bridge_dellink)(struct net_device *dev,\n \t\t\t\t\t\t      struct nlmsghdr *nlh,\n \t\t\t\t\t\t      u16 flags);\n+\tint\t\t\t(*ndo_flow_add)(struct flow_offload *flow);\n+\tint\t\t\t(*ndo_flow_del)(struct flow_offload *flow);\n \tint\t\t\t(*ndo_change_carrier)(struct net_device *dev,\n \t\t\t\t\t\t      bool new_carrier);\n \tint\t\t\t(*ndo_get_phys_port_id)(struct net_device *dev,\ndiff --git a/net/netfilter/nf_flow_offload.c b/net/netfilter/nf_flow_offload.c\nindex f4a3fbe11b69..ac5786976dbb 100644\n--- a/net/netfilter/nf_flow_offload.c\n+++ b/net/netfilter/nf_flow_offload.c\n@@ -147,6 +147,9 @@ static void nf_flow_offload_work_gc(struct work_struct *work)\n \n \t\tflow = container_of(tuplehash, struct flow_offload, tuplehash[0]);\n \n+\t\tif (flow->flags & FLOW_OFFLOAD_HW)\n+\t\t\tcontinue;\n+\n \t\tif (nf_flow_has_expired(flow)) {\n \t\t\tflow_offload_del(flow);\n \t\t\tnf_flow_release_ct(tuplehash);\ndiff --git a/net/netfilter/nft_flow_offload.c b/net/netfilter/nft_flow_offload.c\nindex d38d185a19a5..0cb194a0aaab 100644\n--- a/net/netfilter/nft_flow_offload.c\n+++ b/net/netfilter/nft_flow_offload.c\n@@ -17,6 +17,22 @@ union flow_gateway {\n \tstruct in6_addr\tip6;\n };\n \n+static void flow_hw_offload_del(struct flow_offload *flow)\n+{\n+\tstruct net_device *indev;\n+\tint ret;\n+\n+\trtnl_lock();\n+\tindev = __dev_get_by_index(&init_net, flow->tuplehash[0].tuple.iifidx);\n+\tWARN_ON(!indev);\n+\n+\tif (indev->netdev_ops->ndo_flow_del) {\n+\t\tret = indev->netdev_ops->ndo_flow_del(flow);\n+\t\tWARN_ON(ret < 0);\n+\t}\n+\trtnl_unlock();\n+}\n+\n static int flow_offload_iterate_cleanup(struct nf_conn *ct, void *data)\n {\n \tstruct flow_offload_tuple_rhash *tuplehash;\n@@ -44,14 +60,40 @@ static int flow_offload_iterate_cleanup(struct nf_conn *ct, void *data)\n \t\t\t    tuplehash[tuplehash->tuple.dir]);\n \n \tflow_offload_del(flow);\n+\tif (flow->flags & FLOW_OFFLOAD_HW)\n+\t\tflow_hw_offload_del(flow);\n \n \t/* Do not remove this conntrack from table. */\n \treturn 0;\n }\n \n+static LIST_HEAD(flow_hw_offload_pending_list);\n+static DEFINE_SPINLOCK(flow_hw_offload_lock);\n+\n+struct flow_hw_offload {\n+\tstruct list_head\tlist;\n+\tstruct flow_offload\t*flow;\n+\tstruct nf_conn\t\t*ct;\n+};\n+\n static void flow_offload_cleanup(struct net *net,\n \t\t\t\t const struct net_device *dev)\n {\n+\tstruct flow_hw_offload *offload, *next;\n+\n+\tspin_lock_bh(&flow_hw_offload_lock);\n+\tlist_for_each_entry_safe(offload, next, &flow_hw_offload_pending_list, list) {\n+\t\tif (dev == NULL ||\n+\t\t    offload->flow->tuplehash[FLOW_OFFLOAD_DIR_ORIGINAL].tuple.iifidx == dev->ifindex ||\n+\t\t    offload->flow->tuplehash[FLOW_OFFLOAD_DIR_ORIGINAL].tuple.oifidx == dev->ifindex)\n+\t\t\tcontinue;\n+\n+\t\tnf_conntrack_put(&offload->ct->ct_general);\n+\t\tlist_del(&offload->list);\n+\t\tkfree(offload);\n+\t}\n+\tspin_unlock_bh(&flow_hw_offload_lock);\n+\n \tnf_ct_iterate_cleanup_net(net, flow_offload_iterate_cleanup,\n \t\t\t\t  (void *)dev, 0, 0);\n }\n@@ -156,6 +198,43 @@ flow_offload_alloc(const struct nf_conn *ct, int iifindex, int oifindex,\n \treturn flow;\n }\n \n+static int do_flow_offload(struct flow_offload *flow)\n+{\n+\tstruct net_device *indev;\n+\tint ret, ifindex;\n+\n+\trtnl_lock();\n+\tifindex = flow->tuplehash[FLOW_OFFLOAD_DIR_ORIGINAL].tuple.iifidx;\n+\tindev = __dev_get_by_index(&init_net, ifindex);\n+\tWARN_ON(!indev);\n+\n+\tret = indev->netdev_ops->ndo_flow_add(flow);\n+\trtnl_unlock();\n+\n+\tif (ret >= 0)\n+\t\tflow->flags |= FLOW_OFFLOAD_HW;\n+\n+\treturn ret;\n+}\n+\n+static struct delayed_work nft_flow_offload_dwork;\n+\n+static void flow_offload_work(struct work_struct *work)\n+{\n+\tstruct flow_hw_offload *offload, *next;\n+\n+\tspin_lock_bh(&flow_hw_offload_lock);\n+\tlist_for_each_entry_safe(offload, next, &flow_hw_offload_pending_list, list) {\n+\t\tdo_flow_offload(offload->flow);\n+\t\tnf_conntrack_put(&offload->ct->ct_general);\n+\t\tlist_del(&offload->list);\n+\t\tkfree(offload);\n+\t}\n+\tspin_unlock_bh(&flow_hw_offload_lock);\n+\n+\tqueue_delayed_work(system_power_efficient_wq, &nft_flow_offload_dwork, HZ);\n+}\n+\n static int nft_flow_route(const struct nft_pktinfo *pkt,\n \t\t\t  const struct nf_conn *ct,\n \t\t\t  union flow_gateway *orig_gw,\n@@ -211,6 +290,7 @@ static void nft_flow_offload_eval(const struct nft_expr *expr,\n \tunion flow_gateway orig_gateway, reply_gateway;\n \tstruct net_device *outdev = pkt->xt.state->out;\n \tstruct net_device *indev = pkt->xt.state->in;\n+\tstruct flow_hw_offload *offload;\n \tenum ip_conntrack_info ctinfo;\n \tstruct flow_offload *flow;\n \tstruct nf_conn *ct;\n@@ -250,6 +330,21 @@ static void nft_flow_offload_eval(const struct nft_expr *expr,\n \tif (ret < 0)\n \t\tgoto err2;\n \n+\tif (!indev->netdev_ops->ndo_flow_add)\n+\t\treturn;\n+\n+\toffload = kmalloc(sizeof(struct flow_hw_offload), GFP_ATOMIC);\n+\tif (!offload)\n+\t\treturn;\n+\n+\tnf_conntrack_get(&ct->ct_general);\n+\toffload->ct = ct;\n+\toffload->flow = flow;\n+\n+\tspin_lock_bh(&flow_hw_offload_lock);\n+\tlist_add_tail(&offload->list, &flow_hw_offload_pending_list);\n+\tspin_unlock_bh(&flow_hw_offload_lock);\n+\n \treturn;\n err2:\n \tkfree(flow);\n@@ -308,6 +403,9 @@ static int __init nft_flow_offload_module_init(void)\n {\n \tregister_netdevice_notifier(&flow_offload_netdev_notifier);\n \n+\tINIT_DEFERRABLE_WORK(&nft_flow_offload_dwork, flow_offload_work);\n+\tqueue_delayed_work(system_power_efficient_wq, &nft_flow_offload_dwork, HZ);\n+\n \treturn nft_register_expr(&nft_flow_offload_type);\n }\n \n@@ -316,6 +414,7 @@ static void __exit nft_flow_offload_module_exit(void)\n \tstruct net *net;\n \n \tnft_unregister_expr(&nft_flow_offload_type);\n+\tcancel_delayed_work_sync(&nft_flow_offload_dwork);\n \tunregister_netdevice_notifier(&flow_offload_netdev_notifier);\n \trtnl_lock();\n \tfor_each_net(net)\n","prefixes":["RFC","WIP","5/5"]}