Patch Detail
get:
Show a patch.
patch:
Update a patch.
put:
Update a patch.
GET /api/1.2/patches/811739/?format=api
{ "id": 811739, "url": "http://patchwork.ozlabs.org/api/1.2/patches/811739/?format=api", "web_url": "http://patchwork.ozlabs.org/project/netfilter-devel/patch/1504892748-1605-6-git-send-email-pablo@netfilter.org/", "project": { "id": 26, "url": "http://patchwork.ozlabs.org/api/1.2/projects/26/?format=api", "name": "Netfilter Development", "link_name": "netfilter-devel", "list_id": "netfilter-devel.vger.kernel.org", "list_email": "netfilter-devel@vger.kernel.org", "web_url": null, "scm_url": null, "webscm_url": null, "list_archive_url": "", "list_archive_url_format": "", "commit_url_format": "" }, "msgid": "<1504892748-1605-6-git-send-email-pablo@netfilter.org>", "list_archive_url": null, "date": "2017-09-08T17:45:44", "name": "[5/9] netfilter: nat: Revert \"netfilter: nat: convert nat bysrc hash to rhashtable\"", "commit_ref": null, "pull_url": null, "state": "accepted", "archived": false, "hash": "639ffaf1d629705d1a80503302759b19179ea499", "submitter": { "id": 1315, "url": "http://patchwork.ozlabs.org/api/1.2/people/1315/?format=api", "name": "Pablo Neira Ayuso", "email": "pablo@netfilter.org" }, "delegate": { "id": 6139, "url": "http://patchwork.ozlabs.org/api/1.2/users/6139/?format=api", "username": "pablo", "first_name": "Pablo", "last_name": "Neira", "email": "pablo@netfilter.org" }, "mbox": "http://patchwork.ozlabs.org/project/netfilter-devel/patch/1504892748-1605-6-git-send-email-pablo@netfilter.org/mbox/", "series": [ { "id": 2261, "url": "http://patchwork.ozlabs.org/api/1.2/series/2261/?format=api", "web_url": "http://patchwork.ozlabs.org/project/netfilter-devel/list/?series=2261", "date": "2017-09-08T17:45:46", "name": "[1/9] netfilter: ipvs: fix the issue that sctp_conn_schedule drops non-INIT packet", "version": 1, "mbox": "http://patchwork.ozlabs.org/series/2261/mbox/" } ], "comments": "http://patchwork.ozlabs.org/api/patches/811739/comments/", "check": "pending", "checks": "http://patchwork.ozlabs.org/api/patches/811739/checks/", "tags": {}, "related": [], "headers": { "Return-Path": "<netfilter-devel-owner@vger.kernel.org>", "X-Original-To": "incoming@patchwork.ozlabs.org", "Delivered-To": "patchwork-incoming@bilbo.ozlabs.org", "Authentication-Results": "ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netfilter-devel-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)", "Received": [ "from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3xplBS5WwSz9s8J\n\tfor <incoming@patchwork.ozlabs.org>;\n\tSat, 9 Sep 2017 03:46:56 +1000 (AEST)", "(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1756764AbdIHRqu (ORCPT <rfc822;incoming@patchwork.ozlabs.org>);\n\tFri, 8 Sep 2017 13:46:50 -0400", "from mail.us.es ([193.147.175.20]:54480 \"EHLO mail.us.es\"\n\trhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP\n\tid S1756723AbdIHRqD (ORCPT <rfc822; netfilter-devel@vger.kernel.org>);\n\tFri, 8 Sep 2017 13:46:03 -0400", "from antivirus1-rhel7.int (unknown [192.168.2.11])\n\tby mail.us.es (Postfix) with ESMTP id 482DD8D0505\n\tfor <netfilter-devel@vger.kernel.org>;\n\tFri, 8 Sep 2017 19:45:32 +0200 (CEST)", "from antivirus1-rhel7.int (localhost [127.0.0.1])\n\tby antivirus1-rhel7.int (Postfix) with ESMTP id 31AB1B502E\n\tfor <netfilter-devel@vger.kernel.org>;\n\tFri, 8 Sep 2017 19:45:32 +0200 (CEST)", "by antivirus1-rhel7.int (Postfix, from userid 99)\n\tid 271E0B502C; Fri, 8 Sep 2017 19:45:32 +0200 (CEST)", "from antivirus1-rhel7.int (localhost [127.0.0.1])\n\tby antivirus1-rhel7.int (Postfix) with ESMTP id 95B26DA2DA;\n\tFri, 8 Sep 2017 19:45:29 +0200 (CEST)", "from 192.168.1.97 (192.168.1.97) by antivirus1-rhel7.int\n\t(F-Secure/fsigk_smtp/550/antivirus1-rhel7.int); \n\tFri, 08 Sep 2017 19:45:29 +0200 (CEST)", "from salvia.here (129.166.216.87.static.jazztel.es\n\t[87.216.166.129]) (Authenticated sender: pneira@us.es)\n\tby entrada.int (Postfix) with ESMTPA id 38A2D44581E0;\n\tFri, 8 Sep 2017 19:45:29 +0200 (CEST)" ], "X-Spam-Checker-Version": "SpamAssassin 3.4.1 (2015-04-28) on\n\tantivirus1-rhel7.int", "X-Spam-Level": "", "X-Spam-Status": "No, score=-108.2 required=7.5 tests=ALL_TRUSTED,BAYES_50,\n\tSMTPAUTH_US2,USER_IN_WHITELIST autolearn=disabled version=3.4.1", "X-Virus-Status": "clean(F-Secure/fsigk_smtp/550/antivirus1-rhel7.int)", "X-SMTPAUTHUS": "auth mail.us.es", "From": "Pablo Neira Ayuso <pablo@netfilter.org>", "To": "netfilter-devel@vger.kernel.org", "Cc": "davem@davemloft.net, netdev@vger.kernel.org", "Subject": "[PATCH 5/9] netfilter: nat: Revert \"netfilter: nat: convert nat\n\tbysrc hash to rhashtable\"", "Date": "Fri, 8 Sep 2017 19:45:44 +0200", "Message-Id": "<1504892748-1605-6-git-send-email-pablo@netfilter.org>", "X-Mailer": "git-send-email 2.1.4", "In-Reply-To": "<1504892748-1605-1-git-send-email-pablo@netfilter.org>", "References": "<1504892748-1605-1-git-send-email-pablo@netfilter.org>", "X-Virus-Scanned": "ClamAV using ClamSMTP", "Sender": "netfilter-devel-owner@vger.kernel.org", "Precedence": "bulk", "List-ID": "<netfilter-devel.vger.kernel.org>", "X-Mailing-List": "netfilter-devel@vger.kernel.org" }, "content": "From: Florian Westphal <fw@strlen.de>\n\nThis reverts commit 870190a9ec9075205c0fa795a09fa931694a3ff1.\n\nIt was not a good idea. The custom hash table was a much better\nfit for this purpose.\n\nA fast lookup is not essential, in fact for most cases there is no lookup\nat all because original tuple is not taken and can be used as-is.\nWhat needs to be fast is insertion and deletion.\n\nrhlist removal however requires a rhlist walk.\nWe can have thousands of entries in such a list if source port/addresses\nare reused for multiple flows, if this happens removal requests are so\nexpensive that deletions of a few thousand flows can take several\nseconds(!).\n\nThe advantages that we got from rhashtable are:\n1) table auto-sizing\n2) multiple locks\n\n1) would be nice to have, but it is not essential as we have at\nmost one lookup per new flow, so even a million flows in the bysource\ntable are not a problem compared to current deletion cost.\n2) is easy to add to custom hash table.\n\nI tried to add hlist_node to rhlist to speed up rhltable_remove but this\nisn't doable without changing semantics. rhltable_remove_fast will\ncheck that the to-be-deleted object is part of the table and that\nrequires a list walk that we want to avoid.\n\nFurthermore, using hlist_node increases size of struct rhlist_head, which\nin turn increases nf_conn size.\n\nLink: https://bugzilla.kernel.org/show_bug.cgi?id=196821\nReported-by: Ivan Babrou <ibobrik@gmail.com>\nSigned-off-by: Florian Westphal <fw@strlen.de>\nSigned-off-by: Pablo Neira Ayuso <pablo@netfilter.org>\n---\n include/net/netfilter/nf_conntrack.h | 3 +-\n include/net/netfilter/nf_nat.h | 1 -\n net/netfilter/nf_nat_core.c | 130 ++++++++++++++---------------------\n 3 files changed, 54 insertions(+), 80 deletions(-)", "diff": "diff --git a/include/net/netfilter/nf_conntrack.h b/include/net/netfilter/nf_conntrack.h\nindex fdc9c64a1c94..8f3bd30511de 100644\n--- a/include/net/netfilter/nf_conntrack.h\n+++ b/include/net/netfilter/nf_conntrack.h\n@@ -17,7 +17,6 @@\n #include <linux/bitops.h>\n #include <linux/compiler.h>\n #include <linux/atomic.h>\n-#include <linux/rhashtable.h>\n \n #include <linux/netfilter/nf_conntrack_tcp.h>\n #include <linux/netfilter/nf_conntrack_dccp.h>\n@@ -77,7 +76,7 @@ struct nf_conn {\n \tpossible_net_t ct_net;\n \n #if IS_ENABLED(CONFIG_NF_NAT)\n-\tstruct rhlist_head nat_bysource;\n+\tstruct hlist_node\tnat_bysource;\n #endif\n \t/* all members below initialized via memset */\n \tu8 __nfct_init_offset[0];\ndiff --git a/include/net/netfilter/nf_nat.h b/include/net/netfilter/nf_nat.h\nindex 05c82a1a4267..b71701302e61 100644\n--- a/include/net/netfilter/nf_nat.h\n+++ b/include/net/netfilter/nf_nat.h\n@@ -1,6 +1,5 @@\n #ifndef _NF_NAT_H\n #define _NF_NAT_H\n-#include <linux/rhashtable.h>\n #include <linux/netfilter_ipv4.h>\n #include <linux/netfilter/nf_nat.h>\n #include <net/netfilter/nf_conntrack_tuple.h>\ndiff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c\nindex dc3519cc7209..f090419f5f97 100644\n--- a/net/netfilter/nf_nat_core.c\n+++ b/net/netfilter/nf_nat_core.c\n@@ -30,19 +30,17 @@\n #include <net/netfilter/nf_conntrack_zones.h>\n #include <linux/netfilter/nf_nat.h>\n \n+static DEFINE_SPINLOCK(nf_nat_lock);\n+\n static DEFINE_MUTEX(nf_nat_proto_mutex);\n static const struct nf_nat_l3proto __rcu *nf_nat_l3protos[NFPROTO_NUMPROTO]\n \t\t\t\t\t\t__read_mostly;\n static const struct nf_nat_l4proto __rcu **nf_nat_l4protos[NFPROTO_NUMPROTO]\n \t\t\t\t\t\t__read_mostly;\n \n-struct nf_nat_conn_key {\n-\tconst struct net *net;\n-\tconst struct nf_conntrack_tuple *tuple;\n-\tconst struct nf_conntrack_zone *zone;\n-};\n-\n-static struct rhltable nf_nat_bysource_table;\n+static struct hlist_head *nf_nat_bysource __read_mostly;\n+static unsigned int nf_nat_htable_size __read_mostly;\n+static unsigned int nf_nat_hash_rnd __read_mostly;\n \n inline const struct nf_nat_l3proto *\n __nf_nat_l3proto_find(u8 family)\n@@ -118,17 +116,19 @@ int nf_xfrm_me_harder(struct net *net, struct sk_buff *skb, unsigned int family)\n EXPORT_SYMBOL(nf_xfrm_me_harder);\n #endif /* CONFIG_XFRM */\n \n-static u32 nf_nat_bysource_hash(const void *data, u32 len, u32 seed)\n+/* We keep an extra hash for each conntrack, for fast searching. */\n+static unsigned int\n+hash_by_src(const struct net *n, const struct nf_conntrack_tuple *tuple)\n {\n-\tconst struct nf_conntrack_tuple *t;\n-\tconst struct nf_conn *ct = data;\n+\tunsigned int hash;\n+\n+\tget_random_once(&nf_nat_hash_rnd, sizeof(nf_nat_hash_rnd));\n \n-\tt = &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple;\n \t/* Original src, to ensure we map it consistently if poss. */\n+\thash = jhash2((u32 *)&tuple->src, sizeof(tuple->src) / sizeof(u32),\n+\t\t tuple->dst.protonum ^ nf_nat_hash_rnd ^ net_hash_mix(n));\n \n-\tseed ^= net_hash_mix(nf_ct_net(ct));\n-\treturn jhash2((const u32 *)&t->src, sizeof(t->src) / sizeof(u32),\n-\t\t t->dst.protonum ^ seed);\n+\treturn reciprocal_scale(hash, nf_nat_htable_size);\n }\n \n /* Is this tuple already taken? (not by us) */\n@@ -184,28 +184,6 @@ same_src(const struct nf_conn *ct,\n \t\tt->src.u.all == tuple->src.u.all);\n }\n \n-static int nf_nat_bysource_cmp(struct rhashtable_compare_arg *arg,\n-\t\t\t const void *obj)\n-{\n-\tconst struct nf_nat_conn_key *key = arg->key;\n-\tconst struct nf_conn *ct = obj;\n-\n-\tif (!same_src(ct, key->tuple) ||\n-\t !net_eq(nf_ct_net(ct), key->net) ||\n-\t !nf_ct_zone_equal(ct, key->zone, IP_CT_DIR_ORIGINAL))\n-\t\treturn 1;\n-\n-\treturn 0;\n-}\n-\n-static struct rhashtable_params nf_nat_bysource_params = {\n-\t.head_offset = offsetof(struct nf_conn, nat_bysource),\n-\t.obj_hashfn = nf_nat_bysource_hash,\n-\t.obj_cmpfn = nf_nat_bysource_cmp,\n-\t.nelem_hint = 256,\n-\t.min_size = 1024,\n-};\n-\n /* Only called for SRC manip */\n static int\n find_appropriate_src(struct net *net,\n@@ -216,26 +194,22 @@ find_appropriate_src(struct net *net,\n \t\t struct nf_conntrack_tuple *result,\n \t\t const struct nf_nat_range *range)\n {\n+\tunsigned int h = hash_by_src(net, tuple);\n \tconst struct nf_conn *ct;\n-\tstruct nf_nat_conn_key key = {\n-\t\t.net = net,\n-\t\t.tuple = tuple,\n-\t\t.zone = zone\n-\t};\n-\tstruct rhlist_head *hl, *h;\n-\n-\thl = rhltable_lookup(&nf_nat_bysource_table, &key,\n-\t\t\t nf_nat_bysource_params);\n \n-\trhl_for_each_entry_rcu(ct, h, hl, nat_bysource) {\n-\t\tnf_ct_invert_tuplepr(result,\n-\t\t\t\t &ct->tuplehash[IP_CT_DIR_REPLY].tuple);\n-\t\tresult->dst = tuple->dst;\n-\n-\t\tif (in_range(l3proto, l4proto, result, range))\n-\t\t\treturn 1;\n+\thlist_for_each_entry_rcu(ct, &nf_nat_bysource[h], nat_bysource) {\n+\t\tif (same_src(ct, tuple) &&\n+\t\t net_eq(net, nf_ct_net(ct)) &&\n+\t\t nf_ct_zone_equal(ct, zone, IP_CT_DIR_ORIGINAL)) {\n+\t\t\t/* Copy source part from reply tuple. */\n+\t\t\tnf_ct_invert_tuplepr(result,\n+\t\t\t\t &ct->tuplehash[IP_CT_DIR_REPLY].tuple);\n+\t\t\tresult->dst = tuple->dst;\n+\n+\t\t\tif (in_range(l3proto, l4proto, result, range))\n+\t\t\t\treturn 1;\n+\t\t}\n \t}\n-\n \treturn 0;\n }\n \n@@ -408,6 +382,7 @@ nf_nat_setup_info(struct nf_conn *ct,\n \t\t const struct nf_nat_range *range,\n \t\t enum nf_nat_manip_type maniptype)\n {\n+\tstruct net *net = nf_ct_net(ct);\n \tstruct nf_conntrack_tuple curr_tuple, new_tuple;\n \n \t/* Can't setup nat info for confirmed ct. */\n@@ -449,19 +424,14 @@ nf_nat_setup_info(struct nf_conn *ct,\n \t}\n \n \tif (maniptype == NF_NAT_MANIP_SRC) {\n-\t\tstruct nf_nat_conn_key key = {\n-\t\t\t.net = nf_ct_net(ct),\n-\t\t\t.tuple = &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple,\n-\t\t\t.zone = nf_ct_zone(ct),\n-\t\t};\n-\t\tint err;\n-\n-\t\terr = rhltable_insert_key(&nf_nat_bysource_table,\n-\t\t\t\t\t &key,\n-\t\t\t\t\t &ct->nat_bysource,\n-\t\t\t\t\t nf_nat_bysource_params);\n-\t\tif (err)\n-\t\t\treturn NF_DROP;\n+\t\tunsigned int srchash;\n+\n+\t\tsrchash = hash_by_src(net,\n+\t\t\t\t &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple);\n+\t\tspin_lock_bh(&nf_nat_lock);\n+\t\thlist_add_head_rcu(&ct->nat_bysource,\n+\t\t\t\t &nf_nat_bysource[srchash]);\n+\t\tspin_unlock_bh(&nf_nat_lock);\n \t}\n \n \t/* It's done. */\n@@ -570,8 +540,9 @@ static int nf_nat_proto_clean(struct nf_conn *ct, void *data)\n \t * will delete entry from already-freed table.\n \t */\n \tclear_bit(IPS_SRC_NAT_DONE_BIT, &ct->status);\n-\trhltable_remove(&nf_nat_bysource_table, &ct->nat_bysource,\n-\t\t\tnf_nat_bysource_params);\n+\tspin_lock_bh(&nf_nat_lock);\n+\thlist_del_rcu(&ct->nat_bysource);\n+\tspin_unlock_bh(&nf_nat_lock);\n \n \t/* don't delete conntrack. Although that would make things a lot\n \t * simpler, we'd end up flushing all conntracks on nat rmmod.\n@@ -699,9 +670,11 @@ EXPORT_SYMBOL_GPL(nf_nat_l3proto_unregister);\n /* No one using conntrack by the time this called. */\n static void nf_nat_cleanup_conntrack(struct nf_conn *ct)\n {\n-\tif (ct->status & IPS_SRC_NAT_DONE)\n-\t\trhltable_remove(&nf_nat_bysource_table, &ct->nat_bysource,\n-\t\t\t\tnf_nat_bysource_params);\n+\tif (ct->status & IPS_SRC_NAT_DONE) {\n+\t\tspin_lock_bh(&nf_nat_lock);\n+\t\thlist_del_rcu(&ct->nat_bysource);\n+\t\tspin_unlock_bh(&nf_nat_lock);\n+\t}\n }\n \n static struct nf_ct_ext_type nat_extend __read_mostly = {\n@@ -825,13 +798,16 @@ static int __init nf_nat_init(void)\n {\n \tint ret;\n \n-\tret = rhltable_init(&nf_nat_bysource_table, &nf_nat_bysource_params);\n-\tif (ret)\n-\t\treturn ret;\n+\t/* Leave them the same for the moment. */\n+\tnf_nat_htable_size = nf_conntrack_htable_size;\n+\n+\tnf_nat_bysource = nf_ct_alloc_hashtable(&nf_nat_htable_size, 0);\n+\tif (!nf_nat_bysource)\n+\t\treturn -ENOMEM;\n \n \tret = nf_ct_extend_register(&nat_extend);\n \tif (ret < 0) {\n-\t\trhltable_destroy(&nf_nat_bysource_table);\n+\t\tnf_ct_free_hashtable(nf_nat_bysource, nf_nat_htable_size);\n \t\tprintk(KERN_ERR \"nf_nat_core: Unable to register extension\\n\");\n \t\treturn ret;\n \t}\n@@ -865,8 +841,8 @@ static void __exit nf_nat_cleanup(void)\n \n \tfor (i = 0; i < NFPROTO_NUMPROTO; i++)\n \t\tkfree(nf_nat_l4protos[i]);\n-\n-\trhltable_destroy(&nf_nat_bysource_table);\n+\tsynchronize_net();\n+\tnf_ct_free_hashtable(nf_nat_bysource, nf_nat_htable_size);\n }\n \n MODULE_LICENSE(\"GPL\");\n", "prefixes": [ "5/9" ] }