From patchwork Thu May 5 14:00:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "wenxu@chinatelecom.cn" X-Patchwork-Id: 1626986 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.136; helo=smtp3.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from smtp3.osuosl.org (smtp3.osuosl.org [140.211.166.136]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4KvFhY1lJXz9sFs for ; Fri, 6 May 2022 00:00:28 +1000 (AEST) Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id E287D6112A; Thu, 5 May 2022 14:00:26 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4uHki-hCFbuD; Thu, 5 May 2022 14:00:26 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp3.osuosl.org (Postfix) with ESMTPS id 0EA7361113; Thu, 5 May 2022 14:00:25 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id E4988C0032; Thu, 5 May 2022 14:00:24 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp2.osuosl.org (smtp2.osuosl.org [IPv6:2605:bc80:3010::133]) by lists.linuxfoundation.org (Postfix) with ESMTP id DF0B9C002D for ; Thu, 5 May 2022 14:00:20 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id BEA19400CE for ; Thu, 5 May 2022 14:00:20 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id KV7qpENIfN32 for ; Thu, 5 May 2022 14:00:17 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 Received: from chinatelecom.cn (prt-mail.chinatelecom.cn [42.123.76.222]) by smtp2.osuosl.org (Postfix) with ESMTP id 2BDAC403E0 for ; Thu, 5 May 2022 14:00:17 +0000 (UTC) HMM_SOURCE_IP: 172.18.0.48:38042.23832497 HMM_ATTACHE_NUM: 0000 HMM_SOURCE_TYPE: SMTP Received: from clientip-101.229.163.228 (unknown [172.18.0.48]) by chinatelecom.cn (HERMES) with SMTP id E4A9B2800B1; Thu, 5 May 2022 22:00:09 +0800 (CST) X-189-SAVE-TO-SEND: wenxu@chinatelecom.cn Received: from ([172.18.0.48]) by app0024 with ESMTP id 91b7b05c6e87421e8dfd5359d60fa1c4 for dev@openvswitch.org; Thu, 05 May 2022 22:00:11 CST X-Transaction-ID: 91b7b05c6e87421e8dfd5359d60fa1c4 X-Real-From: wenxu@chinatelecom.cn X-Receive-IP: 172.18.0.48 X-MEDUSA-Status: 0 From: wenxu@chinatelecom.cn To: dev@openvswitch.org, pvalerio@redhat.com, i.maximets@ovn.org Date: Thu, 5 May 2022 10:00:07 -0400 Message-Id: <1651759208-28638-1-git-send-email-wenxu@chinatelecom.cn> X-Mailer: git-send-email 1.8.3.1 Subject: [ovs-dev] [PATCH v3 1/2] conntrack: remove the IP iterations in nat_get_unique_l4 X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" From: wenxu Removing the IP iterations, and just picking the IP address with the hash base on the least-used src-ip/dst-ip/proto triple. Signed-off-by: wenxu Acked-by: Paolo Valerio --- lib/conntrack.c | 86 +++++++++------------------------------------------------ 1 file changed, 13 insertions(+), 73 deletions(-) diff --git a/lib/conntrack.c b/lib/conntrack.c index 08da4dd..6b63fe6 100644 --- a/lib/conntrack.c +++ b/lib/conntrack.c @@ -2297,7 +2297,7 @@ set_dport_range(const struct nat_action_info_t *ni, const struct conn_key *k, } } -/* Gets the initial in range address based on the hash. +/* Gets an in range address based on the hash. * Addresses are kept in network order. */ static void get_addr_in_range(union ct_addr *min, union ct_addr *max, @@ -2322,10 +2322,10 @@ get_addr_in_range(union ct_addr *min, union ct_addr *max, } static void -get_initial_addr(const struct conn *conn, union ct_addr *min, - union ct_addr *max, union ct_addr *curr, - uint32_t hash, bool ipv4, - const struct nat_action_info_t *nat_info) +find_addr(const struct conn *conn, union ct_addr *min, + union ct_addr *max, union ct_addr *curr, + uint32_t hash, bool ipv4, + const struct nat_action_info_t *nat_info) { const union ct_addr zero_ip = {0}; @@ -2352,51 +2352,6 @@ store_addr_to_key(union ct_addr *addr, struct conn_key *key, } } -static void -next_addr_in_range(union ct_addr *curr, union ct_addr *min, - union ct_addr *max, bool ipv4) -{ - if (ipv4) { - /* This check could be unified with IPv6, but let's avoid - * an unneeded memcmp() in case of IPv4. */ - if (min->ipv4 == max->ipv4) { - return; - } - - curr->ipv4 = (curr->ipv4 == max->ipv4) ? min->ipv4 - : htonl(ntohl(curr->ipv4) + 1); - } else { - if (!memcmp(min, max, sizeof *min)) { - return; - } - - if (!memcmp(curr, max, sizeof *curr)) { - *curr = *min; - return; - } - - nat_ipv6_addr_increment(&curr->ipv6, 1); - } -} - -static bool -next_addr_in_range_guarded(union ct_addr *curr, union ct_addr *min, - union ct_addr *max, union ct_addr *guard, - bool ipv4) -{ - bool exhausted; - - next_addr_in_range(curr, min, max, ipv4); - - if (ipv4) { - exhausted = (curr->ipv4 == guard->ipv4); - } else { - exhausted = !memcmp(curr, guard, sizeof *curr); - } - - return exhausted; -} - static bool nat_get_unique_l4(struct conntrack *ct, struct conn *nat_conn, ovs_be16 *port, uint16_t curr, uint16_t min, @@ -2422,7 +2377,7 @@ nat_get_unique_l4(struct conntrack *ct, struct conn *nat_conn, * collide with any existing one. * * In case of SNAT: - * - For each src IP address in the range (if any). + * - Pick a src IP address in the range. * - Try to find a source port in range (if any). * - If no port range exists, use the whole * ephemeral range (after testing the port @@ -2430,7 +2385,7 @@ nat_get_unique_l4(struct conntrack *ct, struct conn *nat_conn, * specified range. * * In case of DNAT: - * - For each dst IP address in the range (if any). + * - Pick a dst IP address in the range. * - For each dport in range (if any) tries to find * an unique tuple. * - Eventually, if the previous attempt fails, @@ -2443,9 +2398,8 @@ nat_get_unique_tuple(struct conntrack *ct, const struct conn *conn, struct conn *nat_conn, const struct nat_action_info_t *nat_info) { - union ct_addr min_addr = {0}, max_addr = {0}, curr_addr = {0}, - guard_addr = {0}; uint32_t hash = nat_range_hash(conn, ct->hash_basis, nat_info); + union ct_addr min_addr = {0}, max_addr = {0}, addr = {0}; bool pat_proto = conn->key.nw_proto == IPPROTO_TCP || conn->key.nw_proto == IPPROTO_UDP; uint16_t min_dport, max_dport, curr_dport; @@ -2454,12 +2408,8 @@ nat_get_unique_tuple(struct conntrack *ct, const struct conn *conn, min_addr = nat_info->min_addr; max_addr = nat_info->max_addr; - get_initial_addr(conn, &min_addr, &max_addr, &curr_addr, hash, - (conn->key.dl_type == htons(ETH_TYPE_IP)), nat_info); - - /* Save the address we started from so that - * we can stop once we reach it. */ - guard_addr = curr_addr; + find_addr(conn, &min_addr, &max_addr, &addr, hash, + (conn->key.dl_type == htons(ETH_TYPE_IP)), nat_info); set_sport_range(nat_info, &conn->key, hash, &curr_sport, &min_sport, &max_sport); @@ -2471,8 +2421,7 @@ nat_get_unique_tuple(struct conntrack *ct, const struct conn *conn, nat_conn->rev_key.dst.port = htons(curr_sport); } -another_round: - store_addr_to_key(&curr_addr, &nat_conn->rev_key, + store_addr_to_key(&addr, &nat_conn->rev_key, nat_info->nat_action); if (!pat_proto) { @@ -2481,7 +2430,7 @@ another_round: return true; } - goto next_addr; + return false; } bool found = false; @@ -2499,16 +2448,7 @@ another_round: return true; } - /* Check if next IP is in range and respin. Otherwise, notify - * exhaustion to the caller. */ -next_addr: - if (next_addr_in_range_guarded(&curr_addr, &min_addr, - &max_addr, &guard_addr, - conn->key.dl_type == htons(ETH_TYPE_IP))) { - return false; - } - - goto another_round; + return false; } static enum ct_update_res From patchwork Thu May 5 14:00:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "wenxu@chinatelecom.cn" X-Patchwork-Id: 1626987 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=2605:bc80:3010::138; helo=smtp1.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from smtp1.osuosl.org (smtp1.osuosl.org [IPv6:2605:bc80:3010::138]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4KvFhd6kJBz9sBB for ; Fri, 6 May 2022 00:00:33 +1000 (AEST) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id 342EB8403B; Thu, 5 May 2022 14:00:31 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ZJZBKTE_Ig9Y; Thu, 5 May 2022 14:00:28 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [IPv6:2605:bc80:3010:104::8cd3:938]) by smtp1.osuosl.org (Postfix) with ESMTPS id ADF5584081; Thu, 5 May 2022 14:00:27 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id A94D3C0083; Thu, 5 May 2022 14:00:25 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp1.osuosl.org (smtp1.osuosl.org [140.211.166.138]) by lists.linuxfoundation.org (Postfix) with ESMTP id 3F97CC002D for ; Thu, 5 May 2022 14:00:22 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id 2E8C68403B for ; Thu, 5 May 2022 14:00:22 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id LjPGuhO1JDto for ; Thu, 5 May 2022 14:00:21 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 Received: from chinatelecom.cn (prt-mail.chinatelecom.cn [42.123.76.222]) by smtp1.osuosl.org (Postfix) with ESMTP id 840D484037 for ; Thu, 5 May 2022 14:00:21 +0000 (UTC) HMM_SOURCE_IP: 172.18.0.48:38042.23832497 HMM_ATTACHE_NUM: 0000 HMM_SOURCE_TYPE: SMTP Received: from clientip-101.229.163.228 (unknown [172.18.0.48]) by chinatelecom.cn (HERMES) with SMTP id DE45B2800BE; Thu, 5 May 2022 22:00:13 +0800 (CST) X-189-SAVE-TO-SEND: wenxu@chinatelecom.cn Received: from ([172.18.0.48]) by app0024 with ESMTP id b7bfe87ae7f14966ab4917af832bf9c6 for dev@openvswitch.org; Thu, 05 May 2022 22:00:15 CST X-Transaction-ID: b7bfe87ae7f14966ab4917af832bf9c6 X-Real-From: wenxu@chinatelecom.cn X-Receive-IP: 172.18.0.48 X-MEDUSA-Status: 0 From: wenxu@chinatelecom.cn To: dev@openvswitch.org, pvalerio@redhat.com, i.maximets@ovn.org Date: Thu, 5 May 2022 10:00:08 -0400 Message-Id: <1651759208-28638-2-git-send-email-wenxu@chinatelecom.cn> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1651759208-28638-1-git-send-email-wenxu@chinatelecom.cn> References: <1651759208-28638-1-git-send-email-wenxu@chinatelecom.cn> Subject: [ovs-dev] [PATCH v3 2/2] conntrack: limit port clash resolution attempts X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" From: wenxu In case almost or all available ports are taken, clash resolution can take a very long time, resulting in pmd lockup. This can happen when many to-be-natted hosts connect to same destination:port (e.g. a proxy) and all connections pass the same SNAT. Pick a random offset in the acceptable range, then try ever smaller number of adjacent port numbers, until either the limit is reached or a useable port was found. This results in at most 248 attempts (128 + 64 + 32 + 16 + 8, i.e. 4 restarts with new search offset) instead of 64000+. Signed-off-by: wenxu Acked-by: Paolo Valerio --- lib/conntrack.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/lib/conntrack.c b/lib/conntrack.c index 6b63fe6..b243370 100644 --- a/lib/conntrack.c +++ b/lib/conntrack.c @@ -2357,9 +2357,24 @@ nat_get_unique_l4(struct conntrack *ct, struct conn *nat_conn, ovs_be16 *port, uint16_t curr, uint16_t min, uint16_t max) { + static const unsigned int max_attempts = 128; + uint16_t range = max - min + 1; + unsigned int attempts; uint16_t orig = curr; + unsigned int i = 0; + attempts = range; + if (attempts > max_attempts) { + attempts = max_attempts; + } + +another_round: + i = 0; FOR_EACH_PORT_IN_RANGE (curr, min, max) { + if (i++ >= attempts) { + break; + } + *port = htons(curr); if (!conn_lookup(ct, &nat_conn->rev_key, time_msec(), NULL, NULL)) { @@ -2367,6 +2382,12 @@ nat_get_unique_l4(struct conntrack *ct, struct conn *nat_conn, } } + if (attempts < range && attempts >= 16) { + attempts /= 2; + curr = min + (random_uint32() % range); + goto another_round; + } + *port = htons(orig); return false;