From patchwork Wed Dec 18 20:31:05 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi-Hung Wei X-Patchwork-Id: 1212780 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.136; helo=silver.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="TTYm2tFH"; dkim-atps=neutral Received: from silver.osuosl.org (smtp3.osuosl.org [140.211.166.136]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 47dRVc1Vtbz9sPT for ; Thu, 19 Dec 2019 07:31:20 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by silver.osuosl.org (Postfix) with ESMTP id BA50F22CB0; Wed, 18 Dec 2019 20:31:16 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from silver.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8p0vboIsjw9l; Wed, 18 Dec 2019 20:31:14 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by silver.osuosl.org (Postfix) with ESMTP id AC1712038D; Wed, 18 Dec 2019 20:31:14 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 7A7CEC1797; Wed, 18 Dec 2019 20:31:14 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from silver.osuosl.org (smtp3.osuosl.org [140.211.166.136]) by lists.linuxfoundation.org (Postfix) with ESMTP id 7D6F9C077D for ; Wed, 18 Dec 2019 20:31:12 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by silver.osuosl.org (Postfix) with ESMTP id 683EB2038D for ; Wed, 18 Dec 2019 20:31:12 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from silver.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8O0zu7b-+TvR for ; Wed, 18 Dec 2019 20:31:10 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mail-pg1-f194.google.com (mail-pg1-f194.google.com [209.85.215.194]) by silver.osuosl.org (Postfix) with ESMTPS id 562DF20133 for ; Wed, 18 Dec 2019 20:31:10 +0000 (UTC) Received: by mail-pg1-f194.google.com with SMTP id l24so1884290pgk.2 for ; Wed, 18 Dec 2019 12:31:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:date:message-id; bh=TFd6OVm7geY7OhpdPM/mk1FLj8mL+lXuiZSO7mjcNDw=; b=TTYm2tFHlTHpe/sYnpWT8omQSp7ZQlH7yxzkMkoSECcrYKt9fLdc+73W/CMuUtkj4h R2firXDoXUXFS1mNXuJDfBv1Dpu+pNhGeYG+xLHAduY0y95DQ588EQ5EW2SzigZ6cMrk wIaKZh16TovCfOZxk0NUrrREFXfMznjsItiRBTU5v0F7trOSmL0zqFpxxQWI7R8m8WSW rLWCEiEvoqSGFbjMMbNJGSbvJW0Po7Drv8iJ269J0VigLOdXI0QldR/ROtWavffzioT6 q/0pStVdhTpWAa/wrFByekbr88k2dNBcuYj6gqCY7amd+6UL2IMCyoRWJYemvjAWpo7q iZnQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id; bh=TFd6OVm7geY7OhpdPM/mk1FLj8mL+lXuiZSO7mjcNDw=; b=jp93fmjstQ9EArRsmmCbT7Xvpp3L5tRWfFJZoCMGZs7yEpAmrzFpgQKDihz3u4dZXF LBro3ewtsgJcX2q/pv5v7Lg+BV7+YYgM6AVpcwmD4AWxNe00b0HpKFvMGfvRU61gKSml uowZ+mvYPniIxwv25JnM60Hw9xgkX2LeXZq82L2q+Ashsp+fyLyQwkpISliFItt3/onb CkhQS2iwg3qMbUHjuYuW3GhzSEdstg6/hMO0EsfbLWF7jztcT27G0UTbJxKz3U9spJCs 5nfwH9ScvQqQDQ04iBSbNLyRrxE6Iz6oDsiTk4jq8MnefPZbtPHpZ9K22QCp2meM826V vywg== X-Gm-Message-State: APjAAAV30G2iAkE62Nf8Zwez1zEwDEzfqL73XI1a9E7MSYZ54kr/gCDm PnhhxxLCQSvladxLBrJ7BR3bpm7W X-Google-Smtp-Source: APXvYqxB9uHRjURd3z8moA3aHj43F8c0waleQof1jdDHYS9f3fgtn4SWzjV62x0CWC+ULUZWh6Jz2g== X-Received: by 2002:a62:ed0b:: with SMTP id u11mr5094709pfh.46.1576701069321; Wed, 18 Dec 2019 12:31:09 -0800 (PST) Received: from bionic.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id c1sm4453795pfa.51.2019.12.18.12.31.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Dec 2019 12:31:08 -0800 (PST) From: Yi-Hung Wei To: dev@openvswitch.org, i.maximets@ovn.org, echaudro@redhat.com, u9012063@gmail.com Date: Wed, 18 Dec 2019 12:31:05 -0800 Message-Id: <20191218203106.85695-1-yihung.wei@gmail.com> X-Mailer: git-send-email 2.17.1 Subject: [ovs-dev] [PATCH v8 1/2] netdev-linux: Detect numa node id. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" From: William Tu The patch detects the numa node id from the name of the netdev, by reading the '/sys/class/net//device/numa_node'. If not available, ex: virtual device, or any error happens, return numa id 0. Currently only the afxdp netdev type uses it, other linux netdev types are disabled due to no use case. Signed-off-by: William Tu Acked-by: Eelco Chaudron --- v8: - Addreess review comments from Eelco and Ilya in patch 2. * Use OVS_FIND_DEPENDENCY(). * Avoid the locking issue when calling netdev_get_numa_id(). * Check NETDEV_NUMA_UNSPEC. * Use return value from netdev_get_numa_id() directly, and check NETDEV_NUMA_UNSPEC case. * Use numa_set_preferred(). - Travis CI: https://travis-ci.org/YiHungWei/ovs/builds/626865328 v7: - Add numa aware memory allocation for afxdp related memory in the following patch. - Travis CI: https://travis-ci.org/YiHungWei/ovs/builds/626403984 v6: Feedbacks from Ilya - add thread safety check at netdev_linux_get_numa_id__, and pass netdev_linux - preserve numa cache on netlink updates - Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/605634673 v5: Feedbacks from Ilya - reafactor the error handling - add mutex lock - Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/601947245 _UNSPEC v4: Feedbacks from Eelco - Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/599308893 v3: Feedbacks from Ilya and Eelco - update doc, afxdp.rst - fix coding style - fix limit of numa node max, by using ovs_numa_numa_id_is_valid - move the function to netdev-linux - cache the result of numa_id - add security check for netdev name - use fscanf instead of read and convert to int - revise some error message content v2: address feedback from Eelco fix memory leak of xaspintf log using INFO instead of WARN --- Documentation/intro/install/afxdp.rst | 1 - lib/netdev-afxdp.c | 11 ----- lib/netdev-afxdp.h | 1 - lib/netdev-linux-private.h | 2 + lib/netdev-linux.c | 67 +++++++++++++++++++++++++-- 5 files changed, 66 insertions(+), 16 deletions(-) diff --git a/Documentation/intro/install/afxdp.rst b/Documentation/intro/install/afxdp.rst index 15e3c918f942..7b0736c96114 100644 --- a/Documentation/intro/install/afxdp.rst +++ b/Documentation/intro/install/afxdp.rst @@ -327,7 +327,6 @@ Below is a script using namespaces and veth peer:: Limitations/Known Issues ------------------------ -#. Device's numa ID is always 0, need a way to find numa id from a netdev. #. No QoS support because AF_XDP netdev by-pass the Linux TC layer. A possible work-around is to use OpenFlow meter action. #. Most of the tests are done using i40e single port. Multiple ports and diff --git a/lib/netdev-afxdp.c b/lib/netdev-afxdp.c index bdabf97f4fdb..91b70b298e57 100644 --- a/lib/netdev-afxdp.c +++ b/lib/netdev-afxdp.c @@ -695,17 +695,6 @@ out: return err; } -int -netdev_afxdp_get_numa_id(const struct netdev *netdev) -{ - /* FIXME: Get netdev's PCIe device ID, then find - * its NUMA node id. - */ - VLOG_INFO("FIXME: Device %s always use numa id 0.", - netdev_get_name(netdev)); - return 0; -} - static void xsk_remove_xdp_program(uint32_t ifindex, enum afxdp_mode mode) { diff --git a/lib/netdev-afxdp.h b/lib/netdev-afxdp.h index 8188bc669526..246a4b62fb57 100644 --- a/lib/netdev-afxdp.h +++ b/lib/netdev-afxdp.h @@ -60,7 +60,6 @@ int netdev_afxdp_batch_send(struct netdev *netdev_, int qid, int netdev_afxdp_set_config(struct netdev *netdev, const struct smap *args, char **errp); int netdev_afxdp_get_config(const struct netdev *netdev, struct smap *args); -int netdev_afxdp_get_numa_id(const struct netdev *netdev); int netdev_afxdp_get_stats(const struct netdev *netdev_, struct netdev_stats *stats); int netdev_afxdp_get_custom_stats(const struct netdev *netdev, diff --git a/lib/netdev-linux-private.h b/lib/netdev-linux-private.h index 8873caa9d412..4d66b0858222 100644 --- a/lib/netdev-linux-private.h +++ b/lib/netdev-linux-private.h @@ -96,6 +96,8 @@ struct netdev_linux { /* LAG information. */ bool is_lag_master; /* True if the netdev is a LAG master. */ + int numa_id; /* NUMA node id. */ + #ifdef HAVE_AF_XDP /* AF_XDP information. */ struct xsk_socket_info **xsks; diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c index f8e59bacfb13..e32df970fe4b 100644 --- a/lib/netdev-linux.c +++ b/lib/netdev-linux.c @@ -236,6 +236,7 @@ enum { VALID_VPORT_STAT_ERROR = 1 << 5, VALID_DRVINFO = 1 << 6, VALID_FEATURES = 1 << 7, + VALID_NUMA_ID = 1 << 8, }; struct linux_lag_slave { @@ -820,9 +821,9 @@ netdev_linux_update__(struct netdev_linux *dev, { if (rtnetlink_type_is_rtnlgrp_link(change->nlmsg_type)) { if (change->nlmsg_type == RTM_NEWLINK) { - /* Keep drv-info, and ip addresses. */ + /* Keep drv-info, ip addresses, and NUMA id. */ netdev_linux_changed(dev, change->ifi_flags, - VALID_DRVINFO | VALID_IN); + VALID_DRVINFO | VALID_IN | VALID_NUMA_ID); /* Update netdev from rtnl-change msg. */ if (change->mtu) { @@ -1391,6 +1392,66 @@ netdev_linux_tap_batch_send(struct netdev *netdev_, return 0; } +static int +netdev_linux_get_numa_id__(struct netdev_linux *netdev) + OVS_REQUIRES(netdev->mutex) +{ + char *numa_node_path; + const char *name; + int node_id; + FILE *stream; + + if (netdev->cache_valid & VALID_NUMA_ID) { + return netdev->numa_id; + } + + netdev->numa_id = 0; + netdev->cache_valid |= VALID_NUMA_ID; + + name = netdev_get_name(&netdev->up); + if (strpbrk(name, "/\\")) { + VLOG_ERR_RL(&rl, "\"%s\" is not a valid name for a port. " + "A valid name must not include '/' or '\\'." + "Using numa_id 0", name); + return 0; + } + + numa_node_path = xasprintf("/sys/class/net/%s/device/numa_node", name); + + stream = fopen(numa_node_path, "r"); + if (!stream) { + /* Virtual device does not have this info. */ + VLOG_INFO_RL(&rl, "%s: Can't open '%s': %s, using numa_id 0", + name, numa_node_path, ovs_strerror(errno)); + free(numa_node_path); + return 0; + } + + if (fscanf(stream, "%d", &node_id) != 1 + || !ovs_numa_numa_id_is_valid(node_id)) { + VLOG_WARN_RL(&rl, "%s: Can't detect NUMA node, using numa_id 0", name); + node_id = 0; + } + + netdev->numa_id = node_id; + fclose(stream); + free(numa_node_path); + return node_id; +} + +static int OVS_UNUSED +netdev_linux_get_numa_id(const struct netdev *netdev_) +{ + struct netdev_linux *netdev = netdev_linux_cast(netdev_); + int numa_id; + + ovs_mutex_lock(&netdev->mutex); + numa_id = netdev_linux_get_numa_id__(netdev); + ovs_mutex_unlock(&netdev->mutex); + + return numa_id; +} + /* Sends 'batch' on 'netdev'. Returns 0 if successful, otherwise a positive * errno value. Returns EAGAIN without blocking if the packet cannot be queued * immediately. Returns EMSGSIZE if a partial packet was transmitted or if @@ -3308,7 +3369,7 @@ const struct netdev_class netdev_afxdp_class = { .set_config = netdev_afxdp_set_config, .get_config = netdev_afxdp_get_config, .reconfigure = netdev_afxdp_reconfigure, - .get_numa_id = netdev_afxdp_get_numa_id, + .get_numa_id = netdev_linux_get_numa_id, .send = netdev_afxdp_batch_send, .rxq_construct = netdev_afxdp_rxq_construct, .rxq_destruct = netdev_afxdp_rxq_destruct,