From patchwork Sat Jan 4 01:13:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi-Hung Wei X-Patchwork-Id: 1217472 X-Patchwork-Delegate: i.maximets@samsung.com Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.136; helo=silver.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="GArV102Z"; dkim-atps=neutral Received: from silver.osuosl.org (smtp3.osuosl.org [140.211.166.136]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 47qP0w4q63z9sNH for ; Sat, 4 Jan 2020 12:13:36 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by silver.osuosl.org (Postfix) with ESMTP id 3651E2221C; Sat, 4 Jan 2020 01:13:35 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from silver.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 17qOjKrolLVR; Sat, 4 Jan 2020 01:13:33 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by silver.osuosl.org (Postfix) with ESMTP id 1D7CA22049; Sat, 4 Jan 2020 01:13:33 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 045F9C1D87; Sat, 4 Jan 2020 01:13:33 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from silver.osuosl.org (smtp3.osuosl.org [140.211.166.136]) by lists.linuxfoundation.org (Postfix) with ESMTP id 8F899C077D for ; Sat, 4 Jan 2020 01:13:31 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by silver.osuosl.org (Postfix) with ESMTP id 7EACC22049 for ; Sat, 4 Jan 2020 01:13:31 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from silver.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ZkEumvzaFxSQ for ; Sat, 4 Jan 2020 01:13:30 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mail-pf1-f193.google.com (mail-pf1-f193.google.com [209.85.210.193]) by silver.osuosl.org (Postfix) with ESMTPS id 6035121556 for ; Sat, 4 Jan 2020 01:13:30 +0000 (UTC) Received: by mail-pf1-f193.google.com with SMTP id 4so24225256pfz.9 for ; Fri, 03 Jan 2020 17:13:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:date:message-id; bh=IfOaKdNyiAKSLW5Hd57rq34ETDozSnhCVpndDCcziJ0=; b=GArV102ZbDJLcTW2XmIfyomo+92qcSlMMNnIXxhXFoad0LSO/vqMWAZ6Cd+1SB9PzX bd23AGjR5/LJeHaOJUJYyyrigMjRSdIkkTiCE6vP6WHcvcqddS9yEXNQ8L5IUVHOPpz5 JeOTypkCKBFi7R4OHLd1zenFUU36YYGharRMIXE7aumFJPQxdjQLsA/wURpTxi5MVIUo 1Sn6X8h4nXXSfszwRQpQlaszU/5KGaiN1jYr/GQZXeadOiaW2x0zWEcqhF4o53HZqCS0 pFfNEom7qGuD8DsLDTwvsta0yv7S6vAZEV7leLZz/Gba+RgIO6SzxISCX7To7edtbVT8 5DDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id; bh=IfOaKdNyiAKSLW5Hd57rq34ETDozSnhCVpndDCcziJ0=; b=JeC609tWqYERLefBTa4I2SUK+IByZva329xj5QnZI5MRtpSV4ZIUbF7v/QVyBm5DKB nm+IlwC1YmoSGd7js44VkxZwJJ2red/vaVMDQUeEDbnhrw288h/u3rbA5kYLLV01xTfe Ms0/ce2c4imQTB7fxMcZiJSsPA4M1Zo2N/SNnrF1ZtEz3/BNZ/ynHDqV94A/T88elbQ0 rOq0ihB1ozjC0L25Ec+uPlTcIzhEh89WfYTDIH5lwvNYbqzaHMJr8cE2IMa/FAcOFNAV JNSfKJGuaQgZ6uKWSRldDgleVZuBRAuqu/57m7LetK3UDtkQM3XPPJCJEISvP8AAzerY bzcg== X-Gm-Message-State: APjAAAWS8bqfYCLs81WvfaDGMtqnE44NJbGCDlxOwksVBFLc45/We+ZR rgjNDeuj1nr+GCwMmMWakk+QHZ0i X-Google-Smtp-Source: APXvYqy74FQG5cVQZ2QwGsUOveld6vO2GMYQJcrJn+bXmHnrEpnwia/5SImeca5nTwUIg9ydhPtETw== X-Received: by 2002:a63:197:: with SMTP id 145mr100606684pgb.11.1578100409410; Fri, 03 Jan 2020 17:13:29 -0800 (PST) Received: from bionic.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id g2sm55937152pgn.59.2020.01.03.17.13.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 03 Jan 2020 17:13:28 -0800 (PST) From: Yi-Hung Wei To: dev@openvswitch.org, i.maximets@ovn.org, echaudro@redhat.com, u9012063@gmail.com Date: Fri, 3 Jan 2020 17:13:25 -0800 Message-Id: <20200104011326.92161-1-yihung.wei@gmail.com> X-Mailer: git-send-email 2.17.1 Subject: [ovs-dev] [PATCH v9 1/2] netdev-linux: Detect numa node id. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" From: William Tu The patch detects the numa node id from the name of the netdev, by reading the '/sys/class/net//device/numa_node'. If not available, ex: virtual device, or any error happens, return numa id 0. Currently only the afxdp netdev type uses it, other linux netdev types are disabled due to no use case. Signed-off-by: William Tu Acked-by: Eelco Chaudron --- v9: - Addreess review comments from Ilya in patch 2. * Add check on numa_available() * Properly restore memory policy with get/set_mempolicy. * Travis CI: https://travis-ci.org/YiHungWei/ovs/builds/632486578 v8: - Addreess review comments from Eelco and Ilya in patch 2. * Use OVS_FIND_DEPENDENCY(). * Avoid the locking issue when calling netdev_get_numa_id(). * Check NETDEV_NUMA_UNSPEC. * Use return value from netdev_get_numa_id() directly, and check NETDEV_NUMA_UNSPEC case. * Use numa_set_preferred(). - Travis CI: https://travis-ci.org/YiHungWei/ovs/builds/626865328 v7: - Add numa aware memory allocation for afxdp related memory in the following patch. - Travis CI: https://travis-ci.org/YiHungWei/ovs/builds/626403984 v6: Feedbacks from Ilya - add thread safety check at netdev_linux_get_numa_id__, and pass netdev_linux - preserve numa cache on netlink updates - Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/605634673 v5: Feedbacks from Ilya - reafactor the error handling - add mutex lock - Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/601947245 v4: Feedbacks from Eelco - Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/599308893 v3: Feedbacks from Ilya and Eelco - update doc, afxdp.rst - fix coding style - fix limit of numa node max, by using ovs_numa_numa_id_is_valid - move the function to netdev-linux - cache the result of numa_id - add security check for netdev name - use fscanf instead of read and convert to int - revise some error message content v2: address feedback from Eelco fix memory leak of xaspintf log using INFO instead of WARN --- Documentation/intro/install/afxdp.rst | 1 - lib/netdev-afxdp.c | 11 ----- lib/netdev-afxdp.h | 1 - lib/netdev-linux-private.h | 2 + lib/netdev-linux.c | 67 +++++++++++++++++++++++++-- 5 files changed, 66 insertions(+), 16 deletions(-) diff --git a/Documentation/intro/install/afxdp.rst b/Documentation/intro/install/afxdp.rst index 15e3c918f942..7b0736c96114 100644 --- a/Documentation/intro/install/afxdp.rst +++ b/Documentation/intro/install/afxdp.rst @@ -327,7 +327,6 @@ Below is a script using namespaces and veth peer:: Limitations/Known Issues ------------------------ -#. Device's numa ID is always 0, need a way to find numa id from a netdev. #. No QoS support because AF_XDP netdev by-pass the Linux TC layer. A possible work-around is to use OpenFlow meter action. #. Most of the tests are done using i40e single port. Multiple ports and diff --git a/lib/netdev-afxdp.c b/lib/netdev-afxdp.c index 58365ed483e3..426dce944977 100644 --- a/lib/netdev-afxdp.c +++ b/lib/netdev-afxdp.c @@ -703,17 +703,6 @@ out: return err; } -int -netdev_afxdp_get_numa_id(const struct netdev *netdev) -{ - /* FIXME: Get netdev's PCIe device ID, then find - * its NUMA node id. - */ - VLOG_INFO("FIXME: Device %s always use numa id 0.", - netdev_get_name(netdev)); - return 0; -} - static void xsk_remove_xdp_program(uint32_t ifindex, enum afxdp_mode mode) { diff --git a/lib/netdev-afxdp.h b/lib/netdev-afxdp.h index ae6971efd113..e91cd102d284 100644 --- a/lib/netdev-afxdp.h +++ b/lib/netdev-afxdp.h @@ -61,7 +61,6 @@ int netdev_afxdp_batch_send(struct netdev *netdev_, int qid, int netdev_afxdp_set_config(struct netdev *netdev, const struct smap *args, char **errp); int netdev_afxdp_get_config(const struct netdev *netdev, struct smap *args); -int netdev_afxdp_get_numa_id(const struct netdev *netdev); int netdev_afxdp_get_stats(const struct netdev *netdev_, struct netdev_stats *stats); int netdev_afxdp_get_custom_stats(const struct netdev *netdev, diff --git a/lib/netdev-linux-private.h b/lib/netdev-linux-private.h index f08159aa7b53..55108d2c2e70 100644 --- a/lib/netdev-linux-private.h +++ b/lib/netdev-linux-private.h @@ -96,6 +96,8 @@ struct netdev_linux { /* LAG information. */ bool is_lag_master; /* True if the netdev is a LAG master. */ + int numa_id; /* NUMA node id. */ + #ifdef HAVE_AF_XDP /* AF_XDP information. */ struct xsk_socket_info **xsks; diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c index 8a62f9d741ec..b29487e7c553 100644 --- a/lib/netdev-linux.c +++ b/lib/netdev-linux.c @@ -236,6 +236,7 @@ enum { VALID_VPORT_STAT_ERROR = 1 << 5, VALID_DRVINFO = 1 << 6, VALID_FEATURES = 1 << 7, + VALID_NUMA_ID = 1 << 8, }; struct linux_lag_slave { @@ -820,9 +821,9 @@ netdev_linux_update__(struct netdev_linux *dev, { if (rtnetlink_type_is_rtnlgrp_link(change->nlmsg_type)) { if (change->nlmsg_type == RTM_NEWLINK) { - /* Keep drv-info, and ip addresses. */ + /* Keep drv-info, ip addresses, and NUMA id. */ netdev_linux_changed(dev, change->ifi_flags, - VALID_DRVINFO | VALID_IN); + VALID_DRVINFO | VALID_IN | VALID_NUMA_ID); /* Update netdev from rtnl-change msg. */ if (change->mtu) { @@ -1391,6 +1392,66 @@ netdev_linux_tap_batch_send(struct netdev *netdev_, return 0; } +static int +netdev_linux_get_numa_id__(struct netdev_linux *netdev) + OVS_REQUIRES(netdev->mutex) +{ + char *numa_node_path; + const char *name; + int node_id; + FILE *stream; + + if (netdev->cache_valid & VALID_NUMA_ID) { + return netdev->numa_id; + } + + netdev->numa_id = 0; + netdev->cache_valid |= VALID_NUMA_ID; + + name = netdev_get_name(&netdev->up); + if (strpbrk(name, "/\\")) { + VLOG_ERR_RL(&rl, "\"%s\" is not a valid name for a port. " + "A valid name must not include '/' or '\\'." + "Using numa_id 0", name); + return 0; + } + + numa_node_path = xasprintf("/sys/class/net/%s/device/numa_node", name); + + stream = fopen(numa_node_path, "r"); + if (!stream) { + /* Virtual device does not have this info. */ + VLOG_INFO_RL(&rl, "%s: Can't open '%s': %s, using numa_id 0", + name, numa_node_path, ovs_strerror(errno)); + free(numa_node_path); + return 0; + } + + if (fscanf(stream, "%d", &node_id) != 1 + || !ovs_numa_numa_id_is_valid(node_id)) { + VLOG_WARN_RL(&rl, "%s: Can't detect NUMA node, using numa_id 0", name); + node_id = 0; + } + + netdev->numa_id = node_id; + fclose(stream); + free(numa_node_path); + return node_id; +} + +static int OVS_UNUSED +netdev_linux_get_numa_id(const struct netdev *netdev_) +{ + struct netdev_linux *netdev = netdev_linux_cast(netdev_); + int numa_id; + + ovs_mutex_lock(&netdev->mutex); + numa_id = netdev_linux_get_numa_id__(netdev); + ovs_mutex_unlock(&netdev->mutex); + + return numa_id; +} + /* Sends 'batch' on 'netdev'. Returns 0 if successful, otherwise a positive * errno value. Returns EAGAIN without blocking if the packet cannot be queued * immediately. Returns EMSGSIZE if a partial packet was transmitted or if @@ -3310,7 +3371,7 @@ const struct netdev_class netdev_afxdp_class = { .set_config = netdev_afxdp_set_config, .get_config = netdev_afxdp_get_config, .reconfigure = netdev_afxdp_reconfigure, - .get_numa_id = netdev_afxdp_get_numa_id, + .get_numa_id = netdev_linux_get_numa_id, .send = netdev_afxdp_batch_send, .rxq_construct = netdev_afxdp_rxq_construct, .rxq_destruct = netdev_afxdp_rxq_destruct,