From patchwork Mon Jan 28 23:44:53 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Kicinski X-Patchwork-Id: 1032318 Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=netronome.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=netronome-com.20150623.gappssmtp.com header.i=@netronome-com.20150623.gappssmtp.com header.b="tcP8tOlb"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 43pR8Z1n3yz9s9h for ; Tue, 29 Jan 2019 10:45:50 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727100AbfA1Xps (ORCPT ); Mon, 28 Jan 2019 18:45:48 -0500 Received: from mail-qk1-f177.google.com ([209.85.222.177]:43242 "EHLO mail-qk1-f177.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726746AbfA1Xpr (ORCPT ); Mon, 28 Jan 2019 18:45:47 -0500 Received: by mail-qk1-f177.google.com with SMTP id z18so10493590qkj.10 for ; Mon, 28 Jan 2019 15:45:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=netronome-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=FZnGh3WYZ5JR+oXFSV8TQ/w0OQU8BUwGOYC1xK7NyrE=; b=tcP8tOlbLl/n+pVc8jcJf2vcdqUvfTVR0NmZiCeVU4xnb+B8xYKKzJNLH7qgeUbU0E G1wG04rEoDgXQldK5Z4XIzaAJm2imNGYJAsO8MhjQi97psat7yeiVAp6WOiCik6Ku8nD 3nbiJD2lDc01Ihyy7xmuHyca1EmtkO1VMPeWGNwis4oxPt75tLw4WHRDsuo4Yn+MPAJZ wwlyImrFugWbKa2Fft5p+HvatDprflcNkB96U/eCR1I3iiEIvJoozaolqxnjTqqxj1cT Demy/f17MgNEQhZdxp+Hn+Ed0JLwnoo5XfzC9kBzaYQcPzeJ+an2X7MfxB6N4vugUzlv GNAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=FZnGh3WYZ5JR+oXFSV8TQ/w0OQU8BUwGOYC1xK7NyrE=; b=oJe7k8BqsgxP+u3dpNZF9mnilr3ZT0nIZRGBVETEGtiaJM0qgs/F+TnDjVLJxoY2GV MesGQmNuGINjJiSgjtaXV8kcf8HYsqVI7+pEtYVqi8KzulRNg5DSn6XSiCsPyXzT+qKw JhwubpC7d7Vy/qagYWqMJPIg2OIpoRlTQgnoeDLrrD959Sx/rDexAZFZigEyLZ/N791Y 9VUrtEqdePzIaHEWPRRDK3mcOJdtGBh6VphjR+aTD6XaztkzTE4OJN9CvbFQcVP7SKSN kbFz43ZRv8YGNHuVCNuj/kSFCqJrqBMKZHow+KwUUOA1mY9z2Tl9jZ879HQmejm1EJ3X GsGQ== X-Gm-Message-State: AJcUukfYu9x2VqfXdTpkgLrffPBYkYkArDVdNbEw4XhpVawCIGqD6upf dArabEimLa20wvwqjzVgQUXE5w== X-Google-Smtp-Source: ALg8bN5FJAAAHo8oCi2bF4llnslosBS1WA3VVlSbpTQ/3kwEFC9SzC3bJxr74MFrB1nTJ/Ypy8ljiw== X-Received: by 2002:ae9:ec02:: with SMTP id h2mr21680828qkg.84.1548719146456; Mon, 28 Jan 2019 15:45:46 -0800 (PST) Received: from jkicinski-Precision-T1700.netronome.com ([66.60.152.14]) by smtp.gmail.com with ESMTPSA id k81sm34336320qkk.18.2019.01.28.15.45.43 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 28 Jan 2019 15:45:45 -0800 (PST) From: Jakub Kicinski To: davem@davemloft.net Cc: oss-drivers@netronome.com, netdev@vger.kernel.org, jiri@resnulli.us, f.fainelli@gmail.com, andrew@lunn.ch, mkubecek@suse.cz, dsahern@gmail.com, simon.horman@netronome.com, jesse.brandeburg@intel.com, maciejromanfijalkowski@gmail.com, vasundhara-v.volam@broadcom.com, michael.chan@broadcom.com, shalomt@mellanox.com, idosch@mellanox.com, Jakub Kicinski Subject: [RFC 00/14] netlink/hierarchical stats Date: Mon, 28 Jan 2019 15:44:53 -0800 Message-Id: <20190128234507.32028-1-jakub.kicinski@netronome.com> X-Mailer: git-send-email 2.19.2 MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Hi! As I tried to explain in my slides at netconf 2018 we are lacking an expressive, standard API to report device statistics. Networking silicon generally maintains some IEEE 802.3 and/or RMON statistics. Today those all end up in ethtool -S. Here is a simple attempt (admittedly very imprecise) of counting how many names driver authors invented for IETF RFC2819 etherStatsPkts512to1023Octets statistics (RX and TX): $ git grep '".*512.*1023.*"' -- drivers/net/ | \ sed -e 's/.*"\(.*\)".*/\1/' | sort | uniq | wc -l 63 Interestingly only two drivers in the tree use the name the standard gave us (etherStatsPkts512to1023, modulo case). I set out to working on this set in an attempt to give drivers a way to express clearly to user space standard-compliant counters. Second most common use for custom statistics is per-queue counters. This is where the "hierarchical" part of this set comes in, as groups can be nested, and user space tools can handle the aggregation inside the groups if needed. This set also tries to address the problem of users not knowing if a statistic is reported by hardware or the driver. Many modern drivers use some prefix in ethtool -S to indicate MAC/PHY stats. At a quick glance: Netronome uses "mac.", Intel "port." and Mellanox "_phy". In this set, netlink attributes describe whether a group of statistics is RX or TX, maintained by device or driver. The purpose of this patch set is _not_ to replace ethtool -S. It is an incredibly useful tool, and we will certainly continue using it. However, for standard-based and commonly maintained statistics a more structured API seems warranted. There are two things missing from these patches, which I initially planned to address as well: filtering, and refresh rate control. Filtering doesn't need much explanation, users should be able to request only a subset of statistics (like only SW stats or only given ID). The bitmap of statistics in each group is there for filtering later on. By refresh control I mean the ability for user space to indicate how "fresh" values it expects. Sometimes reading the HW counters requires slow register reads or FW communication, in such cases drivers may cache the result. (Privileged) user space should be able to add a "not older than" timestamp to indicate how fresh statistics it expects. And vice versa, drivers can then also put the timestamp of when the statistics were last refreshed in the dump for more precise bandwidth estimation. Jakub Kicinski (14): nfp: remove unused structure nfp: constify parameter to nfp_port_from_netdev() net: hstats: add basic/core functionality net: hstats: allow hierarchies to be built nfp: very basic hstat support net: hstats: allow iterators net: hstats: help in iteration over directions nfp: hstats: make use of iteration for direction nfp: hstats: add driver and device per queue statistics net: hstats: add IEEE 802.3 and common IETF MIB/RMON stats nfp: hstats: add IEEE/RMON ethernet port/MAC stats net: hstats: add markers for partial groups nfp: hstats: add a partial group of per-8021Q prio stats Documentation: networking: describe new hstat API Documentation/networking/hstats.rst | 590 +++++++++++++++ .../networking/hstats_flow_example.dot | 11 + Documentation/networking/index.rst | 1 + drivers/net/ethernet/netronome/nfp/Makefile | 1 + .../net/ethernet/netronome/nfp/nfp_hstat.c | 474 ++++++++++++ drivers/net/ethernet/netronome/nfp/nfp_main.c | 1 + drivers/net/ethernet/netronome/nfp/nfp_main.h | 2 + drivers/net/ethernet/netronome/nfp/nfp_net.h | 10 +- .../ethernet/netronome/nfp/nfp_net_common.c | 1 + .../net/ethernet/netronome/nfp/nfp_net_repr.h | 2 +- drivers/net/ethernet/netronome/nfp/nfp_port.c | 2 +- drivers/net/ethernet/netronome/nfp/nfp_port.h | 2 +- include/linux/netdevice.h | 9 + include/net/hstats.h | 176 +++++ include/uapi/linux/if_link.h | 107 +++ net/core/Makefile | 2 +- net/core/hstats.c | 682 ++++++++++++++++++ net/core/rtnetlink.c | 21 + 18 files changed, 2084 insertions(+), 10 deletions(-) create mode 100644 Documentation/networking/hstats.rst create mode 100644 Documentation/networking/hstats_flow_example.dot create mode 100644 drivers/net/ethernet/netronome/nfp/nfp_hstat.c create mode 100644 include/net/hstats.h create mode 100644 net/core/hstats.c