From patchwork Fri Jan 11 23:07:24 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: yupeng X-Patchwork-Id: 1023800 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="X2xrHRo7"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 43bz6B4W6qz9s9h for ; Sat, 12 Jan 2019 10:07:30 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726230AbfAKXH2 (ORCPT ); Fri, 11 Jan 2019 18:07:28 -0500 Received: from mail-pl1-f196.google.com ([209.85.214.196]:45724 "EHLO mail-pl1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725536AbfAKXH2 (ORCPT ); Fri, 11 Jan 2019 18:07:28 -0500 Received: by mail-pl1-f196.google.com with SMTP id a14so7388777plm.12 for ; Fri, 11 Jan 2019 15:07:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=Pvdi5VWaZ65IspMC1A0lW6Cp8yjRWLmcbtphaT0cBXA=; b=X2xrHRo7CnxheTO42ZPOq0JBc8SYCyF5GEhdqF1IDsIkK6/ies1YGQnw7gjaFiZpHd 9kF+N3oPcrIyGkqoeI/f3Smt6QPrux9dyagcN/ZE50FTQyT4mz6eoE4v2jzX1YWrXIwd x0239jbpke/p/VAF1AUU0vj0zoD0D8nlPQ3H6ut14vC1ngg3SkWdn2rdsJWDyoqVsNmX Nwr4zej91TlOEJUWE56F8sgsD/Gc0x7Zmy/D/QqltWoF7z0p0Rr/OiNtgCZPTlkCLAug cYdXj50dMyM3iRpvSwS67Nanqj1+IowshiH64PDvOGwaH5FzVNl7u6MT9H9c004ABcmQ yxHQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=Pvdi5VWaZ65IspMC1A0lW6Cp8yjRWLmcbtphaT0cBXA=; b=YIP5auNKzuTeaQEvQIkuc3nsKhNnl6vSCV0oiRw4Cj0Cso8HXPYdToeDd3KYOEqh6u j2Ct2QZeuNZIKJSqXxPxDH+N1xJJGYvMN1KXvw7LsP+qah/wyzfhqDlIDVo1b1R0MIBw iJ0/Z5IL9YaAcvgICI17gTkdVxnIXWlA497+jo6SENrbrKRD9hsKcnbw/ttWKCZjVSi8 G5EhclvLWTySYnHs4O5ARIXjJu+vGFzZDQkB78HCNB0v31Cq10o4FjJeLxPmxHOZ3Rfa X7WAry3p2KyEl7PAPQLE5EfU5aZuaSNtPu/O41SKrPEzy4oqnuOpcOl2p8pP9fA6/FMp kEJQ== X-Gm-Message-State: AJcUukf2Y3BoqNDss6u//8DLpFFtx2PKGkTok47t20lIkTtFlU9GHe9C kSjkEGm3h4Z0etUGlkVyeAsy+4c+1wU= X-Google-Smtp-Source: ALg8bN4pceFrdhancGrlMZpfBI1UOYgvhew3d1Z2/bw9ByYwDTtBrKy4eS15/fhGPwDslAqLgRwJ5Q== X-Received: by 2002:a17:902:66e6:: with SMTP id e93mr16114805plk.92.1547248045838; Fri, 11 Jan 2019 15:07:25 -0800 (PST) Received: from localhost.localdomain ([2601:602:9602:6598:a474:1a00:33d1:8b97]) by smtp.gmail.com with ESMTPSA id x27sm147218623pfe.178.2019.01.11.15.07.25 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 11 Jan 2019 15:07:25 -0800 (PST) From: yupeng To: netdev@vger.kernel.org, davem@davemloft.net Cc: xiyou.wangcong@gmail.com, rdunlap@infradead.org Subject: [PATCH net] add document for several snmp counters Date: Fri, 11 Jan 2019 15:07:24 -0800 Message-Id: <20190111230724.7133-1-yupeng0921@gmail.com> X-Mailer: git-send-email 2.17.1 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org add document for below counters: TcpEstabResets TcpAttemptFails TcpOutRsts TcpExtTCPSACKDiscard TcpExtTCPDSACKIgnoredOld TcpExtTCPDSACKIgnoredNoUndo TcpExtTCPSackShifted TcpExtTCPSackMerged TcpExtTCPSackShiftFallback TcpExtTCPWantZeroWindowAdv TcpExtTCPToZeroWindowAdv TcpExtTCPFromZeroWindowAdv TcpExtDelayedACKs TcpExtDelayedACKLocked TcpExtDelayedACKLost TcpExtTCPLossProbes TcpExtTCPLossProbeRecovery Signed-off-by: yupeng --- Documentation/networking/snmp_counter.rst | 130 +++++++++++++++++++++- 1 file changed, 125 insertions(+), 5 deletions(-) diff --git a/Documentation/networking/snmp_counter.rst b/Documentation/networking/snmp_counter.rst index b0dfdaaca512..fe8f741193be 100644 --- a/Documentation/networking/snmp_counter.rst +++ b/Documentation/networking/snmp_counter.rst @@ -336,7 +336,26 @@ time client replies ACK, this socket will get another chance to move to the accept queue. -TCP Fast Open +* TcpEstabResets +Defined in `RFC1213 tcpEstabResets`_. + +.. _RFC1213 tcpEstabResets: https://tools.ietf.org/html/rfc1213#page-48 + +* TcpAttemptFails +Defined in `RFC1213 tcpAttemptFails`_. + +.. _RFC1213 tcpAttemptFails: https://tools.ietf.org/html/rfc1213#page-48 + +* TcpOutRsts +Defined in `RFC1213 tcpOutRsts`_. The RFC says this counter indicates +the 'segments sent containing the RST flag', but in linux kernel, this +couner indicates the segments kerenl tried to send. The sending +process might be failed due to some errors (e.g. memory alloc failed). + +.. _RFC1213 tcpOutRsts: https://tools.ietf.org/html/rfc1213#page-52 + + +TCP Fast Path ============ When kernel receives a TCP packet, it has two paths to handler the packet, one is fast path, another is slow path. The comment in kernel @@ -383,8 +402,6 @@ increase 1. TCP abort ======== - - * TcpExtTCPAbortOnData It means TCP layer has data in flight, but need to close the connection. So TCP layer sends a RST to the other side, indicate the @@ -545,7 +562,6 @@ packet yet, the sender would know packet 4 is out of order. The TCP stack of kernel will increase TcpExtTCPSACKReorder for both of the above scenarios. - DSACK ===== The DSACK is defined in `RFC2883`_. The receiver uses DSACK to report @@ -566,13 +582,63 @@ The TCP stack receives an out of order duplicate packet, so it sends a DSACK to the sender. * TcpExtTCPDSACKRecv -The TCP stack receives a DSACK, which indicate an acknowledged +The TCP stack receives a DSACK, which indicates an acknowledged duplicate packet is received. * TcpExtTCPDSACKOfoRecv The TCP stack receives a DSACK, which indicate an out of order duplicate packet is received. +invalid SACK and DSACK +==================== +When a SACK (or DSACK) block is invalid, a corresponding counter would +be updated. The validation method is base on the start/end sequence +number of the SACK block. For more details, please refer the comment +of the function tcp_is_sackblock_valid in the kernel source code. A +SACK option could have up to 4 blocks, they are checked +individually. E.g., if 3 blocks of a SACk is invalid, the +corresponding counter would be updated 3 times. The comment of the +`Add counters for discarded SACK blocks`_ patch has additional +explaination: + +.. _Add counters for discarded SACK blocks: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=18f02545a9a16c9a89778b91a162ad16d510bb32 + +* TcpExtTCPSACKDiscard +This counter indicates how many SACK blocks are invalid. If the invalid +SACK block is caused by ACK recording, the TCP stack will only ignore +it and won't update this counter. + +* TcpExtTCPDSACKIgnoredOld and TcpExtTCPDSACKIgnoredNoUndo +When a DSACK block is invalid, one of these two counters would be +updated. Which counter will be updated depends on the undo_marker flag +of the TCP socket. If the undo_marker is not set, the TCP stack isn't +likely to re-transmit any packets, and we still receive an invalid +DSACK block, the reason might be that the packet is duplicated in the +middle of the network. In such scenario, TcpExtTCPDSACKIgnoredNoUndo +will be updated. If the undo_marker is set, TcpExtTCPDSACKIgnoredOld +will be updated. As implied in its name, it might be an old packet. + +SACK shift +========= +The linux networking stack stores data in sk_buff struct (skb for +short). If a SACK block acrosses multiple skb, the TCP stack will try +to re-arrange data in these skb. E.g. if a SACK block acknowledges seq +10 to 15, skb1 has seq 10 to 13, skb2 has seq 14 to 20. The seq 14 and +15 in skb2 would be moved to skb1. This operation is 'shift'. If a +SACK block acknowledges seq 10 to 20, skb1 has seq 10 to 13, skb2 has +seq 14 to 20. All data in skb2 will be moved to skb1, and skb2 will be +discard, this operation is 'merge'. + +* TcpExtTCPSackShifted +A skb is shifted + +* TcpExtTCPSackMerged +A skb is merged + +* TcpExtTCPSackShiftFallback +A skb should be shifted or merged, but the TCP stack doesn't do it for +some reasons. + TCP out of order =============== * TcpExtTCPOFOQueue @@ -662,6 +728,60 @@ unacknowledged number (more strict than `RFC 5961 section 5.2`_). .. _RFC 5961 section 4.2: https://tools.ietf.org/html/rfc5961#page-9 .. _RFC 5961 section 5.2: https://tools.ietf.org/html/rfc5961#page-11 +TCP receive window +================= +* TcpExtTCPWantZeroWindowAdv +Depending on current memory usage, the TCP stack tries to set receive +window to zero. But the receive window might still be a no-zero +value. For example, if the previous window size is 10, and the TCP +stack receives 3 bytes, the current window size would be 7 even if the +window size calculated by the memory usage is zero. + +* TcpExtTCPToZeroWindowAdv +The TCP receive window is set to zero from a no-zero value. + +* TcpExtTCPFromZeroWindowAdv +The TCP receive window is set to no-zero value from zero. + + +Delayed ACK +========== +The TCP Delayed ACK is a technique which is used for reducing the +packet count in the network. For more details, please refer the +`Delayed ACK wiki`_ + +.. _Delayed ACK wiki: https://en.wikipedia.org/wiki/TCP_delayed_acknowledgment + +* TcpExtDelayedACKs +A delayed ACK timer expires. The TCP stack will send a pure ACK packet +and exit the delayed ACK mode. + +* TcpExtDelayedACKLocked +A delayed ACK timer expires, but the TCP stack can't send an ACK +immediately due to the socket is locked by a userspace program. The +TCP stack will send a pure ACK later (after the userspace program +unlock the socket). When the TCP stack sends the pure ACK later, the +TCP stack will also update TcpExtDelayedACKs and exit the delayed ACK +mode. + +* TcpExtDelayedACKLost +It will be updated when the TCP stack receives a packet which has been +ACKed. A Delayed ACK loss might cause this issue, but it would also be +triggered by other reasons, such as a packet is duplicated in the +network. + +Tail Loss Probe (TLP) +=================== +TLP is an algorithm which is used to detect TCP packet loss. For more +details, please refer the `TLP paper`_. + +.. _TLP paper: https://tools.ietf.org/html/draft-dukkipati-tcpm-tcp-loss-probe-01 + +* TcpExtTCPLossProbes +A TLP probe packet is sent. + +* TcpExtTCPLossProbeRecovery +A packet loss is detected and recovered by TLP. examples =======