diff mbox series

[v5,1/2] tcp: Add TCP_INFO counter for packets received out-of-order

Message ID 20190913232332.44036-1-tph@fb.com
State Accepted
Delegated to: David Miller
Headers show
Series [v5,1/2] tcp: Add TCP_INFO counter for packets received out-of-order | expand

Commit Message

Thomas Higdon Sept. 13, 2019, 11:23 p.m. UTC
For receive-heavy cases on the server-side, we want to track the
connection quality for individual client IPs. This counter, similar to
the existing system-wide TCPOFOQueue counter in /proc/net/netstat,
tracks out-of-order packet reception. By providing this counter in
TCP_INFO, it will allow understanding to what degree receive-heavy
sockets are experiencing out-of-order delivery and packet drops
indicating congestion.

Please note that this is similar to the counter in NetBSD TCP_INFO, and
has the same name.

Also note that we avoid increasing the size of the tcp_sock struct by
taking advantage of a hole.

Signed-off-by: Thomas Higdon <tph@fb.com>
---
changes since v4:
 - optimize placement of rcv_ooopack to avoid increasing tcp_sock struct
   size

 include/linux/tcp.h      | 2 ++
 include/uapi/linux/tcp.h | 2 ++
 net/ipv4/tcp.c           | 2 ++
 net/ipv4/tcp_input.c     | 1 +
 4 files changed, 7 insertions(+)

Comments

Neal Cardwell Sept. 14, 2019, 3:43 p.m. UTC | #1
On Fri, Sep 13, 2019 at 7:23 PM Thomas Higdon <tph@fb.com> wrote:
>
> For receive-heavy cases on the server-side, we want to track the
> connection quality for individual client IPs. This counter, similar to
> the existing system-wide TCPOFOQueue counter in /proc/net/netstat,
> tracks out-of-order packet reception. By providing this counter in
> TCP_INFO, it will allow understanding to what degree receive-heavy
> sockets are experiencing out-of-order delivery and packet drops
> indicating congestion.
>
> Please note that this is similar to the counter in NetBSD TCP_INFO, and
> has the same name.
>
> Also note that we avoid increasing the size of the tcp_sock struct by
> taking advantage of a hole.
>
> Signed-off-by: Thomas Higdon <tph@fb.com>
> ---
> changes since v4:
>  - optimize placement of rcv_ooopack to avoid increasing tcp_sock struct
>    size


Acked-by: Neal Cardwell <ncardwell@google.com>

Thanks, Thomas, for adding this!

After this is merged, would you mind sending a patch to add support to
the "ss" command line tool to print these 2 new fields?

My favorite recent example of such a patch to ss is Eric's change:
  https://git.kernel.org/pub/scm/network/iproute2/iproute2.git/commit/misc/ss.c?id=5eead6270a19f00464052d4084f32182cfe027ff

thanks,
neal
David Miller Sept. 16, 2019, 2:39 p.m. UTC | #2
From: Thomas Higdon <tph@fb.com>
Date: Fri, 13 Sep 2019 23:23:34 +0000

> For receive-heavy cases on the server-side, we want to track the
> connection quality for individual client IPs. This counter, similar to
> the existing system-wide TCPOFOQueue counter in /proc/net/netstat,
> tracks out-of-order packet reception. By providing this counter in
> TCP_INFO, it will allow understanding to what degree receive-heavy
> sockets are experiencing out-of-order delivery and packet drops
> indicating congestion.
> 
> Please note that this is similar to the counter in NetBSD TCP_INFO, and
> has the same name.
> 
> Also note that we avoid increasing the size of the tcp_sock struct by
> taking advantage of a hole.
> 
> Signed-off-by: Thomas Higdon <tph@fb.com>

Applied.
Thomas Higdon Sept. 16, 2019, 5:42 p.m. UTC | #3
On Sat, Sep 14, 2019 at 11:43:25AM -0400, Neal Cardwell wrote:
> On Fri, Sep 13, 2019 at 7:23 PM Thomas Higdon <tph@fb.com> wrote:
> >
> > For receive-heavy cases on the server-side, we want to track the
> > connection quality for individual client IPs. This counter, similar to
> > the existing system-wide TCPOFOQueue counter in /proc/net/netstat,
> > tracks out-of-order packet reception. By providing this counter in
> > TCP_INFO, it will allow understanding to what degree receive-heavy
> > sockets are experiencing out-of-order delivery and packet drops
> > indicating congestion.
> >
> > Please note that this is similar to the counter in NetBSD TCP_INFO, and
> > has the same name.
> >
> > Also note that we avoid increasing the size of the tcp_sock struct by
> > taking advantage of a hole.
> >
> > Signed-off-by: Thomas Higdon <tph@fb.com>
> > ---
> > changes since v4:
> >  - optimize placement of rcv_ooopack to avoid increasing tcp_sock struct
> >    size
> 
> 
> Acked-by: Neal Cardwell <ncardwell@google.com>
> 
> Thanks, Thomas, for adding this!
> 
> After this is merged, would you mind sending a patch to add support to
> the "ss" command line tool to print these 2 new fields?
> 
> My favorite recent example of such a patch to ss is Eric's change:
>   https://git.kernel.org/pub/scm/network/iproute2/iproute2.git/commit/misc/ss.c?id=5eead6270a19f00464052d4084f32182cfe027ff

Yes, and thank you for the help in getting this into a good state!

From looking at that "ss" patch, it seems like we would need to wait
until iproute2-next's include/uapi/linux/tcp.h has received a merge from
kernel net-next before we'd be able to apply a patch for "ss" that uses
the new fields.

In the meantime, as you've asked, I will go ahead and send a patch for
iproute2-next's "ss" with the assumption that these tcpinfo changes have
already been merged.
diff mbox series

Patch

diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index f3a85a7fb4b1..99617e528ea2 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -354,6 +354,8 @@  struct tcp_sock {
 #define BPF_SOCK_OPS_TEST_FLAG(TP, ARG) 0
 #endif
 
+	u32 rcv_ooopack; /* Received out-of-order packets, for tcpinfo */
+
 /* Receiver side RTT estimation */
 	u32 rcv_rtt_last_tsecr;
 	struct {
diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
index b3564f85a762..20237987ccc8 100644
--- a/include/uapi/linux/tcp.h
+++ b/include/uapi/linux/tcp.h
@@ -270,6 +270,8 @@  struct tcp_info {
 	__u64	tcpi_bytes_retrans;  /* RFC4898 tcpEStatsPerfOctetsRetrans */
 	__u32	tcpi_dsack_dups;     /* RFC4898 tcpEStatsStackDSACKDups */
 	__u32	tcpi_reord_seen;     /* reordering events seen */
+
+	__u32	tcpi_rcv_ooopack;    /* Out-of-order packets received */
 };
 
 /* netlink attributes types for SCM_TIMESTAMPING_OPT_STATS */
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 94df48bcecc2..4cf58208270e 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2653,6 +2653,7 @@  int tcp_disconnect(struct sock *sk, int flags)
 	tp->rx_opt.saw_tstamp = 0;
 	tp->rx_opt.dsack = 0;
 	tp->rx_opt.num_sacks = 0;
+	tp->rcv_ooopack = 0;
 
 
 	/* Clean up fastopen related fields */
@@ -3295,6 +3296,7 @@  void tcp_get_info(struct sock *sk, struct tcp_info *info)
 	info->tcpi_bytes_retrans = tp->bytes_retrans;
 	info->tcpi_dsack_dups = tp->dsack_dups;
 	info->tcpi_reord_seen = tp->reord_seen;
+	info->tcpi_rcv_ooopack = tp->rcv_ooopack;
 	unlock_sock_fast(sk, slow);
 }
 EXPORT_SYMBOL_GPL(tcp_get_info);
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 706cbb3b2986..2ef333354026 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4555,6 +4555,7 @@  static void tcp_data_queue_ofo(struct sock *sk, struct sk_buff *skb)
 	tp->pred_flags = 0;
 	inet_csk_schedule_ack(sk);
 
+	tp->rcv_ooopack += max_t(u16, 1, skb_shinfo(skb)->gso_segs);
 	NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPOFOQUEUE);
 	seq = TCP_SKB_CB(skb)->seq;
 	end_seq = TCP_SKB_CB(skb)->end_seq;