diff mbox

tcp: account SYN-ACK timeouts & retransmissions

Message ID 1262808658-29346-1-git-send-email-opurdila@ixiacom.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Octavian Purdila Jan. 6, 2010, 8:10 p.m. UTC
Signed-off-by: Maksim Pyatkovskiy <mpyatkovskiy@ixiacom.com>
Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
---
 net/ipv4/inet_connection_sock.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

Comments

David Miller Jan. 8, 2010, 1:25 a.m. UTC | #1
From: Octavian Purdila <opurdila@ixiacom.com>
Date: Wed,  6 Jan 2010 22:10:58 +0200

> Signed-off-by: Maksim Pyatkovskiy <mpyatkovskiy@ixiacom.com>
> Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>

RETRANSSEGS is meant to count data segments retransmits, not pure
control frames.

Similarly for TCPTIMEOUTS, it's mean for data retransmit timeouts.

If you overload these statistics with other similar events, they
become less meaningful.

Finally, bumping TCP specific statistics from the generic INET
connection oriented socket code is a huge no-no.  That code is
written to be protocol agnostic.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Octavian Purdila Jan. 11, 2010, 10:16 p.m. UTC | #2
On Friday 08 January 2010 03:25:07 you wrote:
> From: Octavian Purdila <opurdila@ixiacom.com>
> Date: Wed,  6 Jan 2010 22:10:58 +0200
> 
> > Signed-off-by: Maksim Pyatkovskiy <mpyatkovskiy@ixiacom.com>
> > Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
> 
> RETRANSSEGS is meant to count data segments retransmits, not pure
> control frames.
> 
> Similarly for TCPTIMEOUTS, it's mean for data retransmit timeouts.
> 
> If you overload these statistics with other similar events, they
> become less meaningful.
> 

But we do increment RETRANSSEGS & TCPTIMEOUTS for SYNs. This creates an 
imbalance in client/server stats which makes things harder to diagnose.

> Finally, bumping TCP specific statistics from the generic INET
> connection oriented socket code is a huge no-no.  That code is
> written to be protocol agnostic.
> 

Fair point, I'll fix that up in the next version.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Jan. 12, 2010, 12:15 a.m. UTC | #3
From: Octavian Purdila <opurdila@ixiacom.com>
Date: Tue, 12 Jan 2010 00:16:33 +0200

> On Friday 08 January 2010 03:25:07 you wrote:
>> From: Octavian Purdila <opurdila@ixiacom.com>
>> Date: Wed,  6 Jan 2010 22:10:58 +0200
>> 
>> > Signed-off-by: Maksim Pyatkovskiy <mpyatkovskiy@ixiacom.com>
>> > Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
>> 
>> RETRANSSEGS is meant to count data segments retransmits, not pure
>> control frames.
>> 
>> Similarly for TCPTIMEOUTS, it's mean for data retransmit timeouts.
>> 
>> If you overload these statistics with other similar events, they
>> become less meaningful.
>> 
> 
> But we do increment RETRANSSEGS & TCPTIMEOUTS for SYNs. This creates an 
> imbalance in client/server stats which makes things harder to diagnose.

Interesting.

Can you do me a favor and look into the code history and see
if we used to account both sides?

I wonder if we accidently lost the statistic bumps when the
generic inet connection socket layer was added.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Octavian Purdila Jan. 12, 2010, 4:55 p.m. UTC | #4
On Tuesday 12 January 2010 02:15:37 you wrote:
> From: Octavian Purdila <opurdila@ixiacom.com>
> Date: Tue, 12 Jan 2010 00:16:33 +0200
> 
> > On Friday 08 January 2010 03:25:07 you wrote:
> >> From: Octavian Purdila <opurdila@ixiacom.com>
> >> Date: Wed,  6 Jan 2010 22:10:58 +0200
> >>
> >> > Signed-off-by: Maksim Pyatkovskiy <mpyatkovskiy@ixiacom.com>
> >> > Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
> >>
> >> RETRANSSEGS is meant to count data segments retransmits, not pure
> >> control frames.
> >>
> >> Similarly for TCPTIMEOUTS, it's mean for data retransmit timeouts.
> >>
> >> If you overload these statistics with other similar events, they
> >> become less meaningful.
> >
> > But we do increment RETRANSSEGS & TCPTIMEOUTS for SYNs. This creates an
> > imbalance in client/server stats which makes things harder to diagnose.
> 
> Interesting.
> 
> Can you do me a favor and look into the code history and see
> if we used to account both sides?
> 

Sure. 

> I wonder if we accidently lost the statistic bumps when the
> generic inet connection socket layer was added.
> 

I don't think so because we are seeing this issue in 2.6.7 as well. I'll 
download the CVS history and drill down further.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Octavian Purdila Jan. 13, 2010, 9:07 p.m. UTC | #5
On Tuesday 12 January 2010 18:55:31 you wrote:
> On Tuesday 12 January 2010 02:15:37 you wrote:
> > From: Octavian Purdila <opurdila@ixiacom.com>
> > Date: Tue, 12 Jan 2010 00:16:33 +0200
> >
> > > On Friday 08 January 2010 03:25:07 you wrote:
> > >> From: Octavian Purdila <opurdila@ixiacom.com>
> > >> Date: Wed,  6 Jan 2010 22:10:58 +0200
> > >>
> > >> > Signed-off-by: Maksim Pyatkovskiy <mpyatkovskiy@ixiacom.com>
> > >> > Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
> > >>
> > >> RETRANSSEGS is meant to count data segments retransmits, not pure
> > >> control frames.
> > >>
> > >> Similarly for TCPTIMEOUTS, it's mean for data retransmit timeouts.
> > >>
> > >> If you overload these statistics with other similar events, they
> > >> become less meaningful.
> > >
> > > But we do increment RETRANSSEGS & TCPTIMEOUTS for SYNs. This creates an
> > > imbalance in client/server stats which makes things harder to diagnose.
> >
> > Interesting.
> >
> > Can you do me a favor and look into the code history and see
> > if we used to account both sides?
> 
> Sure.

I think we lost the SynAck accounting with this commit from the netdev-vger-
cvs tree, where tcp_syn_recv_timer was introduced:

commit 2248761e5cfcb8687563b29c77064185e7604947
Author: davem <davem>
Date:   Sun Nov 10 21:25:00 1996 +0000

    Merge to 2.1.8, IPV6 is here.  Also fix
    a stupid bug in the kernel unaligned trap handler
    where st %g0, [foo] would fail miserably.


Before this, both Syns and SynAcks seems to be accounted for.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Jan. 14, 2010, 10:03 a.m. UTC | #6
From: Octavian Purdila <opurdila@ixiacom.com>
Date: Wed, 13 Jan 2010 23:07:59 +0200

> I think we lost the SynAck accounting with this commit from the netdev-vger-
> cvs tree, where tcp_syn_recv_timer was introduced:
> 
> commit 2248761e5cfcb8687563b29c77064185e7604947
> Author: davem <davem>
> Date:   Sun Nov 10 21:25:00 1996 +0000
> 
>     Merge to 2.1.8, IPV6 is here.  Also fix
>     a stupid bug in the kernel unaligned trap handler
>     where st %g0, [foo] would fail miserably.
> 
> 
> Before this, both Syns and SynAcks seems to be accounted for.

I'm convinced.

Please spin your patch with the "doing TCP socket specific
stuff in generic inet connection socket code" part fixed.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index ee16475..6c20043 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -23,6 +23,7 @@ 
 #include <net/route.h>
 #include <net/tcp_states.h>
 #include <net/xfrm.h>
+#include <net/tcp.h>
 
 #ifdef INET_CSK_DEBUG
 const char inet_csk_timer_bug_msg[] = "inet_csk BUG: unknown timer value\n";
@@ -525,7 +526,9 @@  void inet_csk_reqsk_queue_prune(struct sock *parent,
 		while ((req = *reqp) != NULL) {
 			if (time_after_eq(now, req->expires)) {
 				int expire = 0, resend = 0;
+				struct net *net = sock_net(parent);
 
+				NET_INC_STATS_BH(net, LINUX_MIB_TCPTIMEOUTS);
 				syn_ack_recalc(req, thresh, max_retries,
 					       queue->rskq_defer_accept,
 					       &expire, &resend);
@@ -535,6 +538,7 @@  void inet_csk_reqsk_queue_prune(struct sock *parent,
 				     inet_rsk(req)->acked)) {
 					unsigned long timeo;
 
+					TCP_INC_STATS_BH(net, TCP_MIB_RETRANSSEGS);
 					if (req->retrans++ == 0)
 						lopt->qlen_young--;
 					timeo = min((timeout << req->retrans), max_rto);