Message ID | 1361984703.11403.43.camel@edumazet-glaptop |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
From: Eric Dumazet <eric.dumazet@gmail.com> Date: Wed, 27 Feb 2013 09:05:03 -0800 > Processing pure ACK on behalf of the thread blocked in tcp_recvmsg() > is a waste of resources, as thread has to immediately sleep again > because it got no payload. More than one thread can be operating on the socket, the other one could be waiting for the window to open up in order to do a send. Are you absolutely sure that we won't have a problem in that situation? In fact I wonder if that does the right thing right now. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2013-02-27 at 13:04 -0500, David Miller wrote: > From: Eric Dumazet <eric.dumazet@gmail.com> > Date: Wed, 27 Feb 2013 09:05:03 -0800 > > > Processing pure ACK on behalf of the thread blocked in tcp_recvmsg() > > is a waste of resources, as thread has to immediately sleep again > > because it got no payload. > > More than one thread can be operating on the socket, the other one > could be waiting for the window to open up in order to do a send. Are > you absolutely sure that we won't have a problem in that situation? Yes, more than one thread can be operating, but the prequeue wakeups the one blocked in tcp_recvmsg() only, because of : wake_up_interruptible_sync_poll(sk_sleep(sk), POLLIN | POLLRDNORM | POLLRDBAND); Then the ACK processing might/should wakeup the other thread blocked in tcp_sendmsg(). So this patch will also help this (not very usual) situation, as we will only wakeup the tcp_sendmsg() thread when ACK is processed from softirq handler, and let the thread blocked in tcp_recvmsg() sleeping. > In fact I wonder if that does the right thing right now. Right now it is working, because at least one thread will process the prequeue at the exit of tcp_recvmsg() -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Eric Dumazet <eric.dumazet@gmail.com> Date: Wed, 27 Feb 2013 10:17:39 -0800 > So this patch will also help this (not very usual) situation, as we will > only wakeup the tcp_sendmsg() thread when ACK is processed from softirq > handler, and let the thread blocked in tcp_recvmsg() sleeping. Ok, thanks for the analysis. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Eric Dumazet <eric.dumazet@gmail.com> Date: Wed, 27 Feb 2013 09:05:03 -0800 > TCP prequeue mechanism purpose is to let incoming packets > being processed by the thread currently blocked in tcp_recvmsg(), > instead of behalf of the softirq handler, to better adapt flow > control on receiver host capacity to schedule the consumer. > > But in typical request/answer workloads, we send request, then > block to receive the answer. And before the actual answer, TCP > stack receives the ACK packets acknowledging the request. > > Processing pure ACK on behalf of the thread blocked in tcp_recvmsg() > is a waste of resources, as thread has to immediately sleep again > because it got no payload. > > This patch avoids the extra context switches and scheduler overhead. ... > Signed-off-by: Eric Dumazet <edumazet@google.com> Applied. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/net/tcp.h b/include/net/tcp.h index 23f2e98..cf0694d 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -1045,6 +1045,10 @@ static inline bool tcp_prequeue(struct sock *sk, struct sk_buff *skb) if (sysctl_tcp_low_latency || !tp->ucopy.task) return false; + if (skb->len <= tcp_hdrlen(skb) && + skb_queue_len(&tp->ucopy.prequeue) == 0) + return false; + __skb_queue_tail(&tp->ucopy.prequeue, skb); tp->ucopy.memory += skb->truesize; if (tp->ucopy.memory > sk->sk_rcvbuf) {