diff mbox

[net] tcp: flush DMA queue before sk_wait_data if rcv_wnd is zero

Message ID 20120914170432.C8B8DC6405@unicorn.suse.cz
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Michal Kubecek Sept. 14, 2012, 2:59 p.m. UTC
If recv() syscall is called for a TCP socket so that
  - IOAT DMA is used
  - MSG_WAITALL flag is used
  - requested length is bigger than sk_rcvbuf
  - enough data has already arrived to bring rcv_wnd to zero
then when tcp_recvmsg() gets to calling sk_wait_data(), receive
window can be still zero while sk_async_wait_queue exhausts
enough space to keep it zero. As this queue isn't cleaned until
the tcp_service_net_dma() call, sk_wait_data() cannot receive
any data and blocks forever.

If zero receive window and non-empty sk_async_wait_queue is
detected before calling sk_wait_data(), process the queue first.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Cc: <stable@vger.kernel.org>
---
 net/ipv4/tcp.c |   10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

Comments

David Miller Sept. 19, 2012, 8:10 p.m. UTC | #1
From: Michal Kubecek <mkubecek@suse.cz>
Date: Fri, 14 Sep 2012 16:59:52 +0200

> If recv() syscall is called for a TCP socket so that
>   - IOAT DMA is used
>   - MSG_WAITALL flag is used
>   - requested length is bigger than sk_rcvbuf
>   - enough data has already arrived to bring rcv_wnd to zero
> then when tcp_recvmsg() gets to calling sk_wait_data(), receive
> window can be still zero while sk_async_wait_queue exhausts
> enough space to keep it zero. As this queue isn't cleaned until
> the tcp_service_net_dma() call, sk_wait_data() cannot receive
> any data and blocks forever.
> 
> If zero receive window and non-empty sk_async_wait_queue is
> detected before calling sk_wait_data(), process the queue first.
> 
> Signed-off-by: Michal Kubecek <mkubecek@suse.cz>

Applied and queued up for -stable, thanks.

> Cc: <stable@vger.kernel.org>

Please do not CC: stable on networking bug fixes like this.

Simply ask me to add it to the networking -stable queue instead.

I do not want fixes to propagate immedately to -stable right when they
hit Linus's tree.  Rather, I want them to soak upstream for a while
before they get submitted to -stable.

And that's why I maintain a special queue for networking fixes
which should be submitted to -stable at some point in the future
at:

	http://patchwork.ozlabs.org/user/bundle/2566/?state=*

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 2109ff4..bf9a8ab 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1762,8 +1762,14 @@  int tcp_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 		}
 
 #ifdef CONFIG_NET_DMA
-		if (tp->ucopy.dma_chan)
-			dma_async_memcpy_issue_pending(tp->ucopy.dma_chan);
+		if (tp->ucopy.dma_chan) {
+			if (tp->rcv_wnd == 0 &&
+			    !skb_queue_empty(&sk->sk_async_wait_queue)) {
+				tcp_service_net_dma(sk, true);
+				tcp_cleanup_rbuf(sk, copied);
+			} else
+				dma_async_memcpy_issue_pending(tp->ucopy.dma_chan);
+		}
 #endif
 		if (copied >= target) {
 			/* Do not sleep, just process backlog. */