From patchwork Wed Sep 30 04:54:25 2009 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Eric Dumazet X-Patchwork-Id: 34511 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.176.167]) by ozlabs.org (Postfix) with ESMTP id DB49BB7BBE for ; Wed, 30 Sep 2009 15:02:31 +1000 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751156AbZI3EzB (ORCPT ); Wed, 30 Sep 2009 00:55:01 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751137AbZI3EzA (ORCPT ); Wed, 30 Sep 2009 00:55:00 -0400 Received: from gw1.cosmosbay.com ([212.99.114.194]:57450 "EHLO gw1.cosmosbay.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750887AbZI3EzA (ORCPT ); Wed, 30 Sep 2009 00:55:00 -0400 Received: from [127.0.0.1] (localhost [127.0.0.1]) by gw1.cosmosbay.com (8.13.7/8.13.7) with ESMTP id n8U4sRga019859; Wed, 30 Sep 2009 06:54:27 +0200 Message-ID: <4AC2E481.5060509@gmail.com> Date: Wed, 30 Sep 2009 06:54:25 +0200 From: Eric Dumazet User-Agent: Thunderbird 2.0.0.23 (Windows/20090812) MIME-Version: 1.0 To: Jason Gunthorpe CC: netdev@vger.kernel.org, "David S. Miller" , Volker Lendecke Subject: Re: Splice on blocking TCP sockets again.. References: <20090930004820.GC19540@obsidianresearch.com> In-Reply-To: <20090930004820.GC19540@obsidianresearch.com> X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-1.6 (gw1.cosmosbay.com [0.0.0.0]); Wed, 30 Sep 2009 06:54:28 +0200 (CEST) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Jason Gunthorpe a écrit : > Eric, > > I saw your patch from January regarding splicing on blocking sockets, > and I wondered what ever happened to it? > > http://lkml.org/lkml/2009/1/13/507 > > It doesn't look like it has been applied.. I see the patch thread died > at davem's comments? > > I have run into exactly the same problem as Samba, where I'd like the > TCP socket to be blocking, and the pipe to be non blocking ... > > As it stands, > splice(socket,0,pipe,0,128*1024,SPLICE_F_MOVE); > causes a random endless block and > splice(socket,0,pipe,0,128*1024,SPLICE_F_MOVE | SPLICE_F_NONBLOCK); > will return 0 immediately if the TCP buffer is empty. > > FWIW, it looks like samba has a splice code now, but doesn't enable it > due to this issue? > > http://git.samba.org/?p=samba.git;a=history;f=source3/lib/recvfile.c;h=ea0159642137390a0f7e57a123684e6e63e47581;hb=HEAD > > Thanks, > Jason Hi Jason, thanks for this reminding Hmm, most probably I did not replied correctly do David objection which was : Date Wed, 14 Jan 2009 20:58:39 -0800 (PST) Subject Re: maximum buffer size for splice(2) tcp->pipe? From David Miller <> > From: Eric Dumazet > Date: Wed, 14 Jan 2009 00:38:32 +0100 > [PATCH] net: splice() from tcp to socket should take into account O_NONBLOCK > > Instead of using SPLICE_F_NONBLOCK to select a non blocking mode both on > source tcp socket and pipe destination, we use the underlying file flag (O_NONBLOCK) > for selecting a non blocking socket. > > Signed-off-by: Eric Dumazet This needs at least some more thought. It seems, for one thing, that this change will interfere with the intentions of the code in splice_dirt_to_actor which goes: /* * Don't block on output, we have to drain the direct pipe. */ sd->flags &= ~SPLICE_F_NONBLOCK; ------------------------------------------------------------------------------ But splice_dist_to_actor() handles a REG/BLK file as input and a pipe as output, so I believe my patch wont change splice_dist_to_actor() behavior. My patch title was wrong : net: splice() from tcp to socket should take into account O_NONBLOCK So maybe David was mistaken by the title :) [PATCH] net: splice() from tcp to pipe should take into account O_NONBLOCK Before this patch : splice(socket,0,pipe,0,128*1024,SPLICE_F_MOVE); causes a random endless block (if pipe is full) and splice(socket,0,pipe,0,128*1024,SPLICE_F_MOVE | SPLICE_F_NONBLOCK); will return 0 immediately if the TCP buffer is empty. User application has no way to instruct splice() that socket should be in blocking mode but pipe in nonblock more. http://git.samba.org/?p=samba.git;a=history;f=source3/lib/recvfile.c;h=ea0159642137390a0f7e57a123684e6e63e47581;hb=HEAD One way to handle this is to switch tcp_read() to use the underlying file O_NONBLOCK flag, as other socket operations do. And let SPLICE_F_NONBLOCK control the pipe output only. Users will then call : splice(socket,0,pipe,0,128*1024,SPLICE_F_MOVE | SPLICE_F_NONBLOCK ); to block on data coming from socket (if file is in blocking mode), and not block on pipe output (to avoid deadlock) Reported-by: Volker Lendecke Reported-by: Jason Gunthorpe Signed-off-by: Eric Dumazet Signed-off-by: Linus Torvalds --- -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 21387eb..8cdfab6 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -580,7 +580,7 @@ ssize_t tcp_splice_read(struct socket *sock, loff_t *ppos, lock_sock(sk); - timeo = sock_rcvtimeo(sk, flags & SPLICE_F_NONBLOCK); + timeo = sock_rcvtimeo(sk, sock->file->f_flags & O_NONBLOCK); while (tss.len) { ret = __tcp_splice_read(sk, &tss); if (ret < 0)