From patchwork Tue Apr 19 02:27:04 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Martin KaFai Lau X-Patchwork-Id: 611958 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3qppnm00PCz9syq for ; Tue, 19 Apr 2016 12:27:39 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=fb.com header.i=@fb.com header.b=mAhhCUjO; dkim-atps=neutral Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752159AbcDSC1Q (ORCPT ); Mon, 18 Apr 2016 22:27:16 -0400 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:29218 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751896AbcDSC1P (ORCPT ); Mon, 18 Apr 2016 22:27:15 -0400 Received: from pps.filterd (m0044012.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.0.11/8.16.0.11) with SMTP id u3J2P6JV019623; Mon, 18 Apr 2016 19:27:13 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=fb.com; h=date : from : to : cc : subject : message-id : references : mime-version : content-type : in-reply-to; s=facebook; bh=svJybrT3J1SnEGdd7SouFn3b8cFiOJPaNCzeOGG0sro=; b=mAhhCUjOuD3jlM6nnJXKeOlfD9hV/5mTiJa57IePvI98Xrn3nllPXmZw7fAvigNsyCkD pfsSxHaso5+S06ku0QX1L+fdYCnYoWGmvcdusLRLF/rSaCUGiHWAovbD0hkYxSvVGV6y gMTvzb4fwKW/FlRdSNyiIruqVzN9hFtEo/w= Received: from mail.thefacebook.com ([199.201.64.23]) by mx0a-00082601.pphosted.com with ESMTP id 22da3x0d44-1 (version=TLSv1 cipher=ECDHE-RSA-AES256-SHA bits=256 verify=NOT); Mon, 18 Apr 2016 19:27:13 -0700 Received: from dreloong-mbp.local.DHCP.thefacebook.com (192.168.52.123) by mail.thefacebook.com (192.168.16.11) with Microsoft SMTP Server (TLS) id 14.3.248.2; Mon, 18 Apr 2016 19:27:09 -0700 Date: Mon, 18 Apr 2016 19:27:04 -0700 From: Martin KaFai Lau To: Eric Dumazet CC: , Eric Dumazet , Neal Cardwell , Soheil Hassas Yeganeh , Willem de Bruijn , Yuchung Cheng , Kernel Team Subject: Re: [RFC PATCH v2 net-next 4/7] tcp: Make use of MSG_EOR flag in tcp_sendmsg Message-ID: <20160419022704.GB35817@dreloong-mbp.local.DHCP.thefacebook.com> References: <1461019569-3037369-1-git-send-email-kafai@fb.com> <1461019569-3037369-5-git-send-email-kafai@fb.com> <1461021493.10638.131.camel@edumazet-glaptop3.roam.corp.google.com> <20160418234202.GA27948@kafai-mba.local> <1461024417.10638.141.camel@edumazet-glaptop3.roam.corp.google.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <1461024417.10638.141.camel@edumazet-glaptop3.roam.corp.google.com> User-Agent: Mutt/1.5.23.1 (2014-03-12) X-Originating-IP: [192.168.52.123] X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2016-04-19_03:, , signatures=0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Mon, Apr 18, 2016 at 05:06:57PM -0700, Eric Dumazet wrote: > It should only be a request from user space to ask TCP to not aggregate > stuff on future sendpage()/sendmsg() on the skb carrying this new flag. > How about something like this. Please advise if tcp_sendmsg_noappend can be simpler. --- 2.5.1 diff --git a/include/net/tcp.h b/include/net/tcp.h index c0ef054..ac31798 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -762,7 +762,8 @@ struct tcp_skb_cb { __u8 ip_dsfield; /* IPv4 tos or IPv6 dsfield */ __u8 txstamp_ack:1, /* Record TX timestamp for ack? */ - unused:7; + eor:1, /* Is skb MSG_EOR marked */ + unused:6; __u32 ack_seq; /* Sequence number ACK'd */ union { struct inet_skb_parm h4; diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 4d73858..12772be 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -874,6 +874,13 @@ static int tcp_send_mss(struct sock *sk, int *size_goal, int flags) return mss_now; } +static bool tcp_sendmsg_noappend(const struct sock *sk) +{ + const struct sk_buff *skb = tcp_write_queue_tail(sk); + + return (!skb || TCP_SKB_CB(skb)->eor); +} + static ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset, size_t size, int flags) { @@ -903,6 +910,9 @@ static ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset, if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN)) goto out_err; + if (tcp_sendmsg_noappend(sk)) + goto new_segment; + while (size > 0) { struct sk_buff *skb = tcp_write_queue_tail(sk); int copy, i; @@ -960,6 +970,7 @@ new_segment: size -= copy; if (!size) { tcp_tx_timestamp(sk, sk->sk_tsflags, skb); + TCP_SKB_CB(skb)->eor = !!(flags & MSG_EOR); goto out; } @@ -1145,6 +1156,9 @@ int tcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) sg = !!(sk->sk_route_caps & NETIF_F_SG); + if (tcp_sendmsg_noappend(sk)) + goto new_segment; + while (msg_data_left(msg)) { int copy = 0; int max = size_goal; @@ -1250,6 +1264,7 @@ new_segment: copied += copy; if (!msg_data_left(msg)) { tcp_tx_timestamp(sk, sockc.tsflags, skb); + TCP_SKB_CB(skb)->eor = !!(flags & MSG_EOR); goto out; }