From patchwork Thu May 11 22:24:41 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Dumazet X-Patchwork-Id: 761368 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3wP72T1yt3z9s7r for ; Fri, 12 May 2017 08:24:49 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="sNEN1XlT"; dkim-atps=neutral Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756064AbdEKWYp (ORCPT ); Thu, 11 May 2017 18:24:45 -0400 Received: from mail-pg0-f65.google.com ([74.125.83.65]:35022 "EHLO mail-pg0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752472AbdEKWYo (ORCPT ); Thu, 11 May 2017 18:24:44 -0400 Received: by mail-pg0-f65.google.com with SMTP id i63so5098000pgd.2 for ; Thu, 11 May 2017 15:24:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:subject:from:to:cc:date:in-reply-to:references :mime-version:content-transfer-encoding; bh=TaYBi17QsHmOlUcFuockMa6nwCmTOR+YUsRNMXQf1MM=; b=sNEN1XlTXPsLkVMnFgPlBVaZER6rCtOoxzjJ/Sv3ospLdhy++yV2Zy1s8eMT3iAgoR jvJo4pi6zSe3PTX+WKlyvKsCVhsPs3oxBBRNYd0wET2SXZ/T7wz3n8+otFnHJ8/kz6MN ePYmNfYfOqWeR0SWQoyHmpPXAl3udHjsEupCgJDmQxAe8idBBMMCiQ6FiKhmpOptpx/x a0S8ii2BQOqjXtOZqWCETMOUTRbl248cC/4Sz02mUAf2c8vYk+8vZwh8UMwLiS759VG6 Dve2FLcsuJFg+whmdbFT4rvkNyI5eVxzobYAmE2xKZTA0jEJD8vyZZxXbuvHax9OCCQ5 sW5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:subject:from:to:cc:date:in-reply-to :references:mime-version:content-transfer-encoding; bh=TaYBi17QsHmOlUcFuockMa6nwCmTOR+YUsRNMXQf1MM=; b=kdUZRnLYg4dScrjAuB+yRDDSl+sH1fwETUw8oPMTs+/9m3RHrXKeeClWBUZ0MjKcNs ZLXCaqIrj8WcSQJOhQ4tEJ0TUhWZ/xsxuO67WkaZ6qoK9GrmByfVO8NkGq+p2zNoxMYN 2fK/AS81UEpfQsCS1XxizV6UnIFKuXBoo3VPZzuLRIRyPixjoptzrW/cCz+X35W4NQqB Ekaaay9tyLN74N436EutF9nJW9ndxVVGHqyblt8zv23P0EJQbHRQBqHTC1yrtp6Ix5YK NfywTZUbwgJw84pqqPltvnV+Y12eBpEKJSAdWF1jjFY/TR35lMQPAnail3aQDWtHRCES ZZcw== X-Gm-Message-State: AODbwcAWGfykiTM/H90NF6a6Oz/LLEJ1uXt8syJd6sa2FewJcRSckJpa JIs8ZuYHahdYZQ== X-Received: by 10.84.178.131 with SMTP id z3mr1000342plb.175.1494541483287; Thu, 11 May 2017 15:24:43 -0700 (PDT) Received: from ?IPv6:2620:15c:2c1:100:3891:fc4c:c549:5b? ([2620:15c:2c1:100:3891:fc4c:c549:5b]) by smtp.googlemail.com with ESMTPSA id j17sm2194995pfk.23.2017.05.11.15.24.41 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 11 May 2017 15:24:42 -0700 (PDT) Message-ID: <1494541481.7796.124.camel@edumazet-glaptop3.roam.corp.google.com> Subject: [PATCH net] netem: fix skb_orphan_partial() From: Eric Dumazet To: Michael Madsen , David Miller Cc: Stephen Hemminger , Eric Dumazet , netdev@vger.kernel.org Date: Thu, 11 May 2017 15:24:41 -0700 In-Reply-To: <1494531770.7796.115.camel@edumazet-glaptop3.roam.corp.google.com> References: <20170511094758.2112b00f@xeon-e3> <1494522379.7796.112.camel@edumazet-glaptop3.roam.corp.google.com> <1494531770.7796.115.camel@edumazet-glaptop3.roam.corp.google.com> X-Mailer: Evolution 3.10.4-0ubuntu2 Mime-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Eric Dumazet I should have known that lowering skb->truesize was dangerous :/ In case packets are not leaving the host via a standard Ethernet device, but looped back to local sockets, bad things can happen, as reported by Michael Madsen ( https://bugzilla.kernel.org/show_bug.cgi?id=195713 ) So instead of tweaking skb->truesize, lets change skb->destructor and keep a reference on the owner socket via its sk_refcnt. Fixes: f2f872f9272a ("netem: Introduce skb_orphan_partial() helper") Signed-off-by: Eric Dumazet Reported-by: Michael Madsen --- net/core/sock.c | 20 ++++++++------------ 1 file changed, 8 insertions(+), 12 deletions(-) diff --git a/net/core/sock.c b/net/core/sock.c index 79c6aee6af9b817bd7086f04ae8f46342a3bf4b6..e43e71d7856b385111cd4c4b1bd835a78c670c60 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -1803,28 +1803,24 @@ EXPORT_SYMBOL(skb_set_owner_w); * delay queue. We want to allow the owner socket to send more * packets, as if they were already TX completed by a typical driver. * But we also want to keep skb->sk set because some packet schedulers - * rely on it (sch_fq for example). So we set skb->truesize to a small - * amount (1) and decrease sk_wmem_alloc accordingly. + * rely on it (sch_fq for example). */ void skb_orphan_partial(struct sk_buff *skb) { - /* If this skb is a TCP pure ACK or already went here, - * we have nothing to do. 2 is already a very small truesize. - */ - if (skb->truesize <= 2) + if (skb_is_tcp_pure_ack(skb)) return; - /* TCP stack sets skb->ooo_okay based on sk_wmem_alloc, - * so we do not completely orphan skb, but transfert all - * accounted bytes but one, to avoid unexpected reorders. - */ if (skb->destructor == sock_wfree #ifdef CONFIG_INET || skb->destructor == tcp_wfree #endif ) { - atomic_sub(skb->truesize - 1, &skb->sk->sk_wmem_alloc); - skb->truesize = 1; + struct sock *sk = skb->sk; + + if (atomic_inc_not_zero(&sk->sk_refcnt)) { + atomic_sub(skb->truesize, &sk->sk_wmem_alloc); + skb->destructor = sock_efree; + } } else { skb_orphan(skb); }