From patchwork Fri Apr 20 15:55:39 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Dumazet X-Patchwork-Id: 902001 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="RLFPqRCB"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 40SL7v2mZqz9s1p for ; Sat, 21 Apr 2018 01:56:43 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755863AbeDTPzw (ORCPT ); Fri, 20 Apr 2018 11:55:52 -0400 Received: from mail-pf0-f170.google.com ([209.85.192.170]:44665 "EHLO mail-pf0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755851AbeDTPzt (ORCPT ); Fri, 20 Apr 2018 11:55:49 -0400 Received: by mail-pf0-f170.google.com with SMTP id p15so4475952pff.11 for ; Fri, 20 Apr 2018 08:55:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=4p9GENac9bMC39tvlGVVbyEpYdtaM0ykcuE5gbgRgK0=; b=RLFPqRCBPd9+qGQ1Wa4yQdVE0WZ339xWxdkZczW9n3c/ohiPxBlhNJGLENJTieuJry 2tBVZs6dLAfB0F9IfdGSaI7eFMIWaC+EcPuRRSgSLqDeYSdUpj0LOppuXFbqcvHQNgUK 543XTH1e5bhfdTVyN55HkH/cqKyNflRL+AhsW3MhGo4TfV8SzaMJmTnZSFc3pq45eoSK XbXS3Ws2hiNBwGPuCrRBNRaRAv9bODv2BOZsHsDdhhLlRnZX7IQyLtOWkodX1MQIhsle yhl0YDVHKRqBKFH1EpjqnfEvDoDmvCJ7K9fvBpDbLf7t7Wn650o5s/JVuEugVKQRDeCJ Haww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=4p9GENac9bMC39tvlGVVbyEpYdtaM0ykcuE5gbgRgK0=; b=aThh++edD3mS61AisxjWgdBMDNkSKXDyhmP4Idr7OpNYnP9nWOYH32wdAH+t/WEfKb CEZIguF7HcR8L/XrBmKx74awlSQyRBLXso3BBiZH/FhTyQsg9CJAzz8SBiK9LFRJ1rql JN9gNH8F+UItSqI5hVW+xWazwhwQySNd1T/LNjFcA2Jj2cW3YMRwwWdZXRnrxI117z3N Mofe58FEwlYVDUK4Spcf8cZixOVDOEJ3thGqOlv0tKJAPLIFXP9UJ8yNW9jAg9fG3vMa ldHAL5Q5JyVNISTxjJhDF1q8n9s0gTKfCQ11QXA1TvHPOcZZolqAD3unuP/b1yn+4pEh +CAQ== X-Gm-Message-State: ALQs6tB5OoT7C5J8WzBiSINO6cd6g1BIJiluhip5Rj/0nxEgcBoN0zKF jcP39G0FJAhFCOYtILBrByS7E4LxxLI= X-Google-Smtp-Source: AIpwx48SaWWVWbXWsS8km9/4Zl+ThOeK9WF6mvafy5PflkEJ3iNFOAhyqB6ku5iPmqondx/O8hNeRg== X-Received: by 10.98.162.2 with SMTP id m2mr10322827pff.251.1524239748252; Fri, 20 Apr 2018 08:55:48 -0700 (PDT) Received: from localhost ([2620:15c:2c4:1:7e6f:1e60:1805:893c]) by smtp.gmail.com with ESMTPSA id u68sm11396164pfu.167.2018.04.20.08.55.46 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 20 Apr 2018 08:55:47 -0700 (PDT) From: Eric Dumazet To: "David S . Miller" Cc: netdev , linux-kernel , Soheil Hassas Yeganeh , Eric Dumazet , Eric Dumazet Subject: [PATCH net-next 1/4] mm: provide a mmap_hook infrastructure Date: Fri, 20 Apr 2018 08:55:39 -0700 Message-Id: <20180420155542.122183-2-edumazet@google.com> X-Mailer: git-send-email 2.17.0.484.g0c8726318c-goog In-Reply-To: <20180420155542.122183-1-edumazet@google.com> References: <20180420155542.122183-1-edumazet@google.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org When adding tcp mmap() implementation, I forgot that socket lock had to be taken before current->mm->mmap_sem. syzbot eventually caught the bug. This patch provides a new mmap_hook() method in struct file_operations that might be provided by fs to implement a finer control of whats to be done before and after do_mmap_pgoff() and/or the mm->mmap_sem acquire/release. This is used in following patches by networking and TCP stacks to solve the lockdep issue, and also allows some preparation and cleanup work being done before/after mmap_sem is held, allowing better scalability in multi-threading programs. Fixes: 93ab6cc69162 ("tcp: implement mmap() for zero copy receive") Signed-off-by: Eric Dumazet Reported-by: syzbot --- include/linux/fs.h | 6 ++++++ mm/util.c | 19 ++++++++++++++++++- 2 files changed, 24 insertions(+), 1 deletion(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index 92efaf1f89775f7b017477617dd983c10e0dc4d2..ef3526f84686585678861fc585efea974a69ca55 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1698,6 +1698,11 @@ struct block_device_operations; #define NOMMU_VMFLAGS \ (NOMMU_MAP_READ | NOMMU_MAP_WRITE | NOMMU_MAP_EXEC) +enum mmap_hook { + MMAP_HOOK_PREPARE, + MMAP_HOOK_ROLLBACK, + MMAP_HOOK_COMMIT, +}; struct iov_iter; @@ -1714,6 +1719,7 @@ struct file_operations { long (*unlocked_ioctl) (struct file *, unsigned int, unsigned long); long (*compat_ioctl) (struct file *, unsigned int, unsigned long); int (*mmap) (struct file *, struct vm_area_struct *); + int (*mmap_hook) (struct file *, enum mmap_hook); unsigned long mmap_supported_flags; int (*open) (struct inode *, struct file *); int (*flush) (struct file *, fl_owner_t id); diff --git a/mm/util.c b/mm/util.c index 1fc4fa7576f762bbbf341f056ca6d0be803a423f..3ddb18ab367f069d5884083e992e999546ccd995 100644 --- a/mm/util.c +++ b/mm/util.c @@ -350,11 +350,28 @@ unsigned long vm_mmap_pgoff(struct file *file, unsigned long addr, ret = security_mmap_file(file, prot, flag); if (!ret) { - if (down_write_killable(&mm->mmap_sem)) + int (*mmap_hook)(struct file *, enum mmap_hook) = NULL; + + if (file) { + mmap_hook = file->f_op->mmap_hook; + + if (mmap_hook) { + ret = mmap_hook(file, MMAP_HOOK_PREPARE); + if (ret) + return ret; + } + } + if (down_write_killable(&mm->mmap_sem)) { + if (mmap_hook) + mmap_hook(file, MMAP_HOOK_ROLLBACK); return -EINTR; + } ret = do_mmap_pgoff(file, addr, len, prot, flag, pgoff, &populate, &uf); up_write(&mm->mmap_sem); + if (mmap_hook) + mmap_hook(file, IS_ERR(ret) ? MMAP_HOOK_ROLLBACK : + MMAP_HOOK_COMMIT); userfaultfd_unmap_complete(mm, &uf); if (populate) mm_populate(ret, populate); From patchwork Fri Apr 20 15:55:40 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Dumazet X-Patchwork-Id: 902002 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="BqBrnh4w"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 40SL7w4pqYz9s1p for ; Sat, 21 Apr 2018 01:56:44 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755954AbeDTP4n (ORCPT ); Fri, 20 Apr 2018 11:56:43 -0400 Received: from mail-pg0-f68.google.com ([74.125.83.68]:34016 "EHLO mail-pg0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755660AbeDTPzw (ORCPT ); Fri, 20 Apr 2018 11:55:52 -0400 Received: by mail-pg0-f68.google.com with SMTP id p10so4219335pgn.1 for ; Fri, 20 Apr 2018 08:55:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=SmUW3gO+zwyIw5Ba1sHZNZfzoWhEGhuEpuB2ZbtNXOI=; b=BqBrnh4w60AIitky8wuG2KU7fHtxhCNo+geCQNkObG3n6E8jlmpspzLGhJTgPUrJW4 4F/cz7chOcZQRgOgpnBk4plTJJClDTM8Oh6SUf7IrYbO8IN5WiIMy1pUs8vTb0I5QwuR Pz7antl7INNvdLePEvOwy7wzr2ev1ONgSFAE+guJtsit23KrR4gRiDvuXX/BKTmkY1mn ODFwvJFH7BrwBmhH3s4nB/qrBATEX6vGYXtIeI5GFg0/p9dKUS1OG3lVYPeya/aWxg1X N0yVr0/Tnj8P7jhE5OshwmigUmAWUq5l/fEXHVtd08tLGBNTkGOYSqyThBnuATZ6UKqs mLyQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=SmUW3gO+zwyIw5Ba1sHZNZfzoWhEGhuEpuB2ZbtNXOI=; b=b7TqzDgnxJm85WKMHLUh27J/wSfeV68aFPmJg3nLpcZIHXorZCWzyJpjHyYGCiUgwY 7u1zXYvALOgyfXN0m2f/WvdSw+QmNKrixLL1ONoYa4jD88ebdRxEryl/yX5S1fXY+bob lcoKb1iiRYGZcBScp5GORHUKClFLvqklDYE0fAaYDAiznzErWOuxlpkf6g5CdzrYf0dV i2YqU7Cyjw5LOGPP2Dr0qPXZPuW1lF6xHnU8wHWILsQ/mLm90vQVndUu7/PcK6ulZVgm dpB6B+bWOKi0RArPrabMu9dC/OTVAL2n5OBU55h7rsrI1lrtj1i2FxeDHK43KNob9zNm SKog== X-Gm-Message-State: ALQs6tBCaFadGAXFZhcCDX2W3ZQXS0QNcW7oufBksw8U0KMw7utv/bbf s/CxRVMXr0brifjAETZh92cwZrTU5yE= X-Google-Smtp-Source: AIpwx4+mvdAxLPOv+jJe6jl7DaYUBcy4F/SczjgBfdWw8de9jlGb8k4jvQAnzsqYAluj8LIs2QUZ+A== X-Received: by 10.99.113.84 with SMTP id b20mr9202459pgn.426.1524239750901; Fri, 20 Apr 2018 08:55:50 -0700 (PDT) Received: from localhost ([2620:15c:2c4:1:7e6f:1e60:1805:893c]) by smtp.gmail.com with ESMTPSA id l10sm9349424pgp.35.2018.04.20.08.55.49 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 20 Apr 2018 08:55:49 -0700 (PDT) From: Eric Dumazet To: "David S . Miller" Cc: netdev , linux-kernel , Soheil Hassas Yeganeh , Eric Dumazet , Eric Dumazet Subject: [PATCH net-next 2/4] net: implement sock_mmap_hook() Date: Fri, 20 Apr 2018 08:55:40 -0700 Message-Id: <20180420155542.122183-3-edumazet@google.com> X-Mailer: git-send-email 2.17.0.484.g0c8726318c-goog In-Reply-To: <20180420155542.122183-1-edumazet@google.com> References: <20180420155542.122183-1-edumazet@google.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org sock_mmap_hook() is the mmap_hook handler provided for socket_file_ops Following patch will provide tcp_mmap_hook() for TCP protocol. Signed-off-by: Eric Dumazet --- include/linux/net.h | 1 + net/socket.c | 9 +++++++++ 2 files changed, 10 insertions(+) diff --git a/include/linux/net.h b/include/linux/net.h index 6554d3ba4396b3df49acac934ad16eeb71a695f4..5192bf502b11e42c3d9eb342ce67361916149bfa 100644 --- a/include/linux/net.h +++ b/include/linux/net.h @@ -181,6 +181,7 @@ struct proto_ops { size_t total_len, int flags); int (*mmap) (struct file *file, struct socket *sock, struct vm_area_struct * vma); + int (*mmap_hook) (struct socket *sock, enum mmap_hook); ssize_t (*sendpage) (struct socket *sock, struct page *page, int offset, size_t size, int flags); ssize_t (*splice_read)(struct socket *sock, loff_t *ppos, diff --git a/net/socket.c b/net/socket.c index f10f1d947c78c193b49379b0ec641d81367fb4cf..75a5c2ebe57e0621dae17c6c9e1a796ee818b107 100644 --- a/net/socket.c +++ b/net/socket.c @@ -131,6 +131,14 @@ static ssize_t sock_splice_read(struct file *file, loff_t *ppos, struct pipe_inode_info *pipe, size_t len, unsigned int flags); +static int sock_mmap_hook(struct file *file, enum mmap_hook mode) +{ + struct socket *sock = file->private_data; + + if (!sock->ops->mmap_hook) + return 0; + return sock->ops->mmap_hook(sock, mode); +} /* * Socket files have a set of 'special' operations as well as the generic file ones. These don't appear * in the operation structures but are done directly via the socketcall() multiplexor. @@ -147,6 +155,7 @@ static const struct file_operations socket_file_ops = { .compat_ioctl = compat_sock_ioctl, #endif .mmap = sock_mmap, + .mmap_hook = sock_mmap_hook, .release = sock_close, .fasync = sock_fasync, .sendpage = sock_sendpage, From patchwork Fri Apr 20 15:55:41 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Dumazet X-Patchwork-Id: 901999 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="AT7u2uJk"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 40SL7J4FtCz9s27 for ; Sat, 21 Apr 2018 01:56:12 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755916AbeDTPz6 (ORCPT ); Fri, 20 Apr 2018 11:55:58 -0400 Received: from mail-pl0-f44.google.com ([209.85.160.44]:40805 "EHLO mail-pl0-f44.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755850AbeDTPzy (ORCPT ); Fri, 20 Apr 2018 11:55:54 -0400 Received: by mail-pl0-f44.google.com with SMTP id t22-v6so5492254plo.7 for ; Fri, 20 Apr 2018 08:55:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=+ib3KzkeFO6x16SvW1xnb1iXy86euE+z4QE8CChQMEc=; b=AT7u2uJkDVWGl5X3ogsmuUzBwutDcRdvtX2oM3kCLBUo/8s1d7H7bt3iNdqyu7ydna CKOAAZPJlSrwM3OxaDfJFfsCJF/RKr9+CvyTXFVcMdJE1Bv3bWw+PgVvZjNrSL+wQXmv ZCap05/9rDUyt353hEQLSc/NGzKVMZHOCD/GOdLsxXU1TAGkUGNyIJadUD/G2gw6vE6D +eO8MdZPDIcX3b9fSj6+9d5iiHY6jEIFLjOKqzHb9ViHQTIviMGuudTS4FxTlJf5pVj+ qINBtMNWPvsyuoJ9OlHh4xbAzm7I4WEAnpsCU7lrY1lDgmsednNmiHsQLyBuAigtCKiR GJeA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=+ib3KzkeFO6x16SvW1xnb1iXy86euE+z4QE8CChQMEc=; b=kvU2740MCqab1UtV3nZEMUz1ibQGXqVoFq9cUKgxKCYuNp5w7/vEEQ77kp1an9/zhC LJMGKa4Ufd5zY5K7j0f+SQ+5TNF+m+cWf0ys+vnG6raPsLhXP84sfsORUpokpCFs8pw5 wcyhZpDHvDJN3N/fRh5Ok25jXiNd+6nzJKmcu495zh1ssB3cpmsKoy+F19mBAcFe5mVU fzVnS+U0HAH14IO5XS/bz2Ou2GQPC1yDIw2qgQV8mfVWpHT4LwXsERwhXWN0bgN76zaK Msdngmo4v9Y2X5TgN9UDNSzpu0if8uzFpa/tIinWK8dpf87X+soPbKVv1EQBagjHi494 btcg== X-Gm-Message-State: ALQs6tBHVRuxWvZ5Q4lnF3MrNJzQMxdi5mKrbasWcYr94E1K4r/6ZGXU 0L+kdcJILQatkOY2aD/XXZR/IQ== X-Google-Smtp-Source: AIpwx4/DdBs+cKjbFPbWmsBwu3bkMBDGt7+McwA79bVNTaArWNzSEeoZVggHxJ/CeRlsg5CIprWbqw== X-Received: by 2002:a17:902:76c7:: with SMTP id j7-v6mr10720388plt.108.1524239753364; Fri, 20 Apr 2018 08:55:53 -0700 (PDT) Received: from localhost ([2620:15c:2c4:1:7e6f:1e60:1805:893c]) by smtp.gmail.com with ESMTPSA id j1sm4525325pgn.69.2018.04.20.08.55.52 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 20 Apr 2018 08:55:52 -0700 (PDT) From: Eric Dumazet To: "David S . Miller" Cc: netdev , linux-kernel , Soheil Hassas Yeganeh , Eric Dumazet , Eric Dumazet Subject: [PATCH net-next 3/4] tcp: provide tcp_mmap_hook() Date: Fri, 20 Apr 2018 08:55:41 -0700 Message-Id: <20180420155542.122183-4-edumazet@google.com> X-Mailer: git-send-email 2.17.0.484.g0c8726318c-goog In-Reply-To: <20180420155542.122183-1-edumazet@google.com> References: <20180420155542.122183-1-edumazet@google.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Many socket operations can copy data between user and kernel space while socket lock is held. This means mm->mmap_sem can be taken after socket lock. When implementing tcp mmap(), I forgot this and syzbot was kind enough to point this to my attention. This patch adds tcp_mmap_hook(), allowing us to grab socket lock before vm_mmap_pgoff() grabs mm->mmap_sem This same hook is responsible for releasing socket lock when vm_mmap_pgoff() has released mm->mmap_sem (or failed to acquire it) Note that follow-up patches can transfer code from tcp_mmap() to tcp_mmap_hook() to shorten tcp_mmap() execution time and thus increase mmap() performance in multi-threaded programs. Fixes: 93ab6cc69162 ("tcp: implement mmap() for zero copy receive") Signed-off-by: Eric Dumazet Reported-by: syzbot --- include/net/tcp.h | 1 + net/ipv4/af_inet.c | 1 + net/ipv4/tcp.c | 25 ++++++++++++++++++++++--- net/ipv6/af_inet6.c | 1 + 4 files changed, 25 insertions(+), 3 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index 833154e3df173ea41aa16dd1ec739a175c679c5c..f68c8e8957840cacdbdd3d02bd149fce33ae324f 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -404,6 +404,7 @@ int tcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int nonblock, int flags, int *addr_len); int tcp_set_rcvlowat(struct sock *sk, int val); void tcp_data_ready(struct sock *sk); +int tcp_mmap_hook(struct socket *sock, enum mmap_hook mode); int tcp_mmap(struct file *file, struct socket *sock, struct vm_area_struct *vma); void tcp_parse_options(const struct net *net, const struct sk_buff *skb, diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c index 3ebf599cebaea4926decc1aad7274b12ec7e1566..af597440ff59c049b7fd02f7d7f79c23b9e195bb 100644 --- a/net/ipv4/af_inet.c +++ b/net/ipv4/af_inet.c @@ -995,6 +995,7 @@ const struct proto_ops inet_stream_ops = { .sendmsg = inet_sendmsg, .recvmsg = inet_recvmsg, .mmap = tcp_mmap, + .mmap_hook = tcp_mmap_hook, .sendpage = inet_sendpage, .splice_read = tcp_splice_read, .read_sock = tcp_read_sock, diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 4022073b0aeea9d07af0fa825b640a00512908a3..e913b2dd5df321f2789e8d5f233ede9c2f1d5624 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -1726,6 +1726,28 @@ int tcp_set_rcvlowat(struct sock *sk, int val) } EXPORT_SYMBOL(tcp_set_rcvlowat); +/* mmap() on TCP needs to grab socket lock before current->mm->mmap_sem + * is taken in vm_mmap_pgoff() to avoid possible dead locks. + */ +int tcp_mmap_hook(struct socket *sock, enum mmap_hook mode) +{ + struct sock *sk = sock->sk; + + if (mode == MMAP_HOOK_PREPARE) { + lock_sock(sk); + /* TODO: Move here all the preparation work that can be done + * before having to grab current->mm->mmap_sem. + */ + return 0; + } + /* TODO: Move here the stuff that can been done after + * current->mm->mmap_sem has been released. + */ + release_sock(sk); + return 0; +} +EXPORT_SYMBOL(tcp_mmap_hook); + /* When user wants to mmap X pages, we first need to perform the mapping * before freeing any skbs in receive queue, otherwise user would be unable * to fallback to standard recvmsg(). This happens if some data in the @@ -1756,8 +1778,6 @@ int tcp_mmap(struct file *file, struct socket *sock, /* TODO: Maybe the following is not needed if pages are COW */ vma->vm_flags &= ~VM_MAYWRITE; - lock_sock(sk); - ret = -ENOTCONN; if (sk->sk_state == TCP_LISTEN) goto out; @@ -1833,7 +1853,6 @@ int tcp_mmap(struct file *file, struct socket *sock, ret = 0; out: - release_sock(sk); kvfree(pages_array); return ret; } diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c index 36d622c477b1ed3c5d2b753938444526344a6109..31ce68c001c223d3351f73453273ae517a051816 100644 --- a/net/ipv6/af_inet6.c +++ b/net/ipv6/af_inet6.c @@ -579,6 +579,7 @@ const struct proto_ops inet6_stream_ops = { .sendmsg = inet_sendmsg, /* ok */ .recvmsg = inet_recvmsg, /* ok */ .mmap = tcp_mmap, + .mmap_hook = tcp_mmap_hook, .sendpage = inet_sendpage, .sendmsg_locked = tcp_sendmsg_locked, .sendpage_locked = tcp_sendpage_locked, From patchwork Fri Apr 20 15:55:42 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Dumazet X-Patchwork-Id: 902000 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="dXoMrGfr"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 40SL7h02vjz9s27 for ; Sat, 21 Apr 2018 01:56:31 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755846AbeDTP4R (ORCPT ); Fri, 20 Apr 2018 11:56:17 -0400 Received: from mail-pf0-f196.google.com ([209.85.192.196]:42545 "EHLO mail-pf0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755902AbeDTPz5 (ORCPT ); Fri, 20 Apr 2018 11:55:57 -0400 Received: by mail-pf0-f196.google.com with SMTP id o16so4474187pfk.9 for ; Fri, 20 Apr 2018 08:55:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=9ZRI2V4YkYmTzd1dQqT/kJAE/uJLOLR2mC+eW9oKAkI=; b=dXoMrGfr3H+xHsqaMdFcwF9rjmXf7sj2J1pLnmORv2kNJYYugm67wCZUnl1rHAo9jx nhYqrjpGwb7pfQFN84NxTlu+fMXcIqo3IwJFVr5bIRKeEKEi/i3UIvi6vPHKoE0CLehV 6O03fLZXDmG5Romh3aQxzXWf8NlIZoaERCQO7EKWamdnNqtAc/oiR7lVfwkz3PN08XU7 kaRekijBQ/o1IRQEMW452lm7creW4TlsQxR2ZqnZ5h8KNzJ3+F0QJkBZs/xcsmbmR88s OvKNYKbzhGbIc1Dkmx9q6511KU0RPJY4OVjQXv7QJrDHpiwl0uG0fLyVqce9DIK4fLxF Rtww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=9ZRI2V4YkYmTzd1dQqT/kJAE/uJLOLR2mC+eW9oKAkI=; b=pTrU+BNoPHilwKXPp+FBudaAoWhwWY1X6iTtm//gxG3vvedSlrkP1B+us3Lmf9Pgti PPrLDex6/kPWkxtDZ9MaeNmtFw8AFmHE3SGz0+FW7pOscex+XGp7RbrlLbBuzm5/ZTfC j0PFof0IKUSpsbYVpIXE7X+LKHAh6TvpviS37YXFJmBdbK5GFjiQQNxtL/loEMa7q+O8 Zy58VB6PmgG2czK0YQW0J8wTj7/YgfU8FRAcrdij8lACvWsnFSVMJfnsa2zj5YyhG34m LGL1NmHa7RKAZHEJm2o6hHR4M5M/0AG1kcG9HNELc0WSz87nhP9Yfu3Ui6uEkjMfF5JK kq7A== X-Gm-Message-State: ALQs6tCGk3Hf4whyFBa8u6TPgVdAW1qgWL1tc25uqh3jY3KE8U5WAq+Z unhX1z5kRkoIa/x8u0LiYD2UPA== X-Google-Smtp-Source: AIpwx48pACIQBdRZoa8zadbigeB0f0sgP9lSw83Z0qRDVBLpXcpkWBZicow84dLfQ3yz9ODlPwoUiA== X-Received: by 10.99.163.77 with SMTP id v13mr4639719pgn.224.1524239755901; Fri, 20 Apr 2018 08:55:55 -0700 (PDT) Received: from localhost ([2620:15c:2c4:1:7e6f:1e60:1805:893c]) by smtp.gmail.com with ESMTPSA id x17sm11120179pfm.161.2018.04.20.08.55.54 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 20 Apr 2018 08:55:54 -0700 (PDT) From: Eric Dumazet To: "David S . Miller" Cc: netdev , linux-kernel , Soheil Hassas Yeganeh , Eric Dumazet , Eric Dumazet Subject: [PATCH net-next 4/4] tcp: mmap: move the skb cleanup to tcp_mmap_hook() Date: Fri, 20 Apr 2018 08:55:42 -0700 Message-Id: <20180420155542.122183-5-edumazet@google.com> X-Mailer: git-send-email 2.17.0.484.g0c8726318c-goog In-Reply-To: <20180420155542.122183-1-edumazet@google.com> References: <20180420155542.122183-1-edumazet@google.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Freeing all skbs and sending ACK is time consuming. This is currently done while both current->mm->mmap_sem and socket lock are held, in tcp_mmap() Thanks to mmap_hook infrastructure, we can perform the cleanup after current->mm->mmap_sem has been released, thus allowing other threads to perform mm operations without delay. Note that the preparation work (building the array of page pointers) can also be done from tcp_mmap_hook() while mmap_sem has not been taken yet, but this is another independent change. Signed-off-by: Eric Dumazet --- net/ipv4/tcp.c | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index e913b2dd5df321f2789e8d5f233ede9c2f1d5624..82f7c3e47253cecac6ea1819fbb7a0712058ec55 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -1740,9 +1740,16 @@ int tcp_mmap_hook(struct socket *sock, enum mmap_hook mode) */ return 0; } - /* TODO: Move here the stuff that can been done after - * current->mm->mmap_sem has been released. - */ + if (mode == MMAP_HOOK_COMMIT) { + u32 offset; + + tcp_rcv_space_adjust(sk); + + /* Clean up data we have read: This will do ACK frames. */ + tcp_recv_skb(sk, tcp_sk(sk)->copied_seq, &offset); + + tcp_cleanup_rbuf(sk, PAGE_SIZE); + } release_sock(sk); return 0; } @@ -1843,13 +1850,8 @@ int tcp_mmap(struct file *file, struct socket *sock, if (ret) goto out; } - /* operation is complete, we can 'consume' all skbs */ + /* operation is complete, skbs will be freed from tcp_mmap_hook() */ tp->copied_seq = seq; - tcp_rcv_space_adjust(sk); - - /* Clean up data we have read: This will do ACK frames. */ - tcp_recv_skb(sk, seq, &offset); - tcp_cleanup_rbuf(sk, size); ret = 0; out: