From patchwork Mon Nov 13 12:36:59 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tonghao Zhang X-Patchwork-Id: 837429 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="CX/HJ/tU"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3yb9BY6mzsz9s8J for ; Mon, 13 Nov 2017 23:37:09 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752715AbdKMMhF (ORCPT ); Mon, 13 Nov 2017 07:37:05 -0500 Received: from mail-pf0-f195.google.com ([209.85.192.195]:49098 "EHLO mail-pf0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752704AbdKMMhD (ORCPT ); Mon, 13 Nov 2017 07:37:03 -0500 Received: by mail-pf0-f195.google.com with SMTP id b79so11747656pfk.5 for ; Mon, 13 Nov 2017 04:37:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=wvCMgEpCKz2xnjYBbT0zb2JxMEpMzS7tUDHBeaBfGj4=; b=CX/HJ/tUQIdqzJbSsr82TynuiSKV9MiTB88dIzR/U6VZwgPW2v4gUWBKvJKdI2ZuHF 70NzD9i0jqm6IjI7U9ps4vop/jqNSWR7rUMu5DxdbmVrBEDFbQr5jelacL7St761eESy uJvY9k2k1PJkruxEcIlfk2L6dFCXE6wkp/Xuv2fChVZlTL8fA6gaeBP5qeFJ1piBnvMY 2kaB0JR9a0MWpwDxAO7xlA9cjdFeIG9JWd1mbat8D52vkySvoYnA2SzVX1yUFjlPdb1/ 13jW2kY67ua9nl66ThvXNhzUziiMkJpC5p+axUNkcnxiIlZuTLJyi8tkknpcDP7UfMuX zsoA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=wvCMgEpCKz2xnjYBbT0zb2JxMEpMzS7tUDHBeaBfGj4=; b=Dd34V0A3uXiiztpW5Y3nQV9Vnu9RaWmJ5Po/j0t9919xzFapWia5cAryMmOaMZ/Kfp K8vrp7Xji9vmDKRdFSaal1MKRnmXORhV0vsf7sI7eGh5ZmZN3Rn3UTSDKgO/1YaY5MJo MtA/xZH93kNDKyivY+u48VwMNnYBq/c6CE/satcDTVNBoyRfAJcZvz3UaKpaLRVaxZmY OGyd3CJnyKz6UL0oDjY67ODEGWIgaZ7uGVx+q1UQ5iFLW9ofq7ilSDWJr+qEbcDM2+s9 qzxVe/ZZey7iGyAb9BBTodvjiAGb/h6wULTz0YVsgpfwUY3MAnalBFFhV4gp8tfARzpz 4tgw== X-Gm-Message-State: AJaThX7h8xhRLdmdAWrIuFBB1f/+9UpqCubUbuXpmkcaA6lDrkFl/E8O K8DhVJxFcXAni4MK1rj9UYH4kgDC X-Google-Smtp-Source: AGs4zMbsMX1By7Qd18NB0Q97J/QhuF5uxnEHkCTTyzts5UI7foycxFFwnTHDJ+sEf17jCNRpMIp4dw== X-Received: by 10.84.242.131 with SMTP id d3mr8735140pll.269.1510576622831; Mon, 13 Nov 2017 04:37:02 -0800 (PST) Received: from local.opencloud.tech.localdomain ([13.94.31.177]) by smtp.gmail.com with ESMTPSA id p189sm25189586pfp.127.2017.11.13.04.37.01 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 13 Nov 2017 04:37:02 -0800 (PST) From: Tonghao Zhang X-Google-Original-From: Tonghao Zhang To: netdev@vger.kernel.org Cc: Tonghao Zhang , Tonghao Zhang , Martin Zhang Subject: [PATCH v2 net-next] socket: Move the socket inuse to namespace. Date: Mon, 13 Nov 2017 04:36:59 -0800 Message-Id: <1510576619-6110-1-git-send-email-zhangtonghao@didichuxing.com> X-Mailer: git-send-email 1.8.3.1 MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Tonghao Zhang This patch add a member in struct netns_core. and this is a counter for socket_inuse in the _net_ namespace. The patch will add/sub counter in the sock_alloc or sock_release. In the sock_alloc, the private data of inode saves the special _net_. When releasing it, we can access the special _net_ and dec the counter of socket in that namespace. By the way, we dont use the 'current->nsproxy->net_ns' in the sock_release. In one case,when one task exits, the 'do_exit' may set the current->nsproxy NULL, and then call the sock_release. Use the private data of inode, saving few bytes. Signed-off-by: Tonghao Zhang Signed-off-by: Martin Zhang --- fix typo and comment. --- include/net/netns/core.h | 1 + net/socket.c | 79 ++++++++++++++++++++++++++++++++++++------------ 2 files changed, 61 insertions(+), 19 deletions(-) diff --git a/include/net/netns/core.h b/include/net/netns/core.h index 78eb1ff75475..bef1bc8a1721 100644 --- a/include/net/netns/core.h +++ b/include/net/netns/core.h @@ -11,6 +11,7 @@ struct netns_core { int sysctl_somaxconn; struct prot_inuse __percpu *inuse; + int __percpu *socket_inuse; }; #endif diff --git a/net/socket.c b/net/socket.c index c729625eb5d3..8a873d3fb2ab 100644 --- a/net/socket.c +++ b/net/socket.c @@ -163,12 +163,6 @@ static DEFINE_SPINLOCK(net_family_lock); static const struct net_proto_family __rcu *net_families[NPROTO] __read_mostly; /* - * Statistics counters of the socket lists - */ - -static DEFINE_PER_CPU(int, sockets_in_use); - -/* * Support routines. * Move socket addresses back and forth across the kernel/user * divide and look after the messy bits. @@ -549,6 +543,50 @@ static const struct inode_operations sockfs_inode_ops = { .setattr = sockfs_setattr, }; +#ifdef CONFIG_PROC_FS +static void socket_inuse_add(struct net *net, int val) +{ + __this_cpu_add(*net->core.socket_inuse, val); +} + +static int socket_inuse_get(struct net *net) +{ + int cpu, res = 0; + + for_each_possible_cpu(cpu) + res += *per_cpu_ptr(net->core.socket_inuse, cpu); + + return res >= 0 ? res : 0; +} + +static int __net_init socket_inuse_init_net(struct net *net) +{ + net->core.socket_inuse = alloc_percpu(int); + return net->core.socket_inuse ? 0 : -ENOMEM; +} + +static void __net_exit socket_inuse_exit_net(struct net *net) +{ + free_percpu(net->core.socket_inuse); +} + +static struct pernet_operations socket_inuse_ops = { + .init = socket_inuse_init_net, + .exit = socket_inuse_exit_net, +}; + +static __init int socket_inuse_init(void) +{ + if (register_pernet_subsys(&socket_inuse_ops)) + panic("Cannot initialize socket inuse counters"); + + return 0; +} + +core_initcall(socket_inuse_init); +#endif /* CONFIG_PROC_FS */ + + /** * sock_alloc - allocate a socket * @@ -561,6 +599,7 @@ struct socket *sock_alloc(void) { struct inode *inode; struct socket *sock; + struct net *net; inode = new_inode_pseudo(sock_mnt->mnt_sb); if (!inode) @@ -575,7 +614,15 @@ struct socket *sock_alloc(void) inode->i_gid = current_fsgid(); inode->i_op = &sockfs_inode_ops; - this_cpu_add(sockets_in_use, 1); + net = current->nsproxy->net_ns; + /* + * Save the _net_ to private data of inode. When we destroy the + * socket, we can use it to access the _net_ and dec socket_inuse + * counter. + */ + inode->i_private = get_net(net); + socket_inuse_add(net, 1); + return sock; } EXPORT_SYMBOL(sock_alloc); @@ -602,7 +649,10 @@ void sock_release(struct socket *sock) if (rcu_dereference_protected(sock->wq, 1)->fasync_list) pr_err("%s: fasync list not empty!\n", __func__); - this_cpu_sub(sockets_in_use, 1); + /* inode->i_private saves the _net_ address. */ + socket_inuse_add(SOCK_INODE(sock)->i_private, -1); + put_net(SOCK_INODE(sock)->i_private); + if (!sock->file) { iput(SOCK_INODE(sock)); return; @@ -2645,17 +2695,8 @@ core_initcall(sock_init); /* early initcall */ #ifdef CONFIG_PROC_FS void socket_seq_show(struct seq_file *seq) { - int cpu; - int counter = 0; - - for_each_possible_cpu(cpu) - counter += per_cpu(sockets_in_use, cpu); - - /* It can be negative, by the way. 8) */ - if (counter < 0) - counter = 0; - - seq_printf(seq, "sockets: used %d\n", counter); + seq_printf(seq, "sockets: used %d\n", + socket_inuse_get(seq->private)); } #endif /* CONFIG_PROC_FS */