From patchwork Thu Feb 6 03:08:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 1234108 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 48Ck2n0JvQz9sSG for ; Thu, 6 Feb 2020 14:10:40 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727769AbgBFDKj (ORCPT ); Wed, 5 Feb 2020 22:10:39 -0500 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:6388 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727768AbgBFDKj (ORCPT ); Wed, 5 Feb 2020 22:10:39 -0500 Received: from pps.filterd (m0098420.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 01639Evp096685; Wed, 5 Feb 2020 22:09:54 -0500 Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 2xyhn60hmu-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 05 Feb 2020 22:09:54 -0500 Received: from m0098420.ppops.net (m0098420.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 01639fC0097620; Wed, 5 Feb 2020 22:09:53 -0500 Received: from ppma01wdc.us.ibm.com (fd.55.37a9.ip4.static.sl-reverse.com [169.55.85.253]) by mx0b-001b2d01.pphosted.com with ESMTP id 2xyhn60hmf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 05 Feb 2020 22:09:53 -0500 Received: from pps.filterd (ppma01wdc.us.ibm.com [127.0.0.1]) by ppma01wdc.us.ibm.com (8.16.0.27/8.16.0.27) with SMTP id 01639i51021875; Thu, 6 Feb 2020 03:09:53 GMT Received: from b03cxnp08027.gho.boulder.ibm.com (b03cxnp08027.gho.boulder.ibm.com [9.17.130.19]) by ppma01wdc.us.ibm.com with ESMTP id 2xykc9ht08-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 06 Feb 2020 03:09:53 +0000 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp08027.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 01639pgY56492474 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 6 Feb 2020 03:09:51 GMT Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 6B0A3BE051; Thu, 6 Feb 2020 03:09:51 +0000 (GMT) Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 7D720BE04F; Thu, 6 Feb 2020 03:09:31 +0000 (GMT) Received: from LeoBras.aus.stglabs.ibm.com (unknown [9.85.163.250]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP; Thu, 6 Feb 2020 03:09:31 +0000 (GMT) From: Leonardo Bras To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Arnd Bergmann , Andrew Morton , "Aneesh Kumar K.V" , Nicholas Piggin , Christophe Leroy , Steven Price , Robin Murphy , Leonardo Bras , Mahesh Salgaonkar , Balbir Singh , Reza Arbab , Thomas Gleixner , Allison Randal , Greg Kroah-Hartman , Mike Rapoport , Michal Suchanek Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, kvm-ppc@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v6 01/11] asm-generic/pgtable: Adds generic functions to track lockless pgtable walks Date: Thu, 6 Feb 2020 00:08:50 -0300 Message-Id: <20200206030900.147032-2-leonardo@linux.ibm.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200206030900.147032-1-leonardo@linux.ibm.com> References: <20200206030900.147032-1-leonardo@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138, 18.0.572 definitions=2020-02-05_06:2020-02-04,2020-02-05 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 priorityscore=1501 adultscore=0 mlxscore=0 mlxlogscore=865 malwarescore=0 spamscore=0 lowpriorityscore=0 bulkscore=0 clxscore=1015 phishscore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2001150001 definitions=main-2002060022 Sender: kvm-ppc-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org It's necessary to track lockless pagetable walks, in order to avoid doing THP splitting/collapsing during them. The default solution is to disable irq before lockless pagetable walks and enable it after it's finished. On code, this means you can find local_irq_disable() and local_irq_enable() around some pieces of code, usually without comments on why it is needed. This patch proposes a set of generic functions to be called before starting and after finishing a lockless pagetable walk. It is supposed to make clear that a lockless pagetable walk happens there, and also carries information on why the irq disable/enable is needed. begin_lockless_pgtbl_walk() Insert before starting any lockless pgtable walk end_lockless_pgtbl_walk() Insert after the end of any lockless pgtable walk (Mostly after the ptep is last used) A memory barrier was also added just to make sure there is no speculative read outside the interrupt disabled area. Other than that, it is not supposed to have any change of behavior from current code. It is planned to allow arch-specific versions, so that additional steps can be added while keeping the code clean. Signed-off-by: Leonardo Bras Reported-by: kbuild test robot --- include/asm-generic/pgtable.h | 51 +++++++++++++++++++++++++++++++++++ 1 file changed, 51 insertions(+) diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h index e2e2bef07dd2..8d368d3c0974 100644 --- a/include/asm-generic/pgtable.h +++ b/include/asm-generic/pgtable.h @@ -1222,6 +1222,57 @@ static inline bool arch_has_pfn_modify_check(void) #endif #endif +#ifndef __HAVE_ARCH_LOCKLESS_PGTBL_WALK_CONTROL +/* + * begin_lockless_pgtbl_walk: Must be inserted before a function call that does + * lockless pagetable walks, such as __find_linux_pte() + */ +static inline +unsigned long begin_lockless_pgtbl_walk(void) +{ + unsigned long irq_mask; + + /* + * Interrupts must be disabled during the lockless page table walk. + * That's because the deleting or splitting involves flushing TLBs, + * which in turn issues interrupts, that will block when disabled. + */ + local_irq_save(irq_mask); + + /* + * This memory barrier pairs with any code that is either trying to + * delete page tables, or split huge pages. Without this barrier, + * the page tables could be read speculatively outside of interrupt + * disabling. + */ + smp_mb(); + + return irq_mask; +} + +/* + * end_lockless_pgtbl_walk: Must be inserted after the last use of a pointer + * returned by a lockless pagetable walk, such as __find_linux_pte() + */ +static inline void end_lockless_pgtbl_walk(unsigned long irq_mask) +{ + /* + * This memory barrier pairs with any code that is either trying to + * delete page tables, or split huge pages. Without this barrier, + * the page tables could be read speculatively outside of interrupt + * disabling. + */ + smp_mb(); + + /* + * Interrupts must be disabled during the lockless page table walk. + * That's because the deleting or splitting involves flushing TLBs, + * which in turn issues interrupts, that will block when disabled. + */ + local_irq_restore(irq_mask); +} +#endif + /* * On some architectures it depends on the mm if the p4d/pud or pmd * layer of the page table hierarchy is folded or not. From patchwork Thu Feb 6 03:08:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 1234109 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 48Ck2r4Wdzz9sSP for ; Thu, 6 Feb 2020 14:10:44 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727865AbgBFDKl (ORCPT ); Wed, 5 Feb 2020 22:10:41 -0500 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:31302 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727702AbgBFDKk (ORCPT ); Wed, 5 Feb 2020 22:10:40 -0500 Received: from pps.filterd (m0187473.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 016380Jr187211; Wed, 5 Feb 2020 22:10:11 -0500 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 2xyhmyc07g-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 05 Feb 2020 22:10:11 -0500 Received: from m0187473.ppops.net (m0187473.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 016399Gm189386; Wed, 5 Feb 2020 22:10:10 -0500 Received: from ppma05wdc.us.ibm.com (1b.90.2fa9.ip4.static.sl-reverse.com [169.47.144.27]) by mx0a-001b2d01.pphosted.com with ESMTP id 2xyhmyc06y-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 05 Feb 2020 22:10:10 -0500 Received: from pps.filterd (ppma05wdc.us.ibm.com [127.0.0.1]) by ppma05wdc.us.ibm.com (8.16.0.27/8.16.0.27) with SMTP id 01636LhN025844; Thu, 6 Feb 2020 03:10:09 GMT Received: from b03cxnp08026.gho.boulder.ibm.com (b03cxnp08026.gho.boulder.ibm.com [9.17.130.18]) by ppma05wdc.us.ibm.com with ESMTP id 2xykc9htgv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 06 Feb 2020 03:10:09 +0000 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp08026.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 0163A8eQ19136962 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 6 Feb 2020 03:10:08 GMT Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3C118BE058; Thu, 6 Feb 2020 03:10:08 +0000 (GMT) Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 5720FBE051; Thu, 6 Feb 2020 03:09:52 +0000 (GMT) Received: from LeoBras.aus.stglabs.ibm.com (unknown [9.85.163.250]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP; Thu, 6 Feb 2020 03:09:52 +0000 (GMT) From: Leonardo Bras To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Arnd Bergmann , Andrew Morton , "Aneesh Kumar K.V" , Nicholas Piggin , Christophe Leroy , Steven Price , Robin Murphy , Leonardo Bras , Mahesh Salgaonkar , Balbir Singh , Reza Arbab , Thomas Gleixner , Allison Randal , Greg Kroah-Hartman , Mike Rapoport , Michal Suchanek Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, kvm-ppc@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v6 02/11] mm/gup: Use functions to track lockless pgtbl walks on gup_pgd_range Date: Thu, 6 Feb 2020 00:08:51 -0300 Message-Id: <20200206030900.147032-3-leonardo@linux.ibm.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200206030900.147032-1-leonardo@linux.ibm.com> References: <20200206030900.147032-1-leonardo@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138, 18.0.572 definitions=2020-02-05_06:2020-02-04,2020-02-05 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 mlxscore=0 spamscore=0 priorityscore=1501 mlxlogscore=887 phishscore=0 impostorscore=0 malwarescore=0 suspectscore=0 adultscore=0 lowpriorityscore=0 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2001150001 definitions=main-2002060021 Sender: kvm-ppc-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org As described, gup_pgd_range is a lockless pagetable walk. So, in order to track against THP split/collapse, it disables/enables irq around it. To make use of the new tracking functions, it replaces irq disable/enable by {begin,end}_lockless_pgtbl_walk(). As local_irq_{save,restore} is present inside {begin,end}_lockless_pgtbl_walk, there should be no change in the workings. Signed-off-by: Leonardo Bras Reported-by: kbuild test robot Reported-by: kbuild test robot --- mm/gup.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index 1b521e0ac1de..04e6f46993b6 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -2369,7 +2369,7 @@ int __get_user_pages_fast(unsigned long start, int nr_pages, int write, struct page **pages) { unsigned long len, end; - unsigned long flags; + unsigned long irq_mask; int nr = 0; start = untagged_addr(start) & PAGE_MASK; @@ -2395,9 +2395,9 @@ int __get_user_pages_fast(unsigned long start, int nr_pages, int write, if (IS_ENABLED(CONFIG_HAVE_FAST_GUP) && gup_fast_permitted(start, end)) { - local_irq_save(flags); + irq_mask = begin_lockless_pgtbl_walk(); gup_pgd_range(start, end, write ? FOLL_WRITE : 0, pages, &nr); - local_irq_restore(flags); + end_lockless_pgtbl_walk(irq_mask); } return nr; @@ -2450,9 +2450,9 @@ static int internal_get_user_pages_fast(unsigned long start, int nr_pages, if (IS_ENABLED(CONFIG_HAVE_FAST_GUP) && gup_fast_permitted(start, end)) { - local_irq_disable(); + begin_lockless_pgtbl_walk(); gup_pgd_range(addr, end, gup_flags, pages, &nr); - local_irq_enable(); + end_lockless_pgtbl_walk(IRQS_ENABLED); ret = nr; } From patchwork Thu Feb 6 03:08:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 1234110 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 48Ck3J3lJ5z9sRt for ; Thu, 6 Feb 2020 14:11:08 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727661AbgBFDLI (ORCPT ); Wed, 5 Feb 2020 22:11:08 -0500 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:60574 "EHLO mx0b-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727594AbgBFDLH (ORCPT ); Wed, 5 Feb 2020 22:11:07 -0500 Received: from pps.filterd (m0127361.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 01637YUu186562; Wed, 5 Feb 2020 22:10:29 -0500 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 2xyhns3b4v-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 05 Feb 2020 22:10:29 -0500 Received: from m0127361.ppops.net (m0127361.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 0163A29O191621; Wed, 5 Feb 2020 22:10:28 -0500 Received: from ppma03wdc.us.ibm.com (ba.79.3fa9.ip4.static.sl-reverse.com [169.63.121.186]) by mx0a-001b2d01.pphosted.com with ESMTP id 2xyhns3b4j-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 05 Feb 2020 22:10:28 -0500 Received: from pps.filterd (ppma03wdc.us.ibm.com [127.0.0.1]) by ppma03wdc.us.ibm.com (8.16.0.27/8.16.0.27) with SMTP id 01637kDU016717; Thu, 6 Feb 2020 03:10:28 GMT Received: from b03cxnp08026.gho.boulder.ibm.com (b03cxnp08026.gho.boulder.ibm.com [9.17.130.18]) by ppma03wdc.us.ibm.com with ESMTP id 2xykc9hs9p-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 06 Feb 2020 03:10:28 +0000 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp08026.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 0163AQ5G47317442 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 6 Feb 2020 03:10:26 GMT Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 570B5BE05A; Thu, 6 Feb 2020 03:10:26 +0000 (GMT) Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 28F92BE051; Thu, 6 Feb 2020 03:10:10 +0000 (GMT) Received: from LeoBras.aus.stglabs.ibm.com (unknown [9.85.163.250]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP; Thu, 6 Feb 2020 03:10:08 +0000 (GMT) From: Leonardo Bras To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Arnd Bergmann , Andrew Morton , "Aneesh Kumar K.V" , Nicholas Piggin , Christophe Leroy , Steven Price , Robin Murphy , Leonardo Bras , Mahesh Salgaonkar , Balbir Singh , Reza Arbab , Thomas Gleixner , Allison Randal , Greg Kroah-Hartman , Mike Rapoport , Michal Suchanek Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, kvm-ppc@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v6 03/11] powerpc/mm: Adds arch-specificic functions to track lockless pgtable walks Date: Thu, 6 Feb 2020 00:08:52 -0300 Message-Id: <20200206030900.147032-4-leonardo@linux.ibm.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200206030900.147032-1-leonardo@linux.ibm.com> References: <20200206030900.147032-1-leonardo@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138, 18.0.572 definitions=2020-02-05_06:2020-02-04,2020-02-05 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 malwarescore=0 mlxlogscore=999 bulkscore=0 priorityscore=1501 adultscore=0 suspectscore=0 lowpriorityscore=0 clxscore=1015 impostorscore=0 spamscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2001150001 definitions=main-2002060021 Sender: kvm-ppc-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org On powerpc, we need to do some lockless pagetable walks from functions that already have disabled interrupts, specially from real mode with MSR[EE=0]. In these contexts, disabling/enabling interrupts can be very troubling. So, this arch-specific implementation features functions with an extra argument that allows interrupt enable/disable to be skipped: __begin_lockless_pgtbl_walk() and __end_lockless_pgtbl_walk(). Functions similar to the generic ones are also exported, by calling the above functions with parameter {en,dis}able_irq = true. Signed-off-by: Leonardo Bras --- arch/powerpc/include/asm/book3s/64/pgtable.h | 6 ++ arch/powerpc/mm/book3s64/pgtable.c | 86 +++++++++++++++++++- 2 files changed, 91 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h index 201a69e6a355..78f6ffb1bb3e 100644 --- a/arch/powerpc/include/asm/book3s/64/pgtable.h +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h @@ -1375,5 +1375,11 @@ static inline bool pgd_is_leaf(pgd_t pgd) return !!(pgd_raw(pgd) & cpu_to_be64(_PAGE_PTE)); } +#define __HAVE_ARCH_LOCKLESS_PGTBL_WALK_CONTROL +unsigned long begin_lockless_pgtbl_walk(void); +unsigned long __begin_lockless_pgtbl_walk(bool disable_irq); +void end_lockless_pgtbl_walk(unsigned long irq_mask); +void __end_lockless_pgtbl_walk(unsigned long irq_mask, bool enable_irq); + #endif /* __ASSEMBLY__ */ #endif /* _ASM_POWERPC_BOOK3S_64_PGTABLE_H_ */ diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c index 2bf7e1b4fd82..535613030363 100644 --- a/arch/powerpc/mm/book3s64/pgtable.c +++ b/arch/powerpc/mm/book3s64/pgtable.c @@ -82,6 +82,7 @@ static void do_nothing(void *unused) { } + /* * Serialize against find_current_mm_pte which does lock-less * lookup in page tables with local interrupts disabled. For huge pages @@ -98,6 +99,89 @@ void serialize_against_pte_lookup(struct mm_struct *mm) smp_call_function_many(mm_cpumask(mm), do_nothing, NULL, 1); } +/* begin_lockless_pgtbl_walk: Must be inserted before a function call that does + * lockless pagetable walks, such as __find_linux_pte(). + * This version allows setting disable_irq=false, so irqs are not touched, which + * is quite useful for running when ints are already disabled (like real-mode) + */ +inline +unsigned long __begin_lockless_pgtbl_walk(bool disable_irq) +{ + unsigned long irq_mask = 0; + + /* + * Interrupts must be disabled during the lockless page table walk. + * That's because the deleting or splitting involves flushing TLBs, + * which in turn issues interrupts, that will block when disabled. + * + * When this function is called from realmode with MSR[EE=0], + * it's not needed to touch irq, since it's already disabled. + */ + if (disable_irq) + local_irq_save(irq_mask); + + /* + * This memory barrier pairs with any code that is either trying to + * delete page tables, or split huge pages. Without this barrier, + * the page tables could be read speculatively outside of interrupt + * disabling or reference counting. + */ + smp_mb(); + + return irq_mask; +} +EXPORT_SYMBOL(__begin_lockless_pgtbl_walk); + +/* begin_lockless_pgtbl_walk: Must be inserted before a function call that does + * lockless pagetable walks, such as __find_linux_pte(). + * This version is used by generic code, and always assume irqs will be disabled + */ +unsigned long begin_lockless_pgtbl_walk(void) +{ + return __begin_lockless_pgtbl_walk(true); +} +EXPORT_SYMBOL(begin_lockless_pgtbl_walk); + +/* + * __end_lockless_pgtbl_walk: Must be inserted after the last use of a pointer + * returned by a lockless pagetable walk, such as __find_linux_pte() + * This version allows setting enable_irq=false, so irqs are not touched, which + * is quite useful for running when ints are already disabled (like real-mode) + */ +inline void __end_lockless_pgtbl_walk(unsigned long irq_mask, bool enable_irq) +{ + /* + * This memory barrier pairs with any code that is either trying to + * delete page tables, or split huge pages. Without this barrier, + * the page tables could be read speculatively outside of interrupt + * disabling or reference counting. + */ + smp_mb(); + + /* + * Interrupts must be disabled during the lockless page table walk. + * That's because the deleting or splitting involves flushing TLBs, + * which in turn issues interrupts, that will block when disabled. + * + * When this function is called from realmode with MSR[EE=0], + * it's not needed to touch irq, since it's already disabled. + */ + if (enable_irq) + local_irq_restore(irq_mask); +} +EXPORT_SYMBOL(__end_lockless_pgtbl_walk); + +/* + * end_lockless_pgtbl_walk: Must be inserted after the last use of a pointer + * returned by a lockless pagetable walk, such as __find_linux_pte() + * This version is used by generic code, and always assume irqs will be enabled + */ +void end_lockless_pgtbl_walk(unsigned long irq_mask) +{ + __end_lockless_pgtbl_walk(irq_mask, true); +} +EXPORT_SYMBOL(end_lockless_pgtbl_walk); + /* * We use this to invalidate a pmdp entry before switching from a * hugepte to regular pmd entry. @@ -487,7 +571,7 @@ static int __init setup_disable_tlbie(char *str) tlbie_capable = false; tlbie_enabled = false; - return 1; + return 1; } __setup("disable_tlbie", setup_disable_tlbie); From patchwork Thu Feb 6 03:08:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 1234111 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 48Ck3r5zP6z9sRp for ; Thu, 6 Feb 2020 14:11:36 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727705AbgBFDLg (ORCPT ); Wed, 5 Feb 2020 22:11:36 -0500 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:15526 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727594AbgBFDLg (ORCPT ); Wed, 5 Feb 2020 22:11:36 -0500 Received: from pps.filterd (m0098417.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 0163A1dP099159; Wed, 5 Feb 2020 22:10:48 -0500 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 2xym4ncv6p-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 05 Feb 2020 22:10:48 -0500 Received: from m0098417.ppops.net (m0098417.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 0163Al7X100934; Wed, 5 Feb 2020 22:10:47 -0500 Received: from ppma02wdc.us.ibm.com (aa.5b.37a9.ip4.static.sl-reverse.com [169.55.91.170]) by mx0a-001b2d01.pphosted.com with ESMTP id 2xym4ncv6g-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 05 Feb 2020 22:10:47 -0500 Received: from pps.filterd (ppma02wdc.us.ibm.com [127.0.0.1]) by ppma02wdc.us.ibm.com (8.16.0.27/8.16.0.27) with SMTP id 01636Ihx013851; Thu, 6 Feb 2020 03:10:47 GMT Received: from b03cxnp08027.gho.boulder.ibm.com (b03cxnp08027.gho.boulder.ibm.com [9.17.130.19]) by ppma02wdc.us.ibm.com with ESMTP id 2xykc9hty5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 06 Feb 2020 03:10:47 +0000 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp08027.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 0163AjOK42271104 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 6 Feb 2020 03:10:45 GMT Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 17D6DBE04F; Thu, 6 Feb 2020 03:10:45 +0000 (GMT) Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 26C9ABE051; Thu, 6 Feb 2020 03:10:28 +0000 (GMT) Received: from LeoBras.aus.stglabs.ibm.com (unknown [9.85.163.250]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP; Thu, 6 Feb 2020 03:10:26 +0000 (GMT) From: Leonardo Bras To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Arnd Bergmann , Andrew Morton , "Aneesh Kumar K.V" , Nicholas Piggin , Christophe Leroy , Steven Price , Robin Murphy , Leonardo Bras , Mahesh Salgaonkar , Balbir Singh , Reza Arbab , Thomas Gleixner , Allison Randal , Greg Kroah-Hartman , Mike Rapoport , Michal Suchanek Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, kvm-ppc@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v6 04/11] powerpc/mce_power: Use functions to track lockless pgtbl walks Date: Thu, 6 Feb 2020 00:08:53 -0300 Message-Id: <20200206030900.147032-5-leonardo@linux.ibm.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200206030900.147032-1-leonardo@linux.ibm.com> References: <20200206030900.147032-1-leonardo@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138, 18.0.572 definitions=2020-02-05_06:2020-02-04,2020-02-05 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 malwarescore=0 phishscore=0 impostorscore=0 bulkscore=0 mlxscore=0 lowpriorityscore=0 adultscore=0 mlxlogscore=795 spamscore=0 priorityscore=1501 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2001150001 definitions=main-2002060022 Sender: kvm-ppc-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org Applies the new functions used for tracking lockless pgtable walks on addr_to_pfn(). local_irq_{save,restore} is already inside {begin,end}_lockless_pgtbl_walk, so there is no need to repeat it here. Signed-off-by: Leonardo Bras --- arch/powerpc/kernel/mce_power.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c index 1cbf7f1a4e3d..a9e38ef4e437 100644 --- a/arch/powerpc/kernel/mce_power.c +++ b/arch/powerpc/kernel/mce_power.c @@ -29,7 +29,7 @@ unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr) { pte_t *ptep; unsigned int shift; - unsigned long pfn, flags; + unsigned long pfn, irq_mask; struct mm_struct *mm; if (user_mode(regs)) @@ -37,7 +37,7 @@ unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr) else mm = &init_mm; - local_irq_save(flags); + irq_mask = begin_lockless_pgtbl_walk(); ptep = __find_linux_pte(mm->pgd, addr, NULL, &shift); if (!ptep || pte_special(*ptep)) { @@ -53,7 +53,7 @@ unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr) } out: - local_irq_restore(flags); + end_lockless_pgtbl_walk(irq_mask); return pfn; } From patchwork Thu Feb 6 03:08:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 1234113 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 48Ck406Nrnz9sSL for ; Thu, 6 Feb 2020 14:11:44 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727884AbgBFDLn (ORCPT ); Wed, 5 Feb 2020 22:11:43 -0500 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:54130 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727594AbgBFDLn (ORCPT ); Wed, 5 Feb 2020 22:11:43 -0500 Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 01639ZuY011448; Wed, 5 Feb 2020 22:11:10 -0500 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 2xyhn4ydum-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 05 Feb 2020 22:11:10 -0500 Received: from m0098396.ppops.net (m0098396.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 0163BAAD014286; Wed, 5 Feb 2020 22:11:10 -0500 Received: from ppma04wdc.us.ibm.com (1a.90.2fa9.ip4.static.sl-reverse.com [169.47.144.26]) by mx0a-001b2d01.pphosted.com with ESMTP id 2xyhn4ydu6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 05 Feb 2020 22:11:10 -0500 Received: from pps.filterd (ppma04wdc.us.ibm.com [127.0.0.1]) by ppma04wdc.us.ibm.com (8.16.0.27/8.16.0.27) with SMTP id 01636Jmk002219; Thu, 6 Feb 2020 03:11:09 GMT Received: from b03cxnp08025.gho.boulder.ibm.com (b03cxnp08025.gho.boulder.ibm.com [9.17.130.17]) by ppma04wdc.us.ibm.com with ESMTP id 2xykc9htmt-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 06 Feb 2020 03:11:08 +0000 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp08025.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 0163B7RL52756784 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 6 Feb 2020 03:11:07 GMT Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A1CAEBE053; Thu, 6 Feb 2020 03:11:07 +0000 (GMT) Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3F43ABE054; Thu, 6 Feb 2020 03:10:46 +0000 (GMT) Received: from LeoBras.aus.stglabs.ibm.com (unknown [9.85.163.250]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP; Thu, 6 Feb 2020 03:10:45 +0000 (GMT) From: Leonardo Bras To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Arnd Bergmann , Andrew Morton , "Aneesh Kumar K.V" , Nicholas Piggin , Christophe Leroy , Steven Price , Robin Murphy , Leonardo Bras , Mahesh Salgaonkar , Balbir Singh , Reza Arbab , Thomas Gleixner , Allison Randal , Greg Kroah-Hartman , Mike Rapoport , Michal Suchanek Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, kvm-ppc@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v6 05/11] powerpc/perf: Use functions to track lockless pgtbl walks Date: Thu, 6 Feb 2020 00:08:54 -0300 Message-Id: <20200206030900.147032-6-leonardo@linux.ibm.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200206030900.147032-1-leonardo@linux.ibm.com> References: <20200206030900.147032-1-leonardo@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138, 18.0.572 definitions=2020-02-05_06:2020-02-04,2020-02-05 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 lowpriorityscore=0 phishscore=0 mlxlogscore=943 impostorscore=0 mlxscore=0 suspectscore=0 priorityscore=1501 malwarescore=0 bulkscore=0 spamscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2001150001 definitions=main-2002060022 Sender: kvm-ppc-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org Applies the new functions used for tracking lockless pgtable walks on read_user_stack_slow. local_irq_{save,restore} is already inside {begin,end}_lockless_pgtbl_walk, so there is no need to repeat it here. Variable that saves the irq mask was renamed from flags to irq_mask so it doesn't lose meaning now it's not directly passed to local_irq_* functions. Signed-off-by: Leonardo Bras --- arch/powerpc/perf/callchain.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/perf/callchain.c b/arch/powerpc/perf/callchain.c index cbc251981209..fd9979e69f36 100644 --- a/arch/powerpc/perf/callchain.c +++ b/arch/powerpc/perf/callchain.c @@ -116,14 +116,14 @@ static int read_user_stack_slow(void __user *ptr, void *buf, int nb) unsigned shift; unsigned long addr = (unsigned long) ptr; unsigned long offset; - unsigned long pfn, flags; + unsigned long pfn, irq_mask; void *kaddr; pgdir = current->mm->pgd; if (!pgdir) return -EFAULT; - local_irq_save(flags); + irq_mask = begin_lockless_pgtbl_walk(); ptep = find_current_mm_pte(pgdir, addr, NULL, &shift); if (!ptep) goto err_out; @@ -145,7 +145,7 @@ static int read_user_stack_slow(void __user *ptr, void *buf, int nb) memcpy(buf, kaddr + offset, nb); ret = 0; err_out: - local_irq_restore(flags); + end_lockless_pgtbl_walk(irq_mask); return ret; } From patchwork Thu Feb 6 03:08:55 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 1234114 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 48Ck4S4X96z9sSH for ; Thu, 6 Feb 2020 14:12:08 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727858AbgBFDME (ORCPT ); Wed, 5 Feb 2020 22:12:04 -0500 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:64088 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727703AbgBFDMD (ORCPT ); Wed, 5 Feb 2020 22:12:03 -0500 Received: from pps.filterd (m0098394.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 0163A8vm081052; Wed, 5 Feb 2020 22:11:32 -0500 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 2xyhmnq7ya-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 05 Feb 2020 22:11:32 -0500 Received: from m0098394.ppops.net (m0098394.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 0163BVmo087245; Wed, 5 Feb 2020 22:11:31 -0500 Received: from ppma03wdc.us.ibm.com (ba.79.3fa9.ip4.static.sl-reverse.com [169.63.121.186]) by mx0a-001b2d01.pphosted.com with ESMTP id 2xyhmnq7xt-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 05 Feb 2020 22:11:31 -0500 Received: from pps.filterd (ppma03wdc.us.ibm.com [127.0.0.1]) by ppma03wdc.us.ibm.com (8.16.0.27/8.16.0.27) with SMTP id 01637kEB016717; Thu, 6 Feb 2020 03:11:30 GMT Received: from b03cxnp08028.gho.boulder.ibm.com (b03cxnp08028.gho.boulder.ibm.com [9.17.130.20]) by ppma03wdc.us.ibm.com with ESMTP id 2xykc9hses-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 06 Feb 2020 03:11:30 +0000 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp08028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 0163BSNf63898042 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 6 Feb 2020 03:11:28 GMT Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 873E5BE054; Thu, 6 Feb 2020 03:11:28 +0000 (GMT) Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 0DA34BE051; Thu, 6 Feb 2020 03:11:09 +0000 (GMT) Received: from LeoBras.aus.stglabs.ibm.com (unknown [9.85.163.250]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP; Thu, 6 Feb 2020 03:11:08 +0000 (GMT) From: Leonardo Bras To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Arnd Bergmann , Andrew Morton , "Aneesh Kumar K.V" , Nicholas Piggin , Christophe Leroy , Steven Price , Robin Murphy , Leonardo Bras , Mahesh Salgaonkar , Balbir Singh , Reza Arbab , Thomas Gleixner , Allison Randal , Greg Kroah-Hartman , Mike Rapoport , Michal Suchanek Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, kvm-ppc@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v6 06/11] powerpc/mm/book3s64/hash: Use functions to track lockless pgtbl walks Date: Thu, 6 Feb 2020 00:08:55 -0300 Message-Id: <20200206030900.147032-7-leonardo@linux.ibm.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200206030900.147032-1-leonardo@linux.ibm.com> References: <20200206030900.147032-1-leonardo@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138, 18.0.572 definitions=2020-02-05_06:2020-02-04,2020-02-05 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 mlxlogscore=749 impostorscore=0 lowpriorityscore=0 bulkscore=0 spamscore=0 clxscore=1015 malwarescore=0 suspectscore=0 phishscore=0 priorityscore=1501 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2001150001 definitions=main-2002060022 Sender: kvm-ppc-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org Applies the new tracking functions to all hash-related functions that do lockless pagetable walks. hash_page_mm: Adds comment that explain that there is no need to local_int_disable/save given that it is only called from DataAccess interrupt, so interrupts are already disabled. local_irq_{save,restore} is already inside {begin,end}_lockless_pgtbl_walk, so there is no need to repeat it here. Variable that saves the irq mask was renamed from flags to irq_mask so it doesn't lose meaning now it's not directly passed to local_irq_* functions. Signed-off-by: Leonardo Bras --- arch/powerpc/mm/book3s64/hash_tlb.c | 6 +++--- arch/powerpc/mm/book3s64/hash_utils.c | 27 +++++++++++++++++---------- 2 files changed, 20 insertions(+), 13 deletions(-) diff --git a/arch/powerpc/mm/book3s64/hash_tlb.c b/arch/powerpc/mm/book3s64/hash_tlb.c index 4a70d8dd39cd..86547c4151f6 100644 --- a/arch/powerpc/mm/book3s64/hash_tlb.c +++ b/arch/powerpc/mm/book3s64/hash_tlb.c @@ -194,7 +194,7 @@ void __flush_hash_table_range(struct mm_struct *mm, unsigned long start, { bool is_thp; int hugepage_shift; - unsigned long flags; + unsigned long irq_mask; start = _ALIGN_DOWN(start, PAGE_SIZE); end = _ALIGN_UP(end, PAGE_SIZE); @@ -209,7 +209,7 @@ void __flush_hash_table_range(struct mm_struct *mm, unsigned long start, * to being hashed). This is not the most performance oriented * way to do things but is fine for our needs here. */ - local_irq_save(flags); + irq_mask = begin_lockless_pgtbl_walk(); arch_enter_lazy_mmu_mode(); for (; start < end; start += PAGE_SIZE) { pte_t *ptep = find_current_mm_pte(mm->pgd, start, &is_thp, @@ -229,7 +229,7 @@ void __flush_hash_table_range(struct mm_struct *mm, unsigned long start, hpte_need_flush(mm, start, ptep, pte, hugepage_shift); } arch_leave_lazy_mmu_mode(); - local_irq_restore(flags); + end_lockless_pgtbl_walk(irq_mask); } void flush_tlb_pmd_range(struct mm_struct *mm, pmd_t *pmd, unsigned long addr) diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c index 523d4d39d11e..e6d4ab42173b 100644 --- a/arch/powerpc/mm/book3s64/hash_utils.c +++ b/arch/powerpc/mm/book3s64/hash_utils.c @@ -1341,12 +1341,16 @@ int hash_page_mm(struct mm_struct *mm, unsigned long ea, ea &= ~((1ul << mmu_psize_defs[psize].shift) - 1); #endif /* CONFIG_PPC_64K_PAGES */ - /* Get PTE and page size from page tables */ + /* Get PTE and page size from page tables : + * Called in from DataAccess interrupt (data_access_common: 0x300), + * interrupts are disabled here. + */ + __begin_lockless_pgtbl_walk(false); ptep = find_linux_pte(pgdir, ea, &is_thp, &hugeshift); if (ptep == NULL || !pte_present(*ptep)) { DBG_LOW(" no PTE !\n"); rc = 1; - goto bail; + goto bail_pgtbl_walk; } /* Add _PAGE_PRESENT to the required access perm */ @@ -1359,7 +1363,7 @@ int hash_page_mm(struct mm_struct *mm, unsigned long ea, if (!check_pte_access(access, pte_val(*ptep))) { DBG_LOW(" no access !\n"); rc = 1; - goto bail; + goto bail_pgtbl_walk; } if (hugeshift) { @@ -1383,7 +1387,7 @@ int hash_page_mm(struct mm_struct *mm, unsigned long ea, if (current->mm == mm) check_paca_psize(ea, mm, psize, user_region); - goto bail; + goto bail_pgtbl_walk; } #ifndef CONFIG_PPC_64K_PAGES @@ -1457,6 +1461,8 @@ int hash_page_mm(struct mm_struct *mm, unsigned long ea, #endif DBG_LOW(" -> rc=%d\n", rc); +bail_pgtbl_walk: + __end_lockless_pgtbl_walk(0, false); bail: exception_exit(prev_state); return rc; @@ -1545,7 +1551,7 @@ static void hash_preload(struct mm_struct *mm, unsigned long ea, unsigned long vsid; pgd_t *pgdir; pte_t *ptep; - unsigned long flags; + unsigned long irq_mask; int rc, ssize, update_flags = 0; unsigned long access = _PAGE_PRESENT | _PAGE_READ | (is_exec ? _PAGE_EXEC : 0); @@ -1567,11 +1573,12 @@ static void hash_preload(struct mm_struct *mm, unsigned long ea, vsid = get_user_vsid(&mm->context, ea, ssize); if (!vsid) return; + /* * Hash doesn't like irqs. Walking linux page table with irq disabled * saves us from holding multiple locks. */ - local_irq_save(flags); + irq_mask = begin_lockless_pgtbl_walk(); /* * THP pages use update_mmu_cache_pmd. We don't do @@ -1616,7 +1623,7 @@ static void hash_preload(struct mm_struct *mm, unsigned long ea, mm_ctx_user_psize(&mm->context), pte_val(*ptep)); out_exit: - local_irq_restore(flags); + end_lockless_pgtbl_walk(irq_mask); } /* @@ -1679,16 +1686,16 @@ u16 get_mm_addr_key(struct mm_struct *mm, unsigned long address) { pte_t *ptep; u16 pkey = 0; - unsigned long flags; + unsigned long irq_mask; if (!mm || !mm->pgd) return 0; - local_irq_save(flags); + irq_mask = begin_lockless_pgtbl_walk(); ptep = find_linux_pte(mm->pgd, address, NULL, NULL); if (ptep) pkey = pte_to_pkey_bits(pte_val(READ_ONCE(*ptep))); - local_irq_restore(flags); + end_lockless_pgtbl_walk(irq_mask); return pkey; } From patchwork Thu Feb 6 03:08:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 1234115 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 48Ck4t5JXBz9sSL for ; Thu, 6 Feb 2020 14:12:30 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727767AbgBFDM3 (ORCPT ); Wed, 5 Feb 2020 22:12:29 -0500 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:13842 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727927AbgBFDM3 (ORCPT ); Wed, 5 Feb 2020 22:12:29 -0500 Received: from pps.filterd (m0098414.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 0163BcO3028973; Wed, 5 Feb 2020 22:11:55 -0500 Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 2xyhmhdgnv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 05 Feb 2020 22:11:55 -0500 Received: from m0098414.ppops.net (m0098414.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 0163Be79029145; Wed, 5 Feb 2020 22:11:54 -0500 Received: from ppma01dal.us.ibm.com (83.d6.3fa9.ip4.static.sl-reverse.com [169.63.214.131]) by mx0b-001b2d01.pphosted.com with ESMTP id 2xyhmhdgnm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 05 Feb 2020 22:11:54 -0500 Received: from pps.filterd (ppma01dal.us.ibm.com [127.0.0.1]) by ppma01dal.us.ibm.com (8.16.0.27/8.16.0.27) with SMTP id 0163Blks013916; Thu, 6 Feb 2020 03:11:53 GMT Received: from b03cxnp08028.gho.boulder.ibm.com (b03cxnp08028.gho.boulder.ibm.com [9.17.130.20]) by ppma01dal.us.ibm.com with ESMTP id 2xykc9484g-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 06 Feb 2020 03:11:53 +0000 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp08028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 0163Bqjw61735176 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 6 Feb 2020 03:11:52 GMT Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id F1983BE059; Thu, 6 Feb 2020 03:11:51 +0000 (GMT) Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3C999BE05D; Thu, 6 Feb 2020 03:11:32 +0000 (GMT) Received: from LeoBras.aus.stglabs.ibm.com (unknown [9.85.163.250]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP; Thu, 6 Feb 2020 03:11:30 +0000 (GMT) From: Leonardo Bras To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Arnd Bergmann , Andrew Morton , "Aneesh Kumar K.V" , Nicholas Piggin , Christophe Leroy , Steven Price , Robin Murphy , Leonardo Bras , Mahesh Salgaonkar , Balbir Singh , Reza Arbab , Thomas Gleixner , Allison Randal , Greg Kroah-Hartman , Mike Rapoport , Michal Suchanek Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, kvm-ppc@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v6 07/11] powerpc/kvm/e500: Use functions to track lockless pgtbl walks Date: Thu, 6 Feb 2020 00:08:56 -0300 Message-Id: <20200206030900.147032-8-leonardo@linux.ibm.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200206030900.147032-1-leonardo@linux.ibm.com> References: <20200206030900.147032-1-leonardo@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138, 18.0.572 definitions=2020-02-05_06:2020-02-04,2020-02-05 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 suspectscore=0 spamscore=0 impostorscore=0 mlxlogscore=960 adultscore=0 lowpriorityscore=0 priorityscore=1501 phishscore=0 mlxscore=0 clxscore=1015 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2001150001 definitions=main-2002060022 Sender: kvm-ppc-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org Applies the new functions used for tracking lockless pgtable walks on kvmppc_e500_shadow_map(). Fixes the place where local_irq_restore() is called: previously, if ptep was NULL, local_irq_restore() would never be called. local_irq_{save,restore} is already inside {begin,end}_lockless_pgtbl_walk, so there is no need to repeat it here. Variable that saves the irq mask was renamed from flags to irq_mask so it doesn't lose meaning now it's not directly passed to local_irq_* functions. Signed-off-by: Leonardo Bras --- arch/powerpc/kvm/e500_mmu_host.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 425d13806645..3dcf11f77256 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -336,7 +336,7 @@ static inline int kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500 *vcpu_e500, pte_t *ptep; unsigned int wimg = 0; pgd_t *pgdir; - unsigned long flags; + unsigned long irq_mask; /* used to check for invalidations in progress */ mmu_seq = kvm->mmu_notifier_seq; @@ -473,7 +473,7 @@ static inline int kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500 *vcpu_e500, * We are holding kvm->mmu_lock so a notifier invalidate * can't run hence pfn won't change. */ - local_irq_save(flags); + irq_mask = begin_lockless_pgtbl_walk(); ptep = find_linux_pte(pgdir, hva, NULL, NULL); if (ptep) { pte_t pte = READ_ONCE(*ptep); @@ -481,15 +481,16 @@ static inline int kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500 *vcpu_e500, if (pte_present(pte)) { wimg = (pte_val(pte) >> PTE_WIMGE_SHIFT) & MAS2_WIMGE_MASK; - local_irq_restore(flags); } else { - local_irq_restore(flags); + end_lockless_pgtbl_walk(irq_mask); pr_err_ratelimited("%s: pte not present: gfn %lx,pfn %lx\n", __func__, (long)gfn, pfn); ret = -EINVAL; goto out; } } + end_lockless_pgtbl_walk(irq_mask); + kvmppc_e500_ref_setup(ref, gtlbe, pfn, wimg); kvmppc_e500_setup_stlbe(&vcpu_e500->vcpu, gtlbe, tsize, From patchwork Thu Feb 6 03:08:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 1234116 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 48Ck5X74y6z9sRt for ; Thu, 6 Feb 2020 14:13:04 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727789AbgBFDNE (ORCPT ); Wed, 5 Feb 2020 22:13:04 -0500 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:61651 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727856AbgBFDNE (ORCPT ); Wed, 5 Feb 2020 22:13:04 -0500 Received: from pps.filterd (m0098413.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 01639jn8087190; Wed, 5 Feb 2020 22:12:26 -0500 Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 2xyhmhvk17-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 05 Feb 2020 22:12:26 -0500 Received: from m0098413.ppops.net (m0098413.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 0163AKjb088861; Wed, 5 Feb 2020 22:12:25 -0500 Received: from ppma03dal.us.ibm.com (b.bd.3ea9.ip4.static.sl-reverse.com [169.62.189.11]) by mx0b-001b2d01.pphosted.com with ESMTP id 2xyhmhvk02-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 05 Feb 2020 22:12:25 -0500 Received: from pps.filterd (ppma03dal.us.ibm.com [127.0.0.1]) by ppma03dal.us.ibm.com (8.16.0.27/8.16.0.27) with SMTP id 0163BlgC028884; Thu, 6 Feb 2020 03:12:24 GMT Received: from b03cxnp08028.gho.boulder.ibm.com (b03cxnp08028.gho.boulder.ibm.com [9.17.130.20]) by ppma03dal.us.ibm.com with ESMTP id 2xykc9m8n0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 06 Feb 2020 03:12:24 +0000 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp08028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 0163CMBT65274112 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 6 Feb 2020 03:12:22 GMT Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A76D4BE04F; Thu, 6 Feb 2020 03:12:22 +0000 (GMT) Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C9E25BE054; Thu, 6 Feb 2020 03:11:52 +0000 (GMT) Received: from LeoBras.aus.stglabs.ibm.com (unknown [9.85.163.250]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP; Thu, 6 Feb 2020 03:11:52 +0000 (GMT) From: Leonardo Bras To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Arnd Bergmann , Andrew Morton , "Aneesh Kumar K.V" , Nicholas Piggin , Christophe Leroy , Steven Price , Robin Murphy , Leonardo Bras , Mahesh Salgaonkar , Balbir Singh , Reza Arbab , Thomas Gleixner , Allison Randal , Greg Kroah-Hartman , Mike Rapoport , Michal Suchanek Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, kvm-ppc@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v6 08/11] powerpc/kvm/book3s_hv: Use functions to track lockless pgtbl walks Date: Thu, 6 Feb 2020 00:08:57 -0300 Message-Id: <20200206030900.147032-9-leonardo@linux.ibm.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200206030900.147032-1-leonardo@linux.ibm.com> References: <20200206030900.147032-1-leonardo@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138, 18.0.572 definitions=2020-02-05_06:2020-02-04,2020-02-05 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 clxscore=1015 impostorscore=0 malwarescore=0 priorityscore=1501 adultscore=0 mlxlogscore=999 suspectscore=0 bulkscore=0 lowpriorityscore=0 spamscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2001150001 definitions=main-2002060022 Sender: kvm-ppc-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org Applies the new functions for tracking all book3s_hv related functions that do lockless pagetable walks. Adds comments explaining that some lockless pagetable walks don't need protection due to guest pgd not being a target of THP collapse/split, or due to being called from Realmode + MSR_EE = 0 kvmppc_do_h_enter: Fixes where local_irq_restore() must be placed (after the last usage of ptep). Given that some of these functions can be called in real mode, and others always are, we use __{begin,end}_lockless_pgtbl_walk so we can decide when to disable interrupts. Signed-off-by: Leonardo Bras --- arch/powerpc/kvm/book3s_hv_nested.c | 22 ++++++++++++++++++++-- arch/powerpc/kvm/book3s_hv_rm_mmu.c | 28 ++++++++++++++++++---------- 2 files changed, 38 insertions(+), 12 deletions(-) diff --git a/arch/powerpc/kvm/book3s_hv_nested.c b/arch/powerpc/kvm/book3s_hv_nested.c index dc97e5be76f6..a398061d5778 100644 --- a/arch/powerpc/kvm/book3s_hv_nested.c +++ b/arch/powerpc/kvm/book3s_hv_nested.c @@ -803,7 +803,11 @@ static void kvmhv_update_nest_rmap_rc(struct kvm *kvm, u64 n_rmap, if (!gp) return; - /* Find the pte */ + /* Find the pte: + * We are walking the nested guest (partition-scoped) page table here. + * We can do this without disabling irq because the Linux MM + * subsystem doesn't do THP splits and collapses on this tree. + */ ptep = __find_linux_pte(gp->shadow_pgtable, gpa, NULL, &shift); /* * If the pte is present and the pfn is still the same, update the pte. @@ -853,7 +857,11 @@ static void kvmhv_remove_nest_rmap(struct kvm *kvm, u64 n_rmap, if (!gp) return; - /* Find and invalidate the pte */ + /* Find and invalidate the pte: + * We are walking the nested guest (partition-scoped) page table here. + * We can do this without disabling irq because the Linux MM + * subsystem doesn't do THP splits and collapses on this tree. + */ ptep = __find_linux_pte(gp->shadow_pgtable, gpa, NULL, &shift); /* Don't spuriously invalidate ptes if the pfn has changed */ if (ptep && pte_present(*ptep) && ((pte_val(*ptep) & mask) == hpa)) @@ -921,6 +929,11 @@ static bool kvmhv_invalidate_shadow_pte(struct kvm_vcpu *vcpu, int shift; spin_lock(&kvm->mmu_lock); + /* + * We are walking the nested guest (partition-scoped) page table here. + * We can do this without disabling irq because the Linux MM + * subsystem doesn't do THP splits and collapses on this tree. + */ ptep = __find_linux_pte(gp->shadow_pgtable, gpa, NULL, &shift); if (!shift) shift = PAGE_SHIFT; @@ -1362,6 +1375,11 @@ static long int __kvmhv_nested_page_fault(struct kvm_run *run, /* See if can find translation in our partition scoped tables for L1 */ pte = __pte(0); spin_lock(&kvm->mmu_lock); + /* + * We are walking the secondary (partition-scoped) page table here. + * We can do this without disabling irq because the Linux MM + * subsystem doesn't do THP splits and collapses on this tree. + */ pte_p = __find_linux_pte(kvm->arch.pgtable, gpa, NULL, &shift); if (!shift) shift = PAGE_SHIFT; diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c index 220305454c23..fd4d8f174f09 100644 --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c @@ -210,7 +210,7 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags, pte_t *ptep; unsigned int writing; unsigned long mmu_seq; - unsigned long rcbits, irq_flags = 0; + unsigned long rcbits, irq_mask = 0; if (kvm_is_radix(kvm)) return H_FUNCTION; @@ -252,8 +252,8 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags, * If we had a page table table change after lookup, we would * retry via mmu_notifier_retry. */ - if (!realmode) - local_irq_save(irq_flags); + irq_mask = __begin_lockless_pgtbl_walk(!realmode); + /* * If called in real mode we have MSR_EE = 0. Otherwise * we disable irq above. @@ -272,8 +272,7 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags, * to <= host page size, if host is using hugepage */ if (host_pte_size < psize) { - if (!realmode) - local_irq_restore(flags); + __end_lockless_pgtbl_walk(irq_mask, !realmode); return H_PARAMETER; } pte = kvmppc_read_update_linux_pte(ptep, writing); @@ -287,8 +286,6 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags, pa |= gpa & ~PAGE_MASK; } } - if (!realmode) - local_irq_restore(irq_flags); ptel &= HPTE_R_KEY | HPTE_R_PP0 | (psize-1); ptel |= pa; @@ -302,8 +299,10 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags, /*If we had host pte mapping then Check WIMG */ if (ptep && !hpte_cache_flags_ok(ptel, is_ci)) { - if (is_ci) + if (is_ci) { + __end_lockless_pgtbl_walk(irq_mask, !realmode); return H_PARAMETER; + } /* * Allow guest to map emulated device memory as * uncacheable, but actually make it cacheable. @@ -311,6 +310,7 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags, ptel &= ~(HPTE_R_W|HPTE_R_I|HPTE_R_G); ptel |= HPTE_R_M; } + __end_lockless_pgtbl_walk(irq_mask, !realmode); /* Find and lock the HPTEG slot to use */ do_insert: @@ -907,11 +907,19 @@ static int kvmppc_get_hpa(struct kvm_vcpu *vcpu, unsigned long gpa, /* Translate to host virtual address */ hva = __gfn_to_hva_memslot(memslot, gfn); - /* Try to find the host pte for that virtual address */ + /* Try to find the host pte for that virtual address : + * Called by hcall_real_table (real mode + MSR_EE=0) + * Interrupts are disabled here. + */ + __begin_lockless_pgtbl_walk(false); ptep = __find_linux_pte(vcpu->arch.pgdir, hva, NULL, &shift); - if (!ptep) + if (!ptep) { + __end_lockless_pgtbl_walk(0, false); return H_TOO_HARD; + } pte = kvmppc_read_update_linux_pte(ptep, writing); + __end_lockless_pgtbl_walk(0, false); + if (!pte_present(pte)) return H_TOO_HARD; From patchwork Thu Feb 6 03:08:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 1234117 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 48Ck5g63CCz9sS9 for ; Thu, 6 Feb 2020 14:13:11 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727773AbgBFDNL (ORCPT ); Wed, 5 Feb 2020 22:13:11 -0500 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:20864 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727307AbgBFDNK (ORCPT ); Wed, 5 Feb 2020 22:13:10 -0500 Received: from pps.filterd (m0098421.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 01639iQP024107; Wed, 5 Feb 2020 22:12:35 -0500 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 2xyhn3gxrc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 05 Feb 2020 22:12:35 -0500 Received: from m0098421.ppops.net (m0098421.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 0163BB82027665; Wed, 5 Feb 2020 22:12:35 -0500 Received: from ppma04wdc.us.ibm.com (1a.90.2fa9.ip4.static.sl-reverse.com [169.47.144.26]) by mx0a-001b2d01.pphosted.com with ESMTP id 2xyhn3gxr5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 05 Feb 2020 22:12:35 -0500 Received: from pps.filterd (ppma04wdc.us.ibm.com [127.0.0.1]) by ppma04wdc.us.ibm.com (8.16.0.27/8.16.0.27) with SMTP id 0163BmbC007628; Thu, 6 Feb 2020 03:12:34 GMT Received: from b03cxnp08027.gho.boulder.ibm.com (b03cxnp08027.gho.boulder.ibm.com [9.17.130.19]) by ppma04wdc.us.ibm.com with ESMTP id 2xykc9htv1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 06 Feb 2020 03:12:34 +0000 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp08027.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 0163CXS547448330 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 6 Feb 2020 03:12:33 GMT Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E98A2BE04F; Thu, 6 Feb 2020 03:12:32 +0000 (GMT) Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2FD1ABE051; Thu, 6 Feb 2020 03:12:23 +0000 (GMT) Received: from LeoBras.aus.stglabs.ibm.com (unknown [9.85.163.250]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP; Thu, 6 Feb 2020 03:12:22 +0000 (GMT) From: Leonardo Bras To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Arnd Bergmann , Andrew Morton , "Aneesh Kumar K.V" , Nicholas Piggin , Christophe Leroy , Steven Price , Robin Murphy , Leonardo Bras , Mahesh Salgaonkar , Balbir Singh , Reza Arbab , Thomas Gleixner , Allison Randal , Greg Kroah-Hartman , Mike Rapoport , Michal Suchanek Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, kvm-ppc@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v6 09/11] powerpc/kvm/book3s_64: Use functions to track lockless pgtbl walks Date: Thu, 6 Feb 2020 00:08:58 -0300 Message-Id: <20200206030900.147032-10-leonardo@linux.ibm.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200206030900.147032-1-leonardo@linux.ibm.com> References: <20200206030900.147032-1-leonardo@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138, 18.0.572 definitions=2020-02-05_06:2020-02-04,2020-02-05 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 impostorscore=0 malwarescore=0 suspectscore=0 phishscore=0 clxscore=1015 adultscore=0 mlxlogscore=588 bulkscore=0 spamscore=0 lowpriorityscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2001150001 definitions=main-2002060022 Sender: kvm-ppc-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org Applies the new tracking functions to all book3s_64 related functions that do lockless pagetable walks. Adds comments explaining that some lockless pagetable walks don't need protection due to guest pgd not being a target of THP collapse/split, or due to being called from Realmode + MSR_EE = 0. Given that some of these functions always are called in realmode, we use __{begin,end}_lockless_pgtbl_walk so we can decide when to disable interrupts. local_irq_{save,restore} is already inside {begin,end}_lockless_pgtbl_walk, so there is no need to repeat it here. Variable that saves the irq mask was renamed from flags to irq_mask so it doesn't lose meaning now it's not directly passed to local_irq_* functions. There are also a function that uses local_irq_{en,dis}able, so the return value of begin_lockless_pgtbl_walk() is ignored and we pass IRQS_ENABLED to end_lockless_pgtbl_walk() to mimic the effect of local_irq_enable(). Signed-off-by: Leonardo Bras --- arch/powerpc/kvm/book3s_64_mmu_hv.c | 6 ++--- arch/powerpc/kvm/book3s_64_mmu_radix.c | 34 +++++++++++++++++++++++--- arch/powerpc/kvm/book3s_64_vio_hv.c | 6 ++++- 3 files changed, 39 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c index 6c372f5c61b6..e7ce29a5df60 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c @@ -605,19 +605,19 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu, /* if the guest wants write access, see if that is OK */ if (!writing && hpte_is_writable(r)) { pte_t *ptep, pte; - unsigned long flags; + unsigned long irq_mask; /* * We need to protect against page table destruction * hugepage split and collapse. */ - local_irq_save(flags); + irq_mask = begin_lockless_pgtbl_walk(); ptep = find_current_mm_pte(mm->pgd, hva, NULL, NULL); if (ptep) { pte = kvmppc_read_update_linux_pte(ptep, 1); if (__pte_write(pte)) write_ok = 1; } - local_irq_restore(flags); + end_lockless_pgtbl_walk(irq_mask); } } diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c index 803940d79b73..cda2e455baf2 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_radix.c +++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c @@ -813,20 +813,20 @@ int kvmppc_book3s_instantiate_page(struct kvm_vcpu *vcpu, * Read the PTE from the process' radix tree and use that * so we get the shift and attribute bits. */ - local_irq_disable(); + begin_lockless_pgtbl_walk(); ptep = __find_linux_pte(vcpu->arch.pgdir, hva, NULL, &shift); /* * If the PTE disappeared temporarily due to a THP * collapse, just return and let the guest try again. */ if (!ptep) { - local_irq_enable(); + end_lockless_pgtbl_walk(IRQS_ENABLED); if (page) put_page(page); return RESUME_GUEST; } pte = *ptep; - local_irq_enable(); + end_lockless_pgtbl_walk(IRQS_ENABLED); /* If we're logging dirty pages, always map single pages */ large_enable = !(memslot->flags & KVM_MEM_LOG_DIRTY_PAGES); @@ -980,10 +980,16 @@ int kvm_unmap_radix(struct kvm *kvm, struct kvm_memory_slot *memslot, return 0; } + /* + * We are walking the secondary (partition-scoped) page table here. + * We can do this without disabling irq because the Linux MM + * subsystem doesn't do THP splits and collapses on this tree. + */ ptep = __find_linux_pte(kvm->arch.pgtable, gpa, NULL, &shift); if (ptep && pte_present(*ptep)) kvmppc_unmap_pte(kvm, ptep, gpa, shift, memslot, kvm->arch.lpid); + return 0; } @@ -1000,6 +1006,11 @@ int kvm_age_radix(struct kvm *kvm, struct kvm_memory_slot *memslot, if (kvm->arch.secure_guest & KVMPPC_SECURE_INIT_DONE) return ref; + /* + * We are walking the secondary (partition-scoped) page table here. + * We can do this without disabling irq because the Linux MM + * subsystem doesn't do THP splits and collapses on this tree. + */ ptep = __find_linux_pte(kvm->arch.pgtable, gpa, NULL, &shift); if (ptep && pte_present(*ptep) && pte_young(*ptep)) { old = kvmppc_radix_update_pte(kvm, ptep, _PAGE_ACCESSED, 0, @@ -1027,6 +1038,11 @@ int kvm_test_age_radix(struct kvm *kvm, struct kvm_memory_slot *memslot, if (kvm->arch.secure_guest & KVMPPC_SECURE_INIT_DONE) return ref; + /* + * We are walking the secondary (partition-scoped) page table here. + * We can do this without disabling irq because the Linux MM + * subsystem doesn't do THP splits and collapses on this tree. + */ ptep = __find_linux_pte(kvm->arch.pgtable, gpa, NULL, &shift); if (ptep && pte_present(*ptep) && pte_young(*ptep)) ref = 1; @@ -1047,6 +1063,11 @@ static int kvm_radix_test_clear_dirty(struct kvm *kvm, if (kvm->arch.secure_guest & KVMPPC_SECURE_INIT_DONE) return ret; + /* + * We are walking the secondary (partition-scoped) page table here. + * We can do this without disabling irq because the Linux MM + * subsystem doesn't do THP splits and collapses on this tree. + */ ptep = __find_linux_pte(kvm->arch.pgtable, gpa, NULL, &shift); if (ptep && pte_present(*ptep) && pte_dirty(*ptep)) { ret = 1; @@ -1063,6 +1084,7 @@ static int kvm_radix_test_clear_dirty(struct kvm *kvm, 1UL << shift); spin_unlock(&kvm->mmu_lock); } + return ret; } @@ -1108,6 +1130,12 @@ void kvmppc_radix_flush_memslot(struct kvm *kvm, gpa = memslot->base_gfn << PAGE_SHIFT; spin_lock(&kvm->mmu_lock); for (n = memslot->npages; n; --n) { + /* + * We are walking the secondary (partition-scoped) page table + * here. + * We can do this without disabling irq because the Linux MM + * subsystem doesn't do THP splits and collapses on this tree. + */ ptep = __find_linux_pte(kvm->arch.pgtable, gpa, NULL, &shift); if (ptep && pte_present(*ptep)) kvmppc_unmap_pte(kvm, ptep, gpa, shift, memslot, diff --git a/arch/powerpc/kvm/book3s_64_vio_hv.c b/arch/powerpc/kvm/book3s_64_vio_hv.c index ab6eeb8e753e..83c70c1557e4 100644 --- a/arch/powerpc/kvm/book3s_64_vio_hv.c +++ b/arch/powerpc/kvm/book3s_64_vio_hv.c @@ -453,10 +453,14 @@ static long kvmppc_rm_ua_to_hpa(struct kvm_vcpu *vcpu, * to exit which will agains result in the below page table walk * to finish. */ + __begin_lockless_pgtbl_walk(false); ptep = __find_linux_pte(vcpu->arch.pgdir, ua, NULL, &shift); - if (!ptep || !pte_present(*ptep)) + if (!ptep || !pte_present(*ptep)) { + __end_lockless_pgtbl_walk(0, false); return -ENXIO; + } pte = *ptep; + __end_lockless_pgtbl_walk(0, false); if (!shift) shift = PAGE_SHIFT; From patchwork Thu Feb 6 03:08:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 1234118 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 48Ck5y2Q1Mz9sRt for ; Thu, 6 Feb 2020 14:13:26 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727659AbgBFDNZ (ORCPT ); Wed, 5 Feb 2020 22:13:25 -0500 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:37358 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727577AbgBFDNZ (ORCPT ); Wed, 5 Feb 2020 22:13:25 -0500 Received: from pps.filterd (m0098420.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 01639EMI096736; Wed, 5 Feb 2020 22:12:53 -0500 Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 2xyhn60kte-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 05 Feb 2020 22:12:53 -0500 Received: from m0098420.ppops.net (m0098420.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 0163Bgpt101390; Wed, 5 Feb 2020 22:12:52 -0500 Received: from ppma01dal.us.ibm.com (83.d6.3fa9.ip4.static.sl-reverse.com [169.63.214.131]) by mx0b-001b2d01.pphosted.com with ESMTP id 2xyhn60kt2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 05 Feb 2020 22:12:52 -0500 Received: from pps.filterd (ppma01dal.us.ibm.com [127.0.0.1]) by ppma01dal.us.ibm.com (8.16.0.27/8.16.0.27) with SMTP id 0163BmQl013924; Thu, 6 Feb 2020 03:12:51 GMT Received: from b03cxnp08025.gho.boulder.ibm.com (b03cxnp08025.gho.boulder.ibm.com [9.17.130.17]) by ppma01dal.us.ibm.com with ESMTP id 2xykc9489r-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 06 Feb 2020 03:12:51 +0000 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp08025.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 0163Cnb937028238 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 6 Feb 2020 03:12:50 GMT Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id DE62EBE053; Thu, 6 Feb 2020 03:12:49 +0000 (GMT) Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A1A82BE051; Thu, 6 Feb 2020 03:12:33 +0000 (GMT) Received: from LeoBras.aus.stglabs.ibm.com (unknown [9.85.163.250]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP; Thu, 6 Feb 2020 03:12:33 +0000 (GMT) From: Leonardo Bras To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Arnd Bergmann , Andrew Morton , "Aneesh Kumar K.V" , Nicholas Piggin , Christophe Leroy , Steven Price , Robin Murphy , Leonardo Bras , Mahesh Salgaonkar , Balbir Singh , Reza Arbab , Thomas Gleixner , Allison Randal , Greg Kroah-Hartman , Mike Rapoport , Michal Suchanek Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, kvm-ppc@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v6 10/11] powerpc/mm: Adds counting method to track lockless pagetable walks Date: Thu, 6 Feb 2020 00:08:59 -0300 Message-Id: <20200206030900.147032-11-leonardo@linux.ibm.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200206030900.147032-1-leonardo@linux.ibm.com> References: <20200206030900.147032-1-leonardo@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138, 18.0.572 definitions=2020-02-05_06:2020-02-04,2020-02-05 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 priorityscore=1501 adultscore=0 mlxscore=0 mlxlogscore=999 malwarescore=0 spamscore=0 lowpriorityscore=0 bulkscore=0 clxscore=1015 phishscore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2001150001 definitions=main-2002060022 Sender: kvm-ppc-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org Implements an additional feature to track lockless pagetable walks, using a per-cpu counter: lockless_pgtbl_walk_counter. Before a lockless pagetable walk, preemption is disabled and the current cpu's counter is increased. When the lockless pagetable walk finishes, the current cpu counter is decreased and the preemption is enabled. With that, it's possible to know in which cpus are happening lockless pagetable walks, and optimize serialize_against_pte_lookup(). Implementation notes: - Every counter can be changed only by it's CPU - It makes use of the original memory barrier in the functions - Any counter can be read by any CPU Due to not locking nor using atomic variables, the impact on the lockless pagetable walk is intended to be minimum. Signed-off-by: Leonardo Bras --- arch/powerpc/mm/book3s64/pgtable.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c index 535613030363..bb138b628f86 100644 --- a/arch/powerpc/mm/book3s64/pgtable.c +++ b/arch/powerpc/mm/book3s64/pgtable.c @@ -83,6 +83,7 @@ static void do_nothing(void *unused) } +static DEFINE_PER_CPU(int, lockless_pgtbl_walk_counter); /* * Serialize against find_current_mm_pte which does lock-less * lookup in page tables with local interrupts disabled. For huge pages @@ -120,6 +121,15 @@ unsigned long __begin_lockless_pgtbl_walk(bool disable_irq) if (disable_irq) local_irq_save(irq_mask); + /* + * Counts this instance of lockless pagetable walk for this cpu. + * Disables preempt to make sure there is no cpu change between + * begin/end lockless pagetable walk, so that percpu counting + * works fine. + */ + preempt_disable(); + (*this_cpu_ptr(&lockless_pgtbl_walk_counter))++; + /* * This memory barrier pairs with any code that is either trying to * delete page tables, or split huge pages. Without this barrier, @@ -158,6 +168,14 @@ inline void __end_lockless_pgtbl_walk(unsigned long irq_mask, bool enable_irq) */ smp_mb(); + /* + * Removes this instance of lockless pagetable walk for this cpu. + * Enables preempt only after end lockless pagetable walk, + * so that percpu counting works fine. + */ + (*this_cpu_ptr(&lockless_pgtbl_walk_counter))--; + preempt_enable(); + /* * Interrupts must be disabled during the lockless page table walk. * That's because the deleting or splitting involves flushing TLBs, From patchwork Thu Feb 6 03:09:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 1234119 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 48Ck6F1sJzz9sRX for ; Thu, 6 Feb 2020 14:13:41 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727577AbgBFDNl (ORCPT ); Wed, 5 Feb 2020 22:13:41 -0500 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:33360 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727762AbgBFDNk (ORCPT ); Wed, 5 Feb 2020 22:13:40 -0500 Received: from pps.filterd (m0098413.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 01639iJP087061; Wed, 5 Feb 2020 22:13:08 -0500 Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 2xyhmhvkhn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 05 Feb 2020 22:13:08 -0500 Received: from m0098413.ppops.net (m0098413.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 0163A3Ax088408; Wed, 5 Feb 2020 22:13:07 -0500 Received: from ppma02wdc.us.ibm.com (aa.5b.37a9.ip4.static.sl-reverse.com [169.55.91.170]) by mx0b-001b2d01.pphosted.com with ESMTP id 2xyhmhvkhc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 05 Feb 2020 22:13:07 -0500 Received: from pps.filterd (ppma02wdc.us.ibm.com [127.0.0.1]) by ppma02wdc.us.ibm.com (8.16.0.27/8.16.0.27) with SMTP id 0163D2Hi019950; Thu, 6 Feb 2020 03:13:07 GMT Received: from b03cxnp07028.gho.boulder.ibm.com (b03cxnp07028.gho.boulder.ibm.com [9.17.130.15]) by ppma02wdc.us.ibm.com with ESMTP id 2xykc9hu9p-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 06 Feb 2020 03:13:07 +0000 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp07028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 0163D5Ii34275744 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 6 Feb 2020 03:13:05 GMT Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A620FBE056; Thu, 6 Feb 2020 03:13:05 +0000 (GMT) Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 57003BE04F; Thu, 6 Feb 2020 03:12:52 +0000 (GMT) Received: from LeoBras.aus.stglabs.ibm.com (unknown [9.85.163.250]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP; Thu, 6 Feb 2020 03:12:51 +0000 (GMT) From: Leonardo Bras To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Arnd Bergmann , Andrew Morton , "Aneesh Kumar K.V" , Nicholas Piggin , Christophe Leroy , Steven Price , Robin Murphy , Leonardo Bras , Mahesh Salgaonkar , Balbir Singh , Reza Arbab , Thomas Gleixner , Allison Randal , Greg Kroah-Hartman , Mike Rapoport , Michal Suchanek Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, kvm-ppc@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v6 11/11] powerpc/mm/book3s64/pgtable: Uses counting method to skip serializing Date: Thu, 6 Feb 2020 00:09:00 -0300 Message-Id: <20200206030900.147032-12-leonardo@linux.ibm.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200206030900.147032-1-leonardo@linux.ibm.com> References: <20200206030900.147032-1-leonardo@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138, 18.0.572 definitions=2020-02-05_06:2020-02-04,2020-02-05 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 clxscore=1015 impostorscore=0 malwarescore=0 priorityscore=1501 adultscore=0 mlxlogscore=980 suspectscore=0 bulkscore=0 lowpriorityscore=0 spamscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2001150001 definitions=main-2002060022 Sender: kvm-ppc-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org For each cpu in cpumask, checks if it's running a lockless pagetable walk. Then, run serialize_against_pte_lookup() only on these cpus. serialize_agains_pte_lookup() can take a long while when there are a lot of cpus in cpumask. This method is intended to reduce this waiting, while not impacting too much on the lockless pagetable walk. Signed-off-by: Leonardo Bras --- arch/powerpc/mm/book3s64/pgtable.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c index bb138b628f86..4822ff1aac4b 100644 --- a/arch/powerpc/mm/book3s64/pgtable.c +++ b/arch/powerpc/mm/book3s64/pgtable.c @@ -96,8 +96,22 @@ static DEFINE_PER_CPU(int, lockless_pgtbl_walk_counter); */ void serialize_against_pte_lookup(struct mm_struct *mm) { + int cpu; + struct cpumask cm; + smp_mb(); - smp_call_function_many(mm_cpumask(mm), do_nothing, NULL, 1); + + /* + * Fills a new cpumask only with cpus that are currently doing a + * lockless pagetable walk. This reduces time spent in this function. + */ + cpumask_clear(&cm); + for_each_cpu(cpu, mm_cpumask((mm))) { + if (per_cpu(lockless_pgtbl_walk_counter, cpu) > 0) + cpumask_set_cpu(cpu, &cm); + } + + smp_call_function_many(&cm, do_nothing, NULL, 1); } /* begin_lockless_pgtbl_walk: Must be inserted before a function call that does