{"id":809344,"url":"http://patchwork.ozlabs.org/api/1.2/patches/809344/?format=json","web_url":"http://patchwork.ozlabs.org/project/linuxppc-dev/patch/20170903181513.29635-2-fbarrat@linux.vnet.ibm.com/","project":{"id":2,"url":"http://patchwork.ozlabs.org/api/1.2/projects/2/?format=json","name":"Linux PPC development","link_name":"linuxppc-dev","list_id":"linuxppc-dev.lists.ozlabs.org","list_email":"linuxppc-dev@lists.ozlabs.org","web_url":"https://github.com/linuxppc/wiki/wiki","scm_url":"https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git","webscm_url":"https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/","list_archive_url":"https://lore.kernel.org/linuxppc-dev/","list_archive_url_format":"https://lore.kernel.org/linuxppc-dev/{}/","commit_url_format":"https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/commit/?id={}"},"msgid":"<20170903181513.29635-2-fbarrat@linux.vnet.ibm.com>","list_archive_url":"https://lore.kernel.org/linuxppc-dev/20170903181513.29635-2-fbarrat@linux.vnet.ibm.com/","date":"2017-09-03T18:15:13","name":"[v3,2/2] cxl: Enable global TLBIs for cxl contexts","commit_ref":"03b8abedf4f4965e7e9e0d4f92877c42c07ce19f","pull_url":null,"state":"accepted","archived":false,"hash":"699c61551174ae82a2e1164dfa1bd265f3fbd398","submitter":{"id":67555,"url":"http://patchwork.ozlabs.org/api/1.2/people/67555/?format=json","name":"Frederic Barrat","email":"fbarrat@linux.vnet.ibm.com"},"delegate":null,"mbox":"http://patchwork.ozlabs.org/project/linuxppc-dev/patch/20170903181513.29635-2-fbarrat@linux.vnet.ibm.com/mbox/","series":[{"id":1267,"url":"http://patchwork.ozlabs.org/api/1.2/series/1267/?format=json","web_url":"http://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=1267","date":"2017-09-03T18:15:12","name":"[v3,1/2] powerpc/mm: Export flush_all_mm()","version":3,"mbox":"http://patchwork.ozlabs.org/series/1267/mbox/"}],"comments":"http://patchwork.ozlabs.org/api/patches/809344/comments/","check":"pending","checks":"http://patchwork.ozlabs.org/api/patches/809344/checks/","tags":{},"related":[],"headers":{"Return-Path":"<linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>","X-Original-To":["patchwork-incoming@ozlabs.org","linuxppc-dev@lists.ozlabs.org"],"Delivered-To":["patchwork-incoming@ozlabs.org","linuxppc-dev@lists.ozlabs.org"],"Received":["from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68])\n\t(using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3xlh8M3Ld3z9s7h\n\tfor <patchwork-incoming@ozlabs.org>;\n\tMon,  4 Sep 2017 04:19:31 +1000 (AEST)","from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\tby lists.ozlabs.org (Postfix) with ESMTP id 3xlh8M240lzDrnJ\n\tfor <patchwork-incoming@ozlabs.org>;\n\tMon,  4 Sep 2017 04:19:31 +1000 (AEST)","from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com\n\t[148.163.156.1])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\tby lists.ozlabs.org (Postfix) with ESMTPS id 3xlh3g4LPKzDqjs\n\tfor <linuxppc-dev@lists.ozlabs.org>;\n\tMon,  4 Sep 2017 04:15:27 +1000 (AEST)","from pps.filterd (m0098404.ppops.net [127.0.0.1])\n\tby mx0a-001b2d01.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id\n\tv83IDUKk005348\n\tfor <linuxppc-dev@lists.ozlabs.org>; Sun, 3 Sep 2017 14:15:25 -0400","from e06smtp11.uk.ibm.com (e06smtp11.uk.ibm.com [195.75.94.107])\n\tby mx0a-001b2d01.pphosted.com with ESMTP id 2cqrqy2faj-1\n\t(version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT)\n\tfor <linuxppc-dev@lists.ozlabs.org>; Sun, 03 Sep 2017 14:15:24 -0400","from localhost\n\tby e06smtp11.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use\n\tOnly! Violators will be prosecuted\n\tfor <linuxppc-dev@lists.ozlabs.org> from <fbarrat@linux.vnet.ibm.com>;\n\tSun, 3 Sep 2017 19:15:22 +0100","from b06cxnps3075.portsmouth.uk.ibm.com (9.149.109.195)\n\tby e06smtp11.uk.ibm.com (192.168.101.141) with IBM ESMTP SMTP\n\tGateway: Authorized Use Only! Violators will be prosecuted; \n\tSun, 3 Sep 2017 19:15:20 +0100","from d06av21.portsmouth.uk.ibm.com (d06av21.portsmouth.uk.ibm.com\n\t[9.149.105.232])\n\tby b06cxnps3075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with\n\tESMTP id v83IFKGx14876776; Sun, 3 Sep 2017 18:15:20 GMT","from d06av21.portsmouth.uk.ibm.com (unknown [127.0.0.1])\n\tby IMSVA (Postfix) with ESMTP id 2BF1C52043;\n\tSun,  3 Sep 2017 18:10:41 +0100 (BST)","from localhost.localdomain (unknown [9.167.235.194])\n\tby d06av21.portsmouth.uk.ibm.com (Postfix) with ESMTP id 4C89F5203F; \n\tSun,  3 Sep 2017 18:10:40 +0100 (BST)"],"From":"Frederic Barrat <fbarrat@linux.vnet.ibm.com>","To":"mpe@ellerman.id.au, linuxppc-dev@lists.ozlabs.org,\n\tbenh@kernel.crashing.org, andrew.donnellan@au1.ibm.com,\n\tclombard@linux.vnet.ibm.com, vaibhav@linux.vnet.ibm.com","Subject":"[PATCH v3 2/2] cxl: Enable global TLBIs for cxl contexts","Date":"Sun,  3 Sep 2017 20:15:13 +0200","X-Mailer":"git-send-email 2.11.0","In-Reply-To":"<20170903181513.29635-1-fbarrat@linux.vnet.ibm.com>","References":"<20170903181513.29635-1-fbarrat@linux.vnet.ibm.com>","X-TM-AS-GCONF":"00","x-cbid":"17090318-0040-0000-0000-000003F54853","X-IBM-AV-DETECTION":"SAVI=unused REMOTE=unused XFE=unused","x-cbparentid":"17090318-0041-0000-0000-00002095B941","Message-Id":"<20170903181513.29635-2-fbarrat@linux.vnet.ibm.com>","X-Proofpoint-Virus-Version":"vendor=fsecure engine=2.50.10432:, ,\n\tdefinitions=2017-09-03_05:, , signatures=0","X-Proofpoint-Spam-Details":"rule=outbound_notspam policy=outbound score=0\n\tspamscore=0 suspectscore=0\n\tmalwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam\n\tadjust=0 reason=mlx scancount=1 engine=8.0.1-1707230000\n\tdefinitions=main-1709030303","X-BeenThere":"linuxppc-dev@lists.ozlabs.org","X-Mailman-Version":"2.1.23","Precedence":"list","List-Id":"Linux on PowerPC Developers Mail List\n\t<linuxppc-dev.lists.ozlabs.org>","List-Unsubscribe":"<https://lists.ozlabs.org/options/linuxppc-dev>,\n\t<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=unsubscribe>","List-Archive":"<http://lists.ozlabs.org/pipermail/linuxppc-dev/>","List-Post":"<mailto:linuxppc-dev@lists.ozlabs.org>","List-Help":"<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=help>","List-Subscribe":"<https://lists.ozlabs.org/listinfo/linuxppc-dev>,\n\t<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=subscribe>","Cc":"alistair@popple.id.au","Errors-To":"linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org","Sender":"\"Linuxppc-dev\"\n\t<linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>"},"content":"The PSL and nMMU need to see all TLB invalidations for the memory\ncontexts used on the adapter. For the hash memory model, it is done by\nmaking all TLBIs global as soon as the cxl driver is in use. For\nradix, we need something similar, but we can refine and only convert\nto global the invalidations for contexts actually used by the device.\n\nThe new mm_context_add_copro() API increments the 'active_cpus' count\nfor the contexts attached to the cxl adapter. As soon as there's more\nthan 1 active cpu, the TLBIs for the context become global. Active cpu\ncount must be decremented when detaching to restore locality if\npossible and to avoid overflowing the counter.\n\nThe hash memory model support is somewhat limited, as we can't\ndecrement the active cpus count when mm_context_remove_copro() is\ncalled, because we can't flush the TLB for a mm on hash. So TLBIs\nremain global on hash.\n\nSigned-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>\nFixes: f24be42aab37 (\"cxl: Add psl9 specific code\")\n---\nChangelog:\nv3: don't decrement active cpus count with hash, as we don't know how to flush\nv2: Replace flush_tlb_mm() by the new flush_all_mm() to flush the TLBs\nand PWCs (thanks to Ben)\n\n arch/powerpc/include/asm/mmu_context.h | 46 ++++++++++++++++++++++++++++++++++\n arch/powerpc/mm/mmu_context.c          |  9 -------\n drivers/misc/cxl/api.c                 | 22 +++++++++++++---\n drivers/misc/cxl/context.c             |  3 +++\n drivers/misc/cxl/file.c                | 19 ++++++++++++--\n 5 files changed, 85 insertions(+), 14 deletions(-)","diff":"diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h\nindex 309592589e30..a0d7145d6cd2 100644\n--- a/arch/powerpc/include/asm/mmu_context.h\n+++ b/arch/powerpc/include/asm/mmu_context.h\n@@ -77,6 +77,52 @@ extern void switch_cop(struct mm_struct *next);\n extern int use_cop(unsigned long acop, struct mm_struct *mm);\n extern void drop_cop(unsigned long acop, struct mm_struct *mm);\n \n+#ifdef CONFIG_PPC_BOOK3S_64\n+static inline void inc_mm_active_cpus(struct mm_struct *mm)\n+{\n+\tatomic_inc(&mm->context.active_cpus);\n+}\n+\n+static inline void dec_mm_active_cpus(struct mm_struct *mm)\n+{\n+\tatomic_dec(&mm->context.active_cpus);\n+}\n+\n+static inline void mm_context_add_copro(struct mm_struct *mm)\n+{\n+\t/*\n+\t * On hash, should only be called once over the lifetime of\n+\t * the context, as we can't decrement the active cpus count\n+\t * and flush properly for the time being.\n+\t */\n+\tinc_mm_active_cpus(mm);\n+}\n+\n+static inline void mm_context_remove_copro(struct mm_struct *mm)\n+{\n+\t/*\n+\t * Need to broadcast a global flush of the full mm before\n+\t * decrementing active_cpus count, as the next TLBI may be\n+\t * local and the nMMU and/or PSL need to be cleaned up.\n+\t * Should be rare enough so that it's acceptable.\n+\t *\n+\t * Skip on hash, as we don't know how to do the proper flush\n+\t * for the time being. Invalidations will remain global if\n+\t * used on hash.\n+\t */\n+\tif (radix_enabled()) {\n+\t\tflush_all_mm(mm);\n+\t\tdec_mm_active_cpus(mm);\n+\t}\n+}\n+#else\n+static inline void inc_mm_active_cpus(struct mm_struct *mm) { }\n+static inline void dec_mm_active_cpus(struct mm_struct *mm) { }\n+static inline void mm_context_add_copro(struct mm_struct *mm) { }\n+static inline void mm_context_remove_copro(struct mm_struct *mm) { }\n+#endif\n+\n+\n extern void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,\n \t\t\t       struct task_struct *tsk);\n \ndiff --git a/arch/powerpc/mm/mmu_context.c b/arch/powerpc/mm/mmu_context.c\nindex 0f613bc63c50..d60a62bf4fc7 100644\n--- a/arch/powerpc/mm/mmu_context.c\n+++ b/arch/powerpc/mm/mmu_context.c\n@@ -34,15 +34,6 @@ static inline void switch_mm_pgdir(struct task_struct *tsk,\n \t\t\t\t   struct mm_struct *mm) { }\n #endif\n \n-#ifdef CONFIG_PPC_BOOK3S_64\n-static inline void inc_mm_active_cpus(struct mm_struct *mm)\n-{\n-\tatomic_inc(&mm->context.active_cpus);\n-}\n-#else\n-static inline void inc_mm_active_cpus(struct mm_struct *mm) { }\n-#endif\n-\n void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,\n \t\t\tstruct task_struct *tsk)\n {\ndiff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c\nindex a0c44d16bf30..1137a2cc1d3e 100644\n--- a/drivers/misc/cxl/api.c\n+++ b/drivers/misc/cxl/api.c\n@@ -15,6 +15,7 @@\n #include <linux/module.h>\n #include <linux/mount.h>\n #include <linux/sched/mm.h>\n+#include <linux/mmu_context.h>\n \n #include \"cxl.h\"\n \n@@ -331,9 +332,12 @@ int cxl_start_context(struct cxl_context *ctx, u64 wed,\n \t\t/* ensure this mm_struct can't be freed */\n \t\tcxl_context_mm_count_get(ctx);\n \n-\t\t/* decrement the use count */\n-\t\tif (ctx->mm)\n+\t\tif (ctx->mm) {\n+\t\t\t/* decrement the use count from above */\n \t\t\tmmput(ctx->mm);\n+\t\t\t/* make TLBIs for this context global */\n+\t\t\tmm_context_add_copro(ctx->mm);\n+\t\t}\n \t}\n \n \t/*\n@@ -342,13 +346,25 @@ int cxl_start_context(struct cxl_context *ctx, u64 wed,\n \t */\n \tcxl_ctx_get();\n \n+\t/*\n+\t * Barrier is needed to make sure all TLBIs are global before\n+\t * we attach and the context starts being used by the adapter.\n+\t *\n+\t * Needed after mm_context_add_copro() for radix and\n+\t * cxl_ctx_get() for hash/p8\n+\t */\n+\tsmp_mb();\n+\n \tif ((rc = cxl_ops->attach_process(ctx, kernel, wed, 0))) {\n \t\tput_pid(ctx->pid);\n \t\tctx->pid = NULL;\n \t\tcxl_adapter_context_put(ctx->afu->adapter);\n \t\tcxl_ctx_put();\n-\t\tif (task)\n+\t\tif (task) {\n \t\t\tcxl_context_mm_count_put(ctx);\n+\t\t\tif (ctx->mm)\n+\t\t\t\tmm_context_remove_copro(ctx->mm);\n+\t\t}\n \t\tgoto out;\n \t}\n \ndiff --git a/drivers/misc/cxl/context.c b/drivers/misc/cxl/context.c\nindex 8c32040b9c09..12a41b2753f0 100644\n--- a/drivers/misc/cxl/context.c\n+++ b/drivers/misc/cxl/context.c\n@@ -18,6 +18,7 @@\n #include <linux/slab.h>\n #include <linux/idr.h>\n #include <linux/sched/mm.h>\n+#include <linux/mmu_context.h>\n #include <asm/cputable.h>\n #include <asm/current.h>\n #include <asm/copro.h>\n@@ -267,6 +268,8 @@ int __detach_context(struct cxl_context *ctx)\n \n \t/* Decrease the mm count on the context */\n \tcxl_context_mm_count_put(ctx);\n+\tif (ctx->mm)\n+\t\tmm_context_remove_copro(ctx->mm);\n \tctx->mm = NULL;\n \n \treturn 0;\ndiff --git a/drivers/misc/cxl/file.c b/drivers/misc/cxl/file.c\nindex 4bfad9f6dc9f..84b801b5d0e5 100644\n--- a/drivers/misc/cxl/file.c\n+++ b/drivers/misc/cxl/file.c\n@@ -19,6 +19,7 @@\n #include <linux/mm.h>\n #include <linux/slab.h>\n #include <linux/sched/mm.h>\n+#include <linux/mmu_context.h>\n #include <asm/cputable.h>\n #include <asm/current.h>\n #include <asm/copro.h>\n@@ -220,9 +221,12 @@ static long afu_ioctl_start_work(struct cxl_context *ctx,\n \t/* ensure this mm_struct can't be freed */\n \tcxl_context_mm_count_get(ctx);\n \n-\t/* decrement the use count */\n-\tif (ctx->mm)\n+\tif (ctx->mm) {\n+\t\t/* decrement the use count from above */\n \t\tmmput(ctx->mm);\n+\t\t/* make TLBIs for this context global */\n+\t\tmm_context_add_copro(ctx->mm);\n+\t}\n \n \t/*\n \t * Increment driver use count. Enables global TLBIs for hash\n@@ -230,6 +234,15 @@ static long afu_ioctl_start_work(struct cxl_context *ctx,\n \t */\n \tcxl_ctx_get();\n \n+\t/*\n+\t * Barrier is needed to make sure all TLBIs are global before\n+\t * we attach and the context starts being used by the adapter.\n+\t *\n+\t * Needed after mm_context_add_copro() for radix and\n+\t * cxl_ctx_get() for hash/p8\n+\t */\n+\tsmp_mb();\n+\n \ttrace_cxl_attach(ctx, work.work_element_descriptor, work.num_interrupts, amr);\n \n \tif ((rc = cxl_ops->attach_process(ctx, false, work.work_element_descriptor,\n@@ -240,6 +253,8 @@ static long afu_ioctl_start_work(struct cxl_context *ctx,\n \t\tctx->pid = NULL;\n \t\tcxl_ctx_put();\n \t\tcxl_context_mm_count_put(ctx);\n+\t\tif (ctx->mm)\n+\t\t\tmm_context_remove_copro(ctx->mm);\n \t\tgoto out;\n \t}\n \n","prefixes":["v3","2/2"]}