[{"id":1779318,"web_url":"http://patchwork.ozlabs.org/comment/1779318/","msgid":"<20171003234215.GA5231@redhat.com>","date":"2017-10-03T23:42:15","subject":"Re: [PATCH] mm/mmu_notifier: avoid double notification when it is\n\tuseless","submitter":{"id":1245,"url":"http://patchwork.ozlabs.org/api/people/1245/","name":"Andrea Arcangeli","email":"aarcange@redhat.com"},"content":"Hello Jerome,\n\nOn Fri, Sep 01, 2017 at 01:30:11PM -0400, Jerome Glisse wrote:\n> +Case A is obvious you do not want to take the risk for the device to write to\n> +a page that might now be use by some completely different task.\n\nused\n\n> +is true ven if the thread doing the page table update is preempted right after\n\neven\n\n> diff --git a/mm/huge_memory.c b/mm/huge_memory.c\n> index 90731e3b7e58..5706252b828a 100644\n> --- a/mm/huge_memory.c\n> +++ b/mm/huge_memory.c\n> @@ -1167,8 +1167,15 @@ static int do_huge_pmd_wp_page_fallback(struct vm_fault *vmf, pmd_t orig_pmd,\n>  \t\tgoto out_free_pages;\n>  \tVM_BUG_ON_PAGE(!PageHead(page), page);\n>  \n> +\t/*\n> +\t * Leave pmd empty until pte is filled note we must notify here as\n> +\t * concurrent CPU thread might write to new page before the call to\n> +\t * mmu_notifier_invalidate_range_end() happen which can lead to a\n\nhappens\n\n> +\t * device seeing memory write in different order than CPU.\n> +\t *\n> +\t * See Documentation/vm/mmu_notifier.txt\n> +\t */\n>  \tpmdp_huge_clear_flush_notify(vma, haddr, vmf->pmd);\n> -\t/* leave pmd empty until pte is filled */\n>  \n\nHere we can change the following mmu_notifier_invalidate_range_end to\nskip calling ->invalidate_range. It could be called\nmmu_notifier_invalidate_range_only_end, or other suggestions\nwelcome. Otherwise we'll repeat the call for nothing.\n\nWe need it inside the PT lock for the ordering issue, but we don't\nneed to run it twice.\n\nSame in do_huge_pmd_wp_page, wp_page_copy and\nmigrate_vma_insert_page. Every time *clear_flush_notify is used\nmmu_notifier_invalidate_range_only_end should be called after it,\ninstead of mmu_notifier_invalidate_range_end.\n\nI think optimizing that part too, fits in the context of this patchset\n(if not in the same patch), because the objective is still the same:\nto remove unnecessary ->invalidate_range calls.\n\n> +\t\t\t\t * No need to notify as we downgrading page\n\nare\n\n> +\t\t\t\t * table protection not changing it to point\n> +\t\t\t\t * to a new page.\n> +\t \t\t\t *\n\n> +\t\t * No need to notify as we downgrading page table to read only\n\nare\n\n> +\t * No need to notify as we replacing a read only page with another\n\nare\n\n> @@ -1510,13 +1515,43 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,\n>  \t\t\tif (pte_soft_dirty(pteval))\n>  \t\t\t\tswp_pte = pte_swp_mksoft_dirty(swp_pte);\n>  \t\t\tset_pte_at(mm, address, pvmw.pte, swp_pte);\n> -\t\t} else\n> +\t\t} else {\n> +\t\t\t/*\n> +\t\t\t * We should not need to notify here as we reach this\n> +\t\t\t * case only from freeze_page() itself only call from\n> +\t\t\t * split_huge_page_to_list() so everything below must\n> +\t\t\t * be true:\n> +\t\t\t *   - page is not anonymous\n> +\t\t\t *   - page is locked\n> +\t\t\t *\n> +\t\t\t * So as it is a shared page and it is locked, it can\n> +\t\t\t * not be remove from the page cache and replace by\n> +\t\t\t * a new page before mmu_notifier_invalidate_range_end\n> +\t\t\t * so no concurrent thread might update its page table\n> +\t\t\t * to point at new page while a device still is using\n> +\t\t\t * this page.\n> +\t\t\t *\n> +\t\t\t * But we can not assume that new user of try_to_unmap\n> +\t\t\t * will have that in mind so just to be safe here call\n> +\t\t\t * mmu_notifier_invalidate_range()\n> +\t\t\t *\n> +\t\t\t * See Documentation/vm/mmu_notifier.txt\n> +\t\t\t */\n>  \t\t\tdec_mm_counter(mm, mm_counter_file(page));\n> +\t\t\tmmu_notifier_invalidate_range(mm, address,\n> +\t\t\t\t\t\t      address + PAGE_SIZE);\n> +\t\t}\n>  discard:\n> +\t\t/*\n> +\t\t * No need to call mmu_notifier_invalidate_range() as we are\n> +\t\t * either replacing a present pte with non present one (either\n> +\t\t * a swap or special one). We handling the clearing pte case\n> +\t\t * above.\n> +\t\t *\n> +\t\t * See Documentation/vm/mmu_notifier.txt\n> +\t\t */\n>  \t\tpage_remove_rmap(subpage, PageHuge(page));\n>  \t\tput_page(page);\n> -\t\tmmu_notifier_invalidate_range(mm, address,\n> -\t\t\t\t\t      address + PAGE_SIZE);\n>  \t}\n>  \n>  \tmmu_notifier_invalidate_range_end(vma->vm_mm, start, end);\n\nThat is the path that unmaps filebacked pages (btw, not necessarily\nshared unlike comment says, they can be private but still filebacked).\n\nI'd like some more explanation about the inner working of \"that new\nuser\" as per comment above.\n\nIt would be enough to drop mmu_notifier_invalidate_range from above\nwithout adding it to the filebacked case. The above gives higher prio\nto the hypothetical and uncertain future case, than to the current\nreal filebacked case that doesn't need ->invalidate_range inside the\nPT lock, or do you see something that might already need such\n->invalidate_range?\n\nI certainly like the patch. I expect ->invalidate_range users will\nlike the slight higher complexity in order to eliminate unnecessary\ninvalidates that are slowing them down unnecessarily. At the same time\nthis is zero risk (because a noop) for all other MMU notifier users\n(those that don't share the primary MMU pagetables, like KVM).\n\nThanks!\nAndrea","headers":{"Return-Path":"<linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>","X-Original-To":["patchwork-incoming@ozlabs.org","linuxppc-dev@lists.ozlabs.org"],"Delivered-To":["patchwork-incoming@ozlabs.org","linuxppc-dev@lists.ozlabs.org"],"Received":["from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\t(using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3y6FwH33jwz9t16\n\tfor <patchwork-incoming@ozlabs.org>;\n\tWed,  4 Oct 2017 10:43:27 +1100 (AEDT)","from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\tby lists.ozlabs.org (Postfix) with ESMTP id 3y6FwG6x4DzDr0K\n\tfor <patchwork-incoming@ozlabs.org>;\n\tWed,  4 Oct 2017 10:43:26 +1100 (AEDT)","from mx1.redhat.com (mx1.redhat.com [209.132.183.28])\n\t(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))\n\t(No client certificate requested)\n\tby lists.ozlabs.org (Postfix) with ESMTPS id 3y6Fv10XgMzDqlP\n\tfor <linuxppc-dev@lists.ozlabs.org>;\n\tWed,  4 Oct 2017 10:42:20 +1100 (AEDT)","from smtp.corp.redhat.com\n\t(int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16])\n\t(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))\n\t(No client certificate requested)\n\tby mx1.redhat.com (Postfix) with ESMTPS id A0D2081DF9;\n\tTue,  3 Oct 2017 23:42:17 +0000 (UTC)","from mail.random (ovpn-116-49.ams2.redhat.com [10.36.116.49])\n\tby smtp.corp.redhat.com (Postfix) with ESMTPS id D24205C542;\n\tTue,  3 Oct 2017 23:42:16 +0000 (UTC)"],"Authentication-Results":["ozlabs.org;\n\tspf=pass (mailfrom) smtp.mailfrom=redhat.com\n\t(client-ip=209.132.183.28; helo=mx1.redhat.com;\n\tenvelope-from=aarcange@redhat.com; receiver=<UNKNOWN>)","ext-mx01.extmail.prod.ext.phx2.redhat.com;\n\tdmarc=none (p=none dis=none) header.from=redhat.com","ext-mx01.extmail.prod.ext.phx2.redhat.com;\n\tspf=fail smtp.mailfrom=aarcange@redhat.com"],"DMARC-Filter":"OpenDMARC Filter v1.3.2 mx1.redhat.com A0D2081DF9","Date":"Wed, 4 Oct 2017 01:42:15 +0200","From":"Andrea Arcangeli <aarcange@redhat.com>","To":"jglisse@redhat.com","Subject":"Re: [PATCH] mm/mmu_notifier: avoid double notification when it is\n\tuseless","Message-ID":"<20171003234215.GA5231@redhat.com>","References":"<20170901173011.10745-1-jglisse@redhat.com>","MIME-Version":"1.0","Content-Type":"text/plain; charset=us-ascii","Content-Disposition":"inline","In-Reply-To":"<20170901173011.10745-1-jglisse@redhat.com>","User-Agent":"Mutt/1.9.1 (2017-09-22)","X-Scanned-By":"MIMEDefang 2.79 on 10.5.11.16","X-Greylist":"Sender IP whitelisted, not delayed by milter-greylist-4.5.16\n\t(mx1.redhat.com [10.5.110.25]); Tue, 03 Oct 2017 23:42:18 +0000 (UTC)","X-BeenThere":"linuxppc-dev@lists.ozlabs.org","X-Mailman-Version":"2.1.24","Precedence":"list","List-Id":"Linux on PowerPC Developers Mail List\n\t<linuxppc-dev.lists.ozlabs.org>","List-Unsubscribe":"<https://lists.ozlabs.org/options/linuxppc-dev>,\n\t<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=unsubscribe>","List-Archive":"<http://lists.ozlabs.org/pipermail/linuxppc-dev/>","List-Post":"<mailto:linuxppc-dev@lists.ozlabs.org>","List-Help":"<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=help>","List-Subscribe":"<https://lists.ozlabs.org/listinfo/linuxppc-dev>,\n\t<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=subscribe>","Cc":"Stephen Rothwell <sfr@canb.auug.org.au>, Joerg Roedel <jroedel@suse.de>, \n\tNadav Amit <nadav.amit@gmail.com>, linuxppc-dev@lists.ozlabs.org,\n\tlinux-kernel@vger.kernel.org, linux-mm@kvack.org,\n\tiommu@lists.linux-foundation.org, linux-next@vger.kernel.org,\n\tSuravee Suthikulpanit <suravee.suthikulpanit@amd.com>,\n\tAlistair Popple <alistair@popple.id.au>,\n\tAndrew Morton <akpm@linux-foundation.org>,\n\tLinus Torvalds <torvalds@linux-foundation.org>,\n\tDavid Woodhouse <dwmw2@infradead.org>","Errors-To":"linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org","Sender":"\"Linuxppc-dev\"\n\t<linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>"}},{"id":1779328,"web_url":"http://patchwork.ozlabs.org/comment/1779328/","msgid":"<20171004001559.GD20644@redhat.com>","date":"2017-10-04T00:15:59","subject":"Re: [PATCH] mm/mmu_notifier: avoid double notification when it is\n\tuseless","submitter":{"id":71451,"url":"http://patchwork.ozlabs.org/api/people/71451/","name":"Jerome Glisse","email":"jglisse@redhat.com"},"content":"On Wed, Oct 04, 2017 at 01:42:15AM +0200, Andrea Arcangeli wrote:\n> Hello Jerome,\n> \n> On Fri, Sep 01, 2017 at 01:30:11PM -0400, Jerome Glisse wrote:\n> > +Case A is obvious you do not want to take the risk for the device to write to\n> > +a page that might now be use by some completely different task.\n> \n> used\n> \n> > +is true ven if the thread doing the page table update is preempted right after\n> \n> even\n> \n> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c\n> > index 90731e3b7e58..5706252b828a 100644\n> > --- a/mm/huge_memory.c\n> > +++ b/mm/huge_memory.c\n> > @@ -1167,8 +1167,15 @@ static int do_huge_pmd_wp_page_fallback(struct vm_fault *vmf, pmd_t orig_pmd,\n> >  \t\tgoto out_free_pages;\n> >  \tVM_BUG_ON_PAGE(!PageHead(page), page);\n> >  \n> > +\t/*\n> > +\t * Leave pmd empty until pte is filled note we must notify here as\n> > +\t * concurrent CPU thread might write to new page before the call to\n> > +\t * mmu_notifier_invalidate_range_end() happen which can lead to a\n> \n> happens\n> \n> > +\t * device seeing memory write in different order than CPU.\n> > +\t *\n> > +\t * See Documentation/vm/mmu_notifier.txt\n> > +\t */\n> >  \tpmdp_huge_clear_flush_notify(vma, haddr, vmf->pmd);\n> > -\t/* leave pmd empty until pte is filled */\n> >  \n> \n> Here we can change the following mmu_notifier_invalidate_range_end to\n> skip calling ->invalidate_range. It could be called\n> mmu_notifier_invalidate_range_only_end, or other suggestions\n> welcome. Otherwise we'll repeat the call for nothing.\n> \n> We need it inside the PT lock for the ordering issue, but we don't\n> need to run it twice.\n> \n> Same in do_huge_pmd_wp_page, wp_page_copy and\n> migrate_vma_insert_page. Every time *clear_flush_notify is used\n> mmu_notifier_invalidate_range_only_end should be called after it,\n> instead of mmu_notifier_invalidate_range_end.\n> \n> I think optimizing that part too, fits in the context of this patchset\n> (if not in the same patch), because the objective is still the same:\n> to remove unnecessary ->invalidate_range calls.\n\nYes you are right, good idea, i will respin with that too (and with the\nvarious typo you noted thank you for that). I can do 2 patch or 1, i don't\nmind either way. I will probably do 2 as first and they can be folded into\n1 if people prefer just one.\n\n\n> \n> > +\t\t\t\t * No need to notify as we downgrading page\n> \n> are\n> \n> > +\t\t\t\t * table protection not changing it to point\n> > +\t\t\t\t * to a new page.\n> > +\t \t\t\t *\n> \n> > +\t\t * No need to notify as we downgrading page table to read only\n> \n> are\n> \n> > +\t * No need to notify as we replacing a read only page with another\n> \n> are\n> \n> > @@ -1510,13 +1515,43 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,\n> >  \t\t\tif (pte_soft_dirty(pteval))\n> >  \t\t\t\tswp_pte = pte_swp_mksoft_dirty(swp_pte);\n> >  \t\t\tset_pte_at(mm, address, pvmw.pte, swp_pte);\n> > -\t\t} else\n> > +\t\t} else {\n> > +\t\t\t/*\n> > +\t\t\t * We should not need to notify here as we reach this\n> > +\t\t\t * case only from freeze_page() itself only call from\n> > +\t\t\t * split_huge_page_to_list() so everything below must\n> > +\t\t\t * be true:\n> > +\t\t\t *   - page is not anonymous\n> > +\t\t\t *   - page is locked\n> > +\t\t\t *\n> > +\t\t\t * So as it is a shared page and it is locked, it can\n> > +\t\t\t * not be remove from the page cache and replace by\n> > +\t\t\t * a new page before mmu_notifier_invalidate_range_end\n> > +\t\t\t * so no concurrent thread might update its page table\n> > +\t\t\t * to point at new page while a device still is using\n> > +\t\t\t * this page.\n> > +\t\t\t *\n> > +\t\t\t * But we can not assume that new user of try_to_unmap\n> > +\t\t\t * will have that in mind so just to be safe here call\n> > +\t\t\t * mmu_notifier_invalidate_range()\n> > +\t\t\t *\n> > +\t\t\t * See Documentation/vm/mmu_notifier.txt\n> > +\t\t\t */\n> >  \t\t\tdec_mm_counter(mm, mm_counter_file(page));\n> > +\t\t\tmmu_notifier_invalidate_range(mm, address,\n> > +\t\t\t\t\t\t      address + PAGE_SIZE);\n> > +\t\t}\n> >  discard:\n> > +\t\t/*\n> > +\t\t * No need to call mmu_notifier_invalidate_range() as we are\n> > +\t\t * either replacing a present pte with non present one (either\n> > +\t\t * a swap or special one). We handling the clearing pte case\n> > +\t\t * above.\n> > +\t\t *\n> > +\t\t * See Documentation/vm/mmu_notifier.txt\n> > +\t\t */\n> >  \t\tpage_remove_rmap(subpage, PageHuge(page));\n> >  \t\tput_page(page);\n> > -\t\tmmu_notifier_invalidate_range(mm, address,\n> > -\t\t\t\t\t      address + PAGE_SIZE);\n> >  \t}\n> >  \n> >  \tmmu_notifier_invalidate_range_end(vma->vm_mm, start, end);\n> \n> That is the path that unmaps filebacked pages (btw, not necessarily\n> shared unlike comment says, they can be private but still filebacked).\n\nI was more refering to the fact that they are in page cache and thus\ngiven current condition they can not be migrated to a new page in our\nback. But yes it can be a private mapping, i will fix the comment.\n\n\n> I'd like some more explanation about the inner working of \"that new\n> user\" as per comment above.\n> \n> It would be enough to drop mmu_notifier_invalidate_range from above\n> without adding it to the filebacked case. The above gives higher prio\n> to the hypothetical and uncertain future case, than to the current\n> real filebacked case that doesn't need ->invalidate_range inside the\n> PT lock, or do you see something that might already need such\n> ->invalidate_range?\n\nNo i don't see any new user today that might need such invalidate but\ni was trying to be extra cautious as i have a tendency to assume that\nsomeone might do a patch that use try_to_unmap() without going through\nall the comments in the function and thus possibly using it in a an\nunexpected way from mmu_notifier callback point of view. I am fine\nwith putting the burden on new user to get it right and adding an\nextra warning in the function description to try to warn people in a\nsensible way.\n\n\n> I certainly like the patch. I expect ->invalidate_range users will\n> like the slight higher complexity in order to eliminate unnecessary\n> invalidates that are slowing them down unnecessarily. At the same time\n> this is zero risk (because a noop) for all other MMU notifier users\n> (those that don't share the primary MMU pagetables, like KVM).\n\nI have another patchset to restore the change_pte optimization for kvm\ni want to post too. I will need to do some benchmarking first to make\nsure it actualy helps even in a small way.\n\nThank you for looking into this patch, i will repost with your suggestions\nsoon.\n\nJérôme","headers":{"Return-Path":"<linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>","X-Original-To":["patchwork-incoming@ozlabs.org","linuxppc-dev@lists.ozlabs.org"],"Delivered-To":["patchwork-incoming@ozlabs.org","linuxppc-dev@lists.ozlabs.org"],"Received":["from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\t(using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3y6GgT3sPFz9sPt\n\tfor <patchwork-incoming@ozlabs.org>;\n\tWed,  4 Oct 2017 11:17:25 +1100 (AEDT)","from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\tby lists.ozlabs.org (Postfix) with ESMTP id 3y6GgT31rwzDqpD\n\tfor <patchwork-incoming@ozlabs.org>;\n\tWed,  4 Oct 2017 11:17:25 +1100 (AEDT)","from mx1.redhat.com (mx1.redhat.com [209.132.183.28])\n\t(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))\n\t(No client certificate requested)\n\tby lists.ozlabs.org (Postfix) with ESMTPS id 3y6Gdy0bzqzDqln\n\tfor <linuxppc-dev@lists.ozlabs.org>;\n\tWed,  4 Oct 2017 11:16:06 +1100 (AEDT)","from smtp.corp.redhat.com\n\t(int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16])\n\t(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))\n\t(No client certificate requested)\n\tby mx1.redhat.com (Postfix) with ESMTPS id BFBFA883BF;\n\tWed,  4 Oct 2017 00:16:03 +0000 (UTC)","from redhat.com (ovpn-125-137.rdu2.redhat.com [10.10.125.137])\n\tby smtp.corp.redhat.com (Postfix) with ESMTPS id 8DF615C542;\n\tWed,  4 Oct 2017 00:16:01 +0000 (UTC)"],"Authentication-Results":["ozlabs.org;\n\tspf=pass (mailfrom) smtp.mailfrom=redhat.com\n\t(client-ip=209.132.183.28; helo=mx1.redhat.com;\n\tenvelope-from=jglisse@redhat.com; receiver=<UNKNOWN>)","ext-mx02.extmail.prod.ext.phx2.redhat.com;\n\tdmarc=none (p=none dis=none) header.from=redhat.com","ext-mx02.extmail.prod.ext.phx2.redhat.com;\n\tspf=fail smtp.mailfrom=jglisse@redhat.com"],"DMARC-Filter":"OpenDMARC Filter v1.3.2 mx1.redhat.com BFBFA883BF","Date":"Tue, 3 Oct 2017 20:15:59 -0400","From":"Jerome Glisse <jglisse@redhat.com>","To":"Andrea Arcangeli <aarcange@redhat.com>","Subject":"Re: [PATCH] mm/mmu_notifier: avoid double notification when it is\n\tuseless","Message-ID":"<20171004001559.GD20644@redhat.com>","References":"<20170901173011.10745-1-jglisse@redhat.com>\n\t<20171003234215.GA5231@redhat.com>","MIME-Version":"1.0","Content-Type":"text/plain; charset=iso-8859-1","Content-Disposition":"inline","Content-Transfer-Encoding":"8bit","In-Reply-To":"<20171003234215.GA5231@redhat.com>","User-Agent":"Mutt/1.9.0 (2017-09-02)","X-Scanned-By":"MIMEDefang 2.79 on 10.5.11.16","X-Greylist":"Sender IP whitelisted, not delayed by milter-greylist-4.5.16\n\t(mx1.redhat.com [10.5.110.26]); Wed, 04 Oct 2017 00:16:04 +0000 (UTC)","X-BeenThere":"linuxppc-dev@lists.ozlabs.org","X-Mailman-Version":"2.1.24","Precedence":"list","List-Id":"Linux on PowerPC Developers Mail List\n\t<linuxppc-dev.lists.ozlabs.org>","List-Unsubscribe":"<https://lists.ozlabs.org/options/linuxppc-dev>,\n\t<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=unsubscribe>","List-Archive":"<http://lists.ozlabs.org/pipermail/linuxppc-dev/>","List-Post":"<mailto:linuxppc-dev@lists.ozlabs.org>","List-Help":"<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=help>","List-Subscribe":"<https://lists.ozlabs.org/listinfo/linuxppc-dev>,\n\t<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=subscribe>","Cc":"Stephen Rothwell <sfr@canb.auug.org.au>, Joerg Roedel <jroedel@suse.de>, \n\tNadav Amit <nadav.amit@gmail.com>, linuxppc-dev@lists.ozlabs.org,\n\tlinux-kernel@vger.kernel.org, linux-mm@kvack.org,\n\tiommu@lists.linux-foundation.org, linux-next@vger.kernel.org,\n\tSuravee Suthikulpanit <suravee.suthikulpanit@amd.com>,\n\tAlistair Popple <alistair@popple.id.au>,\n\tAndrew Morton <akpm@linux-foundation.org>,\n\tLinus Torvalds <torvalds@linux-foundation.org>,\n\tDavid Woodhouse <dwmw2@infradead.org>","Errors-To":"linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org","Sender":"\"Linuxppc-dev\"\n\t<linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>"}},{"id":1779348,"web_url":"http://patchwork.ozlabs.org/comment/1779348/","msgid":"<20171004012016.GE20644@redhat.com>","date":"2017-10-04T01:20:16","subject":"Re: [PATCH] mm/mmu_notifier: avoid double notification when it is\n\tuseless","submitter":{"id":71451,"url":"http://patchwork.ozlabs.org/api/people/71451/","name":"Jerome Glisse","email":"jglisse@redhat.com"},"content":"On Tue, Oct 03, 2017 at 05:43:47PM -0700, Nadav Amit wrote:\n> Jerome Glisse <jglisse@redhat.com> wrote:\n> \n> > On Wed, Oct 04, 2017 at 01:42:15AM +0200, Andrea Arcangeli wrote:\n> > \n> >> I'd like some more explanation about the inner working of \"that new\n> >> user\" as per comment above.\n> >> \n> >> It would be enough to drop mmu_notifier_invalidate_range from above\n> >> without adding it to the filebacked case. The above gives higher prio\n> >> to the hypothetical and uncertain future case, than to the current\n> >> real filebacked case that doesn't need ->invalidate_range inside the\n> >> PT lock, or do you see something that might already need such\n> >> ->invalidate_range?\n> > \n> > No i don't see any new user today that might need such invalidate but\n> > i was trying to be extra cautious as i have a tendency to assume that\n> > someone might do a patch that use try_to_unmap() without going through\n> > all the comments in the function and thus possibly using it in a an\n> > unexpected way from mmu_notifier callback point of view. I am fine\n> > with putting the burden on new user to get it right and adding an\n> > extra warning in the function description to try to warn people in a\n> > sensible way.\n> \n> I must be missing something. After the PTE is changed, but before the\n> secondary TLB notification/invalidation - What prevents another thread from\n> changing the mappings (e.g., using munmap/mmap), and setting a new page\n> at that PTE?\n> \n> Wouldn’t it end with the page being mapped without a secondary TLB flush in\n> between?\n\nmunmap would call mmu_notifier to invalidate the range too so secondary\nTLB would be properly flush before any new pte can be setup in for that\nparticular virtual address range. Unlike CPU TLB flush, secondary TLB\nflush are un-conditional and thus current pte value does not play any\nrole.\n\nCheers,\nJérôme","headers":{"Return-Path":"<linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>","X-Original-To":["patchwork-incoming@ozlabs.org","linuxppc-dev@lists.ozlabs.org"],"Delivered-To":["patchwork-incoming@ozlabs.org","linuxppc-dev@lists.ozlabs.org"],"Received":["from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68])\n\t(using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3y6J5f2JNwz9t2V\n\tfor <patchwork-incoming@ozlabs.org>;\n\tWed,  4 Oct 2017 12:21:42 +1100 (AEDT)","from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\tby lists.ozlabs.org (Postfix) with ESMTP id 3y6J5f1RSwzDqqP\n\tfor <patchwork-incoming@ozlabs.org>;\n\tWed,  4 Oct 2017 12:21:42 +1100 (AEDT)","from mx1.redhat.com (mx1.redhat.com [209.132.183.28])\n\t(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))\n\t(No client certificate requested)\n\tby lists.ozlabs.org (Postfix) with ESMTPS id 3y6J482rZJzDqlC\n\tfor <linuxppc-dev@lists.ozlabs.org>;\n\tWed,  4 Oct 2017 12:20:24 +1100 (AEDT)","from smtp.corp.redhat.com\n\t(int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11])\n\t(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))\n\t(No client certificate requested)\n\tby mx1.redhat.com (Postfix) with ESMTPS id 601B37E43A;\n\tWed,  4 Oct 2017 01:20:21 +0000 (UTC)","from redhat.com (ovpn-125-137.rdu2.redhat.com [10.10.125.137])\n\tby smtp.corp.redhat.com (Postfix) with ESMTPS id DA1BC600C8;\n\tWed,  4 Oct 2017 01:20:18 +0000 (UTC)"],"Authentication-Results":["ozlabs.org;\n\tspf=pass (mailfrom) smtp.mailfrom=redhat.com\n\t(client-ip=209.132.183.28; helo=mx1.redhat.com;\n\tenvelope-from=jglisse@redhat.com; receiver=<UNKNOWN>)","ext-mx03.extmail.prod.ext.phx2.redhat.com;\n\tdmarc=none (p=none dis=none) header.from=redhat.com","ext-mx03.extmail.prod.ext.phx2.redhat.com;\n\tspf=fail smtp.mailfrom=jglisse@redhat.com"],"DMARC-Filter":"OpenDMARC Filter v1.3.2 mx1.redhat.com 601B37E43A","Date":"Tue, 3 Oct 2017 21:20:16 -0400","From":"Jerome Glisse <jglisse@redhat.com>","To":"Nadav Amit <nadav.amit@gmail.com>","Subject":"Re: [PATCH] mm/mmu_notifier: avoid double notification when it is\n\tuseless","Message-ID":"<20171004012016.GE20644@redhat.com>","References":"<20170901173011.10745-1-jglisse@redhat.com>\n\t<20171003234215.GA5231@redhat.com>\n\t<20171004001559.GD20644@redhat.com>\n\t<0D64494B-AB3D-4091-B75A-883EA37BE098@gmail.com>","MIME-Version":"1.0","Content-Type":"text/plain; charset=utf-8","Content-Disposition":"inline","Content-Transfer-Encoding":"8bit","In-Reply-To":"<0D64494B-AB3D-4091-B75A-883EA37BE098@gmail.com>","User-Agent":"Mutt/1.9.0 (2017-09-02)","X-Scanned-By":"MIMEDefang 2.79 on 10.5.11.11","X-Greylist":"Sender IP whitelisted, not delayed by milter-greylist-4.5.16\n\t(mx1.redhat.com [10.5.110.27]); Wed, 04 Oct 2017 01:20:22 +0000 (UTC)","X-BeenThere":"linuxppc-dev@lists.ozlabs.org","X-Mailman-Version":"2.1.24","Precedence":"list","List-Id":"Linux on PowerPC Developers Mail List\n\t<linuxppc-dev.lists.ozlabs.org>","List-Unsubscribe":"<https://lists.ozlabs.org/options/linuxppc-dev>,\n\t<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=unsubscribe>","List-Archive":"<http://lists.ozlabs.org/pipermail/linuxppc-dev/>","List-Post":"<mailto:linuxppc-dev@lists.ozlabs.org>","List-Help":"<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=help>","List-Subscribe":"<https://lists.ozlabs.org/listinfo/linuxppc-dev>,\n\t<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=subscribe>","Cc":"Andrea Arcangeli <aarcange@redhat.com>,\n\tStephen Rothwell <sfr@canb.auug.org.au>, Joerg Roedel <jroedel@suse.de>, \n\tDavid Woodhouse <dwmw2@infradead.org>, linuxppc-dev@lists.ozlabs.org, \n\tLKML <linux-kernel@vger.kernel.org>,\n\t\"open list:MEMORY MANAGEMENT\" <linux-mm@kvack.org>,\n\tiommu@lists.linux-foundation.org, linux-next@vger.kernel.org,\n\tSuravee Suthikulpanit <suravee.suthikulpanit@amd.com>,\n\tAlistair Popple <alistair@popple.id.au>,\n\tAndrew Morton <akpm@linux-foundation.org>,\n\tLinus Torvalds <torvalds@linux-foundation.org>","Errors-To":"linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org","Sender":"\"Linuxppc-dev\"\n\t<linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>"}}]