Message ID | 20190205005115.2215-2-khalid.elmously@canonical.com |
---|---|
State | New |
Headers | show |
Series | Fix for LP #1799237 (mprotect() failure) | expand |
On 05.02.19 01:51, Khalid Elmously wrote: > From: Sean Christopherson <sean.j.christopherson@intel.com> > > BugLink: http://bugs.launchpad.net/bugs/1799237 > > It turns out that we should *not* invert all not-present mappings, > because the all zeroes case is obviously special. > > clear_page() does not undergo the XOR logic to invert the address bits, > i.e. PTE, PMD and PUD entries that have not been individually written > will have val=0 and so will trigger __pte_needs_invert(). As a result, > {pte,pmd,pud}_pfn() will return the wrong PFN value, i.e. all ones > (adjusted by the max PFN mask) instead of zero. A zeroed entry is ok > because the page at physical address 0 is reserved early in boot > specifically to mitigate L1TF, so explicitly exempt them from the > inversion when reading the PFN. > > Manifested as an unexpected mprotect(..., PROT_NONE) failure when called > on a VMA that has VM_PFNMAP and was mmap'd to as something other than > PROT_NONE but never used. mprotect() sends the PROT_NONE request down > prot_none_walk(), which walks the PTEs to check the PFNs. > prot_none_pte_entry() gets the bogus PFN from pte_pfn() and returns > -EACCES because it thinks mprotect() is trying to adjust a high MMIO > address. > > [ This is a very modified version of Sean's original patch, but all > credit goes to Sean for doing this and also pointing out that > sometimes the __pte_needs_invert() function only gets the protection > bits, not the full eventual pte. But zero remains special even in > just protection bits, so that's ok. - Linus ] > > Fixes: f22cc87f6c1f ("x86/speculation/l1tf: Invert all not present mappings") > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> > Acked-by: Andi Kleen <ak@linux.intel.com> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Josh Poimboeuf <jpoimboe@redhat.com> > Cc: Michal Hocko <mhocko@suse.com> > Cc: Vlastimil Babka <vbabka@suse.cz> > Cc: Dave Hansen <dave.hansen@intel.com> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> > Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> > (cherry picked from commit f19f5c49bbc3ffcc9126cc245fc1b24cc29f4a37) > Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> > --- Is related to L1TF and from what I saw we have it in Xenial via stable, so makes sense. > arch/x86/include/asm/pgtable-invert.h | 11 ++++++++++- > 1 file changed, 10 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/include/asm/pgtable-invert.h b/arch/x86/include/asm/pgtable-invert.h > index 44b1203ece12..a0c1525f1b6f 100644 > --- a/arch/x86/include/asm/pgtable-invert.h > +++ b/arch/x86/include/asm/pgtable-invert.h > @@ -4,9 +4,18 @@ > > #ifndef __ASSEMBLY__ > > +/* > + * A clear pte value is special, and doesn't get inverted. > + * > + * Note that even users that only pass a pgprot_t (rather > + * than a full pte) won't trigger the special zero case, > + * because even PAGE_NONE has _PAGE_PROTNONE | _PAGE_ACCESSED > + * set. So the all zero case really is limited to just the > + * cleared page table entry case. > + */ > static inline bool __pte_needs_invert(u64 val) > { > - return !(val & _PAGE_PRESENT); > + return val && !(val & _PAGE_PRESENT); > } > > /* Get a mask to xor with the page table entry to get the correct pfn. */ >
On 2/5/19 1:51 AM, Khalid Elmously wrote: > From: Sean Christopherson <sean.j.christopherson@intel.com> > > BugLink: http://bugs.launchpad.net/bugs/1799237 > > It turns out that we should *not* invert all not-present mappings, > because the all zeroes case is obviously special. > > clear_page() does not undergo the XOR logic to invert the address bits, > i.e. PTE, PMD and PUD entries that have not been individually written > will have val=0 and so will trigger __pte_needs_invert(). As a result, > {pte,pmd,pud}_pfn() will return the wrong PFN value, i.e. all ones > (adjusted by the max PFN mask) instead of zero. A zeroed entry is ok > because the page at physical address 0 is reserved early in boot > specifically to mitigate L1TF, so explicitly exempt them from the > inversion when reading the PFN. > > Manifested as an unexpected mprotect(..., PROT_NONE) failure when called > on a VMA that has VM_PFNMAP and was mmap'd to as something other than > PROT_NONE but never used. mprotect() sends the PROT_NONE request down > prot_none_walk(), which walks the PTEs to check the PFNs. > prot_none_pte_entry() gets the bogus PFN from pte_pfn() and returns > -EACCES because it thinks mprotect() is trying to adjust a high MMIO > address. > > [ This is a very modified version of Sean's original patch, but all > credit goes to Sean for doing this and also pointing out that > sometimes the __pte_needs_invert() function only gets the protection > bits, not the full eventual pte. But zero remains special even in > just protection bits, so that's ok. - Linus ] > > Fixes: f22cc87f6c1f ("x86/speculation/l1tf: Invert all not present mappings") > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> > Acked-by: Andi Kleen <ak@linux.intel.com> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Josh Poimboeuf <jpoimboe@redhat.com> > Cc: Michal Hocko <mhocko@suse.com> > Cc: Vlastimil Babka <vbabka@suse.cz> > Cc: Dave Hansen <dave.hansen@intel.com> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> > Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> > (cherry picked from commit f19f5c49bbc3ffcc9126cc245fc1b24cc29f4a37) > Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> I was able to reproduce the issue with 4.15.0-44-generic and confirm that this patch fixes the issue. Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> > --- > arch/x86/include/asm/pgtable-invert.h | 11 ++++++++++- > 1 file changed, 10 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/include/asm/pgtable-invert.h b/arch/x86/include/asm/pgtable-invert.h > index 44b1203ece12..a0c1525f1b6f 100644 > --- a/arch/x86/include/asm/pgtable-invert.h > +++ b/arch/x86/include/asm/pgtable-invert.h > @@ -4,9 +4,18 @@ > > #ifndef __ASSEMBLY__ > > +/* > + * A clear pte value is special, and doesn't get inverted. > + * > + * Note that even users that only pass a pgprot_t (rather > + * than a full pte) won't trigger the special zero case, > + * because even PAGE_NONE has _PAGE_PROTNONE | _PAGE_ACCESSED > + * set. So the all zero case really is limited to just the > + * cleared page table entry case. > + */ > static inline bool __pte_needs_invert(u64 val) > { > - return !(val & _PAGE_PRESENT); > + return val && !(val & _PAGE_PRESENT); > } > > /* Get a mask to xor with the page table entry to get the correct pfn. */
On 2/5/19 1:51 AM, Khalid Elmously wrote: > From: Sean Christopherson <sean.j.christopherson@intel.com> > > BugLink: http://bugs.launchpad.net/bugs/1799237 > > It turns out that we should *not* invert all not-present mappings, > because the all zeroes case is obviously special. > > clear_page() does not undergo the XOR logic to invert the address bits, > i.e. PTE, PMD and PUD entries that have not been individually written > will have val=0 and so will trigger __pte_needs_invert(). As a result, > {pte,pmd,pud}_pfn() will return the wrong PFN value, i.e. all ones > (adjusted by the max PFN mask) instead of zero. A zeroed entry is ok > because the page at physical address 0 is reserved early in boot > specifically to mitigate L1TF, so explicitly exempt them from the > inversion when reading the PFN. > > Manifested as an unexpected mprotect(..., PROT_NONE) failure when called > on a VMA that has VM_PFNMAP and was mmap'd to as something other than > PROT_NONE but never used. mprotect() sends the PROT_NONE request down > prot_none_walk(), which walks the PTEs to check the PFNs. > prot_none_pte_entry() gets the bogus PFN from pte_pfn() and returns > -EACCES because it thinks mprotect() is trying to adjust a high MMIO > address. > > [ This is a very modified version of Sean's original patch, but all > credit goes to Sean for doing this and also pointing out that > sometimes the __pte_needs_invert() function only gets the protection > bits, not the full eventual pte. But zero remains special even in > just protection bits, so that's ok. - Linus ] > > Fixes: f22cc87f6c1f ("x86/speculation/l1tf: Invert all not present mappings") > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> > Acked-by: Andi Kleen <ak@linux.intel.com> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Josh Poimboeuf <jpoimboe@redhat.com> > Cc: Michal Hocko <mhocko@suse.com> > Cc: Vlastimil Babka <vbabka@suse.cz> > Cc: Dave Hansen <dave.hansen@intel.com> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> > Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> > (cherry picked from commit f19f5c49bbc3ffcc9126cc245fc1b24cc29f4a37) > Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> > --- > arch/x86/include/asm/pgtable-invert.h | 11 ++++++++++- > 1 file changed, 10 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/include/asm/pgtable-invert.h b/arch/x86/include/asm/pgtable-invert.h > index 44b1203ece12..a0c1525f1b6f 100644 > --- a/arch/x86/include/asm/pgtable-invert.h > +++ b/arch/x86/include/asm/pgtable-invert.h > @@ -4,9 +4,18 @@ > > #ifndef __ASSEMBLY__ > > +/* > + * A clear pte value is special, and doesn't get inverted. > + * > + * Note that even users that only pass a pgprot_t (rather > + * than a full pte) won't trigger the special zero case, > + * because even PAGE_NONE has _PAGE_PROTNONE | _PAGE_ACCESSED > + * set. So the all zero case really is limited to just the > + * cleared page table entry case. > + */ > static inline bool __pte_needs_invert(u64 val) > { > - return !(val & _PAGE_PRESENT); > + return val && !(val & _PAGE_PRESENT); > } > > /* Get a mask to xor with the page table entry to get the correct pfn. */ Applied to bionic/master-next branch. Thanks, Kleber
diff --git a/arch/x86/include/asm/pgtable-invert.h b/arch/x86/include/asm/pgtable-invert.h index 44b1203ece12..a0c1525f1b6f 100644 --- a/arch/x86/include/asm/pgtable-invert.h +++ b/arch/x86/include/asm/pgtable-invert.h @@ -4,9 +4,18 @@ #ifndef __ASSEMBLY__ +/* + * A clear pte value is special, and doesn't get inverted. + * + * Note that even users that only pass a pgprot_t (rather + * than a full pte) won't trigger the special zero case, + * because even PAGE_NONE has _PAGE_PROTNONE | _PAGE_ACCESSED + * set. So the all zero case really is limited to just the + * cleared page table entry case. + */ static inline bool __pte_needs_invert(u64 val) { - return !(val & _PAGE_PRESENT); + return val && !(val & _PAGE_PRESENT); } /* Get a mask to xor with the page table entry to get the correct pfn. */