From patchwork Thu Dec 1 16:04:17 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefan Bader X-Patchwork-Id: 128721 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from chlorine.canonical.com (chlorine.canonical.com [91.189.94.204]) by ozlabs.org (Postfix) with ESMTP id F29E11007D1 for ; Fri, 2 Dec 2011 03:04:29 +1100 (EST) Received: from localhost ([127.0.0.1] helo=chlorine.canonical.com) by chlorine.canonical.com with esmtp (Exim 4.71) (envelope-from ) id 1RW97W-0000Ri-Ib; Thu, 01 Dec 2011 16:04:22 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by chlorine.canonical.com with esmtp (Exim 4.71) (envelope-from ) id 1RW97T-0000RW-MF for kernel-team@lists.ubuntu.com; Thu, 01 Dec 2011 16:04:19 +0000 Received: from p5b2e33b7.dip.t-dialin.net ([91.46.51.183] helo=canonical.com) by youngberry.canonical.com with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1RW97T-00062H-Cu for kernel-team@lists.ubuntu.com; Thu, 01 Dec 2011 16:04:19 +0000 From: Stefan Bader To: kernel-team@lists.ubuntu.com Subject: [Precise pre-up] Pick k(un)map_atomic fix Date: Thu, 1 Dec 2011 17:04:17 +0100 Message-Id: <1322755457-6024-1-git-send-email-stefan.bader@canonical.com> X-Mailer: git-send-email 1.7.5.4 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.13 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: kernel-team-bounces@lists.ubuntu.com Errors-To: kernel-team-bounces@lists.ubuntu.com This has been verified in older releases and we carry it in Natty and in Oneiric. Unfortunately its upstreaming into 3.2 is not certain for rather procedural reasons. Quoting Andrew (not amused) Morton: "I sent this patch to the x86 maintainers two weeks ago. It was ignored, as were the other 11 patches I sent. Later I will resend them all. If they are again ignored I will later send them yet again, and so on." But since it has been verified to cure rather nasty failures in the cloud (Xen) we should put it into Precise right now. If it lands upstream in time it just can be rebase out of existence. Maybe it needs "UBUNTU SAUCE:" tagging... -Stefan From b39e4363068122a5d36a26cc656c365d2341c1d8 Mon Sep 17 00:00:00 2001 From: Konrad Rzeszutek Wilk Date: Wed, 30 Nov 2011 15:03:08 +1100 Subject: [PATCH] x86/paravirt: PTE updates in k(un)map_atomic need to be synchronous, regardless of lazy_mmu mode Fix an outstanding issue that has been reported since 2.6.37. Under a heavy loaded machine processing "fork()" calls could crash with: BUG: unable to handle kernel paging request at f573fc8c IP: [] swap_count_continued+0x104/0x180 *pdpt = 000000002a3b9027 *pde = 0000000001bed067 *pte = 0000000000000000 Oops: 0000 [#1] SMP Modules linked in: Pid: 1638, comm: apache2 Not tainted 3.0.4-linode37 #1 EIP: 0061:[] EFLAGS: 00210246 CPU: 3 EIP is at swap_count_continued+0x104/0x180 .. snip.. Call Trace: [] ? __swap_duplicate+0xc2/0x160 [] ? pte_mfn_to_pfn+0x87/0xe0 [] ? swap_duplicate+0x14/0x40 [] ? copy_pte_range+0x45b/0x500 [] ? copy_page_range+0x195/0x200 [] ? dup_mmap+0x1c6/0x2c0 [] ? dup_mm+0xa8/0x130 [] ? copy_process+0x98a/0xb30 [] ? do_fork+0x4f/0x280 [] ? getnstimeofday+0x43/0x100 [] ? sys_clone+0x30/0x40 [] ? ptregs_clone+0x15/0x48 [] ? syscall_call+0x7/0xb The problem is that in copy_page_range() we turn lazy mode on, and then in swap_entry_free() we call swap_count_continued() which ends up in: map = kmap_atomic(page, KM_USER0) + offset; and then later we touch *map. Since we are running in batched mode (lazy) we don't actually set up the PTE mappings and the kmap_atomic is not done synchronously and ends up trying to dereference a page that has not been set. Looking at kmap_atomic_prot_pfn(), it uses 'arch_flush_lazy_mmu_mode' and doing the same in kmap_atomic_prot() and __kunmap_atomic() makes the problem go away. Interestingly, commit b8bcfe997e4615 ("x86/paravirt: remove lazy mode in interrupts") removed part of this to fix an interrupt issue - but it went to far and did not consider this scenario. Signed-off-by: Konrad Rzeszutek Wilk Cc: Thomas Gleixner Cc: Ingo Molnar Cc: "H. Peter Anvin" Cc: Peter Zijlstra Cc: Jeremy Fitzhardinge Cc: Signed-off-by: Andrew Morton BugLink: http://bugs.launchpad.net/bugs/854050 (cherry-picked from b39e4363068122a5d36a26cc656c365d2341c1d8 linux-next) Signed-off-by: Stefan Bader --- arch/x86/mm/highmem_32.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/arch/x86/mm/highmem_32.c b/arch/x86/mm/highmem_32.c index b499626..f4f29b1 100644 --- a/arch/x86/mm/highmem_32.c +++ b/arch/x86/mm/highmem_32.c @@ -45,6 +45,7 @@ void *kmap_atomic_prot(struct page *page, pgprot_t prot) vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx); BUG_ON(!pte_none(*(kmap_pte-idx))); set_pte(kmap_pte-idx, mk_pte(page, prot)); + arch_flush_lazy_mmu_mode(); return (void *)vaddr; } @@ -88,6 +89,7 @@ void __kunmap_atomic(void *kvaddr) */ kpte_clear_flush(kmap_pte-idx, vaddr); kmap_atomic_idx_pop(); + arch_flush_lazy_mmu_mode(); } #ifdef CONFIG_DEBUG_HIGHMEM else {