diff mbox

sparc64 WARNING: at mm/mmap.c:2757 exit_mmap+0x13c/0x160()

Message ID 20140729.162635.638786990787878495.davem@davemloft.net
State RFC
Delegated to: David Miller
Headers show

Commit Message

David Miller July 29, 2014, 11:26 p.m. UTC
From: mroos@linux.ee
Date: Thu, 17 Apr 2014 01:22:17 +0300 (EEST)

>> > Just for the archives, I got one of these again with 3.14:
>> 
>> Meelis and Aaro, thanks again for all of your reports.
>> 
>> After pouring over a lot of the data and auditing some code I'm
>> suspecting it's a problem with transparent huge pages.
>> 
>> One thing you two can do to help me further confirm this is to run
>> with THP disabled for a while and see if you still get the log
>> messages.
> 
> I have snice turned off CONFIG_TRANSPARENT_HUGEPAGE on 3 of 4 servers 
> that had this problem (actually most of my sparc64 machines) and the 4th 
> has
> 
> CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
> CONFIG_TRANSPARENT_HUGEPAGE=y
> # CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS is not set
> CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y
> # CONFIG_HUGETLBFS is not set
> # CONFIG_HUGETLB_PAGE is not se
> 
> and also has not had this problem since then. All 4 machines have been 
> running through most -rc's of every kernel.

Here is something I'd like you guys to test.

Yesterday, Christopher (CC:'d), posted some fixes yesterday and one of
them is very interesting.

Basically the update_mmu_cache() methods on sparc64 can insert an
invalid PTE into the TSB hash tables, causing livelocks and other
annoying issues.

The path where this can happen is via remove_migration_pte().

I had a discussion with Johannes Weiner about this and we determined
that it would make sense to mis-diagnose THP as being the root cause
in the RSS counter et al. problems if this bug here is the real
reason those things are happening.

That's because if you're not using THP there is less compaction going
on.  Less compaction means less migration, and therefore a lower
likelyhood of this code path triggering like this.

Could you guys please try this patch below?  Thanks.

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Meelis Roos July 30, 2014, 10:02 p.m. UTC | #1
> Here is something I'd like you guys to test.

Very interesting.

[...]
> Could you guys please try this patch below?  Thanks.

  CC      arch/sparc/mm/init_64.o
arch/sparc/mm/init_64.c: In function 'update_mmu_cache_pmd':
arch/sparc/mm/init_64.c:2625:6: error: 'pte' may be used uninitialized in this function [-Werror=uninitialized]

gcc 4.6.4.
Meelis Roos Aug. 19, 2014, 8:22 a.m. UTC | #2
Meanwhile, a Ultra 1 with overnight looping git clone got exit_mmap 
warning again with 3.17.0-rc1. Otherwise it is working good.

[11052.686935] ------------[ cut here ]------------
[11052.740486] WARNING: CPU: 0 PID: 2541 at mm/mmap.c:2766 exit_mmap+0x138/0x160()
[11052.827934] Modules linked in: osst snd_sun_cs4231 snd_pcm snd_timer snd soundcore parport_sunbpp parport st ch qlogicpti sunhme ipv6 sr_mod cdrom sg evdev
[11052.994500] CPU: 0 PID: 2541 Comm: git Not tainted 3.17.0-rc1 #49
[11053.067464] Call Trace:
[11053.096647]  [00000000004d1758] exit_mmap+0x138/0x160
[11053.157091]  [000000000044c1ec] mmput+0x2c/0xc0
[11053.211256]  [000000000044e168] exit_mm+0x108/0x180
[11053.269597]  [000000000044f908] do_exit+0x228/0x320
[11053.327935]  [000000000044fae4] do_group_exit+0x24/0xc0
[11053.390440]  [000000000044fb94] SyS_exit_group+0x14/0x20
[11053.454004]  [0000000000406074] linux_sparc_syscall32+0x34/0x60
[11053.524809] ---[ end trace 7b6188ceaeca01dd ]---
[11053.580132] BUG: Bad rss-counter state mm:ffffff0032c778c0 idx:0 val:7
diff mbox

Patch

diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c
index 16b58ff..8e894e0 100644
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -351,6 +351,10 @@  void update_mmu_cache(struct vm_area_struct *vma, unsigned long address, pte_t *
 
 	mm = vma->vm_mm;
 
+	/* Don't insert a non-valid PTE into the TSB, we'll deadlock.  */
+	if (!pte_accessible(mm, pte))
+		return;
+
 	spin_lock_irqsave(&mm->context.lock, flags);
 
 #if defined(CONFIG_HUGETLB_PAGE) || defined(CONFIG_TRANSPARENT_HUGEPAGE)
@@ -2617,6 +2621,10 @@  void update_mmu_cache_pmd(struct vm_area_struct *vma, unsigned long addr,
 	if (!pmd_large(entry) || !pmd_young(entry))
 		return;
 
+	/* Don't insert a non-valid PMD into the TSB, we'll deadlock.  */
+	if (!(pte & _PAGE_VALID))
+		return;
+
 	pte = pmd_val(entry);
 
 	/* We are fabricating 8MB pages using 4MB real hw pages.  */