diff mbox

slub: use irqsafe_cpu_cmpxchg for put_cpu_partial

Message ID alpine.DEB.2.00.1111230907330.16139@router.home
State Not Applicable, archived
Delegated to: David Miller
Headers show

Commit Message

Christoph Lameter (Ampere) Nov. 23, 2011, 3:14 p.m. UTC
On Wed, 23 Nov 2011, Pekka Enberg wrote:

> 2011/11/23 Christian Kujau <lists@nerdbynature.de>:
> > OK, with Christoph's patch applied, 3.2.0-rc2-00274-g6fe4c6d-dirty survives
> > on this machine, with the disk & cpu workload that caused the machine to
> > panic w/o the patch. Load was at 4-5 this time, which is expected for this
> > box. I'll run a few more tests later on, but it seems ok for now.
> >
> > I couldn't resist and ran "slabinfo" anyway (after the workload!) - the
> > box survived, nothing was printed in syslog either. Output attached.
>
> Christoph, Eric, would you mind sending me the final patches that
> Christian tested? Maybe CC David too for extra pair of eyes.

I think he only tested the patch that he showed us. Here is the patch
cleaned up. Do you Want me to feed you the debug fixes patch by patch as
well?

Subject: slub: use irqsafe_cpu_cmpxchg for put_cpu_partial

The cmpxchg must be irq safe. The fallback for this_cpu_cmpxchg only
disables preemption which results in per cpu partial page operation
potentially failing on non x86 platforms.

Signed-off-by: Christoph Lameter <cl@linux.com>

---
 mm/slub.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Eric Dumazet Nov. 23, 2011, 4:04 p.m. UTC | #1
Le mercredi 23 novembre 2011 à 09:14 -0600, Christoph Lameter a écrit :

> I think he only tested the patch that he showed us. Here is the patch
> cleaned up. Do you Want me to feed you the debug fixes patch by patch as
> well?
> 
> Subject: slub: use irqsafe_cpu_cmpxchg for put_cpu_partial
> 
> The cmpxchg must be irq safe. The fallback for this_cpu_cmpxchg only
> disables preemption which results in per cpu partial page operation
> potentially failing on non x86 platforms.
> 
> Signed-off-by: Christoph Lameter <cl@linux.com>
> 
> ---
>  mm/slub.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Index: linux-2.6/mm/slub.c
> ===================================================================
> --- linux-2.6.orig/mm/slub.c	2011-11-23 09:10:48.000000000 -0600
> +++ linux-2.6/mm/slub.c	2011-11-23 09:10:57.000000000 -0600
> @@ -1969,7 +1969,7 @@ int put_cpu_partial(struct kmem_cache *s
>  		page->pobjects = pobjects;
>  		page->next = oldpage;
> 
> -	} while (this_cpu_cmpxchg(s->cpu_slab->partial, oldpage, page) != oldpage);
> +	} while (irqsafe_cpu_cmpxchg(s->cpu_slab->partial, oldpage, page) != oldpage);
>  	stat(s, CPU_PARTIAL_FREE);
>  	return pobjects;
>  }

Acked-by: Eric Dumazet <eric.dumazet@gmail.com>

Thanks !


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Rientjes Nov. 23, 2011, 11:15 p.m. UTC | #2
On Wed, 23 Nov 2011, Christoph Lameter wrote:

> Subject: slub: use irqsafe_cpu_cmpxchg for put_cpu_partial
> 
> The cmpxchg must be irq safe. The fallback for this_cpu_cmpxchg only
> disables preemption which results in per cpu partial page operation
> potentially failing on non x86 platforms.
> 
> Signed-off-by: Christoph Lameter <cl@linux.com>

Acked-by: David Rientjes <rientjes@google.com>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pekka Enberg Nov. 24, 2011, 6:45 a.m. UTC | #3
> On Wed, 23 Nov 2011 at 09:14, Christoph Lameter wrote:
>> I think he only tested the patch that he showed us.

On Wed, Nov 23, 2011 at 8:33 PM, Christian Kujau <lists@nerdbynature.de> wrote:
> Yes, that's the (only) one I tested so far. I did some overnight testing
> (rsync'ing to the external disk again) for 6hrs and ran "slabinfo" every
> 30s during the run: http://nerdbynature.de/bits/3.2.0-rc1/oops/slabinfo-1.txt.xz
>
> The machine is still up & running. So for me, your patch fixes it!
>
>  Tested-by: Christian Kujau <lists@nerdbynature.de>

Applied, thanks!
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

Index: linux-2.6/mm/slub.c
===================================================================
--- linux-2.6.orig/mm/slub.c	2011-11-23 09:10:48.000000000 -0600
+++ linux-2.6/mm/slub.c	2011-11-23 09:10:57.000000000 -0600
@@ -1969,7 +1969,7 @@  int put_cpu_partial(struct kmem_cache *s
 		page->pobjects = pobjects;
 		page->next = oldpage;

-	} while (this_cpu_cmpxchg(s->cpu_slab->partial, oldpage, page) != oldpage);
+	} while (irqsafe_cpu_cmpxchg(s->cpu_slab->partial, oldpage, page) != oldpage);
 	stat(s, CPU_PARTIAL_FREE);
 	return pobjects;
 }