From patchwork Mon Jul 20 14:30:23 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Florian Westphal X-Patchwork-Id: 497752 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 12E0614016A for ; Tue, 21 Jul 2015 00:30:42 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756399AbbGTOaf (ORCPT ); Mon, 20 Jul 2015 10:30:35 -0400 Received: from Chamillionaire.breakpoint.cc ([80.244.247.6]:34656 "EHLO Chamillionaire.breakpoint.cc" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932590AbbGTOad (ORCPT ); Mon, 20 Jul 2015 10:30:33 -0400 Received: from fw by Chamillionaire.breakpoint.cc with local (Exim 4.80) (envelope-from ) id 1ZHC5H-0002su-Jh; Mon, 20 Jul 2015 16:30:23 +0200 Date: Mon, 20 Jul 2015 16:30:23 +0200 From: Florian Westphal To: Frank Schreuder Cc: Nikolay Aleksandrov , Johan Schuijt , Eric Dumazet , "nikolay@redhat.com" , "davem@davemloft.net" , "fw@strlen.de" , "chutzpah@gentoo.org" , Robin Geuze , netdev Subject: Re: reproducable panic eviction work queue Message-ID: <20150720143023.GC11985@breakpoint.cc> References: <1437209795.1026.31.camel@edumazet-glaptop2.roam.corp.google.com> <5FD5C17E-B321-404E-80A2-EE46BB8AA746@transip.nl> <55AA243D.5020306@cumulusnetworks.com> <22C5EB62-8974-432D-9C3B-45F4E4067A45@transip.nl> <55AA717D.8080800@cumulusnetworks.com> <55ACEDE9.3090205@transip.nl> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <55ACEDE9.3090205@transip.nl> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Frank Schreuder wrote: > > On 7/18/2015 05:32 PM, Nikolay Aleksandrov wrote: > >On 07/18/2015 05:28 PM, Johan Schuijt wrote: > >>Thx for your looking into this! > >> > >>>Thank you for the report, I will try to reproduce this locally > >>>Could you please post the full crash log ? > >>Of course, please see attached file. > >> > >>>Also could you test > >>>with a clean current kernel from Linus' tree or Dave's -net ? > >>Will do. > >> > >>>These are available at: > >>>git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git > >>>git://git.kernel.org/pub/scm/linux/kernel/git/davem/net > >>>respectively. > >>> > >>>One last question how many IRQs do you pin i.e. how many cores > >>>do you actively use for receive ? > >>This varies a bit across our systems, but we’ve managed to reproduce this with IRQs pinned on as many as 2,4,8 or 20 cores. > >> > >>I won’t have access to our test-setup till Monday again, so I’ll be testing 3 scenario’s then: > >>- Your patch > >----- > >>- Linux tree > >>- Dave’s -net tree > >Just one of these two would be enough. I couldn't reproduce it here but > >I don't have as many machines to test right now and had to improvise with VMs. :-) > > > >>I’ll make sure to keep you posted on all the results then. We have a kernel dump of the panic, so if you need me to extract any data from there just let me know! (Some instructions might be needed) > >> > >>- Johan > >> > >Great, thank you! > > > I'm able to reproduce this panic on the following kernel builds: > - 3.18.7 > - 3.18.18 > - 3.18.18 + patch from Nikolay Aleksandrov > - 4.1.0 > > Would you happen to have any more suggestions we can try? Yes, although I admit its clutching at straws. Problem is that I don't see how we can race with timer, but OTOH I don't see why this needs to play refcnt tricks if we can just skip the entry completely ... The other issue is parallel completion on other cpu, but don't see how we could trip there either. Do you always get this one crash backtrace from evictor wq? I'll set up a bigger test machine soon and will also try to reproduce this. Thanks for reporting! --- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/net/ipv4/inet_fragment.c b/net/ipv4/inet_fragment.c --- a/net/ipv4/inet_fragment.c +++ b/net/ipv4/inet_fragment.c @@ -131,24 +131,14 @@ inet_evict_bucket(struct inet_frags *f, struct inet_frag_bucket *hb) unsigned int evicted = 0; HLIST_HEAD(expired); -evict_again: spin_lock(&hb->chain_lock); hlist_for_each_entry_safe(fq, n, &hb->chain, list) { if (!inet_fragq_should_evict(fq)) continue; - if (!del_timer(&fq->timer)) { - /* q expiring right now thus increment its refcount so - * it won't be freed under us and wait until the timer - * has finished executing then destroy it - */ - atomic_inc(&fq->refcnt); - spin_unlock(&hb->chain_lock); - del_timer_sync(&fq->timer); - inet_frag_put(fq, f); - goto evict_again; - } + if (!del_timer(&fq->timer)) + continue; fq->flags |= INET_FRAG_EVICTED; hlist_del(&fq->list); @@ -240,18 +230,20 @@ void inet_frags_exit_net(struct netns_frags *nf, struct inet_frags *f) int i; nf->low_thresh = 0; - local_bh_disable(); evict_again: + local_bh_disable(); seq = read_seqbegin(&f->rnd_seqlock); for (i = 0; i < INETFRAGS_HASHSZ ; i++) inet_evict_bucket(f, &f->hash[i]); - if (read_seqretry(&f->rnd_seqlock, seq)) - goto evict_again; - local_bh_enable(); + cond_resched(); + + if (read_seqretry(&f->rnd_seqlock, seq) || + percpu_counter_sum(&nf->mem)) + goto evict_again; percpu_counter_destroy(&nf->mem); } @@ -286,6 +278,8 @@ static inline void fq_unlink(struct inet_frag_queue *fq, struct inet_frags *f) hb = get_frag_bucket_locked(fq, f); if (!(fq->flags & INET_FRAG_EVICTED)) hlist_del(&fq->list); + + fq->flags |= INET_FRAG_COMPLETE; spin_unlock(&hb->chain_lock); } @@ -297,7 +291,6 @@ void inet_frag_kill(struct inet_frag_queue *fq, struct inet_frags *f) if (!(fq->flags & INET_FRAG_COMPLETE)) { fq_unlink(fq, f); atomic_dec(&fq->refcnt); - fq->flags |= INET_FRAG_COMPLETE; } } EXPORT_SYMBOL(inet_frag_kill);