From patchwork Wed Dec 16 08:45:54 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Herbert Xu X-Patchwork-Id: 557354 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 54E9F1402C9 for ; Wed, 16 Dec 2015 19:46:09 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932986AbbLPIqD (ORCPT ); Wed, 16 Dec 2015 03:46:03 -0500 Received: from helcar.hengli.com.au ([209.40.204.226]:60864 "EHLO helcar.hengli.com.au" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754466AbbLPIqC (ORCPT ); Wed, 16 Dec 2015 03:46:02 -0500 Received: from gondolin.me.apana.org.au ([192.168.0.6]) by norbury.hengli.com.au with esmtp (Exim 4.80 #3 (Debian)) id 1a97ie-0006vv-Jb; Wed, 16 Dec 2015 19:45:56 +1100 Received: from herbert by gondolin.me.apana.org.au with local (Exim 4.80) (envelope-from ) id 1a97ic-0006MN-P5; Wed, 16 Dec 2015 16:45:54 +0800 Date: Wed, 16 Dec 2015 16:45:54 +0800 From: Herbert Xu To: Colin Ian King Cc: "David S. Miller" , netdev@vger.kernel.org Subject: rhashtable: Fix walker list corruption Message-ID: <20151216084554.GA24395@gondor.apana.org.au> References: <561797B7.3090807@canonical.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <561797B7.3090807@canonical.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Fri, Oct 09, 2015 at 11:32:23AM +0100, Colin Ian King wrote: > > I'm hitting a null ptr deference bug when running 2 or more instances of > the attached reproducer program. I've bisected this down to the > following commit: > > commit ba7c95ea3870fe7b847466d39a049ab6f156aa2c > Author: Herbert Xu > Date: Tue Mar 24 09:53:17 2015 +1100 > > rhashtable: Fix sleeping inside RCU critical section in walk_stop > > > Without this commit, the attached reproducer runs fine for hours. With > the commit, I can oops a 4 core (8 thread) Intel i7-6700 Sharkbay SDP in > a few seconds. Thanks Colin. This commit was indeed bogus, as we end up using two different locks for the one list. ---8<--- The commit ba7c95ea3870fe7b847466d39a049ab6f156aa2c ("rhashtable: Fix sleeping inside RCU critical section in walk_stop") introduced a new spinlock for the walker list. However, it did not convert all existing users of the list over to the new spin lock. Some continued to use the old mutext for this purpose. This obviously led to corruption of the list. The fix is to use the spin lock everywhere where we touch the list. This also allows us to do rcu_rad_lock before we take the lock in rhashtable_walk_start. With the old mutex this would've deadlocked but it's safe with the new spin lock. Fixes: ba7c95ea3870 ("rhashtable: Fix sleeping inside RCU...") Reported-by: Colin Ian King Signed-off-by: Herbert Xu diff --git a/lib/rhashtable.c b/lib/rhashtable.c index 1c624db..ed7ba47 100644 --- a/lib/rhashtable.c +++ b/lib/rhashtable.c @@ -519,10 +519,10 @@ int rhashtable_walk_init(struct rhashtable *ht, struct rhashtable_iter *iter) if (!iter->walker) return -ENOMEM; - mutex_lock(&ht->mutex); + spin_lock(&ht->lock); iter->walker->tbl = rht_dereference(ht->tbl, ht); list_add(&iter->walker->list, &iter->walker->tbl->walkers); - mutex_unlock(&ht->mutex); + spin_unlock(&ht->lock); return 0; } @@ -536,10 +536,10 @@ EXPORT_SYMBOL_GPL(rhashtable_walk_init); */ void rhashtable_walk_exit(struct rhashtable_iter *iter) { - mutex_lock(&iter->ht->mutex); + spin_lock(&iter->ht->lock); if (iter->walker->tbl) list_del(&iter->walker->list); - mutex_unlock(&iter->ht->mutex); + spin_unlock(&iter->ht->lock); kfree(iter->walker); } EXPORT_SYMBOL_GPL(rhashtable_walk_exit); @@ -563,14 +563,12 @@ int rhashtable_walk_start(struct rhashtable_iter *iter) { struct rhashtable *ht = iter->ht; - mutex_lock(&ht->mutex); + rcu_read_lock(); + spin_lock(&ht->lock); if (iter->walker->tbl) list_del(&iter->walker->list); - - rcu_read_lock(); - - mutex_unlock(&ht->mutex); + spin_unlock(&ht->lock); if (!iter->walker->tbl) { iter->walker->tbl = rht_dereference_rcu(ht->tbl, ht);