Message ID | 20150112235821.GB16617@casper.infradead.org |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
From: Thomas Graf <tgraf@suug.ch> Date: Mon, 12 Jan 2015 23:58:21 +0000 > Each per bucket lock covers a configurable number of buckets. While > shrinking, two buckets in the old table contain entries for a single > bucket in the new table. We need to lock down both while linking. > Check if they are protected by different locks to avoid a recursive > lock. > > Fixes: 97defe1e ("rhashtable: Per bucket locks & deferred expansion/shrinking") > Reported-by: Fengguang Wu <fengguang.wu@intel.com> > Signed-off-by: Thomas Graf <tgraf@suug.ch> Applied. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Thomas Graf > Each per bucket lock covers a configurable number of buckets. While > shrinking, two buckets in the old table contain entries for a single > bucket in the new table. We need to lock down both while linking. > Check if they are protected by different locks to avoid a recursive > lock. Thought, could the shrunk table use the same locks as the lower half of the old table? I also wonder whether shrinking hash tables is ever actually worth the effort. Most likely they'll need to grow again very quickly. > spin_lock_bh(old_bucket_lock1); > - spin_lock_bh_nested(old_bucket_lock2, RHT_LOCK_NESTED); > - spin_lock_bh_nested(new_bucket_lock, RHT_LOCK_NESTED2); > + > + /* Depending on the lock per buckets mapping, the bucket in > + * the lower and upper region may map to the same lock. > + */ > + if (old_bucket_lock1 != old_bucket_lock2) { > + spin_lock_bh_nested(old_bucket_lock2, RHT_LOCK_NESTED); > + spin_lock_bh_nested(new_bucket_lock, RHT_LOCK_NESTED2); > + } else { > + spin_lock_bh_nested(new_bucket_lock, RHT_LOCK_NESTED); > + } Acquiring 3 locks of much the same type looks like a locking hierarchy violation just waiting to happen. David -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 01/13/15 at 09:49am, David Laight wrote: > From: Thomas Graf > > Each per bucket lock covers a configurable number of buckets. While > > shrinking, two buckets in the old table contain entries for a single > > bucket in the new table. We need to lock down both while linking. > > Check if they are protected by different locks to avoid a recursive > > lock. > > Thought, could the shrunk table use the same locks as the lower half > of the old table? No. A new bucket table and thus a new set of locks is allocated when the table is shrunk or grown. We only have check for overlapping locks when holding multiple locks for the same table at the same time. > I also wonder whether shrinking hash tables is ever actually worth > the effort. Most likely they'll need to grow again very quickly. Specifying a .shrink_decision function is optional so every rhashtable user can decide whether it wants shrinking or not. Need for it was expressed in the past threads. Also, the case of multiple buckets mapping to the same lock is also present in the expanding logic so removing the shrinking logic would not remove the need for these types of checks. > > spin_lock_bh(old_bucket_lock1); > > - spin_lock_bh_nested(old_bucket_lock2, RHT_LOCK_NESTED); > > - spin_lock_bh_nested(new_bucket_lock, RHT_LOCK_NESTED2); > > + > > + /* Depending on the lock per buckets mapping, the bucket in > > + * the lower and upper region may map to the same lock. > > + */ > > + if (old_bucket_lock1 != old_bucket_lock2) { > > + spin_lock_bh_nested(old_bucket_lock2, RHT_LOCK_NESTED); > > + spin_lock_bh_nested(new_bucket_lock, RHT_LOCK_NESTED2); > > + } else { > > + spin_lock_bh_nested(new_bucket_lock, RHT_LOCK_NESTED); > > + } > > Acquiring 3 locks of much the same type looks like a locking hierarchy > violation just waiting to happen. I'm not claiming it's extremely pretty, lockless lookup with deferred resizing doesn't come for free ;-) If you have a suggestion on how to implement this differently I'm all ears. That said, it's well isolated and the user of rhashtable does not have to deal with it. All code paths which take multiple locks are mutually exclusive to each other (ht->mutex). -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Thomas Graf ... > > Thought, could the shrunk table use the same locks as the lower half > > of the old table? > > No. A new bucket table and thus a new set of locks is allocated when the > table is shrunk or grown. We only have check for overlapping locks > when holding multiple locks for the same table at the same time. I was guessing that when locks are shared buckets k and 2^n+k use the same lock. Under those conditions if the 'grow' decided not to allocate extra locks then it could save work by using exactly the same locks as the old table. Similarly 'shrink' could do the reverse. It was only a thought. David -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Thomas Graf ... > > > spin_lock_bh(old_bucket_lock1); > > > - spin_lock_bh_nested(old_bucket_lock2, RHT_LOCK_NESTED); > > > - spin_lock_bh_nested(new_bucket_lock, RHT_LOCK_NESTED2); > > > + > > > + /* Depending on the lock per buckets mapping, the bucket in > > > + * the lower and upper region may map to the same lock. > > > + */ > > > + if (old_bucket_lock1 != old_bucket_lock2) { > > > + spin_lock_bh_nested(old_bucket_lock2, RHT_LOCK_NESTED); > > > + spin_lock_bh_nested(new_bucket_lock, RHT_LOCK_NESTED2); > > > + } else { > > > + spin_lock_bh_nested(new_bucket_lock, RHT_LOCK_NESTED); > > > + } > > > > Acquiring 3 locks of much the same type looks like a locking hierarchy > > violation just waiting to happen. > > I'm not claiming it's extremely pretty, lockless lookup with deferred > resizing doesn't come for free ;-) If you have a suggestion on how to > implement this differently I'm all ears. runs away.... > That said, it's well isolated > and the user of rhashtable does not have to deal with it. All code paths > which take multiple locks are mutually exclusive to each other (ht->mutex). OK, ht->mutes saves the day. Might be worth a comment to save people looking at the code in isolation from worrying and doing a bit search. OTOH it might be obvious from a slightly larger fragment than the diff. David -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 01/13/15 at 03:06pm, David Laight wrote: > OK, ht->mutes saves the day. > Might be worth a comment to save people looking at the code in isolation > from worrying and doing a bit search. > OTOH it might be obvious from a slightly larger fragment than the diff. Good idea. Will do this. Also, thanks for the suggestion in the other email. I now understand what you meant. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/lib/rhashtable.c b/lib/rhashtable.c index 8023b55..45477f7 100644 --- a/lib/rhashtable.c +++ b/lib/rhashtable.c @@ -443,8 +443,16 @@ int rhashtable_shrink(struct rhashtable *ht) new_bucket_lock = bucket_lock(new_tbl, new_hash); spin_lock_bh(old_bucket_lock1); - spin_lock_bh_nested(old_bucket_lock2, RHT_LOCK_NESTED); - spin_lock_bh_nested(new_bucket_lock, RHT_LOCK_NESTED2); + + /* Depending on the lock per buckets mapping, the bucket in + * the lower and upper region may map to the same lock. + */ + if (old_bucket_lock1 != old_bucket_lock2) { + spin_lock_bh_nested(old_bucket_lock2, RHT_LOCK_NESTED); + spin_lock_bh_nested(new_bucket_lock, RHT_LOCK_NESTED2); + } else { + spin_lock_bh_nested(new_bucket_lock, RHT_LOCK_NESTED); + } rcu_assign_pointer(*bucket_tail(new_tbl, new_hash), tbl->buckets[new_hash]); @@ -452,7 +460,8 @@ int rhashtable_shrink(struct rhashtable *ht) tbl->buckets[new_hash + new_tbl->size]); spin_unlock_bh(new_bucket_lock); - spin_unlock_bh(old_bucket_lock2); + if (old_bucket_lock1 != old_bucket_lock2) + spin_unlock_bh(old_bucket_lock2); spin_unlock_bh(old_bucket_lock1); }
Each per bucket lock covers a configurable number of buckets. While shrinking, two buckets in the old table contain entries for a single bucket in the new table. We need to lock down both while linking. Check if they are protected by different locks to avoid a recursive lock. Fixes: 97defe1e ("rhashtable: Per bucket locks & deferred expansion/shrinking") Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Thomas Graf <tgraf@suug.ch> --- lib/rhashtable.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-)