diff mbox

[RFC,3/3] locking,netfilter: Fix nf_conntrack_lock()

Message ID 20160524143649.673861121@infradead.org
State RFC
Delegated to: Pablo Neira
Headers show

Commit Message

Peter Zijlstra May 24, 2016, 2:27 p.m. UTC
nf_conntrack_lock{,_all}() is borken as it misses a bunch of memory
barriers to order the whole global vs local locks scheme.

Even x86 (and other TSO archs) are affected.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 net/netfilter/nf_conntrack_core.c |   30 +++++++++++++++++++++++++++++-
 1 file changed, 29 insertions(+), 1 deletion(-)



--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Peter Zijlstra May 24, 2016, 2:42 p.m. UTC | #1
On Tue, May 24, 2016 at 04:27:26PM +0200, Peter Zijlstra wrote:
> nf_conntrack_lock{,_all}() is borken as it misses a bunch of memory
> barriers to order the whole global vs local locks scheme.
> 
> Even x86 (and other TSO archs) are affected.
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  net/netfilter/nf_conntrack_core.c |   30 +++++++++++++++++++++++++++++-
>  1 file changed, 29 insertions(+), 1 deletion(-)
> 
> --- a/net/netfilter/nf_conntrack_core.c
> +++ b/net/netfilter/nf_conntrack_core.c
> @@ -74,7 +74,18 @@ void nf_conntrack_lock(spinlock_t *lock)
>  	spin_lock(lock);
>  	while (unlikely(nf_conntrack_locks_all)) {

And note that we can replace nf_conntrack_locks_all with
spin_is_locked(nf_conntrack_locks_all_lock), since that is the exact
same state.

But I didn't want to do too much in one go.

>  		spin_unlock(lock);
> +		/*
> +		 * Order the nf_contrack_locks_all load vs the spin_unlock_wait()
> +		 * loads below, to ensure locks_all is indeed held.
> +		 */
> +		smp_rmb(); /* spin_lock(locks_all) */
>  		spin_unlock_wait(&nf_conntrack_locks_all_lock);
> +		/*
> +		 * The control dependency's LOAD->STORE order is enough to
> +		 * guarantee the spin_lock() is ordered after the above
> +		 * unlock_wait(). And the ACQUIRE of the lock ensures we are
> +		 * fully ordered against the spin_unlock() of locks_all.
> +		 */
>  		spin_lock(lock);
>  	}
>  }
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Zijlstra May 26, 2016, 1:54 p.m. UTC | #2
On Tue, May 24, 2016 at 10:41:43PM +0200, Manfred Spraul wrote:
> >--- a/net/netfilter/nf_conntrack_core.c
> >+++ b/net/netfilter/nf_conntrack_core.c
> >@@ -74,7 +74,18 @@ void nf_conntrack_lock(spinlock_t *lock)
> >  	spin_lock(lock);
> >  	while (unlikely(nf_conntrack_locks_all)) {
> >  		spin_unlock(lock);
> >+		/*
> >+		 * Order the nf_contrack_locks_all load vs the spin_unlock_wait()
> >+		 * loads below, to ensure locks_all is indeed held.
> >+		 */
> >+		smp_rmb(); /* spin_lock(locks_all) */
> >  		spin_unlock_wait(&nf_conntrack_locks_all_lock);

> I don't understand the comment, and I don't understand why the smp_rmb() is
> required: What do you want to protect against?

I order against the spin_unlock_wait() load of
nf_conntrack_locks_all_lock being done _before_ the
nf_conntrack_locks_all load.

This is possible because the spin_unlock() in between will stop the
nf_conntrack_locks_all load from falling down, but it doesn't stop the
&nf_conntrack_locks_all_lock lock from being done early.

Inverting these two loads could result in not waiting when we should.


	nf_conntrack_all_lock()		nf_conntrack_lock()

					  [L] all_locks == unlocked
	  [S] spin_lock(&all_lock);
	  [S] all = true;
					  [L] all == true

And we'll not wait for the current all_lock section to finish.
Resulting in an all_lock and lock section at the same time.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -74,7 +74,18 @@  void nf_conntrack_lock(spinlock_t *lock)
 	spin_lock(lock);
 	while (unlikely(nf_conntrack_locks_all)) {
 		spin_unlock(lock);
+		/*
+		 * Order the nf_contrack_locks_all load vs the spin_unlock_wait()
+		 * loads below, to ensure locks_all is indeed held.
+		 */
+		smp_rmb(); /* spin_lock(locks_all) */
 		spin_unlock_wait(&nf_conntrack_locks_all_lock);
+		/*
+		 * The control dependency's LOAD->STORE order is enough to
+		 * guarantee the spin_lock() is ordered after the above
+		 * unlock_wait(). And the ACQUIRE of the lock ensures we are
+		 * fully ordered against the spin_unlock() of locks_all.
+		 */
 		spin_lock(lock);
 	}
 }
@@ -119,14 +130,31 @@  static void nf_conntrack_all_lock(void)
 	spin_lock(&nf_conntrack_locks_all_lock);
 	nf_conntrack_locks_all = true;
 
+	/*
+	 * Order the above store against the spin_unlock_wait() loads
+	 * below, such that if nf_conntrack_lock() observes lock_all
+	 * we must observe lock[] held.
+	 */
+	smp_mb(); /* spin_lock(locks_all) */
+
 	for (i = 0; i < CONNTRACK_LOCKS; i++) {
 		spin_unlock_wait(&nf_conntrack_locks[i]);
 	}
+	/*
+	 * Ensure we observe all state prior to the spin_unlock()s
+	 * observed above.
+	 */
+	smp_acquire__after_ctrl_dep();
 }
 
 static void nf_conntrack_all_unlock(void)
 {
-	nf_conntrack_locks_all = false;
+	/*
+	 * All prior stores must be complete before we clear locks_all.
+	 * Otherwise nf_conntrack_lock() might observe the false but not
+	 * the entire critical section.
+	 */
+	smp_store_release(&nf_conntrack_locks_all, false);
 	spin_unlock(&nf_conntrack_locks_all_lock);
 }