diff mbox

[8/7] net/netfilter/nf_conntrack_core: Remove another memory barrier

Message ID 1472743673-15585-1-git-send-email-manfred@colorfullife.com
State Changes Requested
Delegated to: Pablo Neira
Headers show

Commit Message

Manfred Spraul Sept. 1, 2016, 3:27 p.m. UTC
Since spin_unlock_wait() is defined as equivalent to spin_lock();
spin_unlock(), the memory barrier before spin_unlock_wait() is
also not required.

Not for stable!

Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: netfilter-devel@vger.kernel.org
---
 net/netfilter/nf_conntrack_core.c | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

Comments

Will Deacon Sept. 1, 2016, 3:30 p.m. UTC | #1
On Thu, Sep 01, 2016 at 05:27:52PM +0200, Manfred Spraul wrote:
> Since spin_unlock_wait() is defined as equivalent to spin_lock();
> spin_unlock(), the memory barrier before spin_unlock_wait() is
> also not required.
> 
> Not for stable!
> 
> Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
> Cc: Pablo Neira Ayuso <pablo@netfilter.org>
> Cc: netfilter-devel@vger.kernel.org
> ---
>  net/netfilter/nf_conntrack_core.c | 8 +-------
>  1 file changed, 1 insertion(+), 7 deletions(-)
> 
> diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
> index 7a3b5e6..0591a25 100644
> --- a/net/netfilter/nf_conntrack_core.c
> +++ b/net/netfilter/nf_conntrack_core.c
> @@ -139,13 +139,7 @@ static void nf_conntrack_all_lock(void)
>  
>  	spin_lock(&nf_conntrack_locks_all_lock);
>  
> -	/*
> -	 * Order the store of 'nf_conntrack_locks_all' against
> -	 * the spin_unlock_wait() loads below, such that if
> -	 * nf_conntrack_lock() observes 'nf_conntrack_locks_all'
> -	 * we must observe nf_conntrack_locks[] held:
> -	 */
> -	smp_store_mb(nf_conntrack_locks_all, true);
> +	nf_conntrack_locks_all = true;

Don't you at least need WRITE_ONCE if you're going to do this?

Will
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Zijlstra Sept. 1, 2016, 4:41 p.m. UTC | #2
On Thu, Sep 01, 2016 at 04:30:39PM +0100, Will Deacon wrote:
> On Thu, Sep 01, 2016 at 05:27:52PM +0200, Manfred Spraul wrote:
> > Since spin_unlock_wait() is defined as equivalent to spin_lock();
> > spin_unlock(), the memory barrier before spin_unlock_wait() is
> > also not required.

Note that ACQUIRE+RELEASE isn't a barrier.

Both are semi-permeable and things can cross in the middle, like:


	x = 1;
	LOCK
	UNLOCK
	r = y;

can (validly) get re-ordered like:

	LOCK
	r = y;
	x = 1;
	UNLOCK

So if you want things ordered, as I think you do, I think the smp_mb()
is still needed.

RELEASE + ACQUIRE otoh, that is a load-store barrier (but not
transitive).
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Boqun Feng Sept. 2, 2016, 6:17 a.m. UTC | #3
Hi Manfred,

On Thu, Sep 01, 2016 at 06:41:26PM +0200, Peter Zijlstra wrote:
> On Thu, Sep 01, 2016 at 04:30:39PM +0100, Will Deacon wrote:
> > On Thu, Sep 01, 2016 at 05:27:52PM +0200, Manfred Spraul wrote:
> > > Since spin_unlock_wait() is defined as equivalent to spin_lock();
> > > spin_unlock(), the memory barrier before spin_unlock_wait() is
> > > also not required.
> 

As Peter said below, ACQUIRE+RELEASE is not a barrier.

What we rely here is that spin_unlock_wait() could pair with another
LOCK or UNLOCK(as spin_unlock_wait() acts as spin_lock();
spin_unlock()). And once paired, we could have the necessary order
guarantee between the code preceding or following unlock_wait() and the
code in the lock critical sections.

Regards,
Boqun

> Note that ACQUIRE+RELEASE isn't a barrier.
> 
> Both are semi-permeable and things can cross in the middle, like:
> 
> 
> 	x = 1;
> 	LOCK
> 	UNLOCK
> 	r = y;
> 
> can (validly) get re-ordered like:
> 
> 	LOCK
> 	r = y;
> 	x = 1;
> 	UNLOCK
> 
> So if you want things ordered, as I think you do, I think the smp_mb()
> is still needed.
> 
> RELEASE + ACQUIRE otoh, that is a load-store barrier (but not
> transitive).
Manfred Spraul Sept. 2, 2016, 6:35 a.m. UTC | #4
On 09/01/2016 06:41 PM, Peter Zijlstra wrote:
> On Thu, Sep 01, 2016 at 04:30:39PM +0100, Will Deacon wrote:
>> On Thu, Sep 01, 2016 at 05:27:52PM +0200, Manfred Spraul wrote:
>>> Since spin_unlock_wait() is defined as equivalent to spin_lock();
>>> spin_unlock(), the memory barrier before spin_unlock_wait() is
>>> also not required.
> Note that ACQUIRE+RELEASE isn't a barrier.
>
> Both are semi-permeable and things can cross in the middle, like:
>
>
> 	x = 1;
> 	LOCK
> 	UNLOCK
> 	r = y;
>
> can (validly) get re-ordered like:
>
> 	LOCK
> 	r = y;
> 	x = 1;
> 	UNLOCK
>
> So if you want things ordered, as I think you do, I think the smp_mb()
> is still needed.
CPU1:
x=1; /* without WRITE_ONCE */
LOCK(l);
UNLOCK(l);
<do_semop>
smp_store_release(x,0)


CPU2;
LOCK(l)
if (smp_load_acquire(x)==1) goto slow_path
<do_semop>
UNLOCK(l)

Ordering is enforced because both CPUs access the same lock.

x=1 can't be reordered past the UNLOCK(l), I don't see that further 
guarantees are necessary.

Correct?

--
     Manfred
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Zijlstra Sept. 2, 2016, 7:22 p.m. UTC | #5
On Fri, Sep 02, 2016 at 08:35:55AM +0200, Manfred Spraul wrote:
> On 09/01/2016 06:41 PM, Peter Zijlstra wrote:
> >On Thu, Sep 01, 2016 at 04:30:39PM +0100, Will Deacon wrote:
> >>On Thu, Sep 01, 2016 at 05:27:52PM +0200, Manfred Spraul wrote:
> >>>Since spin_unlock_wait() is defined as equivalent to spin_lock();
> >>>spin_unlock(), the memory barrier before spin_unlock_wait() is
> >>>also not required.
> >Note that ACQUIRE+RELEASE isn't a barrier.
> >
> >Both are semi-permeable and things can cross in the middle, like:
> >
> >
> >	x = 1;
> >	LOCK
> >	UNLOCK
> >	r = y;
> >
> >can (validly) get re-ordered like:
> >
> >	LOCK
> >	r = y;
> >	x = 1;
> >	UNLOCK
> >
> >So if you want things ordered, as I think you do, I think the smp_mb()
> >is still needed.
> CPU1:
> x=1; /* without WRITE_ONCE */
> LOCK(l);
> UNLOCK(l);
> <do_semop>
> smp_store_release(x,0)
> 
> 
> CPU2;
> LOCK(l)
> if (smp_load_acquire(x)==1) goto slow_path
> <do_semop>
> UNLOCK(l)
> 
> Ordering is enforced because both CPUs access the same lock.
> 
> x=1 can't be reordered past the UNLOCK(l), I don't see that further
> guarantees are necessary.
> 
> Correct?

Correct, sadly implementations do not comply :/ In fact, even x86 is
broken here.

I spoke to Will earlier today and he suggests either making
spin_unlock_wait() stronger to avoids any and all such surprises or just
getting rid of the thing.

I'm not sure which way we should go, but please hold off on these two
patches until I've had a chance to audit all of those implementations
again.

I'll try and have a look at your other patches before that.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Manfred Spraul Sept. 3, 2016, 5:33 a.m. UTC | #6
On 09/02/2016 09:22 PM, Peter Zijlstra wrote:
> On Fri, Sep 02, 2016 at 08:35:55AM +0200, Manfred Spraul wrote:
>> On 09/01/2016 06:41 PM, Peter Zijlstra wrote:
>>> On Thu, Sep 01, 2016 at 04:30:39PM +0100, Will Deacon wrote:
>>>> On Thu, Sep 01, 2016 at 05:27:52PM +0200, Manfred Spraul wrote:
>>>>> Since spin_unlock_wait() is defined as equivalent to spin_lock();
>>>>> spin_unlock(), the memory barrier before spin_unlock_wait() is
>>>>> also not required.
>>> Note that ACQUIRE+RELEASE isn't a barrier.
>>>
>>> Both are semi-permeable and things can cross in the middle, like:
>>>
>>>
>>> 	x = 1;
>>> 	LOCK
>>> 	UNLOCK
>>> 	r = y;
>>>
>>> can (validly) get re-ordered like:
>>>
>>> 	LOCK
>>> 	r = y;
>>> 	x = 1;
>>> 	UNLOCK
>>>
>>> So if you want things ordered, as I think you do, I think the smp_mb()
>>> is still needed.
>> CPU1:
>> x=1; /* without WRITE_ONCE */
>> LOCK(l);
>> UNLOCK(l);
>> <do_semop>
>> smp_store_release(x,0)
>>
>>
>> CPU2;
>> LOCK(l)
>> if (smp_load_acquire(x)==1) goto slow_path
>> <do_semop>
>> UNLOCK(l)
>>
>> Ordering is enforced because both CPUs access the same lock.
>>
>> x=1 can't be reordered past the UNLOCK(l), I don't see that further
>> guarantees are necessary.
>>
>> Correct?
> Correct, sadly implementations do not comply :/ In fact, even x86 is
> broken here.
>
> I spoke to Will earlier today and he suggests either making
> spin_unlock_wait() stronger to avoids any and all such surprises or just
> getting rid of the thing.
>
> I'm not sure which way we should go, but please hold off on these two
> patches until I've had a chance to audit all of those implementations
> again.
For me, it doesn't really matter.
spin_unlock_wait() as "R", as "RAcq" or as "spin_lock(); spin_lock();" - 
I just want a usable definition for ipc/sem.c

So (just to keep Andrew updated):
Ready for merging is: (Bugfixes, safe just with spin_unlock_wait() as "R"):

- 45a449340cd1 ("ipc/sem.c: fix complex_count vs. simple op race")
   Cc stable, back to 3.10 ...
- 7fd5653d9986 ("net/netfilter/nf_conntrack_core: Fix memory barriers.")
   Cc stable, back to ~4.5

--
     Manfred
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 7a3b5e6..0591a25 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -139,13 +139,7 @@  static void nf_conntrack_all_lock(void)
 
 	spin_lock(&nf_conntrack_locks_all_lock);
 
-	/*
-	 * Order the store of 'nf_conntrack_locks_all' against
-	 * the spin_unlock_wait() loads below, such that if
-	 * nf_conntrack_lock() observes 'nf_conntrack_locks_all'
-	 * we must observe nf_conntrack_locks[] held:
-	 */
-	smp_store_mb(nf_conntrack_locks_all, true);
+	nf_conntrack_locks_all = true;
 
 	for (i = 0; i < CONNTRACK_LOCKS; i++) {
 		spin_unlock_wait(&nf_conntrack_locks[i]);