diff mbox

[RFC,1/3] locking: Introduce smp_acquire__after_ctrl_dep

Message ID 20160606172824.GA10383@linux.vnet.ibm.com
State Awaiting Upstream
Delegated to: Pablo Neira
Headers show

Commit Message

Paul E. McKenney June 6, 2016, 5:28 p.m. UTC
On Sat, Jun 04, 2016 at 08:29:29AM -0700, Paul E. McKenney wrote:
> On Fri, Jun 03, 2016 at 02:45:53PM +0100, Will Deacon wrote:
> > On Fri, Jun 03, 2016 at 06:32:38AM -0700, Paul E. McKenney wrote:
> > > On Fri, Jun 03, 2016 at 02:23:10PM +0200, Peter Zijlstra wrote:
> > > > On Fri, Jun 03, 2016 at 05:08:27AM -0700, Paul E. McKenney wrote:
> > > > > On Fri, Jun 03, 2016 at 11:38:34AM +0200, Peter Zijlstra wrote:
> > > > > > On Fri, Jun 03, 2016 at 02:48:38PM +0530, Vineet Gupta wrote:
> > > > > > > On Wednesday 25 May 2016 09:27 PM, Paul E. McKenney wrote:
> > > > > > > > For your example, but keeping the compiler in check:
> > > > > > > > 
> > > > > > > > 	if (READ_ONCE(a))
> > > > > > > > 		WRITE_ONCE(b, 1);
> > > > > > > > 	smp_rmb();
> > > > > > > > 	WRITE_ONCE(c, 2);
> > > > > > 
> > > > > > So I think it example is broken. The store to @c is not in fact
> > > > > > dependent on the condition of @a.
> > > > > 
> > > > > At first glance, the compiler could pull the write to "c" above the
> > > > > conditional, but the "memory" constraint in smp_rmb() prevents this.
> > > > > From a hardware viewpoint, the write to "c" does depend on the "if",
> > > > > as the conditional branch does precede that write in execution order.
> > > > > 
> > > > > But yes, this is using smp_rmb() in a very strange way, if that is
> > > > > what you are getting at.
> > > > 
> > > > Well, the CPU could decide that the store to C happens either way around
> > > > the branch. I'm not sure I'd rely on CPUs not being _that_ clever.
> > > 
> > > If I remember correctly, both Power and ARM guarantee that the CPU won't
> > > be that clever.  Not sure about Itanium.
> > 
> > I wouldn't be so sure about ARM. On 32-bit, at least, we have conditional
> > store instructions so if the compiler could somehow use one of those for
> > the first WRITE_ONCE then there's very obviously no control dependency
> > on the second WRITE_ONCE and they could be observed out of order.
> 
> OK, good to know...
> 
> > I note that smp_rmb() on ARM and arm64 actually orders against subsequent
> > (in program order) writes, so this is still pretty theoretical for us.
> 
> So the combined control-dependency/smp_rmb() still works, but I should
> re-examine the straight control dependency stuff.

And how about the patch below?

							Thanx, Paul

------------------------------------------------------------------------

commit 43672d15aeb69b1a196c06cbc071cbade8d247fd
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date:   Mon Jun 6 10:19:42 2016 -0700

    documentation: Clarify limited control-dependency scope
    
    Nothing in the control-dependencies section of memory-barriers.txt
    says that control dependencies don't extend beyond the end of the
    if-statement containing the control dependency.  Worse yet, in many
    situations, they do extend beyond that if-statement.  In particular,
    the compiler cannot destroy the control dependency given proper use of
    READ_ONCE() and WRITE_ONCE().  However, a weakly ordered system having
    a conditional-move instruction provides the control-dependency guarantee
    only to code within the scope of the if-statement itself.
    
    This commit therefore adds words and an example demonstrating this
    limitation of control dependencies.
    
    Reported-by: Will Deacon <will.deacon@arm.com>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>


--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Peter Zijlstra June 7, 2016, 7:15 a.m. UTC | #1
On Mon, Jun 06, 2016 at 10:28:24AM -0700, Paul E. McKenney wrote:
> commit 43672d15aeb69b1a196c06cbc071cbade8d247fd
> Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Date:   Mon Jun 6 10:19:42 2016 -0700
> 
>     documentation: Clarify limited control-dependency scope
>     
>     Nothing in the control-dependencies section of memory-barriers.txt
>     says that control dependencies don't extend beyond the end of the
>     if-statement containing the control dependency.  Worse yet, in many
>     situations, they do extend beyond that if-statement.  In particular,
>     the compiler cannot destroy the control dependency given proper use of
>     READ_ONCE() and WRITE_ONCE().  However, a weakly ordered system having
>     a conditional-move instruction provides the control-dependency guarantee
>     only to code within the scope of the if-statement itself.
>     
>     This commit therefore adds words and an example demonstrating this
>     limitation of control dependencies.
>     
>     Reported-by: Will Deacon <will.deacon@arm.com>
>     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

> 
> diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
> index 147ae8ec836f..a4d0a99de04d 100644
> --- a/Documentation/memory-barriers.txt
> +++ b/Documentation/memory-barriers.txt
> @@ -806,6 +806,41 @@ out-guess your code.  More generally, although READ_ONCE() does force
>  the compiler to actually emit code for a given load, it does not force
>  the compiler to use the results.
>  
> +In addition, control dependencies apply only to the then-clause and
> +else-clause of the if-statement in question.  In particular, it does
> +not necessarily apply to code following the if-statement:
> +
> +	q = READ_ONCE(a);
> +	if (q) {
> +		WRITE_ONCE(b, p);
> +	} else {
> +		WRITE_ONCE(b, r);
> +	}
> +	WRITE_ONCE(c, 1);  /* BUG: No ordering against the read from "a". */
> +
> +It is tempting to argue that there in fact is ordering because the
> +compiler cannot reorder volatile accesses and also cannot reorder
> +the writes to "b" with the condition.  Unfortunately for this line
> +of reasoning, the compiler might compile the two writes to "b" as
> +conditional-move instructions, as in this fanciful pseudo-assembly
> +language:
> +
> +	ld r1,a
> +	ld r2,p
> +	ld r3,r
> +	cmp r1,$0
> +	cmov,ne r4,r2
> +	cmov,eq r4,r3
> +	st r4,b
> +	st $1,c
> +
> +A weakly ordered CPU would have no dependency of any sort between the load
> +from "a" and the store to "c".  The control dependencies would extend
> +only to the pair of cmov instructions and the store depending on them.
> +In short, control dependencies apply only to the stores in the then-clause
> +and else-clause of the if-statement in question (including functions
> +invoked by those two clauses), not to code following that if-statement.
> +
>  Finally, control dependencies do -not- provide transitivity.  This is
>  demonstrated by two related examples, with the initial values of
>  x and y both being zero:
> @@ -869,6 +904,12 @@ In summary:
>        atomic{,64}_read() can help to preserve your control dependency.
>        Please see the COMPILER BARRIER section for more information.
>  
> +  (*) Control dependencies apply only to the then-clause and else-clause
> +      of the if-statement containing the control dependency, including
> +      any functions that these two clauses call.  Control dependencies
> +      do -not- apply to code following the if-statement containing the
> +      control dependency.
> +
>    (*) Control dependencies pair normally with other types of barriers.
>  
>    (*) Control dependencies do -not- provide transitivity.  If you
> 
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hannes Frederic Sowa June 7, 2016, 12:41 p.m. UTC | #2
On 07.06.2016 09:15, Peter Zijlstra wrote:
>>
>> diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
>> index 147ae8ec836f..a4d0a99de04d 100644
>> --- a/Documentation/memory-barriers.txt
>> +++ b/Documentation/memory-barriers.txt
>> @@ -806,6 +806,41 @@ out-guess your code.  More generally, although READ_ONCE() does force
>>  the compiler to actually emit code for a given load, it does not force
>>  the compiler to use the results.
>>  
>> +In addition, control dependencies apply only to the then-clause and
>> +else-clause of the if-statement in question.  In particular, it does
>> +not necessarily apply to code following the if-statement:
>> +
>> +	q = READ_ONCE(a);
>> +	if (q) {
>> +		WRITE_ONCE(b, p);
>> +	} else {
>> +		WRITE_ONCE(b, r);
>> +	}
>> +	WRITE_ONCE(c, 1);  /* BUG: No ordering against the read from "a". */
>> +
>> +It is tempting to argue that there in fact is ordering because the
>> +compiler cannot reorder volatile accesses and also cannot reorder
>> +the writes to "b" with the condition.  Unfortunately for this line
>> +of reasoning, the compiler might compile the two writes to "b" as
>> +conditional-move instructions, as in this fanciful pseudo-assembly
>> +language:

I wonder if we already guarantee by kernel compiler settings that this
behavior is not allowed by at least gcc.

We unconditionally set --param allow-store-data-races=0 which should
actually prevent gcc from generating such conditional stores.

Am I seeing this correct here?

Thanks,
Hannes

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul E. McKenney June 7, 2016, 1:06 p.m. UTC | #3
On Tue, Jun 07, 2016 at 02:41:44PM +0200, Hannes Frederic Sowa wrote:
> On 07.06.2016 09:15, Peter Zijlstra wrote:
> >>
> >> diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
> >> index 147ae8ec836f..a4d0a99de04d 100644
> >> --- a/Documentation/memory-barriers.txt
> >> +++ b/Documentation/memory-barriers.txt
> >> @@ -806,6 +806,41 @@ out-guess your code.  More generally, although READ_ONCE() does force
> >>  the compiler to actually emit code for a given load, it does not force
> >>  the compiler to use the results.
> >>  
> >> +In addition, control dependencies apply only to the then-clause and
> >> +else-clause of the if-statement in question.  In particular, it does
> >> +not necessarily apply to code following the if-statement:
> >> +
> >> +	q = READ_ONCE(a);
> >> +	if (q) {
> >> +		WRITE_ONCE(b, p);
> >> +	} else {
> >> +		WRITE_ONCE(b, r);
> >> +	}
> >> +	WRITE_ONCE(c, 1);  /* BUG: No ordering against the read from "a". */
> >> +
> >> +It is tempting to argue that there in fact is ordering because the
> >> +compiler cannot reorder volatile accesses and also cannot reorder
> >> +the writes to "b" with the condition.  Unfortunately for this line
> >> +of reasoning, the compiler might compile the two writes to "b" as
> >> +conditional-move instructions, as in this fanciful pseudo-assembly
> >> +language:
> 
> I wonder if we already guarantee by kernel compiler settings that this
> behavior is not allowed by at least gcc.
> 
> We unconditionally set --param allow-store-data-races=0 which should
> actually prevent gcc from generating such conditional stores.
> 
> Am I seeing this correct here?

In this case, the store to "c" is unconditional, so pulling it forward
would not generate a data race.  However, the compiler is still prohibited
from pulling it forward because it is not allowed to reorder volatile
references.  So, yes, the compiler cannot reorder, but for a different
reason.

Some CPUs, on the other hand, can do this reordering, as Will Deacon
pointed out earlier in this thread.

							Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hannes Frederic Sowa June 7, 2016, 2:59 p.m. UTC | #4
On 07.06.2016 15:06, Paul E. McKenney wrote:
> On Tue, Jun 07, 2016 at 02:41:44PM +0200, Hannes Frederic Sowa wrote:
>> On 07.06.2016 09:15, Peter Zijlstra wrote:
>>>>
>>>> diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
>>>> index 147ae8ec836f..a4d0a99de04d 100644
>>>> --- a/Documentation/memory-barriers.txt
>>>> +++ b/Documentation/memory-barriers.txt
>>>> @@ -806,6 +806,41 @@ out-guess your code.  More generally, although READ_ONCE() does force
>>>>  the compiler to actually emit code for a given load, it does not force
>>>>  the compiler to use the results.
>>>>  
>>>> +In addition, control dependencies apply only to the then-clause and
>>>> +else-clause of the if-statement in question.  In particular, it does
>>>> +not necessarily apply to code following the if-statement:
>>>> +
>>>> +	q = READ_ONCE(a);
>>>> +	if (q) {
>>>> +		WRITE_ONCE(b, p);
>>>> +	} else {
>>>> +		WRITE_ONCE(b, r);
>>>> +	}
>>>> +	WRITE_ONCE(c, 1);  /* BUG: No ordering against the read from "a". */
>>>> +
>>>> +It is tempting to argue that there in fact is ordering because the
>>>> +compiler cannot reorder volatile accesses and also cannot reorder
>>>> +the writes to "b" with the condition.  Unfortunately for this line
>>>> +of reasoning, the compiler might compile the two writes to "b" as
>>>> +conditional-move instructions, as in this fanciful pseudo-assembly
>>>> +language:
>>
>> I wonder if we already guarantee by kernel compiler settings that this
>> behavior is not allowed by at least gcc.
>>
>> We unconditionally set --param allow-store-data-races=0 which should
>> actually prevent gcc from generating such conditional stores.
>>
>> Am I seeing this correct here?
> 
> In this case, the store to "c" is unconditional, so pulling it forward
> would not generate a data race.  However, the compiler is still prohibited
> from pulling it forward because it is not allowed to reorder volatile
> references.  So, yes, the compiler cannot reorder, but for a different
> reason.
> 
> Some CPUs, on the other hand, can do this reordering, as Will Deacon
> pointed out earlier in this thread.

Sorry, to follow-up again on this. Will Deacon's comments were about
conditional-move instructions, which this compiler-option would prevent,
as far as I can see it. Thus I couldn't follow your answer completely:

The writes to b would be non-conditional-moves with a control dependency
from a and and edge down to the write to c, which obviously is
non-conditional. As such in terms of dependency ordering, we would have
the control dependency always, thus couldn't we assume that in a current
kernel we always have a load(a)->store(c) requirement?

Is there something else than conditional move instructions that could
come to play here? Obviously a much smarter CPU could evaluate all the
jumps and come to the conclusion that the write to c is never depending
on the load from a, but is this implemented somewhere in hardware?

Thank you,
Hannes

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul E. McKenney June 7, 2016, 3:23 p.m. UTC | #5
On Tue, Jun 07, 2016 at 04:59:02PM +0200, Hannes Frederic Sowa wrote:
> On 07.06.2016 15:06, Paul E. McKenney wrote:
> > On Tue, Jun 07, 2016 at 02:41:44PM +0200, Hannes Frederic Sowa wrote:
> >> On 07.06.2016 09:15, Peter Zijlstra wrote:
> >>>>
> >>>> diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
> >>>> index 147ae8ec836f..a4d0a99de04d 100644
> >>>> --- a/Documentation/memory-barriers.txt
> >>>> +++ b/Documentation/memory-barriers.txt
> >>>> @@ -806,6 +806,41 @@ out-guess your code.  More generally, although READ_ONCE() does force
> >>>>  the compiler to actually emit code for a given load, it does not force
> >>>>  the compiler to use the results.
> >>>>  
> >>>> +In addition, control dependencies apply only to the then-clause and
> >>>> +else-clause of the if-statement in question.  In particular, it does
> >>>> +not necessarily apply to code following the if-statement:
> >>>> +
> >>>> +	q = READ_ONCE(a);
> >>>> +	if (q) {
> >>>> +		WRITE_ONCE(b, p);
> >>>> +	} else {
> >>>> +		WRITE_ONCE(b, r);
> >>>> +	}
> >>>> +	WRITE_ONCE(c, 1);  /* BUG: No ordering against the read from "a". */
> >>>> +
> >>>> +It is tempting to argue that there in fact is ordering because the
> >>>> +compiler cannot reorder volatile accesses and also cannot reorder
> >>>> +the writes to "b" with the condition.  Unfortunately for this line
> >>>> +of reasoning, the compiler might compile the two writes to "b" as
> >>>> +conditional-move instructions, as in this fanciful pseudo-assembly
> >>>> +language:
> >>
> >> I wonder if we already guarantee by kernel compiler settings that this
> >> behavior is not allowed by at least gcc.
> >>
> >> We unconditionally set --param allow-store-data-races=0 which should
> >> actually prevent gcc from generating such conditional stores.
> >>
> >> Am I seeing this correct here?
> > 
> > In this case, the store to "c" is unconditional, so pulling it forward
> > would not generate a data race.  However, the compiler is still prohibited
> > from pulling it forward because it is not allowed to reorder volatile
> > references.  So, yes, the compiler cannot reorder, but for a different
> > reason.
> > 
> > Some CPUs, on the other hand, can do this reordering, as Will Deacon
> > pointed out earlier in this thread.
> 
> Sorry, to follow-up again on this. Will Deacon's comments were about
> conditional-move instructions, which this compiler-option would prevent,
> as far as I can see it.

According to this email thread, I believe that this works the other
way around:

http://thread.gmane.org/gmane.linux.kernel/1721993

That parameter prevents the compiler from converting a conditional
store into an unconditional store, which would be really problematic.
Give the current kernel build, I believe that the compiler really is
within its rights to use conditional-move instructions as shown above.
But I again must defer to Will Deacon on the details.

Or am I misinterpreting that email thread?

>                         Thus I couldn't follow your answer completely:
> 
> The writes to b would be non-conditional-moves with a control dependency
> from a and and edge down to the write to c, which obviously is
> non-conditional. As such in terms of dependency ordering, we would have
> the control dependency always, thus couldn't we assume that in a current
> kernel we always have a load(a)->store(c) requirement?

I agree that if the compiler uses the normal comparisons and conditional
branches, and if the hardware is not excessively clever (bad bet, by the
way, long term), then the load from "a" should not be reordered with
the store to "c".

> Is there something else than conditional move instructions that could
> come to play here? Obviously a much smarter CPU could evaluate all the
> jumps and come to the conclusion that the write to c is never depending
> on the load from a, but is this implemented somewhere in hardware?

I don't know of any hardware that does that, but given that conditional
moves are supported by some weakly ordered hardware, it looks to me
that we are stuck with the possibility of "a"-"c" reordering.

							Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Zijlstra June 7, 2016, 5:48 p.m. UTC | #6
On Tue, Jun 07, 2016 at 08:23:15AM -0700, Paul E. McKenney wrote:
> and if the hardware is not excessively clever (bad bet, by the
> way, long term),

This ^

> > Is there something else than conditional move instructions that could
> > come to play here? Obviously a much smarter CPU could evaluate all the
> > jumps and come to the conclusion that the write to c is never depending
> > on the load from a, but is this implemented somewhere in hardware?
> 
> I don't know of any hardware that does that, but given that conditional
> moves are supported by some weakly ordered hardware, it looks to me
> that we are stuck with the possibility of "a"-"c" reordering.

Is why I'm scared of relying on the non-condition.

The if and else branches are obviously dependent on the completion of
the load; anything after that, not so much.

You could construct an argument against this program order speculation
based on interrupts, which should not observe the stores out of order
etc.. but if the hardware is that clever, it can also abort the entire
speculation on interrupt (much like hardware transactions already can).

So even if today no hardware is this clever (and that isn't proven)
there's nothing saying it will not ever be.

This is why I really do not want to advertise and or rely on this
behaviour.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Will Deacon June 7, 2016, 6:01 p.m. UTC | #7
On Tue, Jun 07, 2016 at 08:23:15AM -0700, Paul E. McKenney wrote:
> On Tue, Jun 07, 2016 at 04:59:02PM +0200, Hannes Frederic Sowa wrote:
> > Sorry, to follow-up again on this. Will Deacon's comments were about
> > conditional-move instructions, which this compiler-option would prevent,
> > as far as I can see it.
> 
> According to this email thread, I believe that this works the other
> way around:
> 
> http://thread.gmane.org/gmane.linux.kernel/1721993
> 
> That parameter prevents the compiler from converting a conditional
> store into an unconditional store, which would be really problematic.
> Give the current kernel build, I believe that the compiler really is
> within its rights to use conditional-move instructions as shown above.
> But I again must defer to Will Deacon on the details.

A multi_v7_defconfig build of mainline certainly spits out conditional
store instructions, but I have no idea whether these correspond to
WRITE_ONCE or not:

$ objdump -d vmlinux | grep 'str\(eq\|ne\)' | wc -l
7326

At the end of the day, the ARM architecture says you can't rely on this
being ordered and I can see it happening in practice in the face of
conditional stores.

Will
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hannes Frederic Sowa June 7, 2016, 6:37 p.m. UTC | #8
On 07.06.2016 17:23, Paul E. McKenney wrote:
> On Tue, Jun 07, 2016 at 04:59:02PM +0200, Hannes Frederic Sowa wrote:
>> On 07.06.2016 15:06, Paul E. McKenney wrote:
>>> On Tue, Jun 07, 2016 at 02:41:44PM +0200, Hannes Frederic Sowa wrote:
>>>> On 07.06.2016 09:15, Peter Zijlstra wrote:
>>>>>>
>>>>>> diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
>>>>>> index 147ae8ec836f..a4d0a99de04d 100644
>>>>>> --- a/Documentation/memory-barriers.txt
>>>>>> +++ b/Documentation/memory-barriers.txt
>>>>>> @@ -806,6 +806,41 @@ out-guess your code.  More generally, although READ_ONCE() does force
>>>>>>  the compiler to actually emit code for a given load, it does not force
>>>>>>  the compiler to use the results.
>>>>>>  
>>>>>> +In addition, control dependencies apply only to the then-clause and
>>>>>> +else-clause of the if-statement in question.  In particular, it does
>>>>>> +not necessarily apply to code following the if-statement:
>>>>>> +
>>>>>> +	q = READ_ONCE(a);
>>>>>> +	if (q) {
>>>>>> +		WRITE_ONCE(b, p);
>>>>>> +	} else {
>>>>>> +		WRITE_ONCE(b, r);
>>>>>> +	}
>>>>>> +	WRITE_ONCE(c, 1);  /* BUG: No ordering against the read from "a". */
>>>>>> +
>>>>>> +It is tempting to argue that there in fact is ordering because the
>>>>>> +compiler cannot reorder volatile accesses and also cannot reorder
>>>>>> +the writes to "b" with the condition.  Unfortunately for this line
>>>>>> +of reasoning, the compiler might compile the two writes to "b" as
>>>>>> +conditional-move instructions, as in this fanciful pseudo-assembly
>>>>>> +language:
>>>>
>>>> I wonder if we already guarantee by kernel compiler settings that this
>>>> behavior is not allowed by at least gcc.
>>>>
>>>> We unconditionally set --param allow-store-data-races=0 which should
>>>> actually prevent gcc from generating such conditional stores.
>>>>
>>>> Am I seeing this correct here?
>>>
>>> In this case, the store to "c" is unconditional, so pulling it forward
>>> would not generate a data race.  However, the compiler is still prohibited
>>> from pulling it forward because it is not allowed to reorder volatile
>>> references.  So, yes, the compiler cannot reorder, but for a different
>>> reason.
>>>
>>> Some CPUs, on the other hand, can do this reordering, as Will Deacon
>>> pointed out earlier in this thread.
>>
>> Sorry, to follow-up again on this. Will Deacon's comments were about
>> conditional-move instructions, which this compiler-option would prevent,
>> as far as I can see it.
> 
> According to this email thread, I believe that this works the other
> way around:
> 
> http://thread.gmane.org/gmane.linux.kernel/1721993
> 
> That parameter prevents the compiler from converting a conditional
> store into an unconditional store, which would be really problematic.
> Give the current kernel build, I believe that the compiler really is
> within its rights to use conditional-move instructions as shown above.
> But I again must defer to Will Deacon on the details.
> 
> Or am I misinterpreting that email thread?

Thanks, Paul!

Based on the description in the thread above, it makes perfectly sense
what you wrote. Sorry for the noise.

>>                         Thus I couldn't follow your answer completely:
>>
>> The writes to b would be non-conditional-moves with a control dependency
>> from a and and edge down to the write to c, which obviously is
>> non-conditional. As such in terms of dependency ordering, we would have
>> the control dependency always, thus couldn't we assume that in a current
>> kernel we always have a load(a)->store(c) requirement?
> 
> I agree that if the compiler uses the normal comparisons and conditional
> branches, and if the hardware is not excessively clever (bad bet, by the
> way, long term), then the load from "a" should not be reordered with
> the store to "c".
> 
>> Is there something else than conditional move instructions that could
>> come to play here? Obviously a much smarter CPU could evaluate all the
>> jumps and come to the conclusion that the write to c is never depending
>> on the load from a, but is this implemented somewhere in hardware?
> 
> I don't know of any hardware that does that, but given that conditional
> moves are supported by some weakly ordered hardware, it looks to me
> that we are stuck with the possibility of "a"-"c" reordering.

I totally agree.

Thanks,
Hannes
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul E. McKenney June 7, 2016, 6:44 p.m. UTC | #9
On Tue, Jun 07, 2016 at 07:48:53PM +0200, Peter Zijlstra wrote:
> On Tue, Jun 07, 2016 at 08:23:15AM -0700, Paul E. McKenney wrote:
> > and if the hardware is not excessively clever (bad bet, by the
> > way, long term),
> 
> This ^
> 
> > > Is there something else than conditional move instructions that could
> > > come to play here? Obviously a much smarter CPU could evaluate all the
> > > jumps and come to the conclusion that the write to c is never depending
> > > on the load from a, but is this implemented somewhere in hardware?
> > 
> > I don't know of any hardware that does that, but given that conditional
> > moves are supported by some weakly ordered hardware, it looks to me
> > that we are stuck with the possibility of "a"-"c" reordering.
> 
> Is why I'm scared of relying on the non-condition.
> 
> The if and else branches are obviously dependent on the completion of
> the load; anything after that, not so much.
> 
> You could construct an argument against this program order speculation
> based on interrupts, which should not observe the stores out of order
> etc.. but if the hardware is that clever, it can also abort the entire
> speculation on interrupt (much like hardware transactions already can).
> 
> So even if today no hardware is this clever (and that isn't proven)
> there's nothing saying it will not ever be.
> 
> This is why I really do not want to advertise and or rely on this
> behaviour.

What Peter said!  ;-)

							Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul E. McKenney June 7, 2016, 6:44 p.m. UTC | #10
On Tue, Jun 07, 2016 at 07:01:07PM +0100, Will Deacon wrote:
> On Tue, Jun 07, 2016 at 08:23:15AM -0700, Paul E. McKenney wrote:
> > On Tue, Jun 07, 2016 at 04:59:02PM +0200, Hannes Frederic Sowa wrote:
> > > Sorry, to follow-up again on this. Will Deacon's comments were about
> > > conditional-move instructions, which this compiler-option would prevent,
> > > as far as I can see it.
> > 
> > According to this email thread, I believe that this works the other
> > way around:
> > 
> > http://thread.gmane.org/gmane.linux.kernel/1721993
> > 
> > That parameter prevents the compiler from converting a conditional
> > store into an unconditional store, which would be really problematic.
> > Give the current kernel build, I believe that the compiler really is
> > within its rights to use conditional-move instructions as shown above.
> > But I again must defer to Will Deacon on the details.
> 
> A multi_v7_defconfig build of mainline certainly spits out conditional
> store instructions, but I have no idea whether these correspond to
> WRITE_ONCE or not:
> 
> $ objdump -d vmlinux | grep 'str\(eq\|ne\)' | wc -l
> 7326
> 
> At the end of the day, the ARM architecture says you can't rely on this
> being ordered and I can see it happening in practice in the face of
> conditional stores.

Thank you for the info, Will!

							Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul E. McKenney June 7, 2016, 6:54 p.m. UTC | #11
On Tue, Jun 07, 2016 at 07:01:07PM +0100, Will Deacon wrote:
> On Tue, Jun 07, 2016 at 08:23:15AM -0700, Paul E. McKenney wrote:
> > On Tue, Jun 07, 2016 at 04:59:02PM +0200, Hannes Frederic Sowa wrote:
> > > Sorry, to follow-up again on this. Will Deacon's comments were about
> > > conditional-move instructions, which this compiler-option would prevent,
> > > as far as I can see it.
> > 
> > According to this email thread, I believe that this works the other
> > way around:
> > 
> > http://thread.gmane.org/gmane.linux.kernel/1721993
> > 
> > That parameter prevents the compiler from converting a conditional
> > store into an unconditional store, which would be really problematic.
> > Give the current kernel build, I believe that the compiler really is
> > within its rights to use conditional-move instructions as shown above.
> > But I again must defer to Will Deacon on the details.
> 
> A multi_v7_defconfig build of mainline certainly spits out conditional
> store instructions, but I have no idea whether these correspond to
> WRITE_ONCE or not:
> 
> $ objdump -d vmlinux | grep 'str\(eq\|ne\)' | wc -l
> 7326
> 
> At the end of the day, the ARM architecture says you can't rely on this
> being ordered and I can see it happening in practice in the face of
> conditional stores.

Thank you for the information, Will!

							Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index 147ae8ec836f..a4d0a99de04d 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -806,6 +806,41 @@  out-guess your code.  More generally, although READ_ONCE() does force
 the compiler to actually emit code for a given load, it does not force
 the compiler to use the results.
 
+In addition, control dependencies apply only to the then-clause and
+else-clause of the if-statement in question.  In particular, it does
+not necessarily apply to code following the if-statement:
+
+	q = READ_ONCE(a);
+	if (q) {
+		WRITE_ONCE(b, p);
+	} else {
+		WRITE_ONCE(b, r);
+	}
+	WRITE_ONCE(c, 1);  /* BUG: No ordering against the read from "a". */
+
+It is tempting to argue that there in fact is ordering because the
+compiler cannot reorder volatile accesses and also cannot reorder
+the writes to "b" with the condition.  Unfortunately for this line
+of reasoning, the compiler might compile the two writes to "b" as
+conditional-move instructions, as in this fanciful pseudo-assembly
+language:
+
+	ld r1,a
+	ld r2,p
+	ld r3,r
+	cmp r1,$0
+	cmov,ne r4,r2
+	cmov,eq r4,r3
+	st r4,b
+	st $1,c
+
+A weakly ordered CPU would have no dependency of any sort between the load
+from "a" and the store to "c".  The control dependencies would extend
+only to the pair of cmov instructions and the store depending on them.
+In short, control dependencies apply only to the stores in the then-clause
+and else-clause of the if-statement in question (including functions
+invoked by those two clauses), not to code following that if-statement.
+
 Finally, control dependencies do -not- provide transitivity.  This is
 demonstrated by two related examples, with the initial values of
 x and y both being zero:
@@ -869,6 +904,12 @@  In summary:
       atomic{,64}_read() can help to preserve your control dependency.
       Please see the COMPILER BARRIER section for more information.
 
+  (*) Control dependencies apply only to the then-clause and else-clause
+      of the if-statement containing the control dependency, including
+      any functions that these two clauses call.  Control dependencies
+      do -not- apply to code following the if-statement containing the
+      control dependency.
+
   (*) Control dependencies pair normally with other types of barriers.
 
   (*) Control dependencies do -not- provide transitivity.  If you