diff mbox

[gimple] assignments to volatile

Message ID 4C209B90.6090703@codesourcery.com
State New
Headers show

Commit Message

Nathan Sidwell June 22, 2010, 11:16 a.m. UTC
On 06/21/10 13:07, Richard Guenther wrote:

> Hm, TREE_THIS_VOLATILE (TREE_TYPE (*to_p)) should
> certainly be TREE_THIS_VOLATILE (*to_p), and all this
> should be done after gimplifying *to_p.  I think even just
> before we build the gimple assign/call do
...

Thanks!  here's a tested patch that implements it the way you suggest.

Now, as for the semantics we want.  I disagree with Michael's opinion.

Firstly, C and C++ differ in their std's description of the semantics of an 
expression cast to void (implicitly or explicitly).  The C std essentially says 
the value is discarded.  The C++ std specifies that no lvalue to rvalue 
transformation occurs, and it is this transformation that causes an lvalue 
object to be read.  There was a long discussion about this several years ago 
when I patched G++ to have somewhat more useful semantics (and warnings) than it 
did have.  That series of patches started when I noticed that in C++ 
'(void)*vol_ptr' did not cause a read of the pointed-to object, but it did in C. 
  The conclusion at that time were that the implemented C semantics were what 
was wanted.  These were:
* after an assignment to a volatile, no re-read occurred.
* a volatile was always read in a void context.

These rules are
* simple
* composable

It means that:

   '*vol_ptr1 = *vol_ptr2 = 0'
doesn't read *vol_ptr2 and therefore assign an unknown value to *vol_ptr1.  It 
also doesn't force any ordering on the two assignments to *vol_ptr1 and 
*vol_ptr2 -- just as would be true if things were not volatile.

   '*vol_ptr1'
where this is not the lhs of an assignment, will read the volatile object being 
pointed to, regardless of whether we use the value or not.

Notice in those examples that I have statement fragments.  The semantics are the 
same regardless of the context in which those fragments occur.

IIUC, Michael's desired semantics are
*) after an assignment, the value is re-read iff the assignment's value is used.
*) a volatile object is not read in a void context.
*) a volatile sub-expression is read or re-read if the value is used by the 
containing expression [I am unsure if this is Michael's semantics or not, please 
correct me if I'm wrong]

I am unsure whether the behaviour Michael would like is dependent on 
optimization level.  I think that would be a very bad thing, btw.

I think these lead to confusing code sequences, and are not composable. 
Fundamentally the ambiguity comes from specifying when an assignment's value is 
needed.  Without even considering data-flow optimizations that could determine a 
value is unused, we have the following:

   'cond ? (*vol_ptr = v) : other_v;'

The conditional expression has a value, even though it is in a void context. 
Does that mean that the volatile assignment's value is used or not?  Similarly for:
   'something, (*vol_ptr = v);'
or a more complicated variants:
   'something, ((vol_ptr = v), something);'
   '(something, (vol_ptr = v)), something;'

the problem here is that the behaviour of a sub-expression now depends on the 
context in which it is embedded (and that could be quite deeply embedded).  If 
we decide that the above examples do not re-read, because the value is 
ultimately not used, we have the ability for the programmer to really get 
confused during debugging.  For instance, they may rewrite one of these as
    'tmp = cond ? (*vol_ptr = v) : other_v;'
in order to determine the value that was written.  But now we've changed the 
pattern of accesses to the volatile object.  (And by in my recent dive into 
gimplify, we'll have a harder patch to write as we need to track want_result 
through more structures than we do at the moment.)

Should whatever semantics we decide upon be implemented by the gimplifier or by 
the front-end?  I'm not sure, but I do think the gimplifier should specify the 
semantics of volatile accesses, and those semantics should be consistent. 
Currently they are not.

nathan

Comments

Richard Biener June 22, 2010, 12:01 p.m. UTC | #1
On Tue, Jun 22, 2010 at 1:16 PM, Nathan Sidwell <nathan@codesourcery.com> wrote:
> On 06/21/10 13:07, Richard Guenther wrote:
>
>> Hm, TREE_THIS_VOLATILE (TREE_TYPE (*to_p)) should
>> certainly be TREE_THIS_VOLATILE (*to_p), and all this
>> should be done after gimplifying *to_p.  I think even just
>> before we build the gimple assign/call do
>
> ...
>
> Thanks!  here's a tested patch that implements it the way you suggest.
>
> Now, as for the semantics we want.  I disagree with Michael's opinion.
>
> Firstly, C and C++ differ in their std's description of the semantics of an
> expression cast to void (implicitly or explicitly).  The C std essentially
> says the value is discarded.  The C++ std specifies that no lvalue to rvalue
> transformation occurs, and it is this transformation that causes an lvalue
> object to be read.  There was a long discussion about this several years ago
> when I patched G++ to have somewhat more useful semantics (and warnings)
> than it did have.  That series of patches started when I noticed that in C++
> '(void)*vol_ptr' did not cause a read of the pointed-to object, but it did
> in C.  The conclusion at that time were that the implemented C semantics
> were what was wanted.  These were:
> * after an assignment to a volatile, no re-read occurred.
> * a volatile was always read in a void context.
>
> These rules are
> * simple
> * composable
>
> It means that:
>
>  '*vol_ptr1 = *vol_ptr2 = 0'
> doesn't read *vol_ptr2 and therefore assign an unknown value to *vol_ptr1.
>  It also doesn't force any ordering on the two assignments to *vol_ptr1 and
> *vol_ptr2 -- just as would be true if things were not volatile.
>
>  '*vol_ptr1'
> where this is not the lhs of an assignment, will read the volatile object
> being pointed to, regardless of whether we use the value or not.
>
> Notice in those examples that I have statement fragments.  The semantics are
> the same regardless of the context in which those fragments occur.
>
> IIUC, Michael's desired semantics are
> *) after an assignment, the value is re-read iff the assignment's value is
> used.
> *) a volatile object is not read in a void context.
> *) a volatile sub-expression is read or re-read if the value is used by the
> containing expression [I am unsure if this is Michael's semantics or not,
> please correct me if I'm wrong]
>
> I am unsure whether the behaviour Michael would like is dependent on
> optimization level.  I think that would be a very bad thing, btw.
>
> I think these lead to confusing code sequences, and are not composable.
> Fundamentally the ambiguity comes from specifying when an assignment's value
> is needed.  Without even considering data-flow optimizations that could
> determine a value is unused, we have the following:
>
>  'cond ? (*vol_ptr = v) : other_v;'
>
> The conditional expression has a value, even though it is in a void context.
> Does that mean that the volatile assignment's value is used or not?
>  Similarly for:
>  'something, (*vol_ptr = v);'
> or a more complicated variants:
>  'something, ((vol_ptr = v), something);'
>  '(something, (vol_ptr = v)), something;'
>
> the problem here is that the behaviour of a sub-expression now depends on
> the context in which it is embedded (and that could be quite deeply
> embedded).  If we decide that the above examples do not re-read, because the
> value is ultimately not used, we have the ability for the programmer to
> really get confused during debugging.  For instance, they may rewrite one of
> these as
>   'tmp = cond ? (*vol_ptr = v) : other_v;'
> in order to determine the value that was written.  But now we've changed the
> pattern of accesses to the volatile object.  (And by in my recent dive into
> gimplify, we'll have a harder patch to write as we need to track want_result
> through more structures than we do at the moment.)
>
> Should whatever semantics we decide upon be implemented by the gimplifier or
> by the front-end?  I'm not sure, but I do think the gimplifier should
> specify the semantics of volatile accesses, and those semantics should be
> consistent. Currently they are not.

Thanks for the updated patch.  I will review it once we settled
on the desired semantics.

And yes - the front-ends should make the semantic explicit.
I'd like to get rid of MODIFY_EXPR having a value to avoid these
problems and have the frontends lower MODIFY_EXPRs with
a value by using COMPOUND_EXPRs.

Richard.

> nathan
>
> --
> Nathan Sidwell    ::   http://www.codesourcery.com   ::         CodeSourcery
>
>
Michael Matz June 22, 2010, 12:37 p.m. UTC | #2
Hi,

On Tue, 22 Jun 2010, Nathan Sidwell wrote:

> IIUC, Michael's desired semantics are
> *) after an assignment, the value is re-read iff the assignment's value is
> used.
> *) a volatile object is not read in a void context.
> *) a volatile sub-expression is read or re-read if the value is used by the
> containing expression [I am unsure if this is Michael's semantics or not,
> please correct me if I'm wrong]

I think you're correct.

> I am unsure whether the behaviour Michael would like is dependent on
> optimization level.  I think that would be a very bad thing, btw.

It wouldn't be dependend on the optimization level, that would be 
terrible.  I.e. the "used-by" property would be purely syntactical.

> I think these lead to confusing code sequences, and are not composable.
> Fundamentally the ambiguity comes from specifying when an assignment's value
> is needed.

Indeed.  Both semantics (re-read or no re-read) will generate some 
surprises, mine has the problem of context dependency, yours for instance 
breaks this identity:
  x = vol = y;
<==>
  vol = y;
  x = vol;

I read the standard as requiring this identity to hold (in particular the 
language about values of assignments not being lvalues is only to disallow 
constructs like "(a = b) = c;").  That is my fundamental problem with your 
approach, that it effectively creates this identity:
  x = y    // as fragment
<==>
  (t = y, x = t, t)  // t a new temporary of the unqualified type of x

Note that I don't think this would be bad semantics, to the contrary (I'm 
not particularly glad about the context dependency that my reading 
implies).  I just don't see it justified in the standard.

> pattern of accesses to the volatile object.  (And by in my recent dive into
> gimplify, we'll have a harder patch to write as we need to track want_result
> through more structures than we do at the moment.)

I'd rather think that it's not gimplify's job at all to implement volatile 
semantics.  It's rather the frontend which should make sure all reads and 
writes to volatile objects are emitted as specified in the language 
standard.  gimplify then merely shouldn't change the number and kind of 
accesses to volatile objects (neither should any further transformation).

> Should whatever semantics we decide upon be implemented by the 
> gimplifier or by the front-end?  I'm not sure, but I do think the 
> gimplifier should specify the semantics of volatile accesses, and those 
> semantics should be consistent. Currently they are not.

I have no problem if we specify that we gimplify assignments as per the 
above second identity.  (I just also think that the C frontend needs 
fixing to emit explicit re-reads).


Ciao,
Michael.
Nathan Sidwell June 22, 2010, 1:28 p.m. UTC | #3
On 06/22/10 13:37, Michael Matz wrote:

>> I think these lead to confusing code sequences, and are not composable.
>> Fundamentally the ambiguity comes from specifying when an assignment's value
>> is needed.
>
> Indeed.  Both semantics (re-read or no re-read) will generate some
> surprises, mine has the problem of context dependency,

As this is the case then I think we're looking for the solution with the least 
surprise, and whatever solution we have will be outside the bounds of the std to 
some greater or lesser extent.

> yours for instance
> breaks this identity:
>    x = vol = y;
> <==>
>    vol = y;
>    x = vol;
>
> I read the standard as requiring this identity to hold (in particular the
> language about values of assignments not being lvalues is only to disallow
> constructs like "(a = b) = c;").

Thanks for identifying the crux of the difference.  I don't think that 
transformation is an identity when volatiles are in play.  Volatile objects 
break all sorts of transformations that would otherwise be fine.  For instance 
your semantics make the following transform not an identity:
    tmp = vol;
    // no further use of tmp
<==>
    vol;

I don't see what's special about the transformation you're highlighting as not 
being an identity under one set of semantics.

I'm still having a problem with sequence points.  Your email of yesterday:

 > Now, onto 6.5 #2:
 >
 >    Between the previous and next sequence point an object shall have its
 >    stored value modified at most once by the evaluation of an expression.
 >    Furthermore, the prior value shall be read only to determine the value
 >    to be stored.70)
 >
 > This talks only about the side-effect of changing the stored value (which
 > we do only once) and about the prior value, which we don't use at all.

I had misremembered that piece of the std as saying 'the object's value shall be 
read only ...'.  I.e. no 'prior' in there.  Indeed, the explanation given in the 
C-faq http://linuxdude.com/Steve_Sumit/C-faq/q3.8.html is 'any and all accesses 
to it within the same expression must be for the purposes of computing the value 
to be written'.  I did think the mental model I have of assignments came from 
the C-faq, but I can't find it there.  Here it is, perhaps someone will 
recognize the author?

1) for each assignment, the new value is written on a ball, and thrown at the 
lvalue target.

2) it is guaranteed all balls will arrive at their targets by the next sequence 
point.

3) if more than one ball is aimed at the same target, chaos occurs.

4) if you try and read a target when a ball is in flight for that target, chaos 
occurs.

Your read-after-write interpretation breaks that model (of course the model 
could be wrong)

nathan
Michael Matz June 22, 2010, 2:35 p.m. UTC | #4
Hi,

On Tue, 22 Jun 2010, Nathan Sidwell wrote:

> > Indeed.  Both semantics (re-read or no re-read) will generate some 
> > surprises, mine has the problem of context dependency,
> 
> As this is the case then I think we're looking for the solution with the 
> least surprise, and whatever solution we have will be outside the bounds 
> of the std to some greater or lesser extent.

Indeed (if we determine that the solutions don't violate the standard of 
course).

> > yours for instance
> > breaks this identity:
> >    x = vol = y;
> > <==>
> >    vol = y;
> >    x = vol;
> 
> Thanks for identifying the crux of the difference.  I don't think that 
> transformation is an identity when volatiles are in play.  Volatile 
> objects break all sorts of transformations that would otherwise be fine.  

Note that I'm not seeing the above identity as transformation (that could 
or could not be allowed).  Rather I read the standard as describing the 
semantics of "x = vol = y" in terms of the second sequence (by defining 
how to get to the value of 'vol = y'); as in the second sequence _being_ 
the semantics of the first one.  That reading could be wrong, but it is 
the reason why I think a re-read needs to occur.

> For instance your semantics make the following transform not an 
> identity:

>    tmp = vol;
>    // no further use of tmp
> <==>
>    vol;
> 
> I don't see what's special about the transformation you're highlighting 
> as not being an identity under one set of semantics.

The difference is that I don't see any language in the standard that would 
require these two sequences to be equivalent (much less to define the 
first one in terms of the latter one).

> I'm still having a problem with sequence points.  Your email of yesterday:
> 
> > Now, onto 6.5 #2:
> >
> >    Between the previous and next sequence point an object shall have its
> >    stored value modified at most once by the evaluation of an expression.
> >    Furthermore, the prior value shall be read only to determine the value
> >    to be stored.70)
> >
> > This talks only about the side-effect of changing the stored value (which
> > we do only once) and about the prior value, which we don't use at all.
> 
> I had misremembered that piece of the std as saying 'the object's value shall
> be read only ...'.  I.e. no 'prior' in there.  Indeed, the explanation given
> in the C-faq http://linuxdude.com/Steve_Sumit/C-faq/q3.8.html is 'any and all
> accesses to it within the same expression must be for the purposes of
> computing the value to be written'.

I think that's just a sloppy reformulation of the standard together with 
the fact that accesses to the 'new value' usually don't make much 
difference, except in the presence of volatile.

> 1) for each assignment, the new value is written on a ball, and thrown 
>    at the lvalue target.
> 
> 2) it is guaranteed all balls will arrive at their targets by the next 
>    sequence point.
> 
> 3) if more than one ball is aimed at the same target, chaos occurs.
> 
> 4) if you try and read a target when a ball is in flight for that 
>    target, chaos occurs.
> 
> Your read-after-write interpretation breaks that model (of course the model
> could be wrong)

(I'm of course saying the model is wrong ;-), though perhaps it actually 
isn't and my read-after-write interpretation doesn't break it, read on)

While reading this mental model a different aspect of this whole topic 
came into my mind, namely that there's a implicit ordering of 
subexpressions when values of assignments are involved.  Not quite 
sequence points, but certainly effects that must occur necessarily before 
other effects.  For instance in "x = y = z" it's clear that the assignment 
to y has to happen before the assignment to x (that is because the value 
of y = z is that of y, so if the assignment to y would happen after the 
assignment to x it would access the old value).

Sequence points are there to be able to say something about allowed orders 
of side-effects.  The implicit ordering above when assignment values are 
used also help that goal (on a smaller scale, it for instance doesn't 
impose an ordering on subexpressions not related to the assignment).  
Maybe this explains why I think there's no problem with sequence points.  
Your model would need amendment in point 2), not just mentioning sequence 
points but also these other ordering constraints, for which I don't have a 
good name :)


Ciao,
Michael.
Joseph Myers June 22, 2010, 3:47 p.m. UTC | #5
On Tue, 22 Jun 2010, Michael Matz wrote:

> While reading this mental model a different aspect of this whole topic 
> came into my mind, namely that there's a implicit ordering of 
> subexpressions when values of assignments are involved.  Not quite 

If you look in C1X drafts you will see more of an *explicit* ordering.  
"The side effect of updating the stored value of the left operand is 
sequenced after the value computations of the left and right operands. The 
evaluations of the operands are unsequenced." (N1425 6.5.16#3).

Note that there is nothing about sequencing for any reading of the left 
operand after the store.  There is also nothing about sequencing for the 
conversion of the right operand to the type of the left operand (which 
could cause floating-point exceptions, for example), though I'd interpret 
that as being part of the value computation of the right operand.

As I read it, in "*p = *q = *r = *s;", there is no sequencing among the 
assignments to *p, *q and *r, although they must all occur by the sequence 
point at the end of the statement.  Note what N1252, the paper introducing 
this sequencing model, says: "It is extremely important to note here that 
the assignment is not sequenced after any side effects of the operands, 
but only after their value computations.".

My view is that the value computation of an assignment consists of the 
value computation of both operands including the conversion of the right 
operand to the type of the left operand, and the value should be 
considered to be the result of that conversion, without volatile lvalues 
being reread.  In "A = B = C = D;", the side effects of assignment to A, B 
and C are unsequenced with respect to each other and with respect to any 
other side effects of A, B, C and D except where other sequence points are 
involved (if C involves a call to a function, for example, then all side 
effects in that function are sequenced before the assignments to any of A, 
B and C).
diff mbox

Patch

2010-06-21  Nathan Sidwell  <nathan@codesourcery.com>
	    Richard Guenther  <richard.guenther@gmail.com>

	gcc/
	* gimplify.c (gimplify_modify_expr): When assigning to volatiles,
	copy the src value and return a copy.

	gcc/testsuite/
	* gcc.target/i386/volatile-2.c: New.

Index: gimplify.c
===================================================================
--- gimplify.c	(revision 160876)
+++ gimplify.c	(working copy)
@@ -4572,6 +4572,9 @@ 
       SET_DECL_DEBUG_EXPR (*from_p, *to_p);
    }
 
+  if (want_value && TREE_THIS_VOLATILE (*to_p))
+    *from_p = get_initialized_tmp_var (*from_p, pre_p, post_p);
+
   if (TREE_CODE (*from_p) == CALL_EXPR)
     {
       /* Since the RHS is a CALL_EXPR, we need to create a GIMPLE_CALL
@@ -4599,7 +4602,7 @@ 
 
   if (want_value)
     {
-      *expr_p = unshare_expr (*to_p);
+      *expr_p = TREE_THIS_VOLATILE (*to_p) ? *from_p : unshare_expr (*to_p);
       return GS_OK;
     }
   else
Index: testsuite/gcc.target/i386/volatile-2.c
===================================================================
--- testsuite/gcc.target/i386/volatile-2.c	(revision 0)
+++ testsuite/gcc.target/i386/volatile-2.c	(revision 0)
@@ -0,0 +1,92 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+/* Check volatiles are written, read or not re-read consistently */
+
+
+/* simple assignments */
+
+extern int volatile obj_0;
+void test_0 (int data)
+{
+  /* should not reread obj */
+  /* { dg-final { scan-assembler "movl\[ \t\]\[^,\]+, obj_0" } } */
+  /* { dg-final { scan-assembler-not "movl\[ \t\]obj_0," } } */
+  obj_0 = data;
+}
+
+extern int volatile obj_1;
+int test_1 (int data)
+{
+  /* should not reread obj */
+  /* { dg-final { scan-assembler "movl\[ \t\]\[^,\]+, obj_1" } } */
+  /* { dg-final { scan-assembler-not "movl\[ \t\]obj_1," } } */
+  return obj_1 = data;
+}
+
+extern int volatile obj_2;
+int test_2 (void)
+{
+  /* should not reread obj */
+  /* { dg-final { scan-assembler "movl\[ \t\]\[^,\]+, obj_2" } } */
+  /* { dg-final { scan-assembler-not "movl\[ \t\]obj_2," } } */
+  return obj_2 = 0;
+}
+
+
+/* Assignments in compound exprs */
+
+extern int volatile obj_3;
+int test_3 (int data)
+{
+  /* should not reread obj */
+  /* { dg-final { scan-assembler "movl\[ \t\]\[^,\]+, obj_3" } } */
+  /* { dg-final { scan-assembler-not "movl\[ \t\]obj_3," } } */
+  return (obj_3 = data, 0);
+}
+
+extern int volatile obj_4;
+int test_4 (void)
+{
+  /* should not reread obj */
+  /* { dg-final { scan-assembler "movl\[ \t\]\[^,\]+, obj_4" } } */
+  /* { dg-final { scan-assembler-not "movl\[ \t\]obj_4," } } */
+  return (obj_4 = 0, 0);
+}
+extern int volatile obj_5;
+int test_5 (void)
+{
+  /* should reread obj */
+  /* { dg-final { scan-assembler "movl\[ \t\]\[^,\]+, obj_5" } } */
+  /* { dg-final { scan-assembler "movl\[ \t\]obj_5," } } */
+  return (obj_5 = 0, obj_5);
+}
+
+/* Assignments in conditional exprs */
+
+extern int volatile obj_6;
+void test_6 (int data, int cond)
+{
+  /* should not reread obj */
+  /* { dg-final { scan-assembler "movl\[ \t\]\[^,\]+, obj_6" } } */
+  /* { dg-final { scan-assembler-not "movl\[ \t\]obj_6," } } */
+  cond ? obj_6 = data : 0;
+}
+
+extern int volatile obj_7;
+int test_7 (int data, int cond)
+{
+  /* should not reread obj */
+  /* { dg-final { scan-assembler "movl\[ \t\]\[^,\]+, obj_7" } } */
+  /* { dg-final { scan-assembler-not "movl\[ \t\]obj_7," } } */
+  return cond ? obj_7 = data : 0;
+}
+
+extern int volatile obj_8;
+int test_8 (int cond)
+{
+  /* should not reread obj */
+  /* { dg-final { scan-assembler "movl\[ \t\]\[^,\]+, obj_8" } } */
+  /* { dg-final { scan-assembler-not "movl\[ \t\]obj_8," } } */
+  return cond ? obj_8 = 0 : 0;
+}