diff mbox

Disallow side-effects in inline asm operands unless (a new) _ constraint modifier is used (PR middle-end/44492)

Message ID 20100611180013.GC7811@tyan-ft48-01.lab.bos.redhat.com
State New
Headers show

Commit Message

Jakub Jelinek June 11, 2010, 6 p.m. UTC
Hi!

As discussed in the PR, currently auto-inc-dec.c happily propagates
{PRE,POST}_{INC,DEC_MODIFY} side-effects into inline asm memory operands
when constraints like "g" or "m" are used.
Unfortunately that means a lot of inline asms out there work by purse
accident - if a mem operand can have a side-effect, special case
needs to be done on the inline asm pattern side.  In particular, it has
to be used exactly once in an instruction that is capable of handling
side-effects and on many targets needs to be accompanied by special
modifier too (e.g. on ppc/ppc64 it needs to be used in insn like
lwz%U1 %0,%1, if the %U1 part is omitted, code is silently miscompiled).
A lot of inline asms I've grepped for don't use "m" or "=m" operands
anywhere, many look e.g. like
asm ("... %2 ... " : "=m" (*p) : "m" (*p), "r" (p));
and the "m" and "=m" operands are there just to tell GCC that the insn
(might) reads or writes that memory.  If the memory operand with
side-effects isn't used at all, the side-effects don't happen, if it is used
more than once, they happen multiple times.

The following patch disallows side-effects in inline asm operands unless
specifically allowed in source code ( _ constraint modifier says this
operand obeys all the rules that it can contain side-effects).

Bootstrapped/regtested on x86_64-linux and i686-linux (yeah, I know, they
aren't AUTO_INC_DEC) and tested on powerpc64-linux cross.

Ok for trunk?

2010-06-10  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/44492
	* recog.h (struct recog_data): Add is_asm field.
	* recog.c (asm_operand_ok, constrain_operands): Handle _ constraint
	modifier.  If it is not present and operand contains
	{PRE,POST}_{INC,DEC,MODIFY}, return 0.
	(extract_insn): Initialize recog_data.is_asm.
	(preprocess_constraints): Handle _ constraint modifier.
	* ira-costs.c (record_reg_classes): Likewise.
	* ira-lives.c (single_reg_class): Likewise.
	* stmt.c (parse_output_constraint, parse_input_constraint): Likewise.
	* postreload.c (reload_cse_simplify_operands): Likewise.
	* reload.c (find_reloads): Likewise.
	* reload1.c (maybe_fix_stack_asms): Likewise.
	* doc/md.texi (Modifiers): Document _ constraint modifier.

	* g++.dg/torture/pr44492.C: New test.


	Jakub

Comments

Hans-Peter Nilsson June 20, 2010, 4:29 a.m. UTC | #1
On Fri, 11 Jun 2010, Jakub Jelinek wrote:
> The following patch disallows side-effects in inline asm operands unless
> specifically allowed in source code ( _ constraint modifier says this
> operand obeys all the rules that it can contain side-effects).

'<' and '>' do too, so...

> --- gcc/recog.c.jj	2010-06-10 19:32:00.000000000 +0200
> +++ gcc/recog.c	2010-06-11 10:14:55.000000000 +0200
> @@ -1629,6 +1632,12 @@ asm_operand_ok (rtx op, const char *cons
>  	case '?':
>  	  break;
>
> +	case '_':
> +#ifdef AUTO_INC_DEC
> +	  incdec_ok = true;

...please, also set incdec_ok = 1 for '<' and '>'.

(FWIW, I don't mind if you drop that new '_', it just looks
weird and wasting one good letter.)

brgds, H-P
Richard Sandiford June 20, 2010, 11:08 a.m. UTC | #2
Jakub Jelinek <jakub@redhat.com> writes:
> As discussed in the PR, currently auto-inc-dec.c happily propagates
> {PRE,POST}_{INC,DEC_MODIFY} side-effects into inline asm memory operands
> when constraints like "g" or "m" are used.
> Unfortunately that means a lot of inline asms out there work by purse
> accident - if a mem operand can have a side-effect, special case
> needs to be done on the inline asm pattern side.  In particular, it has
> to be used exactly once in an instruction that is capable of handling
> side-effects and on many targets needs to be accompanied by special
> modifier too (e.g. on ppc/ppc64 it needs to be used in insn like
> lwz%U1 %0,%1, if the %U1 part is omitted, code is silently miscompiled).
> A lot of inline asms I've grepped for don't use "m" or "=m" operands
> anywhere, many look e.g. like
> asm ("... %2 ... " : "=m" (*p) : "m" (*p), "r" (p));
> and the "m" and "=m" operands are there just to tell GCC that the insn
> (might) reads or writes that memory.  If the memory operand with
> side-effects isn't used at all, the side-effects don't happen, if it is used
> more than once, they happen multiple times.

For the record, here's some previous discussion about this:

    http://gcc.gnu.org/ml/gcc-patches/2009-07/msg00567.html

The patch in that link isn't interesting, because I agree your
approach is better.  It was more the ensuing thread.

Richard
Jakub Jelinek June 20, 2010, 11:50 a.m. UTC | #3
On Sun, Jun 20, 2010 at 12:29:30AM -0400, Hans-Peter Nilsson wrote:
> On Fri, 11 Jun 2010, Jakub Jelinek wrote:
> > --- gcc/recog.c.jj	2010-06-10 19:32:00.000000000 +0200
> > +++ gcc/recog.c	2010-06-11 10:14:55.000000000 +0200
> > @@ -1629,6 +1632,12 @@ asm_operand_ok (rtx op, const char *cons
> >  	case '?':
> >  	  break;
> >
> > +	case '_':
> > +#ifdef AUTO_INC_DEC
> > +	  incdec_ok = true;
> 
> ...please, also set incdec_ok = 1 for '<' and '>'.

Yeah, I can do that.

> (FWIW, I don't mind if you drop that new '_', it just looks
> weird and wasting one good letter.)

_ isn't redundant, < and > require PRE/POST_DEC/INC,
while _ allows it but doesn't require (and, allows also PRE/POST_MODIFY).

	Jakub
Hans-Peter Nilsson June 20, 2010, 1:20 p.m. UTC | #4
On Sun, 20 Jun 2010, Jakub Jelinek wrote:
> On Sun, Jun 20, 2010 at 12:29:30AM -0400, Hans-Peter Nilsson wrote:
> > ...please, also set incdec_ok = 1 for '<' and '>'.
>
> Yeah, I can do that.
Thanks.

> _ isn't redundant, < and > require PRE/POST_DEC/INC,

The doc patch could maybe say so explicitly?
(It *does* says side-effect but mentions just INC/DEC.)

> while _ allows it but doesn't require (and, allows also PRE/POST_MODIFY).

So "_" would be less restrictive than "m<>".
BTW, is it "_m" or "_"?

brgds, H-P
Jakub Jelinek June 20, 2010, 2 p.m. UTC | #5
On Sun, Jun 20, 2010 at 09:20:15AM -0400, Hans-Peter Nilsson wrote:
> > while _ allows it but doesn't require (and, allows also PRE/POST_MODIFY).
> 
> So "_" would be less restrictive than "m<>".
> BTW, is it "_m" or "_"?

It is "_m", or "_g", or "_o" etc.  The address mode needs to pass the given
constraint test and missing _ just means that in addition to that test
side-effects are rejected.

For PRE_MODIFY/POST_MODIFY, targets often don't allow arbitrary adjustment
constants and have other restrictions, so we just can't allow those two in
< (or >) constraint (and which one?) without having some target constrain
routine for that.

With < and > in inline-asm, what worries me is that before reload
asm_operand_ok doesn't require anything for the mem, so reload often has to
fix it up (could result in worse code).

Anyway, I guess I can live with not adding _ and saying that side-effects in
inline-asm memory is not allowed unless the constraint contains < or >.
Then PRE_MODIFY resp. POST_MODIFY would be allowed in
"m<>" or even just "m<" or "m>", assuming PRE_MODIFY resp. POST_MODIFY
is allowed in memory_address.

The implementation would be roughly the same as the patch I've posted, just
</> would set incdec_ok instead of _.

Advantage of doing it that way is backwards compatibility, "m<>" can be used
even with older gccs.

If you prefer to do it that way, I can write a patch.

	Jakub
Hans-Peter Nilsson June 20, 2010, 3:16 p.m. UTC | #6
On Sun, 20 Jun 2010, Jakub Jelinek wrote:
> Anyway, I guess I can live with not adding _ and saying that side-effects in
> inline-asm memory is not allowed unless the constraint contains < or >.
> Then PRE_MODIFY resp. POST_MODIFY would be allowed in
> "m<>" or even just "m<" or "m>", assuming PRE_MODIFY resp. POST_MODIFY
> is allowed in memory_address.
>
> The implementation would be roughly the same as the patch I've posted, just
> </> would set incdec_ok instead of _.

And a much shorter documentation patch! :)
(I guess mentioning that asm operands will not have side-effects
unless < or > are explicitly mentioned would be right.)

> Advantage of doing it that way is backwards compatibility, "m<>" can be used
> even with older gccs.
>
> If you prefer to do it that way, I can write a patch.

IMHO that's preferable, unless there's a known specific need for
what "_m" brings in addition to "m<>" as suggested above.

brgds, H-P
diff mbox

Patch

--- gcc/reload1.c.jj	2010-06-10 19:32:01.000000000 +0200
+++ gcc/reload1.c	2010-06-11 10:00:12.000000000 +0200
@@ -1414,6 +1414,7 @@  maybe_fix_stack_asms (void)
 		case '>': case 'V': case 'o': case '&': case 'E': case 'F':
 		case 's': case 'i': case 'n': case 'X': case 'I': case 'J':
 		case 'K': case 'L': case 'M': case 'N': case 'O': case 'P':
+		case '_':
 		case TARGET_MEM_CONSTRAINT:
 		  break;
 
--- gcc/recog.h.jj	2010-05-13 12:21:21.000000000 +0200
+++ gcc/recog.h	2010-06-11 10:05:09.000000000 +0200
@@ -230,6 +230,9 @@  struct recog_data
   /* The number of alternatives in the constraints for the insn.  */
   char n_alternatives;
 
+  /* True if insn is ASM_OPERANDS.  */
+  bool is_asm;
+
   /* Specifies whether an insn alternative is enabled using the
      `enabled' attribute in the insn pattern definition.  For back
      ends not using the `enabled' attribute the array fields are
--- gcc/postreload.c.jj	2010-06-10 19:32:00.000000000 +0200
+++ gcc/postreload.c	2010-06-11 10:00:11.000000000 +0200
@@ -537,7 +537,7 @@  reload_cse_simplify_operands (rtx insn, 
 		{
 		case '=':  case '+':  case '?':
 		case '#':  case '&':  case '!':
-		case '*':  case '%':
+		case '*':  case '%':  case '_':
 		case '0':  case '1':  case '2':  case '3':  case '4':
 		case '5':  case '6':  case '7':  case '8':  case '9':
 		case '<':  case '>':  case 'V':  case 'o':
--- gcc/recog.c.jj	2010-06-10 19:32:00.000000000 +0200
+++ gcc/recog.c	2010-06-11 10:14:55.000000000 +0200
@@ -1601,6 +1601,9 @@  int
 asm_operand_ok (rtx op, const char *constraint, const char **constraints)
 {
   int result = 0;
+#ifdef AUTO_INC_DEC
+  bool incdec_ok = false;
+#endif
 
   /* Use constrain_operands after reload.  */
   gcc_assert (!reload_completed);
@@ -1608,7 +1611,7 @@  asm_operand_ok (rtx op, const char *cons
   /* Empty constraint string is the same as "X,...,X", i.e. X for as
      many alternatives as required to match the other operands.  */
   if (*constraint == '\0')
-    return 1;
+    result = 1;
 
   while (*constraint)
     {
@@ -1629,6 +1632,12 @@  asm_operand_ok (rtx op, const char *cons
 	case '?':
 	  break;
 
+	case '_':
+#ifdef AUTO_INC_DEC
+	  incdec_ok = true;
+#endif
+	  break;
+
 	case '0': case '1': case '2': case '3': case '4':
 	case '5': case '6': case '7': case '8': case '9':
 	  /* If caller provided constraints pointer, look up
@@ -1814,6 +1823,23 @@  asm_operand_ok (rtx op, const char *cons
 	return 0;
     }
 
+#ifdef AUTO_INC_DEC
+  /* For operands without _ constraint modifier reject side-effects.  */
+  if (!incdec_ok && result && MEM_P (op))
+    switch (GET_CODE (XEXP (op, 0)))
+      {
+      case PRE_INC:
+      case POST_INC:
+      case PRE_DEC:
+      case POST_DEC:
+      case PRE_MODIFY:
+      case POST_MODIFY:
+	return 0;
+      default:
+	break;
+      }
+#endif
+
   return result;
 }
 
@@ -2039,6 +2065,7 @@  extract_insn (rtx insn)
   recog_data.n_operands = 0;
   recog_data.n_alternatives = 0;
   recog_data.n_dups = 0;
+  recog_data.is_asm = false;
 
   switch (GET_CODE (body))
     {
@@ -2085,6 +2112,7 @@  extract_insn (rtx insn)
 	      while (*p)
 		recog_data.n_alternatives += (*p++ == ',');
 	    }
+	  recog_data.is_asm = true;
 	  break;
 	}
       fatal_insn_not_found (insn);
@@ -2198,6 +2226,7 @@  preprocess_constraints (void)
 		case 's': case 'i': case 'n':
 		case 'I': case 'J': case 'K': case 'L':
 		case 'M': case 'N': case 'O': case 'P':
+		case '_':
 		  /* These don't say anything we care about.  */
 		  break;
 
@@ -2396,7 +2425,7 @@  constrain_operands (int strict)
 		break;
 
 	      case '?':  case '!': case '*':  case '%':
-	      case '=':  case '+':
+	      case '=':  case '+': case '_':
 		break;
 
 	      case '#':
@@ -2699,6 +2728,29 @@  constrain_operands (int strict)
 		    = recog_data.operand[funny_match[funny_match_index].this_op];
 		}
 
+#ifdef AUTO_INC_DEC
+	      /* For operands without _ constraint modifier reject
+		 side-effects.  */
+	      if (recog_data.is_asm)
+		{
+		  for (opno = 0; opno < recog_data.n_operands; opno++)
+		    if (MEM_P (recog_data.operand[opno]))
+		      switch (GET_CODE (XEXP (recog_data.operand[opno], 0)))
+			{
+			case PRE_INC:
+			case POST_INC:
+			case PRE_DEC:
+			case POST_DEC:
+			case PRE_MODIFY:
+			case POST_MODIFY:
+			  if (strchr (recog_data.constraints[opno], '_') == 0)
+			    return 0;
+			  break;
+			default:
+			  break;
+			}
+		}
+#endif
 	      return 1;
 	    }
 	}
--- gcc/stmt.c.jj	2010-06-10 19:32:01.000000000 +0200
+++ gcc/stmt.c	2010-06-11 10:00:11.000000000 +0200
@@ -369,7 +369,7 @@  parse_output_constraint (const char **co
       case 'E':  case 'F':  case 'G':  case 'H':
       case 's':  case 'i':  case 'n':
       case 'I':  case 'J':  case 'K':  case 'L':  case 'M':
-      case 'N':  case 'O':  case 'P':  case ',':
+      case 'N':  case 'O':  case 'P':  case ',':  case '_':
 	break;
 
       case '0':  case '1':  case '2':  case '3':  case '4':
@@ -465,7 +465,7 @@  parse_input_constraint (const char **con
 	break;
 
       case '<':  case '>':
-      case '?':  case '!':  case '*':  case '#':
+      case '?':  case '!':  case '*':  case '#':  case '_':
       case 'E':  case 'F':  case 'G':  case 'H':
       case 's':  case 'i':  case 'n':
       case 'I':  case 'J':  case 'K':  case 'L':  case 'M':
--- gcc/ira-costs.c.jj	2010-06-10 19:32:01.000000000 +0200
+++ gcc/ira-costs.c	2010-06-11 10:00:11.000000000 +0200
@@ -398,7 +398,7 @@  record_reg_classes (int n_alts, int n_op
 
 		case '?':
 		  alt_cost += 2;
-		case '!':  case '#':  case '&':
+		case '!':  case '#':  case '&':  case '_':
 		case '0':  case '1':  case '2':  case '3':  case '4':
 		case '5':  case '6':  case '7':  case '8':  case '9':
 		  break;
--- gcc/reload.c.jj	2010-06-10 19:32:00.000000000 +0200
+++ gcc/reload.c	2010-06-11 10:00:11.000000000 +0200
@@ -3106,7 +3106,7 @@  find_reloads (rtx insn, int replace, int
 		c = '\0';
 		break;
 
-	      case '=':  case '+':  case '*':
+	      case '=':  case '+':  case '*':  case '_':
 		break;
 
 	      case '%':
--- gcc/doc/md.texi.jj	2010-06-11 09:38:04.000000000 +0200
+++ gcc/doc/md.texi	2010-06-11 14:37:13.000000000 +0200
@@ -1595,6 +1595,21 @@  Says that the following character should
 register preferences.  @samp{*} has no effect on the meaning of the
 constraint as a constraint, and no effect on reloading.
 
+@cindex @samp{_} in constraint
+@item _
+Means that this operand in inline asm can contain side-effects.
+By default side-effects in operands of inline asm are disallowed,
+because there is no guarantee that the operands are used exactly once
+for the inline asm in an instruction that can update the side-effects
+(pre- or post-increment or decrement of a register used in memory
+operand addressing).
+When @samp{_} is added, the operand must be used exactly once in an
+instruction that will do the update and any target specific means
+of ensuring that the side-effects happen in an instruction are present.
+E.g. on PowerPC that means using the operand with @samp{_} constraint
+modifier only in load or store instructions together with @samp{%U}@var{N}
+suffix for the instruction, etc.
+
 @ifset INTERNALS
 Here is an example: the 68000 has an instruction to sign-extend a
 halfword in a data register, and can also sign-extend a value by
--- gcc/ira-lives.c.jj	2010-06-10 19:32:00.000000000 +0200
+++ gcc/ira-lives.c	2010-06-11 10:00:11.000000000 +0200
@@ -640,6 +640,7 @@  single_reg_class (const char *constraint
 	case '%':
 	case '!':
 	case '?':
+	case '_':
 	  break;
 	case 'i':
 	  if (CONSTANT_P (op)
--- gcc/testsuite/g++.dg/torture/pr44492.C.jj	2010-06-11 10:00:12.000000000 +0200
+++ gcc/testsuite/g++.dg/torture/pr44492.C	2010-06-11 10:00:12.000000000 +0200
@@ -0,0 +1,31 @@ 
+// PR middle-end/44492
+// { dg-do run }
+
+struct T { unsigned long p; };
+struct S { T a, b, c; unsigned d; };
+
+__attribute__((noinline))
+void
+bar (const T &x, const T &y)
+{
+  if (x.p != 0x2348 || y.p != 0x2346)
+    __builtin_abort ();
+}
+
+__attribute__((noinline))
+void
+foo (S &s, T e)
+{
+  unsigned long a = e.p;
+  unsigned long b = s.b.p;
+  __asm__ volatile ("" : : "rm" (a), "rm" (b));
+  bar (e, s.b);
+}
+
+int
+main ()
+{
+  S s = { { 0x2345 }, { 0x2346 }, { 0x2347 }, 6 };
+  T t = { 0x2348 };
+  foo (s, t);
+}