diff mbox

Wonly-top-basic-asm

Message ID 56A54EF9.8060006@LimeGreenSocks.com
State New
Headers show

Commit Message

David Wohlferd Jan. 24, 2016, 10:23 p.m. UTC
I'm not sure which 'subsystem maintainer' to include on this as it 
affects parsers for both C and c++.  I've also cc'ed people from the 
discussion thread.

While that ~100 message thread on the gcc list about pr24414 didn't come 
to any final conclusions about clobbering registers for basic asm, there 
were a few things people agreed we could do to help users right now:

- Update the basic asm docs to describe basic asm's current (and 
historical) semantics (ie clobber nothing).
- Emphasize how that might be different from users' expectations or the 
behavior of other compilers.
- Warn that this could change in future versions of gcc.  To avoid 
impacts from this change, use extended asm.
- Implement and document -Wonly-top-basic-asm (disabled by default) as a 
way to locate affected statements.

This patch does these things.  You can review the new doc text at 
http://www.LimeGreenSocks.com/gcc/Basic-Asm.html

ChangeLog:
2016-01-24  David Wohlferd <dw@LimeGreenSocks.com>

     * doc/extend.texi: Doc basic asm behavior and new
     -Wonly-top-basic-asm option.
     * doc/invoke.texi: Doc new -Wonly-top-basic-asm option.
     * c-family/c.opt: Define -Wonly-top-basic-asm.
     * c/c-parser.c: Implement -Wonly-top-basic-asm for C.
     * cp/parser.c: Implement -Wonly-top-basic-asm for c++.
     * testsuite/c-c++-common/Wonly-top-basic-asm.c: New tests for
     -Wonly-top-basic-asm.
     * testsuite/c-c++-common/Wonly-top-basic-asm-2.c: Ditto.

Note that while I have a release on file with FSF, I don't have write 
access to SVN.

dw

Comments

Bernd Schmidt Jan. 25, 2016, 12:25 p.m. UTC | #1
On 01/24/2016 11:23 PM, David Wohlferd wrote:
> +Wonly-top-basic-asm
> +C ObjC ObjC++ C++ Var(warn_only_top_basic_asm) Warning
> +Warn on unsafe uses of basic asm.

Maybe just -Wbasic-asm?

> +    /* Warn on basic asm used inside of functions,
> +       EXCEPT when in naked functions. Also allow asm(""). */

Two spaces after a sentence.

> +    if (warn_only_top_basic_asm && (TREE_STRING_LENGTH (str) != 1) )

Unnecessary parens, and extra space before closing paren.

> +	      if (warn_only_top_basic_asm &&
> +	          (TREE_STRING_LENGTH (string) != 1))

Extra parens, and && goes first on the next line.

> +	          warning_at(asm_loc, OPT_Wonly_top_basic_asm,

Space before "(".

> +	            "asm statement in function does not use extended syntax");

Could break that into ".." "..." over two lines so as to keep indentation.

> -asm ("");
> +asm ("":::);

Is that necessary? As far as I can tell we're treating these equally.

> @@ -7487,6 +7490,8 @@
>   consecutive in the output, put them in a single multi-instruction @code{asm}
>   statement. Note that GCC's optimizers can move @code{asm} statements
>   relative to other code, including across jumps.
> +Using inputs and outputs with extended @code{asm} can help correctly position
> +your asm.

Not sure this is needed either. Sounds a bit like advertising :) In 
general the doc changes seem much too verbose to me.

> +Extended @code{asm}'s @samp{%=} may help resolve this.

Same here. I think the block that recommends extended asm is good 
enough. I think the next part could be shrunk significantly too.

> -Here is an example of basic @code{asm} for i386:
> +Basic @code{asm} statements within functions do not perform an implicit
> +"memory" clobber (@pxref{Clobbers}).  Also, there is no implicit clobbering
> +of @emph{any} registers, so (other than "naked" functions which follow the

"other than in"? Also @code{naked} maybe. I'd place a note about 
clobbering after the existing "To access C data, it is better to use 
extended asm".

> +ABI rules) changed registers must be restored to their original value before
> +exiting the @code{asm}.  While this behavior has not always been documented,
> +GCC has worked this way since at least v2.95.3.  Also, lacking inputs and
> +outputs means that GCC's optimizers may have difficulties consistently
> +positioning the basic @code{asm} in the generated code.

The existing text already mentions ordering issues. Lose this block.

> +The concept of ``clobbering'' does not apply to basic @code{asm} statements
> +outside of functions (aka top-level asm).

Stating the obvious?

> +@strong{Warning!} This "clobber nothing" behavior may be different than how

Ok there is precedent for this, but it's spelt "@strong{Warning:}" in 
all other cases. Still, I'd probably also shrink this paragraph and put 
a note about lack of C semantics and possibly different behaviour from 
other compilers near the top, where we say that extended asm is better 
in most cases.

> +other compilers treat basic @code{asm}, since the C standards for the
> +@code{asm} statement provide no guidance regarding these semantics.  As a
> +result, @code{asm} statements that work correctly on other compilers may not
> +work correctly with GCC (and vice versa), even though they both compile
> +without error.  Also, there is discussion underway about changing GCC to
> +have basic @code{asm} clobber at least memory and perhaps some (or all)
> +registers.  If implemented, this change may fix subtle problems with
> +existing @code{asm} statements.  However it may break or slow down ones that
> +were working correctly.

How would such a change break anything? I'd also not mention discussion 
underway, just say "Future versions of GCC may treat basic @code{asm} as 
clobbering memory".

> +If your existing code needs clobbers that GCC's basic @code{asm} is not
> +providing, or if you want to 'future-proof' your asm against possible
> +changes to basic @code{asm}'s semantics, use extended @code{asm}.

Recommending it too often. Lose this.

> +Extended @code{asm} allows you to specify what (if anything) needs to be
> +clobbered for your code to work correctly.

And again.

> You can use @ref{Warning
> +Options, @option{-Wonly-top-basic-asm}} to locate basic @code{asm}

I think just plain @option is usual.

> +statements that may need changes, and refer to
> +@uref{https://gcc.gnu.org/wiki/ConvertBasicAsmToExtended, How to convert
> +from basic asm to extended asm} for information about how to perform the
> +conversion.

A link is probably good if we have such a page.

> +Here is an example of top-level basic @code{asm} for i386 that defines an
> +asm macro.  That macro is then invoked from within a function using
> +extended @code{asm}:

The updated example also looks good.

I think I'm fine with the concept but I'd like to see an updated patch 
with better docs.


Bernd
Segher Boessenkool Jan. 26, 2016, 12:29 a.m. UTC | #2
Hi David,

On Sun, Jan 24, 2016 at 02:23:53PM -0800, David Wohlferd wrote:
> - Warn that this could change in future versions of gcc.  To avoid 
> impacts from this change, use extended asm.
> - Implement and document -Wonly-top-basic-asm (disabled by default) as a 
> way to locate affected statements.

In my opinion we should not warn for any asm that means the same both
as basic and as extended asm.  The problem then becomes, what *is* the
meaning of a basic asm, what does it clobber.

Currently the only differences are:

- asms that have a % in the string, or {|} on targets with ASSEMBLER_DIALECT;
- ia64 (for stop bits);
- mep, and this one is easily fixed.
- basic asms do not get TARGET_MD_ASM_ADJUST.


Segher
Bernd Schmidt Jan. 26, 2016, 12:11 p.m. UTC | #3
On 01/26/2016 01:29 AM, Segher Boessenkool wrote:

> In my opinion we should not warn for any asm that means the same both
> as basic and as extended asm.  The problem then becomes, what *is* the
> meaning of a basic asm, what does it clobber.

I think this may be too hard to figure out in general without parsing 
the asm string, which we don't really want to do.


Bernd
Segher Boessenkool Jan. 26, 2016, 4:11 p.m. UTC | #4
On Tue, Jan 26, 2016 at 01:11:36PM +0100, Bernd Schmidt wrote:
> On 01/26/2016 01:29 AM, Segher Boessenkool wrote:
> 
> >In my opinion we should not warn for any asm that means the same both
> >as basic and as extended asm.  The problem then becomes, what *is* the
> >meaning of a basic asm, what does it clobber.
> 
> I think this may be too hard to figure out in general without parsing 
> the asm string, which we don't really want to do.

That depends on the semantics of basic asm.  With the currently implemented
semantics, it is trivial.


Segher
David Wohlferd Jan. 26, 2016, 11:37 p.m. UTC | #5
On 1/26/2016 8:11 AM, Segher Boessenkool wrote:
> On Tue, Jan 26, 2016 at 01:11:36PM +0100, Bernd Schmidt wrote:
>> On 01/26/2016 01:29 AM, Segher Boessenkool wrote:
>>
>>> In my opinion we should not warn for any asm that means the same both
>>> as basic and as extended asm.  The problem then becomes, what *is* the
>>> meaning of a basic asm, what does it clobber.
>> I think this may be too hard to figure out in general without parsing
>> the asm string, which we don't really want to do.
> That depends on the semantics of basic asm.  With the currently implemented
> semantics, it is trivial.

Oh?

asm("cmp al, '#' # if (c == '#') {");

There's a '{', so it might look like it needs to be escaped, but it 
doesn't.  The '{' is just part of a comment.  And how do you know it's a 
comment?  Because of the comment marker (#).  And how do you know that 
it's a comment marker and not a literal?  Only by doing more assembler 
parsing that I'm prepared to write.

But more importantly, consider this MIPS statement:

    asm ("sync");

That is the same for both basic and extended.  But I absolutely want the 
warning to flag this statement.  This is exactly the kind of "broken" 
statement that people need to be able to find/fix right now.

And when Jeff makes the changes to basic asm for v7, people may want to 
be able to find the statements that are affected by that in order to 
*stop* clobbering so many registers.

I'm not clear what people would use this warning for if we made the 
change you are suggesting.

dw
Jan Hubicka Feb. 16, 2016, 2:03 p.m. UTC | #6
>  @example
> -/* Note that this code will not compile with -masm=intel */
> -#define DebugBreak() asm("int $3")
> +/* Define macro at file scope with basic asm. */
> +/* Add macro parameter p to eax. */
> +asm(".macro test p\n\t"
> +    "addl $\\p, %eax\n\t"
> +    ".endm");
> +
> +/* Use macro in function using extended asm.  It needs */
> +/* the "cc" clobber since the flags are changed and uses */
> +/* the "a" constraint since it modifies eax. */
> +int DoAdd(int value)
> +@{
> +   asm("test 5" : "+a" (value) : : "cc");
> +   return value;
> +@}

To make this work you need the .macro appear before all uses in the asm files.  I do not think
we really promise that wihtout -fno-toplevel-reorder -fno-lto

Honza
Bernd Edlinger Feb. 16, 2016, 8:02 p.m. UTC | #7
On 16.02.2016 15:03, Jan Hubicka wrote:
>>   @example
>> -/* Note that this code will not compile with -masm=intel */
>> -#define DebugBreak() asm("int $3")
>> +/* Define macro at file scope with basic asm. */
>> +/* Add macro parameter p to eax. */
>> +asm(".macro test p\n\t"
>> +    "addl $\\p, %eax\n\t"
>> +    ".endm");
>> +
>> +/* Use macro in function using extended asm.  It needs */
>> +/* the "cc" clobber since the flags are changed and uses */
>> +/* the "a" constraint since it modifies eax. */
>> +int DoAdd(int value)
>> +@{
>> +   asm("test 5" : "+a" (value) : : "cc");
>> +   return value;
>> +@}
>
> To make this work you need the .macro appear before all uses in the asm files.  I do not think
> we really promise that wihtout -fno-toplevel-reorder -fno-lto
>
> Honza
>

And, isn't "test" an assembler instruction name?
What if the compiler emits something like test %eax,1
somewhere in the same translation unit?

Bernd.
diff mbox

Patch

Index: gcc/c-family/c.opt
===================================================================
--- gcc/c-family/c.opt	(revision 232773)
+++ gcc/c-family/c.opt	(working copy)
@@ -585,6 +585,10 @@ 
 C++ ObjC++ Var(warn_namespaces) Warning
 Warn on namespace definition.
 
+Wonly-top-basic-asm
+C ObjC ObjC++ C++ Var(warn_only_top_basic_asm) Warning
+Warn on unsafe uses of basic asm.
+
 Wsized-deallocation
 C++ ObjC++ Var(warn_sized_deallocation) Warning EnabledBy(Wextra)
 Warn about missing sized deallocation functions.
Index: gcc/c/c-parser.c
===================================================================
--- gcc/c/c-parser.c	(revision 232773)
+++ gcc/c/c-parser.c	(working copy)
@@ -5973,7 +5973,18 @@ 
   labels = NULL_TREE;
 
   if (c_parser_next_token_is (parser, CPP_CLOSE_PAREN) && !is_goto)
+  {
+    /* Warn on basic asm used inside of functions, 
+       EXCEPT when in naked functions. Also allow asm(""). */
+    if (warn_only_top_basic_asm && (TREE_STRING_LENGTH (str) != 1) )
+      if (lookup_attribute ("naked", 
+                            DECL_ATTRIBUTES (current_function_decl)) 
+          == NULL_TREE)
+        warning_at(asm_loc, OPT_Wonly_top_basic_asm, 
+                   "asm statement in function does not use extended syntax");
+
     goto done_asm;
+  }
 
   /* Parse each colon-delimited section of operands.  */
   nsections = 3 + is_goto;
Index: gcc/cp/parser.c
===================================================================
--- gcc/cp/parser.c	(revision 232773)
+++ gcc/cp/parser.c	(working copy)
@@ -18003,6 +18003,8 @@ 
   bool goto_p = false;
   required_token missing = RT_NONE;
 
+  location_t asm_loc = cp_lexer_peek_token (parser->lexer)->location;
+
   /* Look for the `asm' keyword.  */
   cp_parser_require_keyword (parser, RID_ASM, RT_ASM);
 
@@ -18161,6 +18163,16 @@ 
 	  /* If the extended syntax was not used, mark the ASM_EXPR.  */
 	  if (!extended_p)
 	    {
+	      /* Warn on basic asm used inside of functions,
+	         EXCEPT when in naked functions. Also allow asm(""). */
+	      if (warn_only_top_basic_asm && 
+	          (TREE_STRING_LENGTH (string) != 1))
+	        if (lookup_attribute("naked",
+	                             DECL_ATTRIBUTES (current_function_decl))
+	            == NULL_TREE)
+	          warning_at(asm_loc, OPT_Wonly_top_basic_asm,
+	            "asm statement in function does not use extended syntax");
+
 	      tree temp = asm_stmt;
 	      if (TREE_CODE (temp) == CLEANUP_POINT_EXPR)
 		temp = TREE_OPERAND (temp, 0);
Index: gcc/doc/extend.texi
===================================================================
--- gcc/doc/extend.texi	(revision 232773)
+++ gcc/doc/extend.texi	(working copy)
@@ -2903,12 +2903,14 @@ 
 although the function call is live.  To keep such calls from being
 optimized away, put
 @smallexample
-asm ("");
+asm ("":::);
 @end smallexample
 
 @noindent
 (@pxref{Extended Asm}) in the called function, to serve as a special
 side-effect.
+Older code used @code{asm("")}, but newer code should use the extended
+@code{asm} format.
 
 @item nonnull (@var{arg-index}, @dots{})
 @cindex @code{nonnull} function attribute
@@ -7458,7 +7460,8 @@ 
 @end table
 
 @subsubheading Remarks
-Using extended @code{asm} typically produces smaller, safer, and more
+Using extended @code{asm} (@pxref{Extended Asm}) typically produces smaller,
+safer, and more
 efficient code, and in most cases it is a better solution than basic
 @code{asm}.  However, there are two situations where only basic @code{asm}
 can be used:
@@ -7487,6 +7490,8 @@ 
 consecutive in the output, put them in a single multi-instruction @code{asm}
 statement. Note that GCC's optimizers can move @code{asm} statements 
 relative to other code, including across jumps.
+Using inputs and outputs with extended @code{asm} can help correctly position
+your asm.
 
 @code{asm} statements may not perform jumps into other @code{asm} statements. 
 GCC does not know about these jumps, and therefore cannot take 
@@ -7497,6 +7502,7 @@ 
 assembly code when optimizing. This can lead to unexpected duplicate 
 symbol errors during compilation if your assembly code defines symbols or 
 labels.
+Extended @code{asm}'s @samp{%=} may help resolve this.
 
 Since GCC does not parse the @var{AssemblerInstructions}, it has no 
 visibility of any symbols it references. This may result in GCC discarding 
@@ -7516,11 +7522,59 @@ 
 Basic @code{asm} provides no
 mechanism to provide different assembler strings for different dialects.
 
-Here is an example of basic @code{asm} for i386:
+Basic @code{asm} statements within functions do not perform an implicit
+"memory" clobber (@pxref{Clobbers}).  Also, there is no implicit clobbering
+of @emph{any} registers, so (other than "naked" functions which follow the
+ABI rules) changed registers must be restored to their original value before
+exiting the @code{asm}.  While this behavior has not always been documented,
+GCC has worked this way since at least v2.95.3.  Also, lacking inputs and
+outputs means that GCC's optimizers may have difficulties consistently
+positioning the basic @code{asm} in the generated code.
 
+The concept of ``clobbering'' does not apply to basic @code{asm} statements
+outside of functions (aka top-level asm).
+
+@strong{Warning!} This "clobber nothing" behavior may be different than how
+other compilers treat basic @code{asm}, since the C standards for the
+@code{asm} statement provide no guidance regarding these semantics.  As a
+result, @code{asm} statements that work correctly on other compilers may not
+work correctly with GCC (and vice versa), even though they both compile
+without error.  Also, there is discussion underway about changing GCC to
+have basic @code{asm} clobber at least memory and perhaps some (or all)
+registers.  If implemented, this change may fix subtle problems with
+existing @code{asm} statements.  However it may break or slow down ones that
+were working correctly.
+
+If your existing code needs clobbers that GCC's basic @code{asm} is not 
+providing, or if you want to 'future-proof' your asm against possible 
+changes to basic @code{asm}'s semantics, use extended @code{asm}.
+Extended @code{asm} allows you to specify what (if anything) needs to be 
+clobbered for your code to work correctly.  You can use @ref{Warning 
+Options, @option{-Wonly-top-basic-asm}} to locate basic @code{asm} 
+statements that may need changes, and refer to 
+@uref{https://gcc.gnu.org/wiki/ConvertBasicAsmToExtended, How to convert 
+from basic asm to extended asm} for information about how to perform the 
+conversion.
+
+Here is an example of top-level basic @code{asm} for i386 that defines an
+asm macro.  That macro is then invoked from within a function using
+extended @code{asm}:
+
 @example
-/* Note that this code will not compile with -masm=intel */
-#define DebugBreak() asm("int $3")
+/* Define macro at file scope with basic asm. */
+/* Add macro parameter p to eax. */
+asm(".macro test p\n\t"
+    "addl $\\p, %eax\n\t"
+    ".endm");
+
+/* Use macro in function using extended asm.  It needs */
+/* the "cc" clobber since the flags are changed and uses */
+/* the "a" constraint since it modifies eax. */
+int DoAdd(int value)
+@{
+   asm("test 5" : "+a" (value) : : "cc");
+   return value;
+@}
 @end example
 
 @node Extended Asm
@@ -8047,7 +8101,7 @@ 
 for @code{d} by specifying both constraints.
 
 @anchor{FlagOutputOperands}
-@subsection Flag Output Operands
+@subsubsection Flag Output Operands
 @cindex @code{asm} flag output operands
 
 Some targets have a special register that holds the ``flags'' for the
Index: gcc/doc/invoke.texi
===================================================================
--- gcc/doc/invoke.texi	(revision 232773)
+++ gcc/doc/invoke.texi	(working copy)
@@ -5693,6 +5693,21 @@ 
 a structure that has been marked with the @code{designated_init}
 attribute.
 
+@item -Wonly-top-basic-asm @r{(C and C++ only)}
+Warn if basic @code{asm} statements are used inside a function (i.e. not at
+top-level/file scope).
+
+When used inside of functions, basic @code{asm} can result in unexpected and
+unwanted variations in behavior between compilers due to how registers are
+handled when calling the asm (@pxref{Basic Asm}).  The lack of input and
+output constraints (@pxref{Extended Asm}) can also make it difficult for
+optimizers to correctly and consistently position the output relative to
+other code.
+
+Functions that are marked with the @option{naked} attribute (@pxref{Function
+Attributes}) and @code{asm} statements with an empty instruction string are
+excluded from this check.
+
 @item -Whsa
 Issue a warning when HSAIL cannot be emitted for the compiled function or
 OpenMP construct.