diff mbox series

[v2] generate EH info for volatile asm statements (PR93981)

Message ID 20200307171244.1959-1-jwjagersma@gmail.com
State New
Headers show
Series [v2] generate EH info for volatile asm statements (PR93981) | expand

Commit Message

J.W. Jagersma March 7, 2020, 5:12 p.m. UTC
The following patch extends the generation of exception handling
information to cover volatile asms too.  This was already mostly
implemented, and only minor changes are required in order to make it
work.

The new test case works for me on x86_64-linux-gnu, but will likely
fail on most other platforms.

gcc/
2020-03-07  Jan W. Jagersma  <jwjagersma@gmail.com>

	PR inline-asm/93981
	* tree-cfg.c (make_edges_bb): Make EH edges for GIMPLE_ASM.
	* tree-eh.c (lower_eh_constructs_2): Add case for GIMPLE_ASM.
	Assign register output operands to temporaries.
	* doc/extend.texi: Document that volatile asms can now throw.

gcc/testsuite/
2020-03-07  Jan W. Jagersma  <jwjagersma@gmail.com>

	PR inline-asm/93981
	* g++.target/i386/pr93981.C: New test.
---
 gcc/doc/extend.texi                     |  5 ++++
 gcc/testsuite/g++.target/i386/pr93981.C | 28 ++++++++++++++++++++++
 gcc/tree-cfg.c                          |  2 ++
 gcc/tree-eh.c                           | 32 +++++++++++++++++++++++++
 4 files changed, 67 insertions(+)
 create mode 100644 gcc/testsuite/g++.target/i386/pr93981.C

Comments

Segher Boessenkool March 7, 2020, 7:20 p.m. UTC | #1
Hi!

On Sat, Mar 07, 2020 at 06:12:45PM +0100, J.W. Jagersma wrote:
> The following patch extends the generation of exception handling
> information to cover volatile asms too.  This was already mostly
> implemented, and only minor changes are required in order to make it
> work.

This should wait for stage 1, IMO.  Looks pretty good to me, thanks!
Some comments:

> +When non-call exceptions (@option{-fnon-call-exceptions}) are enabled, a
> +@code{volatile asm} statement is also allowed to throw exceptions.  If it
> +does, then the compiler assumes that its output operands have not been written
> +yet.

That reads as if the compiler assumes the outputs retain their original
value, but that isn't true (I hope!)  The compiler assumes the output
are clobbered, but it doesn't assume they are assigned any definite
value?

> +  try
> +    {
> +      asm ("ud2");
> +    }

Yeah that won't work on other than x86.  Maybe you want to use something
like gc.dg/nop.h?  You could use what __builtin_trap generates, but you
want asm of course.

> +++ b/gcc/tree-eh.c
> @@ -2077,6 +2077,8 @@ lower_eh_constructs_2 (struct leh_state *state, gimple_stmt_iterator *gsi)
> +    record_throwing_stmt:
>        /* Look for things that can throw exceptions, and record them.  */
>        if (state->cur_region && stmt_could_throw_p (cfun, stmt))

The indent of the label is incorrect (it should be flush left, but some
emacs users like one space, instead; same indent as the case labels
isn't correct).


Segher
J.W. Jagersma March 7, 2020, 8:43 p.m. UTC | #2
On 2020-03-07 20:20, Segher Boessenkool wrote:
> Hi!
> 
> On Sat, Mar 07, 2020 at 06:12:45PM +0100, J.W. Jagersma wrote:
>> The following patch extends the generation of exception handling
>> information to cover volatile asms too.  This was already mostly
>> implemented, and only minor changes are required in order to make it
>> work.
> 
> This should wait for stage 1, IMO.  Looks pretty good to me, thanks!

Hi, thanks for your response.
What does stage 1 refer to?  I'm sorry, this is my first gcc patch and
I'm still learning how this all works.

> Some comments:
> 
>> +When non-call exceptions (@option{-fnon-call-exceptions}) are enabled, a
>> +@code{volatile asm} statement is also allowed to throw exceptions.  If it
>> +does, then the compiler assumes that its output operands have not been written
>> +yet.
> 
> That reads as if the compiler assumes the outputs retain their original
> value, but that isn't true (I hope!)  The compiler assumes the output
> are clobbered, but it doesn't assume they are assigned any definite
> value?

Register outputs are assumed to be clobbered, yes.  For memory outputs
this is not the case, if the asm writes it before throwing then the
memory operand retains this value.  It should be the user's
responsibility to ensure that an asm has no side-effects if it throws.

I didn't see any way to make this work either, since create_tmp_var
will not make a temporary that is TREE_ADDRESSABLE.  It's not practical
anyway since you could pass a giant struct as operand and have to copy
the entire thing.  Or the operand could refer to memory-mapped IO space
in which case write operations can have side-effects.

I do agree that more specific wording can be used here.  Maybe
something like this?

+When non-call exceptions (@option{-fnon-call-exceptions}) are enabled, a
+@code{volatile asm} statement is also allowed to throw exceptions.  If it
+does, then all register output operands are assumed to be clobbered.

>> +  try
>> +    {
>> +      asm ("ud2");
>> +    }
> 
> Yeah that won't work on other than x86.  Maybe you want to use something
> like gc.dg/nop.h?  You could use what __builtin_trap generates, but you
> want asm of course.

The test case is in g++.target/i386/ for this reason.  The platform-
specific part is throwing from a signal handler, that needs to be
supported by the runtime environment.  It works on Linux but not on
mingw-w64 or msys, for example.  I am not sure if it's better to set
xfail for all platforms except *-linux-*, or if each platform where it
is expected to fail should be listed individually (and frankly, I'm not
sure how either of this is done in DejaGNU syntax).

>> +++ b/gcc/tree-eh.c
>> @@ -2077,6 +2077,8 @@ lower_eh_constructs_2 (struct leh_state *state, gimple_stmt_iterator *gsi)
>> +    record_throwing_stmt:
>>         /* Look for things that can throw exceptions, and record them.  */
>>         if (state->cur_region && stmt_could_throw_p (cfun, stmt))
> 
> The indent of the label is incorrect (it should be flush left, but some
> emacs users like one space, instead; same indent as the case labels
> isn't correct).

Will edit that, thanks.
Marek Polacek March 7, 2020, 8:48 p.m. UTC | #3
On Sat, Mar 07, 2020 at 09:43:59PM +0100, J.W. Jagersma wrote:
> What does stage 1 refer to?  I'm sorry, this is my first gcc patch and
> I'm still learning how this all works.

No worries, you'll get there.  You can read more about that here:
https://gcc.gnu.org/develop.html

Marek
Gerald Pfeifer March 7, 2020, 8:52 p.m. UTC | #4
Hi J.W.,

On Sat, 7 Mar 2020, J.W. Jagersma wrote:
>> This should wait for stage 1, IMO.  Looks pretty good to me, thanks!
> What does stage 1 refer to?  I'm sorry, this is my first gcc patch and
> I'm still learning how this all works.

we have tried to cover this at https://gcc.gnu.org/develop.html .

I hope this proves interesting reading (it's not too long), and
if you have any questions or suggestions, let us know!

Gerald
J.W. Jagersma March 7, 2020, 9:06 p.m. UTC | #5
On 2020-03-07 21:48, Marek Polacek wrote:
> On Sat, Mar 07, 2020 at 09:43:59PM +0100, J.W. Jagersma wrote:
>> What does stage 1 refer to?  I'm sorry, this is my first gcc patch and
>> I'm still learning how this all works.
> 
> No worries, you'll get there.  You can read more about that here:
> https://gcc.gnu.org/develop.html
> 
> Marek

On 2020-03-07 21:52, Gerald Pfeifer wrote:
> Hi J.W.,
> 
> On Sat, 7 Mar 2020, J.W. Jagersma wrote:
>>> This should wait for stage 1, IMO.  Looks pretty good to me, thanks!
>> What does stage 1 refer to?  I'm sorry, this is my first gcc patch and
>> I'm still learning how this all works.
> 
> we have tried to cover this at https://gcc.gnu.org/develop.html .
> 
> I hope this proves interesting reading (it's not too long), and
> if you have any questions or suggestions, let us know!
> 
> Gerald

Thanks Marek and Gerald, I hadn't read that page yet.  So do I resubmit
the final patch during stage 1, or will it end up in a queue to be
included when stage 1 for gcc 11 (or 10.2) begins?
Segher Boessenkool March 7, 2020, 11:03 p.m. UTC | #6
[ I didn't get any of the intervening replies, huh. ]

On Sat, Mar 07, 2020 at 10:06:46PM +0100, J.W. Jagersma wrote:
> On 2020-03-07 21:52, Gerald Pfeifer wrote:
> >On Sat, 7 Mar 2020, J.W. Jagersma wrote:
> >>>This should wait for stage 1, IMO.  Looks pretty good to me, thanks!
> >>What does stage 1 refer to?  I'm sorry, this is my first gcc patch and
> >>I'm still learning how this all works.
> >
> >we have tried to cover this at https://gcc.gnu.org/develop.html .
> >
> >I hope this proves interesting reading (it's not too long), and
> >if you have any questions or suggestions, let us know!
> 
> Thanks Marek and Gerald, I hadn't read that page yet.  So do I resubmit
> the final patch during stage 1, or will it end up in a queue to be
> included when stage 1 for gcc 11 (or 10.2) begins?

It will be handled for GCC 11.  Please ping the patch when GCC 11 has
opened (if necessary)?  We can of course keep discussing it now (if
there are any open questions left.


Segher
J.W. Jagersma March 8, 2020, 4:18 p.m. UTC | #7
On 2020-03-08 00:03, Segher Boessenkool wrote:
> On Sat, Mar 07, 2020 at 10:06:46PM +0100, J.W. Jagersma wrote:
>> Thanks Marek and Gerald, I hadn't read that page yet.  So do I resubmit
>> the final patch during stage 1, or will it end up in a queue to be
>> included when stage 1 for gcc 11 (or 10.2) begins?
> 
> It will be handled for GCC 11.  Please ping the patch when GCC 11 has
> opened (if necessary)?  We can of course keep discussing it now (if
> there are any open questions left.

Okay thanks, I'll keep an eye on it.

The only question I have left now is if my proposed change to the
documentation is acceptable or should be expanded/reworded.  Any other
feedback is welcome too, of course.
The test cases I think I've sorted out now.  I added a target-
independent test that scans the assembly for a catch block.  I think
that is a reliable method to confirm that EH information is being
generated.  The target-dependent test is set to xfail on anything
except *-linux-gnu, and only runs on x86/amd64 targets.

There is also still the question of whether non-volatile asms should be
allowed to throw or not.  I don't know if that should be discussed here
or on the PR.
Richard Sandiford March 9, 2020, 12:13 p.m. UTC | #8
Thanks for doing this.

"J.W. Jagersma" <jwjagersma@gmail.com> writes:
> On 2020-03-07 20:20, Segher Boessenkool wrote:
>> Hi!
>> 
>> On Sat, Mar 07, 2020 at 06:12:45PM +0100, J.W. Jagersma wrote:
>>> The following patch extends the generation of exception handling
>>> information to cover volatile asms too.  This was already mostly
>>> implemented, and only minor changes are required in order to make it
>>> work.
>> 
>> This should wait for stage 1, IMO.  Looks pretty good to me, thanks!
>
> Hi, thanks for your response.
> What does stage 1 refer to?  I'm sorry, this is my first gcc patch and
> I'm still learning how this all works.
>
>> Some comments:
>> 
>>> +When non-call exceptions (@option{-fnon-call-exceptions}) are enabled, a
>>> +@code{volatile asm} statement is also allowed to throw exceptions.  If it
>>> +does, then the compiler assumes that its output operands have not been written
>>> +yet.
>> 
>> That reads as if the compiler assumes the outputs retain their original
>> value, but that isn't true (I hope!)  The compiler assumes the output
>> are clobbered, but it doesn't assume they are assigned any definite
>> value?
>
> Register outputs are assumed to be clobbered, yes.  For memory outputs
> this is not the case, if the asm writes it before throwing then the
> memory operand retains this value.  It should be the user's
> responsibility to ensure that an asm has no side-effects if it throws.

I guess one problem is that something like "=&m" explicitly says that
the operand is clobbered "early", and I think it would be fair for
"early" to include clobbers before the exception.  So IMO we should
allow at least early-clobbered memory outputs to be clobbered by the
exception.

And if we do that, then I'm not sure there's much benefit in trying to
treat the non-earlyclobber memory case specially.

It would be good to have testcases for the output cases.  E.g. for:

        int foo;
        int bar = 0;
        try
          {
            foo = 1;
            asm volatile ("..." : "=m" (foo));
          }
        catch (...whatever...)
          {
            bar = foo;
          }
        ...use bar...

What does "bar = foo" read?  Is it always undefined behaviour if executed?
Or does it always read "foo" from memory?  Can it be optimised to "bar = 1"?
Is "foo = 1" dead code?

Thanks,
Richard
Richard Biener March 9, 2020, 12:54 p.m. UTC | #9
On Mon, Mar 9, 2020 at 1:14 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Thanks for doing this.
>
> "J.W. Jagersma" <jwjagersma@gmail.com> writes:
> > On 2020-03-07 20:20, Segher Boessenkool wrote:
> >> Hi!
> >>
> >> On Sat, Mar 07, 2020 at 06:12:45PM +0100, J.W. Jagersma wrote:
> >>> The following patch extends the generation of exception handling
> >>> information to cover volatile asms too.  This was already mostly
> >>> implemented, and only minor changes are required in order to make it
> >>> work.
> >>
> >> This should wait for stage 1, IMO.  Looks pretty good to me, thanks!
> >
> > Hi, thanks for your response.
> > What does stage 1 refer to?  I'm sorry, this is my first gcc patch and
> > I'm still learning how this all works.
> >
> >> Some comments:
> >>
> >>> +When non-call exceptions (@option{-fnon-call-exceptions}) are enabled, a
> >>> +@code{volatile asm} statement is also allowed to throw exceptions.  If it
> >>> +does, then the compiler assumes that its output operands have not been written
> >>> +yet.
> >>
> >> That reads as if the compiler assumes the outputs retain their original
> >> value, but that isn't true (I hope!)  The compiler assumes the output
> >> are clobbered, but it doesn't assume they are assigned any definite
> >> value?
> >
> > Register outputs are assumed to be clobbered, yes.  For memory outputs
> > this is not the case, if the asm writes it before throwing then the
> > memory operand retains this value.  It should be the user's
> > responsibility to ensure that an asm has no side-effects if it throws.
>
> I guess one problem is that something like "=&m" explicitly says that
> the operand is clobbered "early", and I think it would be fair for
> "early" to include clobbers before the exception.  So IMO we should
> allow at least early-clobbered memory outputs to be clobbered by the
> exception.

I think memory operands are fine - my original concern was about
register outputs and SSA form that should reflect the correct def
on the EH vs non-EH edge.  From a "miscompile" perspective
doing nothig which means pretending the asm actually set the output
could lead us to false DCE of the old value:

        int foo = 0
        try
          {
            asm volatile ("..." : "=r" (foo));
          }
        catch (...whatever...)
          {
            foo should be still zero, but SSA doesn't have the correct use here
          }

that means the compiler really assumes the asm will populate the outputs
even when it throws.

Test coverage for that would be nice.

> And if we do that, then I'm not sure there's much benefit in trying to
> treat the non-earlyclobber memory case specially.
>
> It would be good to have testcases for the output cases.  E.g. for:
>
>         int foo;
>         int bar = 0;
>         try
>           {
>             foo = 1;
>             asm volatile ("..." : "=m" (foo));
>           }
>         catch (...whatever...)
>           {
>             bar = foo;
>           }
>         ...use bar...
>
> What does "bar = foo" read?  Is it always undefined behaviour if executed?
> Or does it always read "foo" from memory?  Can it be optimised to "bar = 1"?
> Is "foo = 1" dead code?
>
> Thanks,
> Richard
J.W. Jagersma March 9, 2020, 3:49 p.m. UTC | #10
On 2020-03-09 13:13, Richard Sandiford wrote:
> Thanks for doing this.

Hi Richard, thanks for your response.

> "J.W. Jagersma" <jwjagersma@gmail.com> writes:
>> On 2020-03-07 20:20, Segher Boessenkool wrote:
>>> Some comments:
>>>
>>>> +When non-call exceptions (@option{-fnon-call-exceptions}) are enabled, a
>>>> +@code{volatile asm} statement is also allowed to throw exceptions.  If it
>>>> +does, then the compiler assumes that its output operands have not been written
>>>> +yet.
>>>
>>> That reads as if the compiler assumes the outputs retain their original
>>> value, but that isn't true (I hope!)  The compiler assumes the output
>>> are clobbered, but it doesn't assume they are assigned any definite
>>> value?
>>
>> Register outputs are assumed to be clobbered, yes.  For memory outputs
>> this is not the case, if the asm writes it before throwing then the
>> memory operand retains this value.  It should be the user's
>> responsibility to ensure that an asm has no side-effects if it throws.
> 
> I guess one problem is that something like "=&m" explicitly says that
> the operand is clobbered "early", and I think it would be fair for
> "early" to include clobbers before the exception.  So IMO we should
> allow at least early-clobbered memory outputs to be clobbered by the
> exception.

Is "=&m" not equivalent to "=m" in every case?  As I understand it, the
earlyclobber modifier is only relevant for register outputs.  It does
not specify anything about clobbering the output itself, it only says
that an input could be clobbered if it is allocated in the same
register (or if a memory input uses this register as index).

> And if we do that, then I'm not sure there's much benefit in trying to
> treat the non-earlyclobber memory case specially.
> 
> It would be good to have testcases for the output cases.  E.g. for:
> 
>         int foo;
>         int bar = 0;
>         try
>           {
>             foo = 1;
>             asm volatile ("..." : "=m" (foo));
>           }
>         catch (...whatever...)
>           {
>             bar = foo;
>           }
>         ...use bar...
> 
> What does "bar = foo" read?  Is it always undefined behaviour if executed?
> Or does it always read "foo" from memory?  Can it be optimised to "bar = 1"?
> Is "foo = 1" dead code?

These are very good points.  But I am not sure how to test for all of
these.  My test case now looks as follows:

    // PR inline-asm/93981
    // { dg-do run }
    // { dg-options "-fnon-call-exceptions -O3" }
    // { dg-xfail-run-if "" { ! *-linux-gnu } }

    #include <csignal>

    struct illegal_opcode { };

    extern "C" void
    sigill (int)
    {
      throw illegal_opcode ( );
    }

    int
    test_mem ()
    {
      int i = 2;
      try
        {
          asm volatile ("mov%z0 $1, %0; ud2" : "=m" (i));
        }
      catch (const illegal_opcode&)
        {
          if (i == 1) return 0;
        }
      return i;
    }

    int
    test_reg ()
    {
      int i = 2;
      try
        {
          asm volatile ("mov%z0 $1, %0; ud2" : "=r" (i));
        }
      catch (const illegal_opcode&)
        {
          if (i == 2) return 0;
        }
      return i;
    }

    int
    main ()
    {
      std::signal (SIGILL, sigill);
      return test_reg () + test_mem ();
    }

I think that should cover most of it.  Am I missing anything?
Segher Boessenkool March 9, 2020, 6:01 p.m. UTC | #11
Hi!

On Mon, Mar 09, 2020 at 01:54:53PM +0100, Richard Biener wrote:
> I think memory operands are fine - my original concern was about
> register outputs and SSA form that should reflect the correct def
> on the EH vs non-EH edge.  From a "miscompile" perspective
> doing nothig which means pretending the asm actually set the output
> could lead us to false DCE of the old value:
> 
>         int foo = 0
>         try
>           {
>             asm volatile ("..." : "=r" (foo));
>           }
>         catch (...whatever...)
>           {
>             foo should be still zero, but SSA doesn't have the correct use here
>           }
> 
> that means the compiler really assumes the asm will populate the outputs
> even when it throws.

How is memory any different here?  In both cases you do not know if it
is the old value or some new value in foo, after it threw an exception.


Segher
J.W. Jagersma March 9, 2020, 6:42 p.m. UTC | #12
On 2020-03-09 19:01, Segher Boessenkool wrote:
> Hi!
> 
> On Mon, Mar 09, 2020 at 01:54:53PM +0100, Richard Biener wrote:
>> I think memory operands are fine - my original concern was about
>> register outputs and SSA form that should reflect the correct def
>> on the EH vs non-EH edge.  From a "miscompile" perspective
>> doing nothig which means pretending the asm actually set the output
>> could lead us to false DCE of the old value:
>>
>>         int foo = 0
>>         try
>>           {
>>             asm volatile ("..." : "=r" (foo));
>>           }
>>         catch (...whatever...)
>>           {
>>             foo should be still zero, but SSA doesn't have the correct use here
>>           }
>>
>> that means the compiler really assumes the asm will populate the outputs
>> even when it throws.
> 
> How is memory any different here?  In both cases you do not know if it
> is the old value or some new value in foo, after it threw an exception.
> 
> 
> Segher

If foo were a memory operand, the compiler makes no assumptions about
its value.  When you reference it in the catch block it is always read
back from memory.  Only register operands are clobbered and retain
their previous value.  If you compile such an example with -O3, you'll
see that the initial "int foo = 0;" is eliminated from the normal code
path.  It is only set to 0 in the catch block.
Segher Boessenkool March 9, 2020, 10:10 p.m. UTC | #13
Hi!

On Sun, Mar 08, 2020 at 05:18:21PM +0100, J.W. Jagersma wrote:
> The only question I have left now is if my proposed change to the
> documentation is acceptable or should be expanded/reworded.

I think it is fine.  It's easier to judge if you look fresh at it, in
my experience, so I might think differently at stage1 time ;-)

> There is also still the question of whether non-volatile asms should be
> allowed to throw or not.  I don't know if that should be discussed here
> or on the PR.

I think you should just allow it.  But almost always an asm that can
throw *should* be volatile, or else the compiler might optimise it away
in unexpected cases, etc.


Segher
Segher Boessenkool March 10, 2020, 12:25 a.m. UTC | #14
On Mon, Mar 09, 2020 at 07:42:20PM +0100, J.W. Jagersma wrote:
> On 2020-03-09 19:01, Segher Boessenkool wrote:
> > On Mon, Mar 09, 2020 at 01:54:53PM +0100, Richard Biener wrote:
> >>         int foo = 0
> >>         try
> >>           {
> >>             asm volatile ("..." : "=r" (foo));
> >>           }
> >>         catch (...whatever...)
> >>           {
> >>             foo should be still zero, but SSA doesn't have the correct use here
> >>           }
> >>
> >> that means the compiler really assumes the asm will populate the outputs
> >> even when it throws.
> > 
> > How is memory any different here?  In both cases you do not know if it
> > is the old value or some new value in foo, after it threw an exception.
> 
> If foo were a memory operand, the compiler makes no assumptions about
> its value.  When you reference it in the catch block it is always read
> back from memory.  Only register operands are clobbered and retain
> their previous value.  If you compile such an example with -O3, you'll
> see that the initial "int foo = 0;" is eliminated from the normal code
> path.  It is only set to 0 in the catch block.

My question is *why* that is, and/or what in our code makes that so :-)


Segher
J.W. Jagersma March 10, 2020, 1:10 a.m. UTC | #15
On 2020-03-09 23:10, Segher Boessenkool wrote:
> Hi!
> 
> On Sun, Mar 08, 2020 at 05:18:21PM +0100, J.W. Jagersma wrote:
>> There is also still the question of whether non-volatile asms should be
>> allowed to throw or not.  I don't know if that should be discussed here
>> or on the PR.
> 
> I think you should just allow it.  But almost always an asm that can
> throw *should* be volatile, or else the compiler might optimise it away
> in unexpected cases, etc.

I do think that allowing it for all asms is the best option.  My
initial idea was that the compiler could be made to expect exceptions
only from asms that take memory operands, but after giving that some
more thought, I don't think it's feasible.  So either allow all asms to
throw, or restrict this to volatile only.  Personally I'm in favor of
the former.  If an asm could throw, but does otherwise not have any
side effects or useful outputs, then it should be moved or optimized
away.

However I don't know if there is any performance cost associated with
this.  It is generally said that exceptions don't have any runtime
overhead if you don't throw them, I don't know if that is the full
story.  If certain optimizations are disabled for throwing statements,
then I expect there would be some resistance to this proposal.
J.W. Jagersma March 10, 2020, 1:12 a.m. UTC | #16
On 2020-03-10 01:25, Segher Boessenkool wrote:
> On Mon, Mar 09, 2020 at 07:42:20PM +0100, J.W. Jagersma wrote:
>> On 2020-03-09 19:01, Segher Boessenkool wrote:
>>> On Mon, Mar 09, 2020 at 01:54:53PM +0100, Richard Biener wrote:
>>>>         int foo = 0
>>>>         try
>>>>           {
>>>>             asm volatile ("..." : "=r" (foo));
>>>>           }
>>>>         catch (...whatever...)
>>>>           {
>>>>             foo should be still zero, but SSA doesn't have the correct use here
>>>>           }
>>>>
>>>> that means the compiler really assumes the asm will populate the outputs
>>>> even when it throws.
>>>
>>> How is memory any different here?  In both cases you do not know if it
>>> is the old value or some new value in foo, after it threw an exception.
>>
>> If foo were a memory operand, the compiler makes no assumptions about
>> its value.  When you reference it in the catch block it is always read
>> back from memory.  Only register operands are clobbered and retain
>> their previous value.  If you compile such an example with -O3, you'll
>> see that the initial "int foo = 0;" is eliminated from the normal code
>> path.  It is only set to 0 in the catch block.
> 
> My question is *why* that is, and/or what in our code makes that so :-)
> 
> 
> Segher

The code I added in tree-eh.c does that.  Register outputs are assigned
to a temporary variable, which is discarded when the asm throws.  As I
said it's not possible to do this for memory operands too.  So, if this
behavior should be consistent for both types of operands, then you'd
have to assume that register outputs are valid after throwing.
Richard Sandiford March 12, 2020, 9:59 a.m. UTC | #17
"J.W. Jagersma" <jwjagersma@gmail.com> writes:
> On 2020-03-09 13:13, Richard Sandiford wrote:
>> Thanks for doing this.
>
> Hi Richard, thanks for your response.
>
>> "J.W. Jagersma" <jwjagersma@gmail.com> writes:
>>> On 2020-03-07 20:20, Segher Boessenkool wrote:
>>>> Some comments:
>>>>
>>>>> +When non-call exceptions (@option{-fnon-call-exceptions}) are enabled, a
>>>>> +@code{volatile asm} statement is also allowed to throw exceptions.  If it
>>>>> +does, then the compiler assumes that its output operands have not been written
>>>>> +yet.
>>>>
>>>> That reads as if the compiler assumes the outputs retain their original
>>>> value, but that isn't true (I hope!)  The compiler assumes the output
>>>> are clobbered, but it doesn't assume they are assigned any definite
>>>> value?
>>>
>>> Register outputs are assumed to be clobbered, yes.  For memory outputs
>>> this is not the case, if the asm writes it before throwing then the
>>> memory operand retains this value.  It should be the user's
>>> responsibility to ensure that an asm has no side-effects if it throws.
>> 
>> I guess one problem is that something like "=&m" explicitly says that
>> the operand is clobbered "early", and I think it would be fair for
>> "early" to include clobbers before the exception.  So IMO we should
>> allow at least early-clobbered memory outputs to be clobbered by the
>> exception.
>
> Is "=&m" not equivalent to "=m" in every case?  As I understand it, the
> earlyclobber modifier is only relevant for register outputs.  It does
> not specify anything about clobbering the output itself, it only says
> that an input could be clobbered if it is allocated in the same
> register (or if a memory input uses this register as index).
>
>> And if we do that, then I'm not sure there's much benefit in trying to
>> treat the non-earlyclobber memory case specially.
>> 
>> It would be good to have testcases for the output cases.  E.g. for:
>> 
>>         int foo;
>>         int bar = 0;
>>         try
>>           {
>>             foo = 1;
>>             asm volatile ("..." : "=m" (foo));
>>           }
>>         catch (...whatever...)
>>           {
>>             bar = foo;
>>           }
>>         ...use bar...
>> 
>> What does "bar = foo" read?  Is it always undefined behaviour if executed?
>> Or does it always read "foo" from memory?  Can it be optimised to "bar = 1"?
>> Is "foo = 1" dead code?
>
> These are very good points.  But I am not sure how to test for all of
> these.  My test case now looks as follows:
>
>     // PR inline-asm/93981
>     // { dg-do run }
>     // { dg-options "-fnon-call-exceptions -O3" }
>     // { dg-xfail-run-if "" { ! *-linux-gnu } }
>
>     #include <csignal>
>
>     struct illegal_opcode { };
>
>     extern "C" void
>     sigill (int)
>     {
>       throw illegal_opcode ( );
>     }
>
>     int
>     test_mem ()
>     {
>       int i = 2;
>       try
>         {
>           asm volatile ("mov%z0 $1, %0; ud2" : "=m" (i));
>         }
>       catch (const illegal_opcode&)
>         {
>           if (i == 1) return 0;
>         }
>       return i;
>     }
>
>     int
>     test_reg ()
>     {
>       int i = 2;
>       try
>         {
>           asm volatile ("mov%z0 $1, %0; ud2" : "=r" (i));
>         }
>       catch (const illegal_opcode&)
>         {
>           if (i == 2) return 0;
>         }
>       return i;
>     }
>
>     int
>     main ()
>     {
>       std::signal (SIGILL, sigill);
>       return test_reg () + test_mem ();
>     }
>
> I think that should cover most of it.  Am I missing anything?

The other case I mentioned was equivalent to:

     int
     test_mem2 ()
     {
       int i = 2;
       try
         {
           asm volatile ("ud2; mov%z0 $1, %0" : "=m" (i));
         }
       catch (const illegal_opcode&)
         {
           if (i == 2) return 0;
         }
       return i;
     }

Is this test expected to pass?

However, these "mem" tests are testing gimple register types, so they'll
still be SSA names outside of the asm.  It would also be good to test what
happens for non-register types, e.g. something like:

     int
     test_mem3 ()
     {
       int i[5] = { 1, 2, 3, 4, 5 };
       try
         {
           asm volatile ("ud2; <insert asm here>" : "=m" (i));
         }
       catch (const illegal_opcode&)
         {
           if (i[0] == 1 && ...) return 0;
         }
       return ...;
     }

and similarly for ud2 after the store.

It wasn't clear from my message above, but: I was mostly worried about
requiring the asm to treat memory operands in a certain way when the
exception is thrown.  IMO it would be better to say that the values of
memory operands are undefined on the exception edge.

Thanks,
Richard
Li, Pan2 via Gcc-patches March 12, 2020, 1:08 p.m. UTC | #18
On Thu, Mar 12, 2020 at 11:01 AM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> "J.W. Jagersma" <jwjagersma@gmail.com> writes:
> > On 2020-03-09 13:13, Richard Sandiford wrote:
> >> Thanks for doing this.
> >
> > Hi Richard, thanks for your response.
> >
> >> "J.W. Jagersma" <jwjagersma@gmail.com> writes:
> >>> On 2020-03-07 20:20, Segher Boessenkool wrote:
> >>>> Some comments:
> >>>>
> >>>>> +When non-call exceptions (@option{-fnon-call-exceptions}) are enabled, a
> >>>>> +@code{volatile asm} statement is also allowed to throw exceptions.  If it
> >>>>> +does, then the compiler assumes that its output operands have not been written
> >>>>> +yet.
> >>>>
> >>>> That reads as if the compiler assumes the outputs retain their original
> >>>> value, but that isn't true (I hope!)  The compiler assumes the output
> >>>> are clobbered, but it doesn't assume they are assigned any definite
> >>>> value?
> >>>
> >>> Register outputs are assumed to be clobbered, yes.  For memory outputs
> >>> this is not the case, if the asm writes it before throwing then the
> >>> memory operand retains this value.  It should be the user's
> >>> responsibility to ensure that an asm has no side-effects if it throws.
> >>
> >> I guess one problem is that something like "=&m" explicitly says that
> >> the operand is clobbered "early", and I think it would be fair for
> >> "early" to include clobbers before the exception.  So IMO we should
> >> allow at least early-clobbered memory outputs to be clobbered by the
> >> exception.
> >
> > Is "=&m" not equivalent to "=m" in every case?  As I understand it, the
> > earlyclobber modifier is only relevant for register outputs.  It does
> > not specify anything about clobbering the output itself, it only says
> > that an input could be clobbered if it is allocated in the same
> > register (or if a memory input uses this register as index).
> >
> >> And if we do that, then I'm not sure there's much benefit in trying to
> >> treat the non-earlyclobber memory case specially.
> >>
> >> It would be good to have testcases for the output cases.  E.g. for:
> >>
> >>         int foo;
> >>         int bar = 0;
> >>         try
> >>           {
> >>             foo = 1;
> >>             asm volatile ("..." : "=m" (foo));
> >>           }
> >>         catch (...whatever...)
> >>           {
> >>             bar = foo;
> >>           }
> >>         ...use bar...
> >>
> >> What does "bar = foo" read?  Is it always undefined behaviour if executed?
> >> Or does it always read "foo" from memory?  Can it be optimised to "bar = 1"?
> >> Is "foo = 1" dead code?
> >
> > These are very good points.  But I am not sure how to test for all of
> > these.  My test case now looks as follows:
> >
> >     // PR inline-asm/93981
> >     // { dg-do run }
> >     // { dg-options "-fnon-call-exceptions -O3" }
> >     // { dg-xfail-run-if "" { ! *-linux-gnu } }
> >
> >     #include <csignal>
> >
> >     struct illegal_opcode { };
> >
> >     extern "C" void
> >     sigill (int)
> >     {
> >       throw illegal_opcode ( );
> >     }
> >
> >     int
> >     test_mem ()
> >     {
> >       int i = 2;
> >       try
> >         {
> >           asm volatile ("mov%z0 $1, %0; ud2" : "=m" (i));
> >         }
> >       catch (const illegal_opcode&)
> >         {
> >           if (i == 1) return 0;
> >         }
> >       return i;
> >     }
> >
> >     int
> >     test_reg ()
> >     {
> >       int i = 2;
> >       try
> >         {
> >           asm volatile ("mov%z0 $1, %0; ud2" : "=r" (i));
> >         }
> >       catch (const illegal_opcode&)
> >         {
> >           if (i == 2) return 0;
> >         }
> >       return i;
> >     }
> >
> >     int
> >     main ()
> >     {
> >       std::signal (SIGILL, sigill);
> >       return test_reg () + test_mem ();
> >     }
> >
> > I think that should cover most of it.  Am I missing anything?
>
> The other case I mentioned was equivalent to:
>
>      int
>      test_mem2 ()
>      {
>        int i = 2;
>        try
>          {
>            asm volatile ("ud2; mov%z0 $1, %0" : "=m" (i));
>          }
>        catch (const illegal_opcode&)
>          {
>            if (i == 2) return 0;
>          }
>        return i;
>      }
>
> Is this test expected to pass?
>
> However, these "mem" tests are testing gimple register types, so they'll
> still be SSA names outside of the asm.  It would also be good to test what
> happens for non-register types, e.g. something like:
>
>      int
>      test_mem3 ()
>      {
>        int i[5] = { 1, 2, 3, 4, 5 };
>        try
>          {
>            asm volatile ("ud2; <insert asm here>" : "=m" (i));
>          }
>        catch (const illegal_opcode&)
>          {
>            if (i[0] == 1 && ...) return 0;
>          }
>        return ...;
>      }
>
> and similarly for ud2 after the store.
>
> It wasn't clear from my message above, but: I was mostly worried about
> requiring the asm to treat memory operands in a certain way when the
> exception is thrown.  IMO it would be better to say that the values of
> memory operands are undefined on the exception edge.

Rather unspecified.  So IMHO on the exception edge the asm() should
still appear as a def for all outputs so the compiler cannot derive any
actual values for them.  That of course also means that they must not
appear to be must-defs since we may not DSE earlier stores for example.

If we manage to get the unspecified values correct in SSA then we don't
need to say whether the asm may clobber them before or after throwing.

Richard.

> Thanks,
> Richard
Segher Boessenkool March 12, 2020, 3:32 p.m. UTC | #19
On Thu, Mar 12, 2020 at 02:08:18PM +0100, Richard Biener wrote:
> > It wasn't clear from my message above, but: I was mostly worried about
> > requiring the asm to treat memory operands in a certain way when the
> > exception is thrown.  IMO it would be better to say that the values of
> > memory operands are undefined on the exception edge.
> 
> Rather unspecified.  So IMHO on the exception edge the asm() should
> still appear as a def for all outputs so the compiler cannot derive any
> actual values for them.  That of course also means that they must not
> appear to be must-defs since we may not DSE earlier stores for example.

So make all outputs of an asm that may throw (w/ -fnon-call-exceptions)
inout operands instead?  That works for registers exactly the same, too?

> If we manage to get the unspecified values correct in SSA then we don't
> need to say whether the asm may clobber them before or after throwing.

Yeah.  Which users will *never* get right, that is, it would be hard to
use any such interface correctly.


Segher
Li, Pan2 via Gcc-patches March 12, 2020, 6:42 p.m. UTC | #20
On 2020-03-12 16:32, Segher Boessenkool wrote:
> On Thu, Mar 12, 2020 at 02:08:18PM +0100, Richard Biener wrote:
>>> It wasn't clear from my message above, but: I was mostly worried about
>>> requiring the asm to treat memory operands in a certain way when the
>>> exception is thrown.  IMO it would be better to say that the values of
>>> memory operands are undefined on the exception edge.
>>
>> Rather unspecified.  So IMHO on the exception edge the asm() should
>> still appear as a def for all outputs so the compiler cannot derive any
>> actual values for them.  That of course also means that they must not
>> appear to be must-defs since we may not DSE earlier stores for example.
> 
> So make all outputs of an asm that may throw (w/ -fnon-call-exceptions)
> inout operands instead?  That works for registers exactly the same, too?

Should gcc do this or would it be the user's responsibility?  For
memory operands I don't think anything changes if you replace "=m" by
"+m".  However changing each "=r" to "+r" would certainly generate less
optimal code.

>> If we manage to get the unspecified values correct in SSA then we don't
>> need to say whether the asm may clobber them before or after throwing.
> 
> Yeah.  Which users will *never* get right, that is, it would be hard to
> use any such interface correctly.
> 
> 
> Segher
>
Segher Boessenkool March 12, 2020, 7:26 p.m. UTC | #21
On Thu, Mar 12, 2020 at 07:42:30PM +0100, J.W. Jagersma wrote:
> On 2020-03-12 16:32, Segher Boessenkool wrote:
> > On Thu, Mar 12, 2020 at 02:08:18PM +0100, Richard Biener wrote:
> >>> It wasn't clear from my message above, but: I was mostly worried about
> >>> requiring the asm to treat memory operands in a certain way when the
> >>> exception is thrown.  IMO it would be better to say that the values of
> >>> memory operands are undefined on the exception edge.
> >>
> >> Rather unspecified.  So IMHO on the exception edge the asm() should
> >> still appear as a def for all outputs so the compiler cannot derive any
> >> actual values for them.  That of course also means that they must not
> >> appear to be must-defs since we may not DSE earlier stores for example.
> > 
> > So make all outputs of an asm that may throw (w/ -fnon-call-exceptions)
> > inout operands instead?  That works for registers exactly the same, too?
> 
> Should gcc do this or would it be the user's responsibility?  For
> memory operands I don't think anything changes if you replace "=m" by
> "+m".  However changing each "=r" to "+r" would certainly generate less
> optimal code.

[  "+m"(x)  means exactly the same as  "=m"(x) : "m"(x)  ]

Yes, but this is required for all operands that could keep their old
value on exceptions.  Since this will only happen with
-fnon-call-exceptions, which already has a significant performance cost,
I don't think adding just a tiny bit more is so bad?  The benefit is
that all asm will work, the user cannot do this wrong anymore.


Segher
Li, Pan2 via Gcc-patches March 12, 2020, 10:06 p.m. UTC | #22
On 2020-03-12 10:59, Richard Sandiford wrote:
> The other case I mentioned was equivalent to:
> 
>      int
>      test_mem2 ()
>      {
>        int i = 2;
>        try
>          {
>            asm volatile ("ud2; mov%z0 $1, %0" : "=m" (i));
>          }
>        catch (const illegal_opcode&)
>          {
>            if (i == 2) return 0;
>          }
>        return i;
>      }
> 
> Is this test expected to pass?

Good point.  Yes, this should pass, and I thought it did, but it seems
I was mistaken.  To fix that would require transforming "=m" into "+m"
as Segher suggested.

> However, these "mem" tests are testing gimple register types, so they'll
> still be SSA names outside of the asm.  It would also be good to test what
> happens for non-register types, e.g. something like:
> 
>      int
>      test_mem3 ()
>      {
>        int i[5] = { 1, 2, 3, 4, 5 };
>        try
>          {
>            asm volatile ("ud2; <insert asm here>" : "=m" (i));
>          }
>        catch (const illegal_opcode&)
>          {
>            if (i[0] == 1 && ...) return 0;
>          }
>        return ...;
>      }
> 
> and similarly for ud2 after the store.

I think I see what you mean.  Would such a test not also cover what the
current test_mem() function does?  If so then that could be removed.

Also in my current patch I used: (tree-eh.c:2104)

    if (tree_could_throw_p (opval)
        || !is_gimple_reg_type (TREE_TYPE (opval))
        || !is_gimple_reg (get_base_address (opval)))

to determine the difference between a register and memory type.  Could
there potentially be a case where that identifies an operand as gimple
register type, but ends up compiled as a memory operand to an asm?

> It wasn't clear from my message above, but: I was mostly worried about
> requiring the asm to treat memory operands in a certain way when the
> exception is thrown.  IMO it would be better to say that the values of
> memory operands are undefined on the exception edge.

I'm not sure about the semantic difference between undefined and
unspecified.  But gcc should not write to any memory after a throw,
because that write operation itself may have side effects.  Likewise
asm memory operands should not be redirected to a temporary for the
same reason, and also because gcc can't possibly know which parts of
an (offsettable) memory operand are written to.
Li, Pan2 via Gcc-patches March 13, 2020, 1:06 a.m. UTC | #23
On 2020-03-12 20:26, Segher Boessenkool wrote:
> On Thu, Mar 12, 2020 at 07:42:30PM +0100, J.W. Jagersma wrote:
>> On 2020-03-12 16:32, Segher Boessenkool wrote:
>>> On Thu, Mar 12, 2020 at 02:08:18PM +0100, Richard Biener wrote:
>>>>> It wasn't clear from my message above, but: I was mostly worried about
>>>>> requiring the asm to treat memory operands in a certain way when the
>>>>> exception is thrown.  IMO it would be better to say that the values of
>>>>> memory operands are undefined on the exception edge.
>>>>
>>>> Rather unspecified.  So IMHO on the exception edge the asm() should
>>>> still appear as a def for all outputs so the compiler cannot derive any
>>>> actual values for them.  That of course also means that they must not
>>>> appear to be must-defs since we may not DSE earlier stores for example.
>>>
>>> So make all outputs of an asm that may throw (w/ -fnon-call-exceptions)
>>> inout operands instead?  That works for registers exactly the same, too?
>>
>> Should gcc do this or would it be the user's responsibility?  For
>> memory operands I don't think anything changes if you replace "=m" by
>> "+m".  However changing each "=r" to "+r" would certainly generate less
>> optimal code.
> 
> [  "+m"(x)  means exactly the same as  "=m"(x) : "m"(x)  ]
> 
> Yes, but this is required for all operands that could keep their old
> value on exceptions.  Since this will only happen with
> -fnon-call-exceptions, which already has a significant performance cost,
> I don't think adding just a tiny bit more is so bad?  The benefit is
> that all asm will work, the user cannot do this wrong anymore.

I don't want to unnecessarily pessimize register outputs if it can be
avoided.  And even if you do change register outputs to in+out, they
are still more likely to contain some intermediate value that you would
want to discard on throwing.

If you think of it in terms of a function call, the closest equivalent
of register vs memory outputs would be:
    reg = f ();
vs.
    f (&mem);
If f() throws, then in the first case, 'reg' is never assigned to. For
the second case, you can't expect 'mem' to remain unmodified.

There is also the case of "=rm" and similar which have not been
discussed so far, but are (in my experience) the most common operand
constraint.  I think these can be handled as "=r" if they end up in
memory since they are never offsettable, and are unlikely (impossible?)
to have side-effects if written to.  So I think the following behavior
is the most sensible:

If the output *may* reside in a register (this covers any constraint
combination that includes a register constraint, so also "=rm", etc),
then it is assigned via a temporary and its value discarded after a
throw.
Only pure memory operands, strictly "=m", "=o", etc, are converted to
in+out and assumed to be valid after throwing.

Does that sound reasonable?
Richard Sandiford March 13, 2020, 1:31 p.m. UTC | #24
"J.W. Jagersma" <jwjagersma@gmail.com> writes:
> On 2020-03-12 10:59, Richard Sandiford wrote:
>> The other case I mentioned was equivalent to:
>> 
>>      int
>>      test_mem2 ()
>>      {
>>        int i = 2;
>>        try
>>          {
>>            asm volatile ("ud2; mov%z0 $1, %0" : "=m" (i));
>>          }
>>        catch (const illegal_opcode&)
>>          {
>>            if (i == 2) return 0;
>>          }
>>        return i;
>>      }
>> 
>> Is this test expected to pass?
>
> Good point.  Yes, this should pass, and I thought it did, but it seems
> I was mistaken.  To fix that would require transforming "=m" into "+m"
> as Segher suggested.
>
>> However, these "mem" tests are testing gimple register types, so they'll
>> still be SSA names outside of the asm.  It would also be good to test what
>> happens for non-register types, e.g. something like:
>> 
>>      int
>>      test_mem3 ()
>>      {
>>        int i[5] = { 1, 2, 3, 4, 5 };
>>        try
>>          {
>>            asm volatile ("ud2; <insert asm here>" : "=m" (i));
>>          }
>>        catch (const illegal_opcode&)
>>          {
>>            if (i[0] == 1 && ...) return 0;
>>          }
>>        return ...;
>>      }
>> 
>> and similarly for ud2 after the store.
>
> I think I see what you mean.  Would such a test not also cover what the
> current test_mem() function does?  If so then that could be removed.

I think it's better to have tests for both the is_gimple_reg_type case
and the !is_gimple_reg_type case, so keeping test_mem sounds better.

> Also in my current patch I used: (tree-eh.c:2104)
>
>     if (tree_could_throw_p (opval)
>         || !is_gimple_reg_type (TREE_TYPE (opval))
>         || !is_gimple_reg (get_base_address (opval)))
>
> to determine the difference between a register and memory type.  Could
> there potentially be a case where that identifies an operand as gimple
> register type, but ends up compiled as a memory operand to an asm?

The constraints can support both registers and memory, e.g. via "rm"
or "g".  I'm not sure off-hand what we do with those for gimple.

>> It wasn't clear from my message above, but: I was mostly worried about
>> requiring the asm to treat memory operands in a certain way when the
>> exception is thrown.  IMO it would be better to say that the values of
>> memory operands are undefined on the exception edge.
>
> I'm not sure about the semantic difference between undefined and
> unspecified.  But gcc should not write to any memory after a throw,
> because that write operation itself may have side effects.  Likewise
> asm memory operands should not be redirected to a temporary for the
> same reason, and also because gcc can't possibly know which parts of
> an (offsettable) memory operand are written to.

This might not be what you mean, but for:

int
f1 (void)
{
  int x = 1;
  asm volatile ("": "=m" (x));
  return x;
}

struct s { int i[5]; };
struct s
f2 (void)
{
  struct s x = { 1, 2, 3, 4, 5 };
  asm volatile ("": "=m" (x));
  return x;
}

we do delete "x = 1" for f1.   I think that's the expected behaviour.
We don't yet delete the initialisation in f2, but I think in principle
we could.

So the kind of contract I was advocating was:

- the compiler can't make any assumptions about what the asm does
  or doesn't do to output operands when an exception is raised

- the source code can't make any assumption about the values bound
  to output operands when an exception is raised

Thanks,
Richard
Segher Boessenkool March 13, 2020, 2:58 p.m. UTC | #25
Hi!

On Fri, Mar 13, 2020 at 01:31:19PM +0000, Richard Sandiford wrote:
> This might not be what you mean, but for:
> 
> int
> f1 (void)
> {
>   int x = 1;
>   asm volatile ("": "=m" (x));
>   return x;
> }
> 
> struct s { int i[5]; };
> struct s
> f2 (void)
> {
>   struct s x = { 1, 2, 3, 4, 5 };
>   asm volatile ("": "=m" (x));
>   return x;
> }
> 
> we do delete "x = 1" for f1.   I think that's the expected behaviour.
> We don't yet delete the initialisation in f2, but I think in principle
> we could.

Right.  And this is incorrect if the asm may throw.

> So the kind of contract I was advocating was:
> 
> - the compiler can't make any assumptions about what the asm does
>   or doesn't do to output operands when an exception is raised
> 
> - the source code can't make any assumption about the values bound
>   to output operands when an exception is raised

And the easiest (and only feasible?) way to do this is for the compiler
to automatically make an input for every output as well, imo.


Segher
Segher Boessenkool March 13, 2020, 3:55 p.m. UTC | #26
On Fri, Mar 13, 2020 at 02:06:13AM +0100, J.W. Jagersma wrote:
> I don't want to unnecessarily pessimize register outputs if it can be
> avoided.  And even if you do change register outputs to in+out, they
> are still more likely to contain some intermediate value that you would
> want to discard on throwing.

But it also may contain values that you should keep, if it threw the
exception on an earlier machine instruction.

> If you think of it in terms of a function call, the closest equivalent
> of register vs memory outputs would be:
>     reg = f ();
> vs.
>     f (&mem);
> If f() throws, then in the first case, 'reg' is never assigned to. For
> the second case, you can't expect 'mem' to remain unmodified.

But you also cannot expect it to be modified: the old value may remain
a live value.

> There is also the case of "=rm" and similar which have not been
> discussed so far, but are (in my experience) the most common operand
> constraint.

On some CISC targets, sure.  Not common on load-store architectures
(like most things from the last 30+ years).


Segher
Li, Pan2 via Gcc-patches March 13, 2020, 4:40 p.m. UTC | #27
On 2020-03-13 14:31, Richard Sandiford wrote:
> "J.W. Jagersma" <jwjagersma@gmail.com> writes:
>> On 2020-03-12 10:59, Richard Sandiford wrote:
>>> The other case I mentioned was equivalent to:
>>>
>>>      int
>>>      test_mem2 ()
>>>      {
>>>        int i = 2;
>>>        try
>>>          {
>>>            asm volatile ("ud2; mov%z0 $1, %0" : "=m" (i));
>>>          }
>>>        catch (const illegal_opcode&)
>>>          {
>>>            if (i == 2) return 0;
>>>          }
>>>        return i;
>>>      }
>>>
>>> Is this test expected to pass?
>>
>> Good point.  Yes, this should pass, and I thought it did, but it seems
>> I was mistaken.  To fix that would require transforming "=m" into "+m"
>> as Segher suggested.
>>
>>> However, these "mem" tests are testing gimple register types, so they'll
>>> still be SSA names outside of the asm.  It would also be good to test what
>>> happens for non-register types, e.g. something like:
>>>
>>>      int
>>>      test_mem3 ()
>>>      {
>>>        int i[5] = { 1, 2, 3, 4, 5 };
>>>        try
>>>          {
>>>            asm volatile ("ud2; <insert asm here>" : "=m" (i));
>>>          }
>>>        catch (const illegal_opcode&)
>>>          {
>>>            if (i[0] == 1 && ...) return 0;
>>>          }
>>>        return ...;
>>>      }
>>>
>>> and similarly for ud2 after the store.
>>
>> I think I see what you mean.  Would such a test not also cover what the
>> current test_mem() function does?  If so then that could be removed.
> 
> I think it's better to have tests for both the is_gimple_reg_type case
> and the !is_gimple_reg_type case, so keeping test_mem sounds better.

Especially because the initial 'int i = 2;' is currently eliminated
for gimple register types that end up in memory, I see.

>> Also in my current patch I used: (tree-eh.c:2104)
>>
>>     if (tree_could_throw_p (opval)
>>         || !is_gimple_reg_type (TREE_TYPE (opval))
>>         || !is_gimple_reg (get_base_address (opval)))
>>
>> to determine the difference between a register and memory type.  Could
>> there potentially be a case where that identifies an operand as gimple
>> register type, but ends up compiled as a memory operand to an asm?
> 
> The constraints can support both registers and memory, e.g. via "rm"
> or "g".  I'm not sure off-hand what we do with those for gimple.

In gimplify_asm_expr (gimplify.c:6281), I see that:

    if (!allows_reg && allows_mem)
      mark_addressable (TREE_VALUE (link));

So TREE_ADDRESSABLE I think is a good predicate to distinguish between
pure memory constraints and everything else.  I have changed the above
to:

       if (!allows_reg && allows_mem)
-	mark_addressable (TREE_VALUE (link));
+	{
+	  mark_addressable (TREE_VALUE (link));
+	  /* If non-call exceptions are enabled, mark each memory output
+	     as inout, so that previous assignments are not eliminated.  */
+	  if (cfun->can_throw_non_call_exceptions)
+	    is_inout = true;
+	}

So that memory operands are converted to in+out.  Then in
lower_eh_constructs_2 we can use TREE_ADDRESSABLE to find non-memory
operands:

+		if (tree_could_throw_p (opval)
+		    || TREE_ADDRESSABLE (opval))
+		  continue;

(tree_could_throw_p may be superfluous here)
Then I have a new test case that forces an "=rm" into memory, and then
confirm that it behaves the same as "=r":

+int
+rm ()
+{
+  int i = 2;
+  try
+    {
+      asm volatile ("mov%z0 $1, %0; ud2" : "=rm" (i) :
+                    : "eax", "ebx", "ecx", "edx", "edi", "esi"
+#ifdef __x86_64__
+                    , "r8", "r9", "r10", "r11", "r12", "r13", "r14", "r15"
+#endif
+                    );
+    }
+  catch (const illegal_opcode&)
+    {
+      if (i == 2) return 0;
+    }
+  return i;
+}

>>> It wasn't clear from my message above, but: I was mostly worried about
>>> requiring the asm to treat memory operands in a certain way when the
>>> exception is thrown.  IMO it would be better to say that the values of
>>> memory operands are undefined on the exception edge.
>>
>> I'm not sure about the semantic difference between undefined and
>> unspecified.  But gcc should not write to any memory after a throw,
>> because that write operation itself may have side effects.  Likewise
>> asm memory operands should not be redirected to a temporary for the
>> same reason, and also because gcc can't possibly know which parts of
>> an (offsettable) memory operand are written to.
> 
> This might not be what you mean, but for:
> 
> int
> f1 (void)
> {
>   int x = 1;
>   asm volatile ("": "=m" (x));
>   return x;
> }
> 
> struct s { int i[5]; };
> struct s
> f2 (void)
> {
>   struct s x = { 1, 2, 3, 4, 5 };
>   asm volatile ("": "=m" (x));
>   return x;
> }
> 
> we do delete "x = 1" for f1.   I think that's the expected behaviour.
> We don't yet delete the initialisation in f2, but I think in principle
> we could.
> 
> So the kind of contract I was advocating was:
> 
> - the compiler can't make any assumptions about what the asm does
>   or doesn't do to output operands when an exception is raised
> 
> - the source code can't make any assumption about the values bound
>   to output operands when an exception is raised
> 
> Thanks,
> Richard
> 

Right.  The compiler technically can't make any assumptions, but I
think it can at least provide consistent behavior.  The user can read
the asm, so they should be able to know exactly what happened when an
exception is thrown.
Richard Sandiford March 16, 2020, 5:47 p.m. UTC | #28
Segher Boessenkool <segher@kernel.crashing.org> writes:
> On Fri, Mar 13, 2020 at 01:31:19PM +0000, Richard Sandiford wrote:
>> This might not be what you mean, but for:
>> 
>> int
>> f1 (void)
>> {
>>   int x = 1;
>>   asm volatile ("": "=m" (x));
>>   return x;
>> }
>> 
>> struct s { int i[5]; };
>> struct s
>> f2 (void)
>> {
>>   struct s x = { 1, 2, 3, 4, 5 };
>>   asm volatile ("": "=m" (x));
>>   return x;
>> }
>> 
>> we do delete "x = 1" for f1.   I think that's the expected behaviour.
>> We don't yet delete the initialisation in f2, but I think in principle
>> we could.
>
> Right.  And this is incorrect if the asm may throw.

Well...

>
>> So the kind of contract I was advocating was:
>> 
>> - the compiler can't make any assumptions about what the asm does
>>   or doesn't do to output operands when an exception is raised
>> 
>> - the source code can't make any assumption about the values bound
>>   to output operands when an exception is raised

...with this interpretation, the deletions above would be correct even
if the asm throws.

> And the easiest (and only feasible?) way to do this is for the compiler
> to automatically make an input for every output as well, imo.

Modifying the asm like that feels a bit dangerous, And the other problem
still exists: he compiler might assume that the output isn't modified
unless the asm completes normally.

Thanks,
Richard
Segher Boessenkool March 16, 2020, 7:02 p.m. UTC | #29
On Mon, Mar 16, 2020 at 05:47:03PM +0000, Richard Sandiford wrote:
> Segher Boessenkool <segher@kernel.crashing.org> writes:
> >> we do delete "x = 1" for f1.   I think that's the expected behaviour.
> >> We don't yet delete the initialisation in f2, but I think in principle
> >> we could.
> >
> > Right.  And this is incorrect if the asm may throw.
> 
> Well...
> 
> >> So the kind of contract I was advocating was:
> >> 
> >> - the compiler can't make any assumptions about what the asm does
> >>   or doesn't do to output operands when an exception is raised
> >> 
> >> - the source code can't make any assumption about the values bound
> >>   to output operands when an exception is raised
> 
> ...with this interpretation, the deletions above would be correct even
> if the asm throws.

The write to "x" *before the asm* is deleted.  I cannot think of any
interpretation where that is correct (this does not involve inline asm
at all: it is deleting an observable side effect before the exception).

> > And the easiest (and only feasible?) way to do this is for the compiler
> > to automatically make an input for every output as well, imo.
> 
> Modifying the asm like that feels a bit dangerous,

Yes, obviously.  The other option is to accept that almost all existing
inline asm will have UB, with -fnon-call-exceptions.  I think that is
an even less desirable option.

> And the other problem
> still exists: he compiler might assume that the output isn't modified
> unless the asm completes normally.

I don't understand what this means?  As far as the compiler is concerned
any asm is just one instruction?  And it all executes completely always.
You need to do things with the constraints to tell the compiler it does
not know some of the values around.  If you have both an input and an
output for a variable, the compiler does not know what value is written
to it, and it might just be the one that was the input already (which is
the same effect as not writing it at all).


Segher
Richard Sandiford March 16, 2020, 7:46 p.m. UTC | #30
Segher Boessenkool <segher@kernel.crashing.org> writes:
> On Mon, Mar 16, 2020 at 05:47:03PM +0000, Richard Sandiford wrote:
>> Segher Boessenkool <segher@kernel.crashing.org> writes:
>> >> we do delete "x = 1" for f1.   I think that's the expected behaviour.
>> >> We don't yet delete the initialisation in f2, but I think in principle
>> >> we could.
>> >
>> > Right.  And this is incorrect if the asm may throw.
>> 
>> Well...
>> 
>> >> So the kind of contract I was advocating was:
>> >> 
>> >> - the compiler can't make any assumptions about what the asm does
>> >>   or doesn't do to output operands when an exception is raised
>> >> 
>> >> - the source code can't make any assumption about the values bound
>> >>   to output operands when an exception is raised
>> 
>> ...with this interpretation, the deletions above would be correct even
>> if the asm throws.
>
> The write to "x" *before the asm* is deleted.  I cannot think of any
> interpretation where that is correct (this does not involve inline asm
> at all: it is deleting an observable side effect before the exception).

It's correct under the contract above :-)

>> > And the easiest (and only feasible?) way to do this is for the compiler
>> > to automatically make an input for every output as well, imo.
>> 
>> Modifying the asm like that feels a bit dangerous,
>
> Yes, obviously.  The other option is to accept that almost all existing
> inline asm will have UB, with -fnon-call-exceptions.  I think that is
> an even less desirable option.
>
>> And the other problem
>> still exists: he compiler might assume that the output isn't modified
>> unless the asm completes normally.
>
> I don't understand what this means?  As far as the compiler is concerned
> any asm is just one instruction?  And it all executes completely always.
> You need to do things with the constraints to tell the compiler it does
> not know some of the values around.  If you have both an input and an
> output for a variable, the compiler does not know what value is written
> to it, and it might just be the one that was the input already (which is
> the same effect as not writing it at all).

Normally, for SSA names in something like:

  _1 = foo ()

the definition of _1 does not take place when foo throws.  Similarly
for non-call exceptions on other statements.  It sounds like what you're
describing requires the corresponding definition to happen for memory
outputs regardless of whether the asm throws or not, so that the memory
appears to change on both excecution paths.  Otherwise, the compiler
would be able to assume that the memory operand still has its original
value in the exception handler.

Richard
Segher Boessenkool March 16, 2020, 8:19 p.m. UTC | #31
On Mon, Mar 16, 2020 at 07:46:57PM +0000, Richard Sandiford wrote:
> > The write to "x" *before the asm* is deleted.  I cannot think of any
> > interpretation where that is correct (this does not involve inline asm
> > at all: it is deleting an observable side effect before the exception).
> 
> It's correct under the contract above :-)

That cannot be, because the generated code does the wrong thing before
the asm is executed already (and without UB there is no time travel).

> > Yes, obviously.  The other option is to accept that almost all existing
> > inline asm will have UB, with -fnon-call-exceptions.  I think that is
> > an even less desirable option.

(I'm assuming we do not want that option).

> Normally, for SSA names in something like:
> 
>   _1 = foo ()
> 
> the definition of _1 does not take place when foo throws.

But that is not how RTL works, afaik.


Segher
Michael Matz March 17, 2020, 3:32 p.m. UTC | #32
Hello,

On Mon, 16 Mar 2020, Richard Sandiford wrote:

> Segher Boessenkool <segher@kernel.crashing.org> writes:
> > On Mon, Mar 16, 2020 at 05:47:03PM +0000, Richard Sandiford wrote:
> >> Segher Boessenkool <segher@kernel.crashing.org> writes:
> >> >> we do delete "x = 1" for f1.   I think that's the expected behaviour.
> >> >> We don't yet delete the initialisation in f2, but I think in principle
> >> >> we could.
> >> >
> >> > Right.  And this is incorrect if the asm may throw.
> >> 
> >> Well...
> >> 
> >> >> So the kind of contract I was advocating was:
> >> >> 
> >> >> - the compiler can't make any assumptions about what the asm does
> >> >>   or doesn't do to output operands when an exception is raised
> >> >> 
> >> >> - the source code can't make any assumption about the values bound
> >> >>   to output operands when an exception is raised
> >> 
> >> ...with this interpretation, the deletions above would be correct even
> >> if the asm throws.
> >
> > The write to "x" *before the asm* is deleted.  I cannot think of any
> > interpretation where that is correct (this does not involve inline asm
> > at all: it is deleting an observable side effect before the exception).
> 
> It's correct under the contract above :-)
> 
> >> > And the easiest (and only feasible?) way to do this is for the compiler
> >> > to automatically make an input for every output as well, imo.
> >> 
> >> Modifying the asm like that feels a bit dangerous,
> >
> > Yes, obviously.  The other option is to accept that almost all existing
> > inline asm will have UB, with -fnon-call-exceptions.  I think that is
> > an even less desirable option.
> >
> >> And the other problem
> >> still exists: he compiler might assume that the output isn't modified
> >> unless the asm completes normally.
> >
> > I don't understand what this means?  As far as the compiler is concerned
> > any asm is just one instruction?  And it all executes completely always.
> > You need to do things with the constraints to tell the compiler it does
> > not know some of the values around.  If you have both an input and an
> > output for a variable, the compiler does not know what value is written
> > to it, and it might just be the one that was the input already (which is
> > the same effect as not writing it at all).
> 
> Normally, for SSA names in something like:
> 
>   _1 = foo ()
> 
> the definition of _1 does not take place when foo throws.

Mostly, but maybe we need to lift this somewhen.  E.g. when we support 
SSA form for non-registers; the actual return migth then be via invisible 
reference, and hence the result might be changed even if foo throws.  That 
also could happen right now for some return types depending on the 
architecture (think large float types).  Our workaround for some of these 
cases (where it's obvious that the result will lie in memory) is to put 
the real copy-out into an extra gimple insn and make the LHS be a 
temporary; but of course we don't want that with too large types.

> Similarly for non-call exceptions on other statements.  It sounds like 
> what you're describing requires the corresponding definition to happen 
> for memory outputs regardless of whether the asm throws or not, so that 
> the memory appears to change on both excecution paths.  Otherwise, the 
> compiler would be able to assume that the memory operand still has its 
> original value in the exception handler.

Well, it's both: on the exception path the compiler has to assume that the 
the value wasn't changed (so that former defines are regarded as dead) or 
that it already has changed (so that the effects the throwing 
"instruction" had on the result (if any) aren't lost).  The easiest for 
this is to regard the result place as also being an input.

(If broadened to all instructions under -fnon-call-exceptions, and not 
just to asms will have quite a bad effect on optimization capabilities, 
but I believe with enough force it's already possible now to construct 
miscompiling testcases with the right mixtures of return types and ABIs)


Ciao,
Michael.
Li, Pan2 via Gcc-patches March 18, 2020, 3:42 p.m. UTC | #33
Hi Michael, thanks for your response.

On 2020-03-17 16:32, Michael Matz wrote:
> Hello,
> 
> On Mon, 16 Mar 2020, Richard Sandiford wrote:
> 
>> Segher Boessenkool <segher@kernel.crashing.org> writes:
>>> On Mon, Mar 16, 2020 at 05:47:03PM +0000, Richard Sandiford wrote:
>>>> Segher Boessenkool <segher@kernel.crashing.org> writes:
>>>>>> we do delete "x = 1" for f1.   I think that's the expected behaviour.
>>>>>> We don't yet delete the initialisation in f2, but I think in principle
>>>>>> we could.
>>>>>
>>>>> Right.  And this is incorrect if the asm may throw.
>>>>
>>>> Well...
>>>>
>>>>>> So the kind of contract I was advocating was:
>>>>>>
>>>>>> - the compiler can't make any assumptions about what the asm does
>>>>>>   or doesn't do to output operands when an exception is raised
>>>>>>
>>>>>> - the source code can't make any assumption about the values bound
>>>>>>   to output operands when an exception is raised
>>>>
>>>> ...with this interpretation, the deletions above would be correct even
>>>> if the asm throws.
>>>
>>> The write to "x" *before the asm* is deleted.  I cannot think of any
>>> interpretation where that is correct (this does not involve inline asm
>>> at all: it is deleting an observable side effect before the exception).
>>
>> It's correct under the contract above :-)
>>
>>>>> And the easiest (and only feasible?) way to do this is for the compiler
>>>>> to automatically make an input for every output as well, imo.
>>>>
>>>> Modifying the asm like that feels a bit dangerous,
>>>
>>> Yes, obviously.  The other option is to accept that almost all existing
>>> inline asm will have UB, with -fnon-call-exceptions.  I think that is
>>> an even less desirable option.
>>>
>>>> And the other problem
>>>> still exists: he compiler might assume that the output isn't modified
>>>> unless the asm completes normally.
>>>
>>> I don't understand what this means?  As far as the compiler is concerned
>>> any asm is just one instruction?  And it all executes completely always.
>>> You need to do things with the constraints to tell the compiler it does
>>> not know some of the values around.  If you have both an input and an
>>> output for a variable, the compiler does not know what value is written
>>> to it, and it might just be the one that was the input already (which is
>>> the same effect as not writing it at all).
>>
>> Normally, for SSA names in something like:
>>
>>   _1 = foo ()
>>
>> the definition of _1 does not take place when foo throws.
> 
> Mostly, but maybe we need to lift this somewhen.  E.g. when we support 
> SSA form for non-registers; the actual return migth then be via invisible 
> reference, and hence the result might be changed even if foo throws.  That 
> also could happen right now for some return types depending on the 
> architecture (think large float types).  Our workaround for some of these 
> cases (where it's obvious that the result will lie in memory) is to put 
> the real copy-out into an extra gimple insn and make the LHS be a 
> temporary; but of course we don't want that with too large types.
> 
>> Similarly for non-call exceptions on other statements.  It sounds like 
>> what you're describing requires the corresponding definition to happen 
>> for memory outputs regardless of whether the asm throws or not, so that 
>> the memory appears to change on both excecution paths.  Otherwise, the 
>> compiler would be able to assume that the memory operand still has its 
>> original value in the exception handler.
> 
> Well, it's both: on the exception path the compiler has to assume that the 
> the value wasn't changed (so that former defines are regarded as dead) or 
> that it already has changed (so that the effects the throwing 
> "instruction" had on the result (if any) aren't lost).  The easiest for 
> this is to regard the result place as also being an input.

The way I see it, there are two options: either you use the outputs
when an exception is thrown, or you don't.

The first option is more or less what my first patch did, but it was
incomplete.  Changing each output to in+out would make that work
correctly.

The second is what I have implemented now, each output is assigned via
a temporary which is then assigned back to the variable bound to this
output.  On exception, this temporary is discarded.  However this is
not possible for asms that write directly to memory, so those cases are
treated like option #1.

I think the second option is easier on optimization since any previous
assignments can be removed from the normal code path, and registers
don't need to be loaded with known-valid values before the asm.  The
first option is more consistent since all outputs are treated the same,
but more dangerous, as the asm may write incomplete values before
throwing.

So, I prefer the second option, but I don't really have a strong
opinion either way.  As long as I can catch my exceptions I'm fine with
anything.  If the consensus is that the first option is preferable,
then I'll implement and submit that.

> (If broadened to all instructions under -fnon-call-exceptions, and not 
> just to asms will have quite a bad effect on optimization capabilities, 
> but I believe with enough force it's already possible now to construct 
> miscompiling testcases with the right mixtures of return types and ABIs)
> 
> 
> Ciao,
> Michael.
>
Segher Boessenkool March 18, 2020, 11:56 p.m. UTC | #34
On Tue, Mar 17, 2020 at 03:32:34PM +0000, Michael Matz wrote:
> On Mon, 16 Mar 2020, Richard Sandiford wrote:
> > Similarly for non-call exceptions on other statements.  It sounds like 
> > what you're describing requires the corresponding definition to happen 
> > for memory outputs regardless of whether the asm throws or not, so that 
> > the memory appears to change on both excecution paths.  Otherwise, the 
> > compiler would be able to assume that the memory operand still has its 
> > original value in the exception handler.
> 
> Well, it's both: on the exception path the compiler has to assume that the 
> the value wasn't changed (so that former defines are regarded as dead) or 
> that it already has changed (so that the effects the throwing 
> "instruction" had on the result (if any) aren't lost).  The easiest for 
> this is to regard the result place as also being an input.
> 
> (If broadened to all instructions under -fnon-call-exceptions, and not 
> just to asms will have quite a bad effect on optimization capabilities, 
> but I believe with enough force it's already possible now to construct 
> miscompiling testcases with the right mixtures of return types and ABIs)

It's a tradeoff: do we want this to work for almost no one and get PRs
that we cannot solve, or do we generate slightly worse assembler code
for -fnon-call-exceptions?  I don't think this is a difficult decision
to make, considering that you already get pretty bad performance with
that flag (if indeed it works correctly at all).


Segher
Michael Matz March 19, 2020, 4:42 p.m. UTC | #35
Hello,

On Wed, 18 Mar 2020, Segher Boessenkool wrote:

> > > Similarly for non-call exceptions on other statements.  It sounds like 
> > > what you're describing requires the corresponding definition to happen 
> > > for memory outputs regardless of whether the asm throws or not, so that 
> > > the memory appears to change on both excecution paths.  Otherwise, the 
> > > compiler would be able to assume that the memory operand still has its 
> > > original value in the exception handler.
> > 
> > Well, it's both: on the exception path the compiler has to assume that the 
> > the value wasn't changed (so that former defines are regarded as dead) or 
> > that it already has changed (so that the effects the throwing 
> > "instruction" had on the result (if any) aren't lost).  The easiest for 
> > this is to regard the result place as also being an input.
> > 
> > (If broadened to all instructions under -fnon-call-exceptions, and not 
> > just to asms will have quite a bad effect on optimization capabilities, 
> > but I believe with enough force it's already possible now to construct 
> > miscompiling testcases with the right mixtures of return types and ABIs)
> 
> It's a tradeoff: do we want this to work for almost no one and get PRs
> that we cannot solve, or do we generate slightly worse assembler code
> for -fnon-call-exceptions?  I don't think this is a difficult decision
> to make, considering that you already get pretty bad performance with
> that flag (if indeed it works correctly at all).

Oh, I wasn't advocating doing anything else than you suggested (i.e. make 
all operands in-out), I merely pointed out that the inherent problem here 
is not really specific to asms.


Ciao,
Michael.
Li, Pan2 via Gcc-patches March 19, 2020, 4:59 p.m. UTC | #36
On 2020-03-19 00:56, Segher Boessenkool wrote:
> On Tue, Mar 17, 2020 at 03:32:34PM +0000, Michael Matz wrote:
>> On Mon, 16 Mar 2020, Richard Sandiford wrote:
>>> Similarly for non-call exceptions on other statements.  It sounds like 
>>> what you're describing requires the corresponding definition to happen 
>>> for memory outputs regardless of whether the asm throws or not, so that 
>>> the memory appears to change on both excecution paths.  Otherwise, the 
>>> compiler would be able to assume that the memory operand still has its 
>>> original value in the exception handler.
>>
>> Well, it's both: on the exception path the compiler has to assume that the 
>> the value wasn't changed (so that former defines are regarded as dead) or 
>> that it already has changed (so that the effects the throwing 
>> "instruction" had on the result (if any) aren't lost).  The easiest for 
>> this is to regard the result place as also being an input.
>>
>> (If broadened to all instructions under -fnon-call-exceptions, and not 
>> just to asms will have quite a bad effect on optimization capabilities, 
>> but I believe with enough force it's already possible now to construct 
>> miscompiling testcases with the right mixtures of return types and ABIs)
> 
> It's a tradeoff: do we want this to work for almost no one and get PRs
> that we cannot solve, or do we generate slightly worse assembler code
> for -fnon-call-exceptions?  I don't think this is a difficult decision
> to make, considering that you already get pretty bad performance with
> that flag (if indeed it works correctly at all).

I just realized that changing all outputs to in+out would generate
worse code for *every* single asm that has any outputs.  Whereas
changing outputs to a temporary will not generate any extra code if
there is no local try/catch block.  I think that is something that
should be considered.

Then again, if the consensus is that outputs should become in+out, I
will implement that.  Then there is the issue of those debug stmts
being inserted after asms.  I can either patch the code that inserts
these (in tree-into-ssa.c) as I did earlier, or change the verify
routines to ignore debug stmts and the end of a basic block.  Which
would be preferable?
Michael Matz March 19, 2020, 5:06 p.m. UTC | #37
Hello,

On Wed, 18 Mar 2020, J.W. Jagersma via Gcc-patches wrote:

> > Well, it's both: on the exception path the compiler has to assume that the 
> > the value wasn't changed (so that former defines are regarded as dead) or 
> > that it already has changed (so that the effects the throwing 
> > "instruction" had on the result (if any) aren't lost).  The easiest for 
> > this is to regard the result place as also being an input.
> 
> The way I see it, there are two options: either you use the outputs
> when an exception is thrown, or you don't.

Assuming by "use the outputs" you mean "transform them implicitely to 
in-out operands", not "the source code uses the output operands after the 
asm on except and no-except paths".

> The first option is more or less what my first patch did, but it was
> incomplete.  Changing each output to in+out would make that work
> correctly.

Right.

> The second is what I have implemented now, each output is assigned via
> a temporary which is then assigned back to the variable bound to this
> output.  On exception, this temporary is discarded.  However this is
> not possible for asms that write directly to memory, so those cases are
> treated like option #1.

Right again, somewhat.  Except that the determination of which outputs are 
going into memory is a fuzzy notion until reload/LRA (which is very late 
in the compile pipeline).  You can infer some things from the constraint 
letters, but gimple might still put things into memory (e.g. when the 
output place is address taken), even though the constraint only allows a 
register (in which case reloads will be generated), or place something 
into a register, even though the constraint only allows memory (again, 
reloads will be generated).

Some of these reloads will be done early in the gimple pipeline to help 
optimizations (they basically look like the insns that you generate for 
copy-out), some of them will be generated only very late.

> I think the second option is easier on optimization since any previous
> assignments can be removed from the normal code path, and registers
> don't need to be loaded with known-valid values before the asm.

True (easier to optimizations) but immaterial (see below).

> The first option is more consistent since all outputs are treated the 
> same, but more dangerous, as the asm may write incomplete values before 
> throwing.

You have to assume that the author of the asm and its surrounding code is 
written by a knowledgable person, so if the asm possibly writes partially 
to outputs and then throws, then the output must not be accessed on the 
exception path.  If the asm does not write partially, then the author can 
access it.  So, what can or cannot be accessed on the exception path 
really is an inherent property of the contents of the asm.

Now, your second case tries to create additional guarantees: it makes it 
so that for some operands the user can depend on certain behaviour, namely 
that the old value before the asm was entered is available on the except 
path.  As you can't ensure this for all operands (those in memory are the 
problem) you want to tighten the definition to only include the operands 
where you can guarantee it, but this is fairly brittle, as above, because 
some decisions are taken very late.

There's another case to consider: assume I'm writing an asm that writes 
meaningful values to an output before and after a potential exception is 
thrown, ala this:

asm (" mov %2, %0
       xyz %2, %1
       mov $0, %0" : "=r" (status), "+r" (a) : "r" (b));

Assume 'xyz' can fault depending on input.  The intention above would be 
that on the exception path 'status' would contain a meaningful value (here 
the value of input b, and on the non-except path would contain zero.

Your proposal of copyin/out for register values would make the above 
impossible.  (Basically you wouldn't be able to output any reliable 
information out of the asm in the except path).

Given that, and the complication of determining what the safe-for-copy-out 
operands really are, and the fact that being easier on optimizations in 
connection with asms and -fnon-call-exceptions isn't a high priority it 
seems best to opt for the option that is guaranteed to work correctly in 
most cases, and in fact allows more freedom in using the asm.


Ciao,
Michael.
Michael Matz March 19, 2020, 5:17 p.m. UTC | #38
Hello,

On Thu, 19 Mar 2020, J.W. Jagersma via Gcc-patches wrote:

> I just realized that changing all outputs to in+out would generate
> worse code for *every* single asm that has any outputs.

Under -fnon-call-exception only.  And I'm not sure what you mean, the only 
effect of the additional 'in+' part that wasn't there before should be 
that some instructions setting those operands to values before the asm 
aren't deleted.  If there are none, the code should come out the same.

> Whereas changing outputs to a temporary will not generate any extra code 
> if there is no local try/catch block.  I think that is something that 
> should be considered.

But it will also disallow values to be given out of the asm in the 
exception case. (See my other mail).  I think that's more worthwhile than 
some superflous moves (that, if they turn out to be really useless could 
be removed after reload).


Ciao,
Michael.
Li, Pan2 via Gcc-patches March 20, 2020, 7:09 p.m. UTC | #39
On 2020-03-19 18:06, Michael Matz wrote:
> Hello,
> 
> On Wed, 18 Mar 2020, J.W. Jagersma via Gcc-patches wrote:
> 
>>> Well, it's both: on the exception path the compiler has to assume that the 
>>> the value wasn't changed (so that former defines are regarded as dead) or 
>>> that it already has changed (so that the effects the throwing 
>>> "instruction" had on the result (if any) aren't lost).  The easiest for 
>>> this is to regard the result place as also being an input.
>>
>> The way I see it, there are two options: either you use the outputs
>> when an exception is thrown, or you don't.
> 
> Assuming by "use the outputs" you mean "transform them implicitely to 
> in-out operands", not "the source code uses the output operands after the 
> asm on except and no-except paths".
> 
>> The first option is more or less what my first patch did, but it was
>> incomplete.  Changing each output to in+out would make that work
>> correctly.
> 
> Right.
> 
>> The second is what I have implemented now, each output is assigned via
>> a temporary which is then assigned back to the variable bound to this
>> output.  On exception, this temporary is discarded.  However this is
>> not possible for asms that write directly to memory, so those cases are
>> treated like option #1.
> 
> Right again, somewhat.  Except that the determination of which outputs are 
> going into memory is a fuzzy notion until reload/LRA (which is very late 
> in the compile pipeline).  You can infer some things from the constraint 
> letters, but gimple might still put things into memory (e.g. when the 
> output place is address taken), even though the constraint only allows a 
> register (in which case reloads will be generated), or place something 
> into a register, even though the constraint only allows memory (again, 
> reloads will be generated).
> 
> Some of these reloads will be done early in the gimple pipeline to help 
> optimizations (they basically look like the insns that you generate for 
> copy-out), some of them will be generated only very late.
> 
>> I think the second option is easier on optimization since any previous
>> assignments can be removed from the normal code path, and registers
>> don't need to be loaded with known-valid values before the asm.
> 
> True (easier to optimizations) but immaterial (see below).
> 
>> The first option is more consistent since all outputs are treated the 
>> same, but more dangerous, as the asm may write incomplete values before 
>> throwing.
> 
> You have to assume that the author of the asm and its surrounding code is 
> written by a knowledgable person, so if the asm possibly writes partially 
> to outputs and then throws, then the output must not be accessed on the 
> exception path.  If the asm does not write partially, then the author can 
> access it.  So, what can or cannot be accessed on the exception path 
> really is an inherent property of the contents of the asm.
> 
> Now, your second case tries to create additional guarantees: it makes it 
> so that for some operands the user can depend on certain behaviour, namely 
> that the old value before the asm was entered is available on the except 
> path.  As you can't ensure this for all operands (those in memory are the 
> problem) you want to tighten the definition to only include the operands 
> where you can guarantee it, but this is fairly brittle, as above, because 
> some decisions are taken very late.
> 
> There's another case to consider: assume I'm writing an asm that writes 
> meaningful values to an output before and after a potential exception is 
> thrown, ala this:
> 
> asm (" mov %2, %0
>        xyz %2, %1
>        mov $0, %0" : "=r" (status), "+r" (a) : "r" (b));
> 
> Assume 'xyz' can fault depending on input.  The intention above would be 
> that on the exception path 'status' would contain a meaningful value (here 
> the value of input b, and on the non-except path would contain zero.
> 
> Your proposal of copyin/out for register values would make the above 
> impossible.  (Basically you wouldn't be able to output any reliable 
> information out of the asm in the except path).
> 
> Given that, and the complication of determining what the safe-for-copy-out 
> operands really are, and the fact that being easier on optimizations in 
> connection with asms and -fnon-call-exceptions isn't a high priority it 
> seems best to opt for the option that is guaranteed to work correctly in 
> most cases, and in fact allows more freedom in using the asm.
> 
> 
> Ciao,
> Michael.

Thanks.  You bring up some good points which I hadn't yet considered,
and I don't have any counter-arguments.
So using the outputs on exception looks to be the best option then.  I
think the test cases can then also be reduced to a single function, as
all operand types should behave the same.  I will implement this and
resubmit.
Li, Pan2 via Gcc-patches March 24, 2020, 5:54 p.m. UTC | #40
I was looking into this yesterday but I ran into some issues.
The first is that blindly converting each output to in+out breaks
cases where an input already overlaps with an output:

    asm ("..." : "=r" (out) : "0" (in));

This could be worked around by scanning the input list and only
converting outputs which don't already have a matching input.  But then
you could run into cases like:

    asm ("..." : "=r,m" (out) : "0,r" (in));

Which may be unlikely, but this is valid code.  Here you can never
determine during gimplification which alternative will eventually be
chosen.

The second problem is that it sets off -Wuninitialized warnings
everywhere.  Is there some way to determine in gimplify_asm_expr which
outputs have already been assigned to?


On 2020-03-19 18:06, Michael Matz wrote:
> Hello,
> 
> On Wed, 18 Mar 2020, J.W. Jagersma via Gcc-patches wrote:
> 
>>> Well, it's both: on the exception path the compiler has to assume that the 
>>> the value wasn't changed (so that former defines are regarded as dead) or 
>>> that it already has changed (so that the effects the throwing 
>>> "instruction" had on the result (if any) aren't lost).  The easiest for 
>>> this is to regard the result place as also being an input.
>>
>> The way I see it, there are two options: either you use the outputs
>> when an exception is thrown, or you don't.
> 
> Assuming by "use the outputs" you mean "transform them implicitely to 
> in-out operands", not "the source code uses the output operands after the 
> asm on except and no-except paths".
> 
>> The first option is more or less what my first patch did, but it was
>> incomplete.  Changing each output to in+out would make that work
>> correctly.
> 
> Right.
> 
>> The second is what I have implemented now, each output is assigned via
>> a temporary which is then assigned back to the variable bound to this
>> output.  On exception, this temporary is discarded.  However this is
>> not possible for asms that write directly to memory, so those cases are
>> treated like option #1.
> 
> Right again, somewhat.  Except that the determination of which outputs are 
> going into memory is a fuzzy notion until reload/LRA (which is very late 
> in the compile pipeline).  You can infer some things from the constraint 
> letters, but gimple might still put things into memory (e.g. when the 
> output place is address taken), even though the constraint only allows a 
> register (in which case reloads will be generated), or place something 
> into a register, even though the constraint only allows memory (again, 
> reloads will be generated).

Re-reading this and giving it some more thought, I must reconsider what
I previously said.
From looking at the constraints you should be able to tell which
operands *must* be in memory, and which *may* reside in a register.
This is the only distinction that matters.  Only the must-be-memory
cases get special treatment.  Everything else can be discarded on the
exception path, this includes "=rm" and such.

> Some of these reloads will be done early in the gimple pipeline to help 
> optimizations (they basically look like the insns that you generate for 
> copy-out), some of them will be generated only very late.
> 
>> I think the second option is easier on optimization since any previous
>> assignments can be removed from the normal code path, and registers
>> don't need to be loaded with known-valid values before the asm.
> 
> True (easier to optimizations) but immaterial (see below).
> 
>> The first option is more consistent since all outputs are treated the 
>> same, but more dangerous, as the asm may write incomplete values before 
>> throwing.
> 
> You have to assume that the author of the asm and its surrounding code is 
> written by a knowledgable person, so if the asm possibly writes partially 
> to outputs and then throws, then the output must not be accessed on the 
> exception path.  If the asm does not write partially, then the author can 
> access it.  So, what can or cannot be accessed on the exception path 
> really is an inherent property of the contents of the asm.
> 
> Now, your second case tries to create additional guarantees: it makes it 
> so that for some operands the user can depend on certain behaviour, namely 
> that the old value before the asm was entered is available on the except 
> path.  As you can't ensure this for all operands (those in memory are the 
> problem) you want to tighten the definition to only include the operands 
> where you can guarantee it, but this is fairly brittle, as above, because 
> some decisions are taken very late.
> 
> There's another case to consider: assume I'm writing an asm that writes 
> meaningful values to an output before and after a potential exception is 
> thrown, ala this:
> 
> asm (" mov %2, %0
>        xyz %2, %1
>        mov $0, %0" : "=r" (status), "+r" (a) : "r" (b));
> 
> Assume 'xyz' can fault depending on input.  The intention above would be 
> that on the exception path 'status' would contain a meaningful value (here 
> the value of input b, and on the non-except path would contain zero.
> 
> Your proposal of copyin/out for register values would make the above 
> impossible.  (Basically you wouldn't be able to output any reliable 
> information out of the asm in the except path).

I do agree that it could potentially be useful to keep certain outputs
on the exception path, although I can't think of any real-world use
cases.  But with my current patch this is already possible, as long as
the constraints force it into memory.

The example you gave looks useful at first, but is it really?  With
exceptions, there isn't much use for this.  The fact that you get an
exception should be enough to tell the user what happened.  Having a
separate status variable could only be useful in asms with multiple
instructions that can potentially throw.  But there are probably few,
if any, such cases which couldn't be split up into multiple stmts.

> Given that, and the complication of determining what the safe-for-copy-out 
> operands really are, and the fact that being easier on optimizations in 
> connection with asms and -fnon-call-exceptions isn't a high priority it 
> seems best to opt for the option that is guaranteed to work correctly in 
> most cases, and in fact allows more freedom in using the asm.
> 
> 
> Ciao,
> Michael.
>
diff mbox series

Patch

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 11b79a5852c..adcdd123011 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -9577,6 +9577,11 @@  errors during compilation if your @code{asm} code defines symbols or labels.
 Using @samp{%=} 
 (@pxref{AssemblerTemplate}) may help resolve this problem.
 
+When non-call exceptions (@option{-fnon-call-exceptions}) are enabled, a
+@code{volatile asm} statement is also allowed to throw exceptions.  If it
+does, then the compiler assumes that its output operands have not been written
+yet.
+
 @anchor{AssemblerTemplate}
 @subsubsection Assembler Template
 @cindex @code{asm} assembler template
diff --git a/gcc/testsuite/g++.target/i386/pr93981.C b/gcc/testsuite/g++.target/i386/pr93981.C
new file mode 100644
index 00000000000..ca444c5db3b
--- /dev/null
+++ b/gcc/testsuite/g++.target/i386/pr93981.C
@@ -0,0 +1,28 @@ 
+// PR inline-asm/93981
+// { dg-do run }
+// { dg-options "-fnon-call-exceptions" }
+
+#include <csignal>
+
+struct illegal_opcode { };
+
+extern "C" void
+sigill (int)
+{
+  throw illegal_opcode ( );
+}
+
+int
+main ()
+{
+  std::signal (SIGILL, sigill);
+  try
+    {
+      asm ("ud2");
+    }
+  catch (const illegal_opcode&)
+    {
+      return 0;
+    }
+  return 1;
+}
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index f7b817d94e6..c21a7978493 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -913,6 +913,8 @@  make_edges_bb (basic_block bb, struct omp_region **pcur_region, int *pomp_index)
       break;
 
     case GIMPLE_ASM:
+      if (stmt_can_throw_internal (cfun, last))
+	make_eh_edges (last);
       make_gimple_asm_edges (bb);
       fallthru = true;
       break;
diff --git a/gcc/tree-eh.c b/gcc/tree-eh.c
index 2a409dcaffe..cfdd0fdcf23 100644
--- a/gcc/tree-eh.c
+++ b/gcc/tree-eh.c
@@ -2077,6 +2077,8 @@  lower_eh_constructs_2 (struct leh_state *state, gimple_stmt_iterator *gsi)
 	    DECL_GIMPLE_REG_P (tmp) = 1;
 	  gsi_insert_after (gsi, s, GSI_SAME_STMT);
 	}
+
+    record_throwing_stmt:
       /* Look for things that can throw exceptions, and record them.  */
       if (state->cur_region && stmt_could_throw_p (cfun, stmt))
 	{
@@ -2085,6 +2087,36 @@  lower_eh_constructs_2 (struct leh_state *state, gimple_stmt_iterator *gsi)
 	}
       break;
 
+    case GIMPLE_ASM:
+      {
+	/* As above with GIMPLE_ASSIGN.  Change each register output operand
+	   to a temporary and insert a new stmt to assign this to the original
+	   operand.  */
+	gasm *asm_stmt = as_a <gasm *> (stmt);
+	if (stmt_could_throw_p (cfun, stmt)
+	    && gimple_asm_noutputs (asm_stmt) > 0
+	    && gimple_stmt_may_fallthru (stmt))
+	  {
+	    for (unsigned i = 0; i < gimple_asm_noutputs (asm_stmt); ++i)
+	      {
+		tree op = gimple_asm_output_op (asm_stmt, i);
+		tree opval = TREE_VALUE (op);
+		if (tree_could_throw_p (opval)
+		    || !is_gimple_reg_type (TREE_TYPE (opval))
+		    || !is_gimple_reg (get_base_address (opval)))
+		  continue;
+
+		tree tmp = create_tmp_reg (TREE_TYPE (opval));
+		gimple *s = gimple_build_assign (opval, tmp);
+		gimple_set_location (s, gimple_location (stmt));
+		gimple_set_block (s, gimple_block (stmt));
+		TREE_VALUE (op) = tmp;
+		gsi_insert_after (gsi, s, GSI_SAME_STMT);
+	      }
+	  }
+      }
+      goto record_throwing_stmt;
+
     case GIMPLE_COND:
     case GIMPLE_GOTO:
     case GIMPLE_RETURN: