Patchwork [RFC] trans-mem: mark transaction begins as returns-twice

login
register
mail settings
Submitter Torvald Riegel
Date Jan. 2, 2012, 6:10 p.m.
Message ID <1325527819.7636.552.camel@triegel.csb>
Download mbox | patch
Permalink /patch/133896/
State New
Headers show

Comments

Torvald Riegel - Jan. 2, 2012, 6:10 p.m.
Attached is Richard Henderson's patch that marks calls that start a
transaction as returns-twice.  In turn, these calls will be treated like
a call to setjmp.

This was motivated by the miscompilation of one of the STAMP
applications (Genome), where a stack slot was used as temp storage for a
CPU register but not restored when the transaction got aborted and
restarted (then, after restart, the program crashed because it used
inconsistent data).  With the attached patch and in this particular
example, the stack slots that are written to in the transaction do not
get read during the transaction.  (-fno-caller-saves was not a
sufficient solution, BTW.)

AFAICT, we previously wanted to handle "restart safety" by adding
abnormal edges to all calls to the TM runtime library (which could in
turn call the libitm longjmp that actually restarts a transaction).
Richard mentioned that we could drop this code (I think he meant this,
and others pieces perhaps) if the returns-twice approach works.  BTW,
did we also add abnormal edges to any call to transactional clones,
which could in turn call the TM runtime library?

Conceptually, what we would need to enforce is that all stack slots that
are live-in into a transaction (i.e., the live range crosses a call to
_ITM_beginTransaction) do not get reused until the matching call to
_ITM_commitTransaction.  Alternatively, we can reuse those, but need to
save and restore them after aborts.

As Richard suggested, I looked at the handling of REG_SETJMP (see the
summary below).  This looks like a sufficient solution to me (eg, don't
share stack slots if regs cross a setjmp) but I don't know nearly enough
about this to really understand whether it is.  Can somebody else have a
look please?  In particular:
- Is it really sufficient?
- Is this too conservative (e.g., no CSE at all)?  If so, we should look
  at this again for 4.8 I guess, otherwise every function that uses a
  transaction will be slower, and slowing down nontransactional code
  isn't nice.  Or won't that be much of a slowdown?
- Do we potentially get unnecessary warnings (if vars are live across
  a transaction begin)?  I didn't get such warnings in the STAMP app
  that wasn't working though.  Does anyone has suggestions for a test
  case?
- For the long-term, should we try to limit the setjmp effects to the
  TM region?  Contrary to standard setjmp/longjmp, we know about the
  regions and where the longjmps matching a setjmp are.  Any thoughts
  on how much this might matter performance-wise?

Overall, I think we should still use it, unless we can come up with
another fix in time for 4.7.

Opinions?


Torvald

Summary:
- cprop.c: constant propagation only runs if !cfun->calls_setjmp
- gcse.c: Same.
- cse.c: we don't CSE over a call to setjmp. But comment associates this
  just with some weird longjmp on VAX and other systems.
- cselib.c (cselib_process_insn): Forget everything at a setjmp.
  - callers of this function are: dse_step1(), local_cprop_pass(),
    reload_cse_regs(), thread_jump(), vt_initialize()
- ira-lives.c (process_bb_node_lives): ??? Don't allocate allocnos
  that cross setjmps.
- regstat.c (regstat_bb_compute_ri): Marks all pseudo(?) regs that
  cross a setjmp. Clients of this information:
  - (regstat_compute_ri): ??? REG_LIVE_LENGTH(regno) = -1; 
    REG_BASIC_BLOCK(regno) = REG_BLOCK_UNKNOWN;
  - function.c (generate_setjmp_warnings): Warnings if vars are live
    across setjmp.
  - ira-color.c (coalesce_spill_slots): ??? Don't share stack slots if
    one of the regs crosses a setjmp?
 - reload1.c (reload_as_needed): Don't reuse any reload reg contents
   across a setjmp.
 - reload.c (find_equiv_reg): Don't reuse register contents from before
   a setjmp-type function call.
 - resource.c (mark_referenced_resources, mark_set_resources): setjmp
   uses all registers.
 - sched-deps.c: setjmp acts as a MOVE_BARRIER.  What does this barrier
   do precisely?
commit 53e0ab5b86ca38d55e4c4fd927be33143586ad15
Author: Torvald Riegel <triegel@redhat.com>
Date:   Mon Dec 19 23:36:45 2011 +0100

    rth's returns-twice fix
Eric Botcazou - Jan. 3, 2012, 8:36 a.m.
> AFAICT, we previously wanted to handle "restart safety" by adding
> abnormal edges to all calls to the TM runtime library (which could in
> turn call the libitm longjmp that actually restarts a transaction).
> Richard mentioned that we could drop this code (I think he meant this,
> and others pieces perhaps) if the returns-twice approach works.

Why does the explicit CFG approach not work exactly?  cfun->calls_setjmp is 
thought to be quite a big hammer.
Richard Guenther - Jan. 3, 2012, 9:41 a.m.
On Tue, Jan 3, 2012 at 9:36 AM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> AFAICT, we previously wanted to handle "restart safety" by adding
>> abnormal edges to all calls to the TM runtime library (which could in
>> turn call the libitm longjmp that actually restarts a transaction).
>> Richard mentioned that we could drop this code (I think he meant this,
>> and others pieces perhaps) if the returns-twice approach works.
>
> Why does the explicit CFG approach not work exactly?  cfun->calls_setjmp is
> thought to be quite a big hammer.

Indeed.  And I wonder if it really works - given the restrictions on automatic
variable use with these functions (IIRC you have to be careful and mark them
all 'volatile').

Richard.

> --
> Eric Botcazou
Torvald Riegel - Jan. 3, 2012, 10:42 a.m.
On Tue, 2012-01-03 at 09:36 +0100, Eric Botcazou wrote:
> > AFAICT, we previously wanted to handle "restart safety" by adding
> > abnormal edges to all calls to the TM runtime library (which could in
> > turn call the libitm longjmp that actually restarts a transaction).
> > Richard mentioned that we could drop this code (I think he meant this,
> > and others pieces perhaps) if the returns-twice approach works.
> 
> Why does the explicit CFG approach not work exactly?  cfun->calls_setjmp is 
> thought to be quite a big hammer.

I don't know, actually.  When I looked at the miscompilation case, all
abnormal edges seemed to be in place.

@rth: Do you have an idea what could be going wrong?  I haven't tried
the other thing you sent me, what was it supposed to fix?
Torvald Riegel - Jan. 3, 2012, 10:49 a.m.
On Tue, 2012-01-03 at 10:41 +0100, Richard Guenther wrote:
> On Tue, Jan 3, 2012 at 9:36 AM, Eric Botcazou <ebotcazou@adacore.com> wrote:
> >> AFAICT, we previously wanted to handle "restart safety" by adding
> >> abnormal edges to all calls to the TM runtime library (which could in
> >> turn call the libitm longjmp that actually restarts a transaction).
> >> Richard mentioned that we could drop this code (I think he meant this,
> >> and others pieces perhaps) if the returns-twice approach works.
> >
> > Why does the explicit CFG approach not work exactly?  cfun->calls_setjmp is
> > thought to be quite a big hammer.
> 
> Indeed.  And I wonder if it really works - given the restrictions on automatic
> variable use with these functions (IIRC you have to be careful and mark them
> all 'volatile').

Well, every variable that does get its address taken will have its
accesses re-routed through libitm, so those locations will be handled
properly using undo-logging and rollback by libitm.  For everything that
is only on the stack or in registers, it should work too because we
don't reuse the stack slots that cross the transaction begin, so we
always have those values available as a snapshot.
Also, the (non-working?) abnormal edges are still active in what I
tried, so perhaps there's some overlap there.
Torvald Riegel - Jan. 3, 2012, 7:20 p.m.
On Mon, 2012-01-02 at 19:10 +0100, Torvald Riegel wrote:
> - Do we potentially get unnecessary warnings (if vars are live across
>   a transaction begin)?  I didn't get such warnings in the STAMP app
>   that wasn't working though.  Does anyone has suggestions for a test
>   case?

Attached is a test that raises a warning.  This requires g++ and
-Wclobbered (-Wall is not sufficient), so it might still be okay until
we have a proper solution.

Interestingly (at least to me), I wasn't able to construct a test on my
own (see main2() for my attempts...), but had to web-search for one (see
main()).  Is the warning actually supposed to always trigger?  We only
seem to have a test for a case where there should be _no_ warning.
Richard Henderson - Jan. 3, 2012, 10:59 p.m.
On 01/03/2012 09:42 PM, Torvald Riegel wrote:
>> Why does the explicit CFG approach not work exactly?  cfun->calls_setjmp is 
>> thought to be quite a big hammer.
> 
> I don't know, actually.  When I looked at the miscompilation case, all
> abnormal edges seemed to be in place.
> 
> @rth: Do you have an idea what could be going wrong?  I haven't tried
> the other thing you sent me, what was it supposed to fix?

There are several places where those edges (currently) get lost going
from gimple to rtl.  In addition, return value copy from the hard reg
to the pseudo is in the wrong basic block wrt the abnormal edges.

My inclination at this point is to use returns_twice for the 4.7 release
and fix the abcall edges and associated fiddlery in 4.8 stage1.  I'm not
especially confident that we'd clear out all the bugs in time otherwise.


r~
Patrick Marlier - Jan. 4, 2012, 3:53 p.m.
On 01/02/2012 01:10 PM, Torvald Riegel wrote:
> This was motivated by the miscompilation of one of the STAMP
> applications (Genome), where a stack slot was used as temp storage for a
> CPU register but not restored when the transaction got aborted and
> restarted (then, after restart, the program crashed because it used
> inconsistent data).  With the attached patch and in this particular
> example, the stack slots that are written to in the transaction do not
> get read during the transaction.  (-fno-caller-saves was not a
> sufficient solution, BTW.)

Are you sure this not due to the missing of tm-logging? We xfail until 
now for the testsuite but this should be addressed. (Note that I tested 
genome months ago and it was working correctly.)

By the way, what's the status of this problem of tm-logging?

Patrick.
Aldy Hernandez - Jan. 4, 2012, 4:40 p.m.
On 01/04/12 09:53, Patrick Marlier wrote:
> On 01/02/2012 01:10 PM, Torvald Riegel wrote:
>> This was motivated by the miscompilation of one of the STAMP
>> applications (Genome), where a stack slot was used as temp storage for a
>> CPU register but not restored when the transaction got aborted and
>> restarted (then, after restart, the program crashed because it used
>> inconsistent data). With the attached patch and in this particular
>> example, the stack slots that are written to in the transaction do not
>> get read during the transaction. (-fno-caller-saves was not a
>> sufficient solution, BTW.)
>
> Are you sure this not due to the missing of tm-logging? We xfail until
> now for the testsuite but this should be addressed. (Note that I tested
> genome months ago and it was working correctly.)
>
> By the way, what's the status of this problem of tm-logging?

I'm chugging along on the TM PR's, but so far the bug reporters are 
beating me :).  I can move this problem to the top of the list if you 
want.  If so, what is the PR?
Patrick Marlier - Jan. 4, 2012, 5:13 p.m.
On 01/04/2012 11:40 AM, Aldy Hernandez wrote:
> On 01/04/12 09:53, Patrick Marlier wrote:
>> On 01/02/2012 01:10 PM, Torvald Riegel wrote:
>>> This was motivated by the miscompilation of one of the STAMP
>>> applications (Genome), where a stack slot was used as temp storage for a
>>> CPU register but not restored when the transaction got aborted and
>>> restarted (then, after restart, the program crashed because it used
>>> inconsistent data). With the attached patch and in this particular
>>> example, the stack slots that are written to in the transaction do not
>>> get read during the transaction. (-fno-caller-saves was not a
>>> sufficient solution, BTW.)
>>
>> Are you sure this not due to the missing of tm-logging? We xfail until
>> now for the testsuite but this should be addressed. (Note that I tested
>> genome months ago and it was working correctly.)
>>
>> By the way, what's the status of this problem of tm-logging?
>
> I'm chugging along on the TM PR's, but so far the bug reporters are
> beating me :). I can move this problem to the top of the list if you
> want. If so, what is the PR?

PR: 51165 51166 51167 51168
I let Torvald and you decide about the priority.

Patrick.
Torvald Riegel - Jan. 4, 2012, 6:46 p.m.
On Wed, 2012-01-04 at 12:13 -0500, Patrick Marlier wrote:
> On 01/04/2012 11:40 AM, Aldy Hernandez wrote:
> > On 01/04/12 09:53, Patrick Marlier wrote:
> >> On 01/02/2012 01:10 PM, Torvald Riegel wrote:
> >>> This was motivated by the miscompilation of one of the STAMP
> >>> applications (Genome), where a stack slot was used as temp storage for a
> >>> CPU register but not restored when the transaction got aborted and
> >>> restarted (then, after restart, the program crashed because it used
> >>> inconsistent data). With the attached patch and in this particular
> >>> example, the stack slots that are written to in the transaction do not
> >>> get read during the transaction. (-fno-caller-saves was not a
> >>> sufficient solution, BTW.)
> >>
> >> Are you sure this not due to the missing of tm-logging? We xfail until
> >> now for the testsuite but this should be addressed. (Note that I tested
> >> genome months ago and it was working correctly.)
> >>
> >> By the way, what's the status of this problem of tm-logging?
> >
> > I'm chugging along on the TM PR's, but so far the bug reporters are
> > beating me :). I can move this problem to the top of the list if you
> > want. If so, what is the PR?
> 
> PR: 51165 51166 51167 51168
> I let Torvald and you decide about the priority.

If that's indeed just a missed optimization as Aldy says on those
reports, then I think this has lower priority.  51752 is higher
priority, for example, even though I'm not very optimistic that we can
fix this quickly...
Torvald Riegel - Jan. 4, 2012, 6:53 p.m.
On Wed, 2012-01-04 at 10:53 -0500, Patrick Marlier wrote:
> On 01/02/2012 01:10 PM, Torvald Riegel wrote:
> > This was motivated by the miscompilation of one of the STAMP
> > applications (Genome), where a stack slot was used as temp storage for a
> > CPU register but not restored when the transaction got aborted and
> > restarted (then, after restart, the program crashed because it used
> > inconsistent data).  With the attached patch and in this particular
> > example, the stack slots that are written to in the transaction do not
> > get read during the transaction.  (-fno-caller-saves was not a
> > sufficient solution, BTW.)
> 
> Are you sure this not due to the missing of tm-logging? We xfail until 
> now for the testsuite but this should be addressed. (Note that I tested 
> genome months ago and it was working correctly.)

I think so.  This could be fixed with logging, but I don't think that's
the right way to handle this.  Perhaps its related though (unless the
PR's you cited are indeed just missed optimizations).

Patrick, I forwarded you the details of the issue...
Torvald Riegel - Jan. 6, 2012, 12:03 a.m.
On Wed, 2012-01-04 at 09:59 +1100, Richard Henderson wrote:
> On 01/03/2012 09:42 PM, Torvald Riegel wrote:
> >> Why does the explicit CFG approach not work exactly?  cfun->calls_setjmp is 
> >> thought to be quite a big hammer.
> > 
> > I don't know, actually.  When I looked at the miscompilation case, all
> > abnormal edges seemed to be in place.
> > 
> > @rth: Do you have an idea what could be going wrong?  I haven't tried
> > the other thing you sent me, what was it supposed to fix?
> 
> There are several places where those edges (currently) get lost going
> from gimple to rtl.  In addition, return value copy from the hard reg
> to the pseudo is in the wrong basic block wrt the abnormal edges.
> 
> My inclination at this point is to use returns_twice for the 4.7 release
> and fix the abcall edges and associated fiddlery in 4.8 stage1.  I'm not
> especially confident that we'd clear out all the bugs in time otherwise.

I committed Richard's returns-twice patch as a work-around for now
(r182937), and created PR 51771 for this.

Patch

diff --git a/gcc/builtin-attrs.def b/gcc/builtin-attrs.def
index 619794e..c3132cc 100644
--- a/gcc/builtin-attrs.def
+++ b/gcc/builtin-attrs.def
@@ -98,6 +98,7 @@  DEF_ATTR_IDENT (ATTR_STRFTIME, "strftime")
 DEF_ATTR_IDENT (ATTR_TYPEGENERIC, "type generic")
 DEF_ATTR_IDENT (ATTR_TM_REGPARM, "*tm regparm")
 DEF_ATTR_IDENT (ATTR_TM_TMPURE, "transaction_pure")
+DEF_ATTR_IDENT (ATTR_RETURNS_TWICE, "returns_twice")
 
 DEF_ATTR_TREE_LIST (ATTR_NOVOPS_LIST, ATTR_NOVOPS, ATTR_NULL, ATTR_NULL)
 
@@ -241,6 +242,8 @@  DEF_ATTR_TREE_LIST (ATTR_TM_NORETURN_NOTHROW_LIST,
 		    ATTR_TM_REGPARM, ATTR_NULL, ATTR_NORETURN_NOTHROW_LIST)
 DEF_ATTR_TREE_LIST (ATTR_TM_CONST_NOTHROW_LIST,
 		    ATTR_TM_REGPARM, ATTR_NULL, ATTR_CONST_NOTHROW_LIST)
+DEF_ATTR_TREE_LIST (ATTR_TM_NOTHROW_RT_LIST,
+		    ATTR_RETURNS_TWICE, ATTR_NULL, ATTR_TM_NOTHROW_LIST)
 
 /* Same attributes used for BUILT_IN_MALLOC except with TM_PURE thrown in.  */
 DEF_ATTR_TREE_LIST (ATTR_TMPURE_MALLOC_NOTHROW_LIST,
diff --git a/gcc/gtm-builtins.def b/gcc/gtm-builtins.def
index 9fcbdb0..1630a0e 100644
--- a/gcc/gtm-builtins.def
+++ b/gcc/gtm-builtins.def
@@ -1,5 +1,5 @@ 
 DEF_TM_BUILTIN (BUILT_IN_TM_START, "_ITM_beginTransaction",
-		BT_FN_UINT_UINT, ATTR_TM_NOTHROW_LIST)
+		BT_FN_UINT_UINT, ATTR_TM_NOTHROW_RT_LIST)
 
 DEF_TM_BUILTIN (BUILT_IN_TM_COMMIT, "_ITM_commitTransaction",
 		BT_FN_VOID, ATTR_TM_NOTHROW_LIST)