Patchwork [11/13] tcg: sync globals for pure helpers instead of saving them

login
register
mail settings
Submitter Aurelien Jarno
Date Sept. 27, 2012, 5:15 p.m.
Message ID <1348766113-18373-12-git-send-email-aurelien@aurel32.net>
Download mbox | patch
Permalink /patch/187440/
State New
Headers show

Comments

Aurelien Jarno - Sept. 27, 2012, 5:15 p.m.
Pure helpers are allowed to read globals, but not to modify them. In
that case, instead of moving all the globals to memory, just synchronize
them.

Also update the documentation to make it more clear.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 tcg/README |   15 ++++++++++-----
 tcg/tcg.c  |   33 ++++++++++++++++++++++++++++-----
 2 files changed, 38 insertions(+), 10 deletions(-)

Note: this is a change of behavior, but the new one is still compliant
with the documentation. Currently only cris, i386 and s390x have PURE
only helpers. I have checked they are really PURE, and also did some 
tests on cris and i386.
Richard Henderson - Sept. 27, 2012, 7:39 p.m.
On 09/27/2012 10:15 AM, Aurelien Jarno wrote:
> Note: this is a change of behavior, but the new one is still compliant
> with the documentation. Currently only cris, i386 and s390x have PURE
> only helpers. I have checked they are really PURE, and also did some 
> tests on cris and i386.
> 
> diff --git a/tcg/README b/tcg/README
> index 27846f1..d61daa1 100644
> --- a/tcg/README
> +++ b/tcg/README
> @@ -77,11 +77,16 @@ destroyed, but local temporaries and globals are preserved.
>  Using the tcg_gen_helper_x_y it is possible to call any function
>  taking i32, i64 or pointer types. By default, before calling a helper,
>  all globals are stored at their canonical location and it is assumed
> -that the function can modify them. This can be overridden by the
> -TCG_CALL_CONST function modifier. By default, the helper is allowed to
> -modify the CPU state or raise an exception. This can be overridden by
> -the TCG_CALL_PURE function modifier, in which case the call to the
> -function is removed if the return value is not used.
> +that the function can modify them. The helper is allowed to modify
> +the CPU state or raise an exception.
> +
> +This can be overridden by the TCG_CALL_CONST and TCG_CALL_PURE function
> +modifiers. A PURE helper can read globals but cannot modify them nor the
> +CPU state and cannot raise an exception. It can be removed if the return
> +value is not used. For that the globals are synchronized with their canonical
> +location, but the associated registers are not freed nor reloaded afterwise.
> +A CONST helper is a PURE helper which in addition cannot read globals, they
> +are not synchronized nor stored. Note that CONST implies PURE.

If we're going to re-org these flags, lets please do it right.

In particular the terms "const" and "pure" were clearly stolen from gcc,
but then given different meanings (!).  This, to me at least, is incredibly
confusing.  I have to go back and re-read the docs every time I touch
one of the helper declarations.

There are really three flags we care about:

(1) Helper may read globals.  Sometimes indirectly via exception.
    Implies that globals must be synced, but may remain in REG/CONST state.

(2) Helper may write globals.  Implies globals must be synced, and
    returned to MEM state.

(3) Helper has no side effects at all.  Implies that it can be deleted if
    its outputs are dead.

For the sake of discussion, lets call these READG, WRITEG, NOSIDE.

Our current default is READG | WRITEG.

Our current definition of PURE is READG | NOSIDE.  Note that the gcc
definition of pure actually talks about main memory.

Our current definition of CONST is none of these bits.  Note that the
gcc definition of const is a superset of pure.

What we'd like for all fp helpers is READG only.  Something that we
cannot get at the moment.

There are several cases in which the helper needs to return more than
64 bits in which we use a "temp" allocated in the env struct.  Sparc
does this for some of its 128-bit arithmetic; I'm planning to do just
the same in my s390x rewrite.  For this, the helper neither reads nor
writes globals, but it has non-global-register side effects.  There
will generally be a post-helper load from env to get the "extra" bits.
While the "true" output of the helper may be dead, the side load from
env may not be.  So we want none of READG | WRITEG | NOSIDE.

If we do reorg, we can certainly add compatibility defines so that we
need not update all translators at once.


r~
Aurelien Jarno - Sept. 27, 2012, 8:34 p.m.
On Thu, Sep 27, 2012 at 12:39:10PM -0700, Richard Henderson wrote:
> On 09/27/2012 10:15 AM, Aurelien Jarno wrote:
> > Note: this is a change of behavior, but the new one is still compliant
> > with the documentation. Currently only cris, i386 and s390x have PURE
> > only helpers. I have checked they are really PURE, and also did some 
> > tests on cris and i386.
> > 
> > diff --git a/tcg/README b/tcg/README
> > index 27846f1..d61daa1 100644
> > --- a/tcg/README
> > +++ b/tcg/README
> > @@ -77,11 +77,16 @@ destroyed, but local temporaries and globals are preserved.
> >  Using the tcg_gen_helper_x_y it is possible to call any function
> >  taking i32, i64 or pointer types. By default, before calling a helper,
> >  all globals are stored at their canonical location and it is assumed
> > -that the function can modify them. This can be overridden by the
> > -TCG_CALL_CONST function modifier. By default, the helper is allowed to
> > -modify the CPU state or raise an exception. This can be overridden by
> > -the TCG_CALL_PURE function modifier, in which case the call to the
> > -function is removed if the return value is not used.
> > +that the function can modify them. The helper is allowed to modify
> > +the CPU state or raise an exception.
> > +
> > +This can be overridden by the TCG_CALL_CONST and TCG_CALL_PURE function
> > +modifiers. A PURE helper can read globals but cannot modify them nor the
> > +CPU state and cannot raise an exception. It can be removed if the return
> > +value is not used. For that the globals are synchronized with their canonical
> > +location, but the associated registers are not freed nor reloaded afterwise.
> > +A CONST helper is a PURE helper which in addition cannot read globals, they
> > +are not synchronized nor stored. Note that CONST implies PURE.
> 
> If we're going to re-org these flags, lets please do it right.

Well it was not a re-org, but more an improvement of the documentation.

> In particular the terms "const" and "pure" were clearly stolen from gcc,
> but then given different meanings (!).  This, to me at least, is incredibly
> confusing.  I have to go back and re-read the docs every time I touch
> one of the helper declarations.
> 
> There are really three flags we care about:
> 
> (1) Helper may read globals.  Sometimes indirectly via exception.
>     Implies that globals must be synced, but may remain in REG/CONST state.

I actually do wonder if we shouldn't use a different flag for "may
trigger exception", even if we end-up handling it the same way as "may
read globals". I am not sure it would be clear without knowledge of the
QEMU internals that a function triggering exceptions should be flagged
as "read globals".

> (2) Helper may write globals.  Implies globals must be synced, and
>     returned to MEM state.
> 
> (3) Helper has no side effects at all.  Implies that it can be deleted if
>     its outputs are dead.

This is basically the flags I have used for the TCG op. That said the
problem with this way of doing for helpers is that most helpers will now
need flags, while it was more or less the contrary up to now.

It's a bit like in GCC, a default function implies nothing, but you can
add more flags latter.

> For the sake of discussion, lets call these READG, WRITEG, NOSIDE.
> 
> Our current default is READG | WRITEG.
> 
> Our current definition of PURE is READG | NOSIDE.  Note that the gcc
> definition of pure actually talks about main memory.
> 
> Our current definition of CONST is none of these bits.  Note that the
> gcc definition of const is a superset of pure.
> 
> What we'd like for all fp helpers is READG only.  Something that we
> cannot get at the moment.

Agreed, as long as the fp env is not mapped as a global.

> There are several cases in which the helper needs to return more than
> 64 bits in which we use a "temp" allocated in the env struct.  Sparc
> does this for some of its 128-bit arithmetic; I'm planning to do just
> the same in my s390x rewrite.  For this, the helper neither reads nor
> writes globals, but it has non-global-register side effects.  There
> will generally be a post-helper load from env to get the "extra" bits.
> While the "true" output of the helper may be dead, the side load from
> env may not be.  So we want none of READG | WRITEG | NOSIDE.
> 
> If we do reorg, we can certainly add compatibility defines so that we
> need not update all translators at once.
> 

Given the default is READG | WRITEG, it would be difficult to do that
unless we negate the flags. Actually NOWRITEG, NOREADG, NOSIDE might be
a good way to go, and is more consistent that by default nothing should
be implied.
Richard Henderson - Sept. 27, 2012, 10:02 p.m.
On 09/27/2012 01:34 PM, Aurelien Jarno wrote:
>> (1) Helper may read globals.  Sometimes indirectly via exception.
>>     Implies that globals must be synced, but may remain in REG/CONST state.
> 
> I actually do wonder if we shouldn't use a different flag for "may
> trigger exception", even if we end-up handling it the same way as "may
> read globals". I am not sure it would be clear without knowledge of the
> QEMU internals that a function triggering exceptions should be flagged
> as "read globals".

I agree it is better to have a "may trigger exception" flag
if for nothing else besides documentation.

> Given the default is READG | WRITEG, it would be difficult to do that
> unless we negate the flags. Actually NOWRITEG, NOREADG, NOSIDE might be
> a good way to go, and is more consistent that by default nothing should
> be implied.

Sure.


r~

Patch

diff --git a/tcg/README b/tcg/README
index 27846f1..d61daa1 100644
--- a/tcg/README
+++ b/tcg/README
@@ -77,11 +77,16 @@  destroyed, but local temporaries and globals are preserved.
 Using the tcg_gen_helper_x_y it is possible to call any function
 taking i32, i64 or pointer types. By default, before calling a helper,
 all globals are stored at their canonical location and it is assumed
-that the function can modify them. This can be overridden by the
-TCG_CALL_CONST function modifier. By default, the helper is allowed to
-modify the CPU state or raise an exception. This can be overridden by
-the TCG_CALL_PURE function modifier, in which case the call to the
-function is removed if the return value is not used.
+that the function can modify them. The helper is allowed to modify
+the CPU state or raise an exception.
+
+This can be overridden by the TCG_CALL_CONST and TCG_CALL_PURE function
+modifiers. A PURE helper can read globals but cannot modify them nor the
+CPU state and cannot raise an exception. It can be removed if the return
+value is not used. For that the globals are synchronized with their canonical
+location, but the associated registers are not freed nor reloaded afterwise.
+A CONST helper is a PURE helper which in addition cannot read globals, they
+are not synchronized nor stored. Note that CONST implies PURE.
 
 On some TCG targets (e.g. x86), several calling conventions are
 supported.
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 2f973e8..1c5ab81 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -1280,11 +1280,14 @@  static void tcg_liveness_analysis(TCGContext *s)
                         dead_temps[arg] = 1;
                         mem_temps[arg] = 0;
                     }
-                    
+
                     if (!(call_flags & TCG_CALL_CONST)) {
+                        /* globals should be synced to memory */
+                        memset(mem_temps, 1, s->nb_globals);
+                    }
+                    if (!(call_flags & (TCG_CALL_CONST | TCG_CALL_PURE))) {
                         /* globals should go back to memory */
                         memset(dead_temps, 1, s->nb_globals);
-                        memset(mem_temps, 1, s->nb_globals);
                     }
 
                     /* input args are live */
@@ -1623,6 +1626,23 @@  static void save_globals(TCGContext *s, TCGRegSet allocated_regs)
     }
 }
 
+/* sync globals to their canonical location and assume they can be
+   read by the following code. 'allocated_regs' is used in case a
+   temporary registers needs to be allocated to store a constant. */
+static void sync_globals(TCGContext *s, TCGRegSet allocated_regs)
+{
+    int i;
+
+    for (i = 0; i < s->nb_globals; i++) {
+#ifdef USE_LIVENESS_ANALYSIS
+        assert(s->temps[i].val_type != TEMP_VAL_REG || s->temps[i].fixed_reg ||
+               s->temps[i].mem_coherent);
+#else
+        temp_sync(s, i, allocated_regs);
+#endif
+    }
+}
+
 /* at the end of a basic block, we assume all temporaries are dead and
    all globals are stored at their canonical location. */
 static void tcg_reg_alloc_bb_end(TCGContext *s, TCGRegSet allocated_regs)
@@ -2070,9 +2090,12 @@  static int tcg_reg_alloc_call(TCGContext *s, const TCGOpDef *def,
         }
     }
     
-    /* store globals and free associated registers (we assume the call
-       can modify any global. */
-    if (!(flags & TCG_CALL_CONST)) {
+    /* sync or save globals */
+    if (flags & TCG_CALL_CONST) {
+        /* nothing to do */
+    } else if (flags & TCG_CALL_PURE) {
+        sync_globals(s, allocated_regs);
+    } else {
         save_globals(s, allocated_regs);
     }