TCG problem with cpu_{st,ld}x_data ?

Message ID	1469364745.8568.254.camel@kernel.crashing.org
State	New
Headers	show Return-Path: <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org> Message-ID: <1469364745.8568.254.camel@kernel.crashing.org> From: Benjamin Herrenschmidt <benh@kernel.crashing.org> To: qemu-devel@nongnu.org Date: Sun, 24 Jul 2016 22:52:25 +1000 In-Reply-To: <1469364697.8568.253.camel@kernel.crashing.org> References: <1469364141.8568.251.camel@kernel.crashing.org> <1469364697.8568.253.camel@kernel.crashing.org> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] TCG problem with cpu_{st,ld}x_data ? Precedence: list Cc: Paolo Bonzini <pbonzini@redhat.com>, Christian Borntraeger <borntraeger@de.ibm.com>, Richard Henderson <rth@twiddle.net> Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>

Benjamin Herrenschmidt July 24, 2016, 12:52 p.m. UTC

On Sun, 2016-07-24 at 22:51 +1000, Benjamin Herrenschmidt wrote:
> 
> FYI: This probably completely wrong patch (but it was easier than
> hacking all the helpers) fixed the problem for me. With this (and the
> video driver I wrote that I will publish asap), I can now reliably
> boot
> various versions of MacOS X in qemu ppc using a 7400 CPU.

And here's the patch:

Richard Henderson July 25, 2016, 12:36 a.m. UTC | #1

On 07/24/2016 06:22 PM, Benjamin Herrenschmidt wrote:
> On Sun, 2016-07-24 at 22:51 +1000, Benjamin Herrenschmidt wrote:
>> >
>> > FYI: This probably completely wrong patch (but it was easier than
>> > hacking all the helpers) fixed the problem for me. With this (and the
>> > video driver I wrote that I will publish asap), I can now reliably
>> > boot
>> > various versions of MacOS X in qemu ppc using a 7400 CPU.
> And here's the patch:
>
> diff --git a/include/exec/cpu_ldst_template.h b/include/exec/cpu_ldst_template.h
> index eaf69a1..13e8881 100644
> --- a/include/exec/cpu_ldst_template.h
> +++ b/include/exec/cpu_ldst_template.h
> @@ -111,7 +111,7 @@ glue(glue(glue(cpu_ld, USUFFIX), MEMSUFFIX), _ra)(CPUArchState *env,
>  static inline RES_TYPE
>  glue(glue(cpu_ld, USUFFIX), MEMSUFFIX)(CPUArchState *env, target_ulong ptr)
>  {
> -    return glue(glue(glue(cpu_ld, USUFFIX), MEMSUFFIX), _ra)(env, ptr, 0);
> +    return glue(glue(glue(cpu_ld, USUFFIX), MEMSUFFIX), _ra)(env, ptr, GETPC());
>  }
>

These functions would have to be always_inline for this to work.  Otherwise you 
could get the helper's PC, not the TCG caller's PC.

But let's try to fix this the other way.



r~

Benjamin Herrenschmidt July 25, 2016, 12:46 a.m. UTC | #2

On Mon, 2016-07-25 at 06:06 +0530, Richard Henderson wrote:
> These functions would have to be always_inline for this to work. 
> Otherwise you 
> could get the helper's PC, not the TCG caller's PC.
> 
> But let's try to fix this the other way.

I could use some help there as I don't really understand the PC fixup /
adjustment mechanism in qemu... 

Cheers,
Ben.

Benjamin Herrenschmidt July 25, 2016, 12:51 a.m. UTC | #3

On Mon, 2016-07-25 at 10:46 +1000, Benjamin Herrenschmidt wrote:
> On Mon, 2016-07-25 at 06:06 +0530, Richard Henderson wrote:
> > 
> > These functions would have to be always_inline for this to work. 
> > Otherwise you 
> > could get the helper's PC, not the TCG caller's PC.
> > 
> > But let's try to fix this the other way.
> 
> I could use some help there as I don't really understand the PC fixup
> adjustment mechanism in qemu... 

One thing I can do, but I don't know whether that's worthwhile (you
tell me), is change all translation helpers in powerpc to do like
x86, which is to pass the RA along and never use the non_ra() variants.

But that's quite a bit of churn, so let me know if your plan is better.

Are those functions always meant to be called within translation
helpers ? IE, the fault can safely longjmp out and it's just a matter
of finding the proper instruction PC to report ?

Or can they also be called outside of that context ?

Cheers,
Ben.

Richard Henderson July 25, 2016, 2 p.m. UTC | #4

On 07/25/2016 06:21 AM, Benjamin Herrenschmidt wrote:
> On Mon, 2016-07-25 at 10:46 +1000, Benjamin Herrenschmidt wrote:
>> On Mon, 2016-07-25 at 06:06 +0530, Richard Henderson wrote:
>>>
>>> These functions would have to be always_inline for this to work.
>>> Otherwise you
>>> could get the helper's PC, not the TCG caller's PC.
>>>
>>> But let's try to fix this the other way.
>>
>> I could use some help there as I don't really understand the PC fixup
>> adjustment mechanism in qemu...
>
> One thing I can do, but I don't know whether that's worthwhile (you
> tell me), is change all translation helpers in powerpc to do like
> x86, which is to pass the RA along and never use the non_ra() variants.
>
> But that's quite a bit of churn, so let me know if your plan is better.

It is worthwhile, because that allows you to drop some state saving within the 
translation.  With the assumption being, of course, that the exceptional case 
is rare and so we save on the state saving most of the time.

Or, as appears to be the case for e.g. LVEBX, there currently is no such state 
save, and so any exceptions raised by this insn will point to the wrong insn, 
and so the conversion fixes a bug.

> Are those functions always meant to be called within translation
> helpers ? IE, the fault can safely longjmp out and it's just a matter
> of finding the proper instruction PC to report ?

Correct.

> Or can they also be called outside of that context ?

No, not without a valid return address.

E.g. it's not valid to have one helper call another, and for the second helper 
use GETPC.  For this, typically, one must factor out a common function which 
accepts a "retaddr" argument, which the callers must fill in with GETPC.

r~

Benjamin Herrenschmidt July 25, 2016, 9:42 p.m. UTC | #5

On Mon, 2016-07-25 at 19:30 +0530, Richard Henderson wrote:
> > Or can they also be called outside of that context ?
> 
> No, not without a valid return address.
> 
> E.g. it's not valid to have one helper call another, and for the second helper 
> use GETPC.  For this, typically, one must factor out a common function which 
> accepts a "retaddr" argument, which the callers must fill in with GETPC.

Right, I've figured that out. I notice that the cpu_ldl_code() are
sprinkled in parts that are "chancy".

For example we have one in powerpc_excp() to read the faulting
instruction, though that *should* never fail it's till not great.

I haven't completely figured out what code path instruction translation
faults take, I assume we longjmp out of the translation loop the same
was as we do out of the execution loop ?

Note: I've started cleaning that on ppc (but not fixing the -2 bug yet)
in there: very much a work in progress but I'd be happy to have initial
feedback (ignore patch 1 about MOL OSI, it's unrelated):

https://github.com/ozbenh/qemu/commits/wip

Cheers,
Ben.

Peter Maydell July 25, 2016, 10:19 p.m. UTC | #6

On 25 July 2016 at 22:42, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Mon, 2016-07-25 at 19:30 +0530, Richard Henderson wrote:
>> > Or can they also be called outside of that context ?
>>
>> No, not without a valid return address.
>>
>> E.g. it's not valid to have one helper call another, and for the second helper
>> use GETPC.  For this, typically, one must factor out a common function which
>> accepts a "retaddr" argument, which the callers must fill in with GETPC.
>
> Right, I've figured that out. I notice that the cpu_ldl_code() are
> sprinkled in parts that are "chancy".
>
> For example we have one in powerpc_excp() to read the faulting
> instruction, though that *should* never fail it's till not great.

Mmm. Strictly speaking you can't guarantee that that load will
work, because the code you're executing could be still cached
in the QEMU TLB while the (perverse) guest rewrites the page table
so the ldl_code won't work.

You might like to take a look at how target-arm handles the
similar case (wanting to know information about source register
numbers etc for some memory faults). We use a TCG instruction
parameter to save the information as we translate the code
in target-arm/translate-a64.c, then in restore_state_to_opc()
(called after the fault) we can fish it out and put it in the
CPU state struct the same way we regain the faulting PC. Then
in the exception handler it's available for use. Commit aaa1f954d4
is the meat of it.

Alternatively if you just want to do "a load at virtual address
but don't longjump anywhere" you probably need to:
 * do the virt-to-phys translation manually (ie by calling
   some ppc specific function), so you can capture success/failure
   of the MMU access checks (and without setting guest visible
   MMU fault status info)
 * use address_space_ldl() on the result, so you can pass in
   a MemTxResult* to capture success/failure of the phys access

That's pretty clunky. I meant at some point to look at
whether we could provide versions of the cpu_ldl* type
functions that allowed a MemTxResult*, but never got to it.

> I haven't completely figured out what code path instruction translation
> faults take, I assume we longjmp out of the translation loop the same
> was as we do out of the execution loop ?

I suggest you step through it in a debugger if you fancy a laugh.
As I recall it works more by luck than judgement and includes
rather more repeated work than is actually necessary. (That
is, we do longjump out, but not necessarily to the most
sensible point.)

thanks
-- PMM

Benjamin Herrenschmidt July 25, 2016, 10:42 p.m. UTC | #7

On Mon, 2016-07-25 at 23:19 +0100, Peter Maydell wrote:
> > For example we have one in powerpc_excp() to read the faulting
> > instruction, though that *should* never fail it's till not great.
> 
> Mmm. Strictly speaking you can't guarantee that that load will
> work, because the code you're executing could be still cached
> in the QEMU TLB while the (perverse) guest rewrites the page table
> so the ldl_code won't work.

Right, it worries me, it's unlikely but possible.

> You might like to take a look at how target-arm handles the
> similar case (wanting to know information about source register
> numbers etc for some memory faults). We use a TCG instruction
> parameter to save the information as we translate the code
> in target-arm/translate-a64.c, then in restore_state_to_opc()
> (called after the fault) we can fish it out and put it in the
> CPU state struct the same way we regain the faulting PC. Then
> in the exception handler it's available for use. Commit aaa1f954d4
> is the meat of it.

We do something a bit different on ppc where we store the access type
before every access, however the DSISR case is special in that on older
CPUs, it's expected to contains a whole subset of the opcode which is
quite a bit more info than what you want here...

I'm thinking maybe we should use a form of load that returns an error
instead of longjmp'ing, and if we do error out, flush the tb for that
instruction and replay which should cause the translate path to reload
the TLB for it but it's still fishy.

> Alternatively if you just want to do "a load at virtual address
> but don't longjump anywhere" you probably need to:
>  * do the virt-to-phys translation manually (ie by calling
>    some ppc specific function), so you can capture success/failure
>    of the MMU access checks (and without setting guest visible
>    MMU fault status info)
>  * use address_space_ldl() on the result, so you can pass in
>    a MemTxResult* to capture success/failure of the phys access

Yup.

> That's pretty clunky. I meant at some point to look at
> whether we could provide versions of the cpu_ldl* type
> functions that allowed a MemTxResult*, but never got to it.
> 
> > I haven't completely figured out what code path instruction
> translation faults take, I assume we longjmp out of the translation
> loop the same
> > was as we do out of the execution loop ?
> 
> I suggest you step through it in a debugger if you fancy a laugh.
> As I recall it works more by luck than judgement and includes
> rather more repeated work than is actually necessary. (That
> is, we do longjump out, but not necessarily to the most
> sensible point.)

Haha, I was starting to guess that :-) But that isn't a battle I want
to fight today, enough on my plate already !

Cheers,
Ben.

Benjamin Herrenschmidt July 25, 2016, 11:22 p.m. UTC | #8

On Tue, 2016-07-26 at 08:42 +1000, Benjamin Herrenschmidt wrote:

> We do something a bit different on ppc where we store the access type
> before every access, however the DSISR case is special in that on
> older
> CPUs, it's expected to contains a whole subset of the opcode which is
> quite a bit more info than what you want here...
> 
> I'm thinking maybe we should use a form of load that returns an error
> instead of longjmp'ing, and if we do error out, flush the tb for that
> instruction and replay which should cause the translate path to
> reload
> the TLB for it but it's still fishy.

I have a better idea !

This is only a problem for alignment interrupts, and those are very
rare, we only generate them in some cases of broken forms like
trying to do a ll/sc on an unaligned address.

So I'm thinking I'm just going to pass the opcode to the helper
in the error_code field.

Cheers,
Ben.

TCG problem with cpu_{st,ld}x_data ?

Commit Message

Comments

Patch