diff mbox

cpu-exec: Don't mask out external interrupts when single-stepping an invalid instruction.

Message ID CAOKbPbYGhyFQVCVBWpQTr0728kYU9bkX+rLKhp1+FtBEDiAbWA@mail.gmail.com
State New
Headers show

Commit Message

Martin Galvan Sept. 11, 2014, 9:02 p.m. UTC
When using Gdb to remote-debug a program, if we try to single-step an
invalid instruction,
Qemu will never return control to the remote Gdb.
The source of this problem is external interrupts being masked out
in cpu_exec if cpu->singlestep_enabled has the SSTEP_NOIRQ flag set.
To solve this I've added an additional flag, SSTEP_EXCEPTION,
that will be set in the exception_with_syndrome instruction generated
when trying to translate the invalid instruction and will be cleared
after checking for cpu->singlestep_enabled in cpu_exec.

Signed-off-by: Martin Galvan <martin.galvan@tallertechnologies.com>
---

The long story: Qemu generates an exception_with_syndrome instruction
when it realizes the instruction it's trying to translate is invalid.
That instruction in turn modifies cs->exception_index and calls cpu_loop_exit.
Normally, the value in cs->exception_index would cause do_interrupt to set
the PC to point to the corresponding exception handler.
However, since we're masking out IRQs, the PC will never be set correctly.
Even worse, since Qemu will have generated an internal exception
to return control back to the remote Gdb *after* it generated the syndrome one,
its code will appear after the call to cpu_loop_exit.
Since the PC won't have changed, we'll try to excecute
the same translation block as before, thus calling cpu_loop_exit
again, and so on.

Here's the bug tracker link: https://bugs.launchpad.net/qemu/+bug/1364501

 cpu-exec.c             | 6 +++++-
 include/qom/cpu.h      | 5 +++++
 target-arm/op_helper.c | 5 +++++
 3 files changed, 15 insertions(+), 1 deletion(-)

     cpu_loop_exit(cs);

Comments

Peter Maydell Sept. 11, 2014, 10:13 p.m. UTC | #1
[cc'ing RTH who may have a better grasp on how the builtin single step is
supposed to work.]

On 11 September 2014 22:02, Martin Galvan
<martin.galvan@tallertechnologies.com> wrote:
> When using Gdb to remote-debug a program, if we try to single-step an
> invalid instruction,
> Qemu will never return control to the remote Gdb.
> The source of this problem is external interrupts being masked out
> in cpu_exec if cpu->singlestep_enabled has the SSTEP_NOIRQ flag set.
> To solve this I've added an additional flag, SSTEP_EXCEPTION,
> that will be set in the exception_with_syndrome instruction generated
> when trying to translate the invalid instruction and will be cleared
> after checking for cpu->singlestep_enabled in cpu_exec.
>
> Signed-off-by: Martin Galvan <martin.galvan@tallertechnologies.com>
> ---
>
> The long story: Qemu generates an exception_with_syndrome instruction
> when it realizes the instruction it's trying to translate is invalid.
> That instruction in turn modifies cs->exception_index and calls cpu_loop_exit.
> Normally, the value in cs->exception_index would cause do_interrupt to set
> the PC to point to the corresponding exception handler.
> However, since we're masking out IRQs, the PC will never be set correctly.
> Even worse, since Qemu will have generated an internal exception
> to return control back to the remote Gdb *after* it generated the syndrome one,
> its code will appear after the call to cpu_loop_exit.
> Since the PC won't have changed, we'll try to excecute
> the same translation block as before, thus calling cpu_loop_exit
> again, and so on.
>
> Here's the bug tracker link: https://bugs.launchpad.net/qemu/+bug/1364501
>
>  cpu-exec.c             | 6 +++++-
>  include/qom/cpu.h      | 5 +++++
>  target-arm/op_helper.c | 5 +++++
>  3 files changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/cpu-exec.c b/cpu-exec.c
> index 5fa172c..5d9f231 100644
> --- a/cpu-exec.c
> +++ b/cpu-exec.c
> @@ -448,9 +448,13 @@ int cpu_exec(CPUArchState *env)
>              for(;;) {
>                  interrupt_request = cpu->interrupt_request;
>                  if (unlikely(interrupt_request)) {
> -                    if (unlikely(cpu->singlestep_enabled & SSTEP_NOIRQ)) {
> +                    if (unlikely((cpu->singlestep_enabled & SSTEP_NOIRQ) &&
> +                            !(cpu->singlestep_enabled & SSTEP_EXCEPTION))) {
>                          /* Mask out external interrupts for this step. */
>                          interrupt_request &= ~CPU_INTERRUPT_SSTEP_MASK;
> +                    } else { /* An exception occured; don't mask out
> this one */
> +                        /* Mask out any future external interrupts */
> +                        cpu->singlestep_enabled &= ~SSTEP_EXCEPTION;
>                      }
>                      if (interrupt_request & CPU_INTERRUPT_DEBUG) {
>                          cpu->interrupt_request &= ~CPU_INTERRUPT_DEBUG;
> diff --git a/include/qom/cpu.h b/include/qom/cpu.h
> index f2df033..a57800f 100644
> --- a/include/qom/cpu.h
> +++ b/include/qom/cpu.h
> @@ -607,6 +607,11 @@ void qemu_init_vcpu(CPUState *cpu);
>  #define SSTEP_ENABLE    0x1  /* Enable simulated HW single stepping */
>  #define SSTEP_NOIRQ     0x2  /* Do not use IRQ while single stepping */
>  #define SSTEP_NOTIMER   0x4  /* Do not Timers while single stepping */
> +#define SSTEP_EXCEPTION 0x8  /* Don't mask out exception-related
> IRQs. Set only if
> +                              * we have to process an exception while single-
> +                              * stepping (such as when
> single-stepping an invalid
> +                              * instruction).
> +                              */
>
>  /**
>   * cpu_single_step:
> diff --git a/target-arm/op_helper.c b/target-arm/op_helper.c
> index fe40358..2139ea6 100644
> --- a/target-arm/op_helper.c
> +++ b/target-arm/op_helper.c
> @@ -251,6 +251,11 @@ void HELPER(exception_with_syndrome)(CPUARMState
> *env, uint32_t excp,
>      CPUState *cs = CPU(arm_env_get_cpu(env));
>
>      assert(!excp_is_internal(excp));
> +
> +    if (unlikely(cs->singlestep_enabled & SSTEP_NOIRQ)) {
> +        cs->singlestep_enabled |= SSTEP_EXCEPTION;
> +    }
> +
>      cs->exception_index = excp;
>      env->exception.syndrome = syndrome;
>      cpu_loop_exit(cs);
> --
> 1.9.1
>
> --
>
> Martín Galván
>
> Software Engineer
>
> Taller Technologies Argentina
>
> San Lorenzo 47, 3rd Floor, Office 5
>
> Córdoba, Argentina
>
> Phone: 54 351 4217888 / +54 351 4218211


thanks
-- PMM
Richard Henderson Sept. 11, 2014, 11:52 p.m. UTC | #2
On 09/11/2014 03:13 PM, Peter Maydell wrote:
>> The long story: Qemu generates an exception_with_syndrome instruction
>> when it realizes the instruction it's trying to translate is invalid.
>> That instruction in turn modifies cs->exception_index and calls cpu_loop_exit.
>> Normally, the value in cs->exception_index would cause do_interrupt to set
>> the PC to point to the corresponding exception handler.
>> However, since we're masking out IRQs, the PC will never be set correctly.
>> Even worse, since Qemu will have generated an internal exception
>> to return control back to the remote Gdb *after* it generated the syndrome one,
>> its code will appear after the call to cpu_loop_exit.
>> Since the PC won't have changed, we'll try to excecute
>> the same translation block as before, thus calling cpu_loop_exit
>> again, and so on.

This suggests that target-arm (64 or 32-bit?) isn't implementing illegal
instruction traping in the way that I'd expect.

In particular, I'd expect the invalid exception to be recognized, and then the
cpu loop exited, before the single-step exception could overwrite it.  E.g. how
things work on alpha:

$ cat z.s
	.globl _start
_start:
	nop
	.long (1 << 26)
	nop
$ alphaev67-linux-as -o z.o z.s
$ alphaev67-linux-ld -Ttext-segment 0xfffffc0000100000 -o z z.o
$ ./run/bin/qemu-system-alpha -kernel z -S -gdb tcp::12345 &
$ alphaev67-linux-gdb ./z

Set a breakpoint at _start, continue, swap over to qemu monitor and enable
logging of int,op,in_asm.  Now single-step a couple of times and we get:

IN:
0xfffffc0000100078:  nop

OP:
 ld_i32 tmp0,env,$0xfffffffffffffffc
 movi_i32 tmp1,$0x0
 brcond_i32 tmp0,tmp1,ne,$0x0
 ---- 0xfffffc0000100078
 movi_i64 pc,$0xfffffc000010007c
 movi_i32 tmp0,$0x10002
 movi_i32 tmp1,$0x0
 call excp,$0x0,$0,env,tmp0,tmp1
 set_label $0x0
 exit_tb $0x7f54d4fee013

Notice here that we end the single-step of the insn with a call to excp with
0x10002=EXCP_DEBUG.

IN:
0xfffffc000010007c:  .long 0x4000000

OP:
 ld_i32 tmp0,env,$0xfffffffffffffffc
 movi_i32 tmp1,$0x0
 brcond_i32 tmp0,tmp1,ne,$0x0
 ---- 0xfffffc000010007c
 movi_i64 pc,$0xfffffc0000100080
 movi_i32 tmp0,$0x7
 movi_i32 tmp1,$0x0
 call excp,$0x0,$0,env,tmp0,tmp1
 set_label $0x0
 exit_tb $0x7f54d4fee013

Notice here that we end the single-step of the invalid insn with a call to excp
with 7=EXCP_OPCDEC.  We never attempt to single-step, because we know that
single-step isn't going to be reachable.

INT      1: opcdec(0) pc=fffffc0000100080 sp=fffffc0000010000
IN: Pal_OpcDec
0xfffffc0000000380:  hw_mfpr    t7,0

But gdb does in fact stop at the very next insn, which is of course the PALcode
(bios) OPCDEC exception entry point.

>> +#define SSTEP_EXCEPTION 0x8  /* Don't mask out exception-related
>> IRQs. Set only if
>> +                              * we have to process an exception while single-
>> +                              * stepping (such as when
>> single-stepping an invalid
>> +                              * instruction).
>> +                              */

I'm really not sure what you're doing with this, or why you'd need it.
Probably target-arm wants fixing in some way, but I don't have time to look
into it this evening.


r~
Martin Galvan Sept. 12, 2014, 2:33 p.m. UTC | #3
On Thu, Sep 11, 2014 at 8:52 PM, Richard Henderson <rth@twiddle.net> wrote:
> On 09/11/2014 03:13 PM, Peter Maydell wrote:
> In particular, I'd expect the invalid exception to be recognized, and then the
> cpu loop exited, before the single-step exception could overwrite it.  E.g. how
> things work on alpha:
>
> $ cat z.s
>         .globl _start
> _start:
>         nop
>         .long (1 << 26)
>         nop
> $ alphaev67-linux-as -o z.o z.s
> $ alphaev67-linux-ld -Ttext-segment 0xfffffc0000100000 -o z z.o
> $ ./run/bin/qemu-system-alpha -kernel z -S -gdb tcp::12345 &
> $ alphaev67-linux-gdb ./z

Can't test this myself since I don't have an alpha cross-toolchain at
hand right now.

> Set a breakpoint at _start, continue, swap over to qemu monitor and enable
> logging of int,op,in_asm.  Now single-step a couple of times and we get:
>
> IN:
> 0xfffffc0000100078:  nop
>
> OP:
>  ld_i32 tmp0,env,$0xfffffffffffffffc
>  movi_i32 tmp1,$0x0
>  brcond_i32 tmp0,tmp1,ne,$0x0
>  ---- 0xfffffc0000100078
>  movi_i64 pc,$0xfffffc000010007c
>  movi_i32 tmp0,$0x10002
>  movi_i32 tmp1,$0x0
>  call excp,$0x0,$0,env,tmp0,tmp1
>  set_label $0x0
>  exit_tb $0x7f54d4fee013
>
> Notice here that we end the single-step of the insn with a call to excp with
> 0x10002=EXCP_DEBUG.
>
> IN:
> 0xfffffc000010007c:  .long 0x4000000
>
> OP:
>  ld_i32 tmp0,env,$0xfffffffffffffffc
>  movi_i32 tmp1,$0x0
>  brcond_i32 tmp0,tmp1,ne,$0x0
>  ---- 0xfffffc000010007c
>  movi_i64 pc,$0xfffffc0000100080
>  movi_i32 tmp0,$0x7
>  movi_i32 tmp1,$0x0
>  call excp,$0x0,$0,env,tmp0,tmp1
>  set_label $0x0
>  exit_tb $0x7f54d4fee013
>
> Notice here that we end the single-step of the invalid insn with a call to excp
> with 7=EXCP_OPCDEC.  We never attempt to single-step, because we know that
> single-step isn't going to be reachable.
>
> INT      1: opcdec(0) pc=fffffc0000100080 sp=fffffc0000010000
> IN: Pal_OpcDec
> 0xfffffc0000000380:  hw_mfpr    t7,0
>
> But gdb does in fact stop at the very next insn, which is of course the PALcode
> (bios) OPCDEC exception entry point.

So if I understood right, what you do in Alpha is:

- With your PC pointing to the invalid instruction, single-step once.
- The generated assembly will contain a call to excp with EXCP_OPCDEC.
- On excp, it sets cs->exception_index to EXCP_OPCDEC and then does a
cpu_loop_exit.
- As it advances through the loop again, it'll notice exception_index
is greater than 0, thus calling do_interrupt.
- Inside do_interrupt it sets the PC to point to the exception handler
entry point.
- It sets cpu->exception_index to EXCP_DEBUG somehow, thus returning
control back to gdb.
- The net result is that single-stepping with the PC pointing to an
invalid instruction
  immediately leads us to the exception handler.

That's exactly what I'm trying to achieve. However, in target-arm we
end up calling do_interrupt twice:
the first time in the outer for(;;) to set cpu->interrupt_request to
CPU_INTERRUPT_HARD,
and the second time inside the inner for(;;) to actually set the PC to
point to the exception handler.

Silly me, I thought every architecture did the same.

>>> +#define SSTEP_EXCEPTION 0x8  /* Don't mask out exception-related
>>> IRQs. Set only if
>>> +                              * we have to process an exception while single-
>>> +                              * stepping (such as when
>>> single-stepping an invalid
>>> +                              * instruction).
>>> +                              */
>
> I'm really not sure what you're doing with this, or why you'd need it.
> Probably target-arm wants fixing in some way, but I don't have time to look
> into it this evening.
>

Like I said, in the inner for(;;) the call to do_interrupt wasn't
being done because
our CPU_INTERRUPT_HARD was being masked out because of the very fact we have
the SSTEP_NOIRQ flag set in singlestep_enabled. The purpose of my new
flag was to avoid that,
so the PC would be set correctly.

However, like you said this should probably be fixed inside the ARM
code instead of
touching cpu-exec.

Besides, my fix was taking advantage of the fact that we'd get a new
translation block
after translating the first instruction of the exception handler. As
we still have singlestep_enabled,
it would generate a new internal exception that would set
cpu->exception_index to EXCP_DEBUG
and return control back to gdb. However, I just noticed that if we
were to have an invalid instruction inside
the interrupt handler itself, Qemu won't generate a new translation
block and once again
we'll never reach the internal exception, thus never returning control
back to gdb.

How do you set cpu->exception_index in to EXCP_DEBUG after calling
do_interrupt with EXCP_OPCDEC?
Richard Henderson Sept. 12, 2014, 3:37 p.m. UTC | #4
On 09/12/2014 07:33 AM, Martin Galvan wrote:
> On Thu, Sep 11, 2014 at 8:52 PM, Richard Henderson <rth@twiddle.net> wrote:
>> On 09/11/2014 03:13 PM, Peter Maydell wrote:
>> In particular, I'd expect the invalid exception to be recognized, and then the
>> cpu loop exited, before the single-step exception could overwrite it.  E.g. how
>> things work on alpha:
>>
>> $ cat z.s
>>         .globl _start
>> _start:
>>         nop
>>         .long (1 << 26)
>>         nop
>> $ alphaev67-linux-as -o z.o z.s
>> $ alphaev67-linux-ld -Ttext-segment 0xfffffc0000100000 -o z z.o
>> $ ./run/bin/qemu-system-alpha -kernel z -S -gdb tcp::12345 &
>> $ alphaev67-linux-gdb ./z
> 
> Can't test this myself since I don't have an alpha cross-toolchain at
> hand right now.

It's pretty simple to build a cross binutils; much easier and quicker than
building a cross gcc.  And you'd need the cross gdb that gets built in the
process anyway...

> So if I understood right, what you do in Alpha is:
> 
> - With your PC pointing to the invalid instruction, single-step once.
> - The generated assembly will contain a call to excp with EXCP_OPCDEC.
> - On excp, it sets cs->exception_index to EXCP_OPCDEC and then does a
> cpu_loop_exit.
> - As it advances through the loop again, it'll notice exception_index
> is greater than 0, thus calling do_interrupt.
> - Inside do_interrupt it sets the PC to point to the exception handler
> entry point.

All true and correct so far.

> - It sets cpu->exception_index to EXCP_DEBUG somehow, thus returning
> control back to gdb.

Ah, well, no.  Certainly target-alpha does no such thing.
I admit I have no idea exactly what happens here, because...

> - The net result is that single-stepping with the PC pointing to an
> invalid instruction
>   immediately leads us to the exception handler.

... it just worked, yes.

> That's exactly what I'm trying to achieve. However, in target-arm we
> end up calling do_interrupt twice:
> the first time in the outer for(;;) to set cpu->interrupt_request to
> CPU_INTERRUPT_HARD,
> and the second time inside the inner for(;;) to actually set the PC to
> point to the exception handler.

Ah, well, that's where we begin to differ.

Alpha do_interrupt doesn't mess with cpu->interrupt_request at all, and doesn't
generate two calls to do_interrupt.  The one call finds the vector for the
given interrupt, modifies the PC, and swaps to the shadow register bank.

Fin.

(Which reminds me, we really, Really, need to get those ifdefs in cpu_exec
factored out into a nice single cpu callback.  Every time I read this code, I
feel I've missed something.)

> How do you set cpu->exception_index in to EXCP_DEBUG after calling
> do_interrupt with EXCP_OPCDEC?

I still don't know, but I know that _I_ don't do it.  I tried stepping through
qemu itself here, but I managed to crash the guest gdb in the process; dunno
what happened there.


r~
Martin Galvan Sept. 12, 2014, 3:50 p.m. UTC | #5
On Fri, Sep 12, 2014 at 12:37 PM, Richard Henderson <rth@twiddle.net> wrote:
> Alpha do_interrupt doesn't mess with cpu->interrupt_request at all, and doesn't
> generate two calls to do_interrupt.  The one call finds the vector for the
> given interrupt, modifies the PC, and swaps to the shadow register bank.
>

Indeed it does. The problem is, even if I changed the code inside
ARM's do_interrupt to set the correct PC in the first call (so the IRQ
masking wasn't an issue anymore) I'd still be stuck in the loop
without returning control back to gdb.

> (Which reminds me, we really, Really, need to get those ifdefs in cpu_exec
> factored out into a nice single cpu callback.  Every time I read this code, I
> feel I've missed something.)

I totally agree with you there.

>> How do you set cpu->exception_index in to EXCP_DEBUG after calling
>> do_interrupt with EXCP_OPCDEC?
>
> I still don't know, but I know that _I_ don't do it.  I tried stepping through
> qemu itself here, but I managed to crash the guest gdb in the process; dunno
> what happened there.
>

In that case I'll see if I can step through Qemu myself and find out
how Alpha returns control to gdb. Perhaps we can do the same on ARM.

Thanks a lot!
Martin Galvan Sept. 15, 2014, 4:02 p.m. UTC | #6
I did a bit more research on how Cortex-M handles nested interrupts.
In particular, I noticed if we were to have an invalid instruction
inside the exception handler itself, trying to single-step it would
again cause Qemu not to return control back to gdb. Reading up a bit I
found out that when the CPU is handling an exception, all other
exceptions with the same or lower priority will be blocked.

Qemu seems to implement this by setting the exception as pending, yet
not changing cpu->interrupt_request so that we modify the PC in the
second call to do_interrupt. That seems fine to me, but since the
translation block is the same as before (and if that one contained an
exception_with_syndrome ending in a call to cpu_loop_exit) we'll be
stuck in the loop and we'll never return control to Gdb.

Therefore I have two questions:
1) Is this a bug?
2) If it is, what should the behaviour be? I think we should return
control back to gdb, but we shouldn't try to translate that
instruction again. How could we handle this in a clean way?

On Fri, Sep 12, 2014 at 12:50 PM, Martin Galvan
<martin.galvan@tallertechnologies.com> wrote:
> On Fri, Sep 12, 2014 at 12:37 PM, Richard Henderson <rth@twiddle.net> wrote:
>> Alpha do_interrupt doesn't mess with cpu->interrupt_request at all, and doesn't
>> generate two calls to do_interrupt.  The one call finds the vector for the
>> given interrupt, modifies the PC, and swaps to the shadow register bank.
>>
>
> Indeed it does. The problem is, even if I changed the code inside
> ARM's do_interrupt to set the correct PC in the first call (so the IRQ
> masking wasn't an issue anymore) I'd still be stuck in the loop
> without returning control back to gdb.
>
>> (Which reminds me, we really, Really, need to get those ifdefs in cpu_exec
>> factored out into a nice single cpu callback.  Every time I read this code, I
>> feel I've missed something.)
>
> I totally agree with you there.
>
>>> How do you set cpu->exception_index in to EXCP_DEBUG after calling
>>> do_interrupt with EXCP_OPCDEC?
>>
>> I still don't know, but I know that _I_ don't do it.  I tried stepping through
>> qemu itself here, but I managed to crash the guest gdb in the process; dunno
>> what happened there.
>>
>
> In that case I'll see if I can step through Qemu myself and find out
> how Alpha returns control to gdb. Perhaps we can do the same on ARM.
>
> Thanks a lot!
>
> --
>
> Martín Galván
>
> Software Engineer
>
> Taller Technologies Argentina
>
>
> San Lorenzo 47, 3rd Floor, Office 5
>
> Córdoba, Argentina
>
> Phone: 54 351 4217888 / +54 351 4218211
Peter Maydell Sept. 15, 2014, 4:10 p.m. UTC | #7
On 15 September 2014 09:02, Martin Galvan
<martin.galvan@tallertechnologies.com> wrote:
> I did a bit more research on how Cortex-M handles nested interrupts.
> In particular, I noticed if we were to have an invalid instruction
> inside the exception handler itself, trying to single-step it would
> again cause Qemu not to return control back to gdb. Reading up a bit I
> found out that when the CPU is handling an exception, all other
> exceptions with the same or lower priority will be blocked.
>
> Qemu seems to implement this by setting the exception as pending, yet
> not changing cpu->interrupt_request so that we modify the PC in the
> second call to do_interrupt. That seems fine to me, but since the
> translation block is the same as before (and if that one contained an
> exception_with_syndrome ending in a call to cpu_loop_exit) we'll be
> stuck in the loop and we'll never return control to Gdb.
>
> Therefore I have two questions:
> 1) Is this a bug?
> 2) If it is, what should the behaviour be? I think we should return
> control back to gdb, but we shouldn't try to translate that
> instruction again. How could we handle this in a clean way?

Our handling of exceptions and exception priorities for
Cortex-M CPUs is totally wrong. We attempt to reuse the
GIC priority handling code for interrupts, which then leaves
the exceptions without the correct priority handling. Ideally
we should rewrite this completely to give the M profile
cores the architecturally mandated exception priority
handling and masking, and stop trying to reuse the
A/R profile core code at all. The two profiles are just
too different for that to make sense.

Any bugs or weirdness in the current code are likely
symptoms of attempting to shoehorn M profile into the
A/R profile handling.

-- PMM
Andreas Färber Sept. 15, 2014, 4:17 p.m. UTC | #8
Am 15.09.2014 um 18:10 schrieb Peter Maydell:
> On 15 September 2014 09:02, Martin Galvan
> <martin.galvan@tallertechnologies.com> wrote:
>> I did a bit more research on how Cortex-M handles nested interrupts.
>> In particular, I noticed if we were to have an invalid instruction
>> inside the exception handler itself, trying to single-step it would
>> again cause Qemu not to return control back to gdb. Reading up a bit I
>> found out that when the CPU is handling an exception, all other
>> exceptions with the same or lower priority will be blocked.
>>
>> Qemu seems to implement this by setting the exception as pending, yet
>> not changing cpu->interrupt_request so that we modify the PC in the
>> second call to do_interrupt. That seems fine to me, but since the
>> translation block is the same as before (and if that one contained an
>> exception_with_syndrome ending in a call to cpu_loop_exit) we'll be
>> stuck in the loop and we'll never return control to Gdb.
>>
>> Therefore I have two questions:
>> 1) Is this a bug?
>> 2) If it is, what should the behaviour be? I think we should return
>> control back to gdb, but we shouldn't try to translate that
>> instruction again. How could we handle this in a clean way?
> 
> Our handling of exceptions and exception priorities for
> Cortex-M CPUs is totally wrong. We attempt to reuse the
> GIC priority handling code for interrupts, which then leaves
> the exceptions without the correct priority handling. Ideally
> we should rewrite this completely to give the M profile
> cores the architecturally mandated exception priority
> handling and masking, and stop trying to reuse the
> A/R profile core code at all. The two profiles are just
> too different for that to make sense.

Are you saying we should dump the inheritence completely?

Andreas

> 
> Any bugs or weirdness in the current code are likely
> symptoms of attempting to shoehorn M profile into the
> A/R profile handling.
> 
> -- PMM
>
Peter Maydell Sept. 15, 2014, 4:22 p.m. UTC | #9
On 15 September 2014 09:17, Andreas Färber <afaerber@suse.de> wrote:
> Am 15.09.2014 um 18:10 schrieb Peter Maydell:
>> Our handling of exceptions and exception priorities for
>> Cortex-M CPUs is totally wrong. We attempt to reuse the
>> GIC priority handling code for interrupts, which then leaves
>> the exceptions without the correct priority handling. Ideally
>> we should rewrite this completely to give the M profile
>> cores the architecturally mandated exception priority
>> handling and masking, and stop trying to reuse the
>> A/R profile core code at all. The two profiles are just
>> too different for that to make sense.
>
> Are you saying we should dump the inheritence completely?

From the GIC common class? Yes. You'd want the
QOM CPU object to directly expose a bunch of input
IRQ/GPIO lines for the various interrupt lines an M
profile core has (and not the IRQ/FIQ lines we do now).
Exception/interrupt handling and prioritization is much
more closely coupled to the core in M profile so
trying to model it as a separate device is part of our
problems here.

I should mention that this is all a bit academic as far
as I'm personally concerned since M profile is not
something I can spend much time on (better to
think of it as "occasional fixes" mode rather than
"mantained/supported"). If anybody wants to take on
the rework I'm happy to review code but I'm unlikely
to ever do it myself.

-- PMM
diff mbox

Patch

diff --git a/cpu-exec.c b/cpu-exec.c
index 5fa172c..5d9f231 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -448,9 +448,13 @@  int cpu_exec(CPUArchState *env)
             for(;;) {
                 interrupt_request = cpu->interrupt_request;
                 if (unlikely(interrupt_request)) {
-                    if (unlikely(cpu->singlestep_enabled & SSTEP_NOIRQ)) {
+                    if (unlikely((cpu->singlestep_enabled & SSTEP_NOIRQ) &&
+                            !(cpu->singlestep_enabled & SSTEP_EXCEPTION))) {
                         /* Mask out external interrupts for this step. */
                         interrupt_request &= ~CPU_INTERRUPT_SSTEP_MASK;
+                    } else { /* An exception occured; don't mask out
this one */
+                        /* Mask out any future external interrupts */
+                        cpu->singlestep_enabled &= ~SSTEP_EXCEPTION;
                     }
                     if (interrupt_request & CPU_INTERRUPT_DEBUG) {
                         cpu->interrupt_request &= ~CPU_INTERRUPT_DEBUG;
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index f2df033..a57800f 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -607,6 +607,11 @@  void qemu_init_vcpu(CPUState *cpu);
 #define SSTEP_ENABLE    0x1  /* Enable simulated HW single stepping */
 #define SSTEP_NOIRQ     0x2  /* Do not use IRQ while single stepping */
 #define SSTEP_NOTIMER   0x4  /* Do not Timers while single stepping */
+#define SSTEP_EXCEPTION 0x8  /* Don't mask out exception-related
IRQs. Set only if
+                              * we have to process an exception while single-
+                              * stepping (such as when
single-stepping an invalid
+                              * instruction).
+                              */

 /**
  * cpu_single_step:
diff --git a/target-arm/op_helper.c b/target-arm/op_helper.c
index fe40358..2139ea6 100644
--- a/target-arm/op_helper.c
+++ b/target-arm/op_helper.c
@@ -251,6 +251,11 @@  void HELPER(exception_with_syndrome)(CPUARMState
*env, uint32_t excp,
     CPUState *cs = CPU(arm_env_get_cpu(env));

     assert(!excp_is_internal(excp));
+
+    if (unlikely(cs->singlestep_enabled & SSTEP_NOIRQ)) {
+        cs->singlestep_enabled |= SSTEP_EXCEPTION;
+    }
+
     cs->exception_index = excp;
     env->exception.syndrome = syndrome;