diff mbox

target-arm: add dump-guest-memory support

Message ID 1419377358-17222-1-git-send-email-rabin@rab.in
State New
Headers show

Commit Message

Rabin Vincent Dec. 23, 2014, 11:29 p.m. UTC
Enable support for the dump-guest-memory command on ARM and AArch64.
The dumped files can be analyzed with crash or similar tools.

Signed-off-by: Rabin Vincent <rabin@rab.in>
---
 target-arm/Makefile.objs |   2 +-
 target-arm/arch_dump.c   | 148 +++++++++++++++++++++++++++++++++++++++++++++++
 target-arm/cpu-qom.h     |   8 +++
 target-arm/cpu.c         |   4 ++
 4 files changed, 161 insertions(+), 1 deletion(-)
 create mode 100644 target-arm/arch_dump.c

Comments

Peter Maydell Dec. 23, 2014, 11:45 p.m. UTC | #1
On 23 December 2014 at 23:29, Rabin Vincent <rabin@rab.in> wrote:
> Enable support for the dump-guest-memory command on ARM and AArch64.
> The dumped files can be analyzed with crash or similar tools.
>
> Signed-off-by: Rabin Vincent <rabin@rab.in>

Thanks -- looks pretty good. One nit and some annoying
corner cases you may not have thought of...

> +static size_t round4(size_t size)
> +{
> +    return ((size + 3) / 4) * 4;
> +}

Is this different from ROUND_UP(size, 4) ?
If we can use the standard macro from the headers we should;
if there's a real difference we should comment about what it is.


> +int arm_cpu_write_elf32_note(WriteCoreDumpFunction f, CPUState *cs,
> +                             int cpuid, void *opaque)
> +{
> +    arm_elf_prstatus prstatus = {.pid = cpuid};
> +    ARMCPU *cpu = ARM_CPU(cs);
> +
> +    memcpy(&(prstatus.regs), cpu->env.regs, sizeof(cpu->env.regs));
> +    prstatus.regs[16] = cpsr_read(&cpu->env);
> +
> +    return write_elf_note("CORE", NT_PRSTATUS,
> +                          &prstatus, sizeof(prstatus),
> +                          f, opaque);
> +}
> +
> +int arm_cpu_write_elf64_note(WriteCoreDumpFunction f, CPUState *cs,
> +                             int cpuid, void *opaque)
> +{
> +    aarch64_elf_prstatus prstatus = {.pid = cpuid};
> +    ARMCPU *cpu = ARM_CPU(cs);
> +
> +    memcpy(&(prstatus.regs), cpu->env.xregs, sizeof(cpu->env.xregs));
> +    prstatus.pc = cpu->env.pc;
> +    prstatus.pstate = cpu->env.pstate;

You need to use the correct accessor function for pstate, not
all the bits are kept in env.pstate. Call pstate_read().

Can we get here when a 64-bit CPU is in AArch32 mode? (eg,
64 bit guest OS running a 32 bit compat process at the
point of taking the memory dump). If so, what sort of
core file should we be writing?

Assuming the answer is "still 64 bit core dump" you need
to do something here to sync the 32 bit TCG state into the
64 bit xregs array. (KVM can take care of itself.)

> +int cpu_get_dump_info(ArchDumpInfo *info,
> +                      const struct GuestPhysBlockList *guest_phys_blocks)
> +{
> +    info->d_machine = ELF_MACHINE;
> +    info->d_class = (info->d_machine == EM_ARM) ? ELFCLASS32 : ELFCLASS64;
> +
> +#ifdef TARGET_WORDS_BIGENDIAN
> +    info->d_endian = ELFDATA2MSB;
> +#else
> +    info->d_endian = ELFDATA2LSB;
> +#endif

Note that in fact ARM is never going to be TARGET_WORDS_BIGENDIAN,
even if the guest is big-endian, because the #define represents
the bus endianness, not whether the CPU happens to currently be
doing byte-swizzling. Do you need to key d_endian off the CPU's
current endianness setting? The current endianness of EL1?
Something else?

thanks
-- PMM
Rabin Vincent Dec. 24, 2014, 4:54 p.m. UTC | #2
On Tue, Dec 23, 2014 at 11:45:00PM +0000, Peter Maydell wrote:
> On 23 December 2014 at 23:29, Rabin Vincent <rabin@rab.in> wrote:
> > +static size_t round4(size_t size)
> > +{
> > +    return ((size + 3) / 4) * 4;
> > +}
> 
> Is this different from ROUND_UP(size, 4) ?
> If we can use the standard macro from the headers we should;
> if there's a real difference we should comment about what it is.

No, I'll use ROUND_UP.

> > +int arm_cpu_write_elf64_note(WriteCoreDumpFunction f, CPUState *cs,
> > +                             int cpuid, void *opaque)
> > +{
> > +    aarch64_elf_prstatus prstatus = {.pid = cpuid};
> > +    ARMCPU *cpu = ARM_CPU(cs);
> > +
> > +    memcpy(&(prstatus.regs), cpu->env.xregs, sizeof(cpu->env.xregs));
> > +    prstatus.pc = cpu->env.pc;
> > +    prstatus.pstate = cpu->env.pstate;
> 
> You need to use the correct accessor function for pstate, not
> all the bits are kept in env.pstate. Call pstate_read().

OK.

> Can we get here when a 64-bit CPU is in AArch32 mode? (eg,
> 64 bit guest OS running a 32 bit compat process at the
> point of taking the memory dump). If so, what sort of
> core file should we be writing?

I'd say still 64-bit.

> Assuming the answer is "still 64 bit core dump" you need
> to do something here to sync the 32 bit TCG state into the
> 64 bit xregs array. (KVM can take care of itself.)

I have now tested this by triggering a dump while a 32-bit process is
incrementing a register in a tight loop, and the following, which I
lifted off the exception handling code, appears to work:

    if (!is_a64(&cpu->env)) {   
        int i;

        for (i = 0; i < 15; i++) {
            prstatus.regs[i] = cpu->env.regs[i];
        }
    }

> > +int cpu_get_dump_info(ArchDumpInfo *info,
> > +                      const struct GuestPhysBlockList *guest_phys_blocks)
> > +{
> > +    info->d_machine = ELF_MACHINE;
> > +    info->d_class = (info->d_machine == EM_ARM) ? ELFCLASS32 : ELFCLASS64;
> > +
> > +#ifdef TARGET_WORDS_BIGENDIAN
> > +    info->d_endian = ELFDATA2MSB;
> > +#else
> > +    info->d_endian = ELFDATA2LSB;
> > +#endif
> 
> Note that in fact ARM is never going to be TARGET_WORDS_BIGENDIAN,
> even if the guest is big-endian, because the #define represents
> the bus endianness, not whether the CPU happens to currently be
> doing byte-swizzling. Do you need to key d_endian off the CPU's
> current endianness setting? The current endianness of EL1?
> Something else?

IIUC we don't currently support anything other than little endian in
system emulation?  Attempting to boot a BE ARMv7 vexpress kernel hits
the unimplementation of setend pretty quickly, and I don't see any
machine initializing the bswap_code to big endian.

According the the ELF specification for ARM, the choice between
ELFDATA2LSB and ELFDATA2MSB "will be governed by the default data order
in the execution environment".  Since we dump the full system memory I
would interpret this to be the "lowest" execution environment.  So I
guess for ARM this would mean setting big endian if (SCTLR.EE ||
SCTLR.B) and for AArch64 if SCTLR_EL1.E0E is set?

(I had assumed that post-analysis tools would refuse to open a dump if
the endianness does not match but this does not seem to be the case.  I
tested by generating dumps with this d_endian hardcoded to both
ELFDATA2LSB and ELFDATA2MSB and gdb appears to open them and show the
registers and memory without complaining.)
Peter Maydell Dec. 24, 2014, 5:12 p.m. UTC | #3
On 24 December 2014 at 16:54, Rabin Vincent <rabin@rab.in> wrote:
> On Tue, Dec 23, 2014 at 11:45:00PM +0000, Peter Maydell wrote:
>> Assuming the answer is "still 64 bit core dump" you need
>> to do something here to sync the 32 bit TCG state into the
>> 64 bit xregs array. (KVM can take care of itself.)
>
> I have now tested this by triggering a dump while a 32-bit process is
> incrementing a register in a tight loop, and the following, which I
> lifted off the exception handling code, appears to work:
>
>     if (!is_a64(&cpu->env)) {
>         int i;
>
>         for (i = 0; i < 15; i++) {
>             prstatus.regs[i] = cpu->env.regs[i];
>         }
>     }

Yes, that looks OK, but can you factor it out to a function
in target-arm/, please? When we add support for 32-bit EL1
in 64-bit EL2/EL3 it'll need to get more complicated.

Also you need to use the read_cpsr() value for pstate.

>> Note that in fact ARM is never going to be TARGET_WORDS_BIGENDIAN,
>> even if the guest is big-endian, because the #define represents
>> the bus endianness, not whether the CPU happens to currently be
>> doing byte-swizzling. Do you need to key d_endian off the CPU's
>> current endianness setting? The current endianness of EL1?
>> Something else?
>
> IIUC we don't currently support anything other than little endian in
> system emulation?  Attempting to boot a BE ARMv7 vexpress kernel hits
> the unimplementation of setend pretty quickly, and I don't see any
> machine initializing the bswap_code to big endian.

We don't currently in emulation; there are patches on-list to
support it for KVM, though, which I expect we'll be merging
shortly.

(I really should resurrect that patchseries Paolo sent out to
add BE support, it was pretty close but I think needed a few
more tweaks.)

> According the the ELF specification for ARM, the choice between
> ELFDATA2LSB and ELFDATA2MSB "will be governed by the default data order
> in the execution environment".  Since we dump the full system memory I
> would interpret this to be the "lowest" execution environment.  So I
> guess for ARM this would mean setting big endian if (SCTLR.EE ||
> SCTLR.B) and for AArch64 if SCTLR_EL1.E0E is set?

I think AArch64 should be SCTLR_EL1.EE, shouldn't it?

What are the semantics of the dump if we support EL2/EL3?
Do we still just dump from the perspective of EL1? That's
probably the best approximation to useful for a user I guess.

-- PMM
diff mbox

Patch

diff --git a/target-arm/Makefile.objs b/target-arm/Makefile.objs
index 9460b40..38ba48c 100644
--- a/target-arm/Makefile.objs
+++ b/target-arm/Makefile.objs
@@ -7,6 +7,6 @@  obj-$(call lnot,$(CONFIG_KVM)) += kvm-stub.o
 obj-y += translate.o op_helper.o helper.o cpu.o
 obj-y += neon_helper.o iwmmxt_helper.o
 obj-y += gdbstub.o
-obj-$(CONFIG_SOFTMMU) += psci.o
+obj-$(CONFIG_SOFTMMU) += psci.o arch_dump.o
 obj-$(TARGET_AARCH64) += cpu64.o translate-a64.o helper-a64.o gdbstub64.o
 obj-y += crypto_helper.o
diff --git a/target-arm/arch_dump.c b/target-arm/arch_dump.c
new file mode 100644
index 0000000..4e7b9a2
--- /dev/null
+++ b/target-arm/arch_dump.c
@@ -0,0 +1,148 @@ 
+#include "cpu.h"
+
+#include "sysemu/dump.h"
+#include "elf.h"
+
+/* Matches elf_prstatus in <linux/elfcore.h> */
+
+typedef struct {
+    char pad1[24];
+    uint32_t pid;
+    char pad2[44];
+    uint32_t regs[18];
+    char pad3[4];
+} arm_elf_prstatus;
+
+typedef struct {
+    char pad1[32];
+    uint32_t pid;
+    char pad2[76];
+    uint64_t regs[32];
+    uint64_t pc;
+    uint64_t pstate;
+    char pad3[4];
+} aarch64_elf_prstatus;
+
+static size_t round4(size_t size)
+{
+    return ((size + 3) / 4) * 4;
+}
+
+static int write_elf_note(const char *name, uint32_t type,
+                          void *desc, size_t descsz,
+                          WriteCoreDumpFunction f, void *opaque)
+{
+    size_t note_size, name_size, note_head_size;
+    DumpState *s = opaque;
+    int class = s->dump_info.d_class;
+    Elf64_Nhdr *note64;
+    Elf32_Nhdr *note32;
+    void *note;
+    char *buf;
+    int ret;
+
+    name_size = strlen(name) + 1;
+
+    if (class == ELFCLASS32) {
+        note_head_size = sizeof(Elf32_Nhdr);
+    } else {
+        note_head_size = sizeof(Elf64_Nhdr);
+    }
+
+    note_size = note_head_size + round4(name_size) + descsz;
+    note = g_malloc0(note_size);
+
+    if (class == ELFCLASS32) {
+        note32 = note;
+        note32->n_namesz = cpu_to_dump32(s, name_size);
+        note32->n_descsz = cpu_to_dump32(s, descsz);
+        note32->n_type = cpu_to_dump32(s, type);
+    } else {
+        note64 = note;
+        note64->n_namesz = cpu_to_dump64(s, name_size);
+        note64->n_descsz = cpu_to_dump64(s, descsz);
+        note64->n_type = cpu_to_dump64(s, type);
+    }
+
+    buf = note;
+    buf += note_head_size;
+
+    memcpy(buf, name, name_size);
+    buf += round4(name_size);
+
+    memcpy(buf, desc, descsz);
+
+    ret = f(note, note_size, opaque);
+    g_free(note);
+
+    return ret;
+}
+
+int arm_cpu_write_elf32_note(WriteCoreDumpFunction f, CPUState *cs,
+                             int cpuid, void *opaque)
+{
+    arm_elf_prstatus prstatus = {.pid = cpuid};
+    ARMCPU *cpu = ARM_CPU(cs);
+
+    memcpy(&(prstatus.regs), cpu->env.regs, sizeof(cpu->env.regs));
+    prstatus.regs[16] = cpsr_read(&cpu->env);
+
+    return write_elf_note("CORE", NT_PRSTATUS,
+                          &prstatus, sizeof(prstatus),
+                          f, opaque);
+}
+
+int arm_cpu_write_elf64_note(WriteCoreDumpFunction f, CPUState *cs,
+                             int cpuid, void *opaque)
+{
+    aarch64_elf_prstatus prstatus = {.pid = cpuid};
+    ARMCPU *cpu = ARM_CPU(cs);
+
+    memcpy(&(prstatus.regs), cpu->env.xregs, sizeof(cpu->env.xregs));
+    prstatus.pc = cpu->env.pc;
+    prstatus.pstate = cpu->env.pstate;
+
+    return write_elf_note("CORE", NT_PRSTATUS,
+                          &prstatus, sizeof(prstatus),
+                          f, opaque);
+}
+
+int arm_cpu_write_elf32_qemunote(WriteCoreDumpFunction f,
+                                  CPUState *cpu, void *opaque)
+{
+    return 0;
+}
+
+int arm_cpu_write_elf64_qemunote(WriteCoreDumpFunction f,
+                                  CPUState *cpu, void *opaque)
+{
+    return 0;
+}
+
+int cpu_get_dump_info(ArchDumpInfo *info,
+                      const struct GuestPhysBlockList *guest_phys_blocks)
+{
+    info->d_machine = ELF_MACHINE;
+    info->d_class = (info->d_machine == EM_ARM) ? ELFCLASS32 : ELFCLASS64;
+
+#ifdef TARGET_WORDS_BIGENDIAN
+    info->d_endian = ELFDATA2MSB;
+#else
+    info->d_endian = ELFDATA2LSB;
+#endif
+
+    return 0;
+}
+
+ssize_t cpu_get_note_size(int class, int machine, int nr_cpus)
+{
+    size_t elf_note_size = round4(sizeof("CORE"));
+
+    if (machine == EM_ARM) {
+        elf_note_size += sizeof(Elf32_Nhdr) + sizeof(arm_elf_prstatus);
+    } else {
+        elf_note_size += sizeof(Elf64_Nhdr) + sizeof(aarch64_elf_prstatus);
+    }
+
+    return elf_note_size * nr_cpus;
+}
diff --git a/target-arm/cpu-qom.h b/target-arm/cpu-qom.h
index ed5a644..7de8ba1 100644
--- a/target-arm/cpu-qom.h
+++ b/target-arm/cpu-qom.h
@@ -205,6 +205,14 @@  bool arm_cpu_exec_interrupt(CPUState *cpu, int int_req);
 
 void arm_cpu_dump_state(CPUState *cs, FILE *f, fprintf_function cpu_fprintf,
                         int flags);
+int arm_cpu_write_elf32_note(WriteCoreDumpFunction f, CPUState *cs,
+                             int cpuid, void *opaque);
+int arm_cpu_write_elf64_note(WriteCoreDumpFunction f, CPUState *cs,
+                             int cpuid, void *opaque);
+int arm_cpu_write_elf32_qemunote(WriteCoreDumpFunction f,
+                                 CPUState *cpu, void *opaque);
+int arm_cpu_write_elf64_qemunote(WriteCoreDumpFunction f,
+                                 CPUState *cpu, void *opaque);
 
 hwaddr arm_cpu_get_phys_page_debug(CPUState *cpu, vaddr addr);
 
diff --git a/target-arm/cpu.c b/target-arm/cpu.c
index 285947f..679387a 100644
--- a/target-arm/cpu.c
+++ b/target-arm/cpu.c
@@ -1188,6 +1188,10 @@  static void arm_cpu_class_init(ObjectClass *oc, void *data)
 #else
     cc->do_interrupt = arm_cpu_do_interrupt;
     cc->get_phys_page_debug = arm_cpu_get_phys_page_debug;
+    cc->write_elf32_note = arm_cpu_write_elf32_note;
+    cc->write_elf64_note = arm_cpu_write_elf64_note;
+    cc->write_elf32_qemunote = arm_cpu_write_elf32_qemunote;
+    cc->write_elf64_qemunote = arm_cpu_write_elf64_qemunote;
     cc->vmsd = &vmstate_arm_cpu;
 #endif
     cc->gdb_num_core_regs = 26;