diff mbox series

[v3,6/7] loader: Implement .hex file loader

Message ID 20180725085944.11856-7-stefanha@redhat.com
State New
Headers show
Series arm: add Cortex M0 CPU model and hex file loader | expand

Commit Message

Stefan Hajnoczi July 25, 2018, 8:59 a.m. UTC
From: Su Hang <suhang16@mails.ucas.ac.cn>

This patch adds Intel Hexadecimal Object File format support to the
loader.  The file format specification is available here:
http://www.piclist.com/techref/fileext/hex/intel.htm

This file format is often used with microcontrollers such as the
micro:bit, Arduino, STM32, etc.  Users expect to be able to run them
directly with qemu -kernel program.hex instead of converting to ELF or
binary.

Signed-off-by: Su Hang <suhang16@mails.ucas.ac.cn>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 include/hw/loader.h    |  12 ++
 hw/arm/arm-m-profile.c |   3 +
 hw/core/loader.c       | 243 +++++++++++++++++++++++++++++++++++++++++
 3 files changed, 258 insertions(+)

Comments

Peter Maydell July 30, 2018, 6:01 p.m. UTC | #1
On 25 July 2018 at 09:59, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> From: Su Hang <suhang16@mails.ucas.ac.cn>
>
> This patch adds Intel Hexadecimal Object File format support to the
> loader.  The file format specification is available here:
> http://www.piclist.com/techref/fileext/hex/intel.htm
>
> This file format is often used with microcontrollers such as the
> micro:bit, Arduino, STM32, etc.  Users expect to be able to run them
> directly with qemu -kernel program.hex instead of converting to ELF or
> binary.

I'm still not convinced we want to add another random
special case only-works-on-one-architecture-and-some-boards
feature to the -kernel command line option.

Adding it to the "generic loader" device might be more plausible?
Or just get the user to create ELF files or plain binary files,
both of which we already handle.

thanks
-- PMM
Stefan Hajnoczi Aug. 2, 2018, 6:51 a.m. UTC | #2
On Mon, Jul 30, 2018 at 07:01:53PM +0100, Peter Maydell wrote:
> On 25 July 2018 at 09:59, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > From: Su Hang <suhang16@mails.ucas.ac.cn>
> >
> > This patch adds Intel Hexadecimal Object File format support to the
> > loader.  The file format specification is available here:
> > http://www.piclist.com/techref/fileext/hex/intel.htm
> >
> > This file format is often used with microcontrollers such as the
> > micro:bit, Arduino, STM32, etc.  Users expect to be able to run them
> > directly with qemu -kernel program.hex instead of converting to ELF or
> > binary.
> 
> I'm still not convinced we want to add another random
> special case only-works-on-one-architecture-and-some-boards
> feature to the -kernel command line option.
> 
> Adding it to the "generic loader" device might be more plausible?

I'll look into this and let you know what I find.

> Or just get the user to create ELF files or plain binary files,
> both of which we already handle.

Plain binary definitely won't work because the .hex files contain sparse
(discontiguous) data.

But even using ELF would be very inconvenient to users since .hex is
*the* format that micro:bit IDEs and toolchains use.  Our goal is for
less technical users (even kids learning to program) to use QEMU has a
micro:bit simulator, so the ability to directly use .hex files is very
important.

Stefan
Stefan Hajnoczi Aug. 2, 2018, 12:43 p.m. UTC | #3
On Mon, Jul 30, 2018 at 07:01:53PM +0100, Peter Maydell wrote:
> On 25 July 2018 at 09:59, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > From: Su Hang <suhang16@mails.ucas.ac.cn>
> >
> > This patch adds Intel Hexadecimal Object File format support to the
> > loader.  The file format specification is available here:
> > http://www.piclist.com/techref/fileext/hex/intel.htm
> >
> > This file format is often used with microcontrollers such as the
> > micro:bit, Arduino, STM32, etc.  Users expect to be able to run them
> > directly with qemu -kernel program.hex instead of converting to ELF or
> > binary.
> 
> I'm still not convinced we want to add another random
> special case only-works-on-one-architecture-and-some-boards
> feature to the -kernel command line option.
> 
> Adding it to the "generic loader" device might be more plausible?

I'm not sure I understand the purpose of the generic loader.

As a user -kernel <myfile> is easier than -device
loader,file=<myfile>,cpu-num=1.

Can you explain the advantage to moving hex file loading to the generic
loader?

Stefan
Peter Maydell Aug. 2, 2018, 10:04 p.m. UTC | #4
On 2 August 2018 at 13:43, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> On Mon, Jul 30, 2018 at 07:01:53PM +0100, Peter Maydell wrote:
>> I'm still not convinced we want to add another random
>> special case only-works-on-one-architecture-and-some-boards
>> feature to the -kernel command line option.
>>
>> Adding it to the "generic loader" device might be more plausible?
>
> I'm not sure I understand the purpose of the generic loader.
>
> As a user -kernel <myfile> is easier than -device
> loader,file=<myfile>,cpu-num=1.
>
> Can you explain the advantage to moving hex file loading to the generic
> loader?

It means we have a command line option for loading hex files
that works for every board and every CPU architecture.
(Similarly, if you want a way to load an ELF file that
works the same way for all boards and CPUs, the generic
loader is it -- -kernel will not reliably do the job.
You can also use it to load more than one ELF file or
to load different ELF files for different CPUs, neither
of which you can do with -kernel.)

-kernel, like all our legacy short options, is, yes,
easier to use; it's also a twisted mess of different
"do what I mean" functionality that varies depending
on the guest CPU architecture and subtype, the machine
being emulated, and other random things like "did the
user also tell us to start a BIOS image". It mostly means
"run a Linux kernel", with some extras wedged in on the
side where we thought we could do it without breaking the
kernel case. Oh, and we don't document anywhere what
it actually does. I'm reluctant to add yet another layer of
"do what I mean" to it that only has an effect on a subset
of Arm boards.

thanks
-- PMM
Stefan Hajnoczi Aug. 3, 2018, 10:32 a.m. UTC | #5
On Thu, Aug 02, 2018 at 11:04:46PM +0100, Peter Maydell wrote:
> On 2 August 2018 at 13:43, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> > On Mon, Jul 30, 2018 at 07:01:53PM +0100, Peter Maydell wrote:
> >> I'm still not convinced we want to add another random
> >> special case only-works-on-one-architecture-and-some-boards
> >> feature to the -kernel command line option.
> >>
> >> Adding it to the "generic loader" device might be more plausible?
> >
> > I'm not sure I understand the purpose of the generic loader.
> >
> > As a user -kernel <myfile> is easier than -device
> > loader,file=<myfile>,cpu-num=1.
> >
> > Can you explain the advantage to moving hex file loading to the generic
> > loader?
> 
> It means we have a command line option for loading hex files
> that works for every board and every CPU architecture.
> (Similarly, if you want a way to load an ELF file that
> works the same way for all boards and CPUs, the generic
> loader is it -- -kernel will not reliably do the job.
> You can also use it to load more than one ELF file or
> to load different ELF files for different CPUs, neither
> of which you can do with -kernel.)
> 
> -kernel, like all our legacy short options, is, yes,
> easier to use; it's also a twisted mess of different
> "do what I mean" functionality that varies depending
> on the guest CPU architecture and subtype, the machine
> being emulated, and other random things like "did the
> user also tell us to start a BIOS image". It mostly means
> "run a Linux kernel", with some extras wedged in on the
> side where we thought we could do it without breaking the
> kernel case. Oh, and we don't document anywhere what
> it actually does. I'm reluctant to add yet another layer of
> "do what I mean" to it that only has an effect on a subset
> of Arm boards.

Okay, I understand better now.  Thanks!

I'll resend with the generic loader instead of -kernel.

Stefan
diff mbox series

Patch

diff --git a/include/hw/loader.h b/include/hw/loader.h
index 5235f119a3..3c112975f4 100644
--- a/include/hw/loader.h
+++ b/include/hw/loader.h
@@ -28,6 +28,18 @@  ssize_t load_image_size(const char *filename, void *addr, size_t size);
 int load_image_targphys_as(const char *filename,
                            hwaddr addr, uint64_t max_sz, AddressSpace *as);
 
+/**load_targphys_hex_as:
+ * @filename: Path to the .hex file
+ * @entry: Store the entry point given by the .hex file
+ * @as: The AddressSpace to load the .hex file to. The value of
+ *      address_space_memory is used if nothing is supplied here.
+ *
+ * Load a fixed .hex file into memory.
+ *
+ * Returns the size of the loaded .hex file on success, -1 otherwise.
+ */
+int load_targphys_hex_as(const char *filename, hwaddr *entry, AddressSpace *as);
+
 /** load_image_targphys:
  * Same as load_image_targphys_as(), but doesn't allow the caller to specify
  * an AddressSpace.
diff --git a/hw/arm/arm-m-profile.c b/hw/arm/arm-m-profile.c
index 8bafd6602d..441bc1a9c2 100644
--- a/hw/arm/arm-m-profile.c
+++ b/hw/arm/arm-m-profile.c
@@ -302,6 +302,9 @@  void arm_m_profile_load_kernel(ARMCPU *cpu, const char *kernel_filename, int mem
     if (kernel_filename) {
         image_size = load_elf_as(kernel_filename, NULL, NULL, &entry, &lowaddr,
                                  NULL, big_endian, EM_ARM, 1, 0, as);
+        if (image_size < 0) {
+            image_size = load_targphys_hex_as(kernel_filename, &entry, as);
+        }
         if (image_size < 0) {
             image_size = load_image_targphys_as(kernel_filename, 0,
                                                 mem_size, as);
diff --git a/hw/core/loader.c b/hw/core/loader.c
index 182abcce28..9ce5f45d5c 100644
--- a/hw/core/loader.c
+++ b/hw/core/loader.c
@@ -1323,3 +1323,246 @@  void hmp_info_roms(Monitor *mon, const QDict *qdict)
         }
     }
 }
+
+typedef enum HexRecord HexRecord;
+enum HexRecord {
+    DATA_RECORD = 0,
+    EOF_RECORD,
+    EXT_SEG_ADDR_RECORD,
+    START_SEG_ADDR_RECORD,
+    EXT_LINEAR_ADDR_RECORD,
+    START_LINEAR_ADDR_RECORD,
+};
+
+#define DATA_FIELD_MAX_LEN 0xff
+#define LEN_EXCEPT_DATA 0x5
+/* 0x5 = sizeof(byte_count) + sizeof(address) + sizeof(record_type) +
+ *       sizeof(checksum) */
+typedef struct {
+    uint8_t byte_count;
+    uint16_t address;
+    uint8_t record_type;
+    uint8_t data[DATA_FIELD_MAX_LEN];
+    uint8_t checksum;
+} HexLine;
+
+/* return 0 or -1 if error */
+static bool parse_record(HexLine *line, uint8_t *our_checksum, const uint8_t c,
+                         uint32_t *index, const bool in_process)
+{
+    /* +-------+---------------+-------+---------------------+--------+
+     * | byte  |               |record |                     |        |
+     * | count |    address    | type  |        data         |checksum|
+     * +-------+---------------+-------+---------------------+--------+
+     * ^       ^               ^       ^                     ^        ^
+     * |1 byte |    2 bytes    |1 byte |     0-255 bytes     | 1 byte |
+     */
+    uint8_t value = 0;
+    uint32_t idx = *index;
+    /* ignore space */
+    if (g_ascii_isspace(c)) {
+        return true;
+    }
+    if (!g_ascii_isxdigit(c) || !in_process) {
+        return false;
+    }
+    value = g_ascii_xdigit_value(c);
+    value = idx & 0x1 ? value & 0xf : value << 4;
+    if (idx < 2) {
+        line->byte_count |= value;
+    } else if (2 <= idx && idx < 6) {
+        line->address <<= 4;
+        line->address += g_ascii_xdigit_value(c);
+    } else if (6 <= idx && idx < 8) {
+        line->record_type |= value;
+    } else if (8 <= idx && idx < 8 + 2 * line->byte_count) {
+        line->data[(idx - 8) >> 1] |= value;
+    } else if (8 + 2 * line->byte_count <= idx &&
+               idx < 10 + 2 * line->byte_count) {
+        line->checksum |= value;
+    } else {
+        return false;
+    }
+    *our_checksum += value;
+    ++(*index);
+    return true;
+}
+
+typedef struct {
+    const char *filename;
+    HexLine line;
+    uint8_t *bin_buf;
+    hwaddr *start_addr;
+    int total_size;
+    uint32_t next_address_to_write;
+    uint32_t current_address;
+    uint32_t current_rom_index;
+    uint32_t rom_start_address;
+    AddressSpace *as;
+} HexParser;
+
+/* return size or -1 if error */
+static int handle_record_type(HexParser *parser)
+{
+    HexLine *line = &(parser->line);
+    switch (line->record_type) {
+    case DATA_RECORD:
+        parser->current_address =
+            (parser->next_address_to_write & 0xffff0000) | line->address;
+        /* verify this is a contiguous block of memory */
+        if (parser->current_address != parser->next_address_to_write) {
+            if (parser->current_rom_index != 0) {
+                rom_add_blob_fixed_as(parser->filename, parser->bin_buf,
+                                      parser->current_rom_index,
+                                      parser->rom_start_address, parser->as);
+            }
+            parser->rom_start_address = parser->current_address;
+            parser->current_rom_index = 0;
+        }
+
+        /* copy from line buffer to output bin_buf */
+        memcpy(parser->bin_buf + parser->current_rom_index, line->data,
+               line->byte_count);
+        parser->current_rom_index += line->byte_count;
+        parser->total_size += line->byte_count;
+        /* save next address to write */
+        parser->next_address_to_write =
+            parser->current_address + line->byte_count;
+        break;
+
+    case EOF_RECORD:
+        if (parser->current_rom_index != 0) {
+            rom_add_blob_fixed_as(parser->filename, parser->bin_buf,
+                                  parser->current_rom_index,
+                                  parser->rom_start_address, parser->as);
+        }
+        return parser->total_size;
+    case EXT_SEG_ADDR_RECORD:
+    case EXT_LINEAR_ADDR_RECORD:
+        if (line->byte_count != 2 && line->address != 0) {
+            return -1;
+        }
+
+        if (parser->current_rom_index != 0) {
+            rom_add_blob_fixed_as(parser->filename, parser->bin_buf,
+                                  parser->current_rom_index,
+                                  parser->rom_start_address, parser->as);
+        }
+
+        /* save next address to write,
+         * in case of non-contiguous block of memory */
+        parser->next_address_to_write =
+            line->record_type == EXT_SEG_ADDR_RECORD
+                ? ((line->data[0] << 12) | (line->data[1] << 4))
+                : ((line->data[0] << 24) | (line->data[1] << 16));
+        parser->rom_start_address = parser->next_address_to_write;
+        parser->current_rom_index = 0;
+        break;
+
+    case START_SEG_ADDR_RECORD:
+        if (line->byte_count != 4 && line->address != 0) {
+            return -1;
+        }
+
+        /* x86 16-bit CS:IP segmented addressing */
+        *(parser->start_addr) = (((line->data[0] << 8) | line->data[1]) << 4) |
+                                (line->data[2] << 8) | line->data[3];
+        break;
+
+    case START_LINEAR_ADDR_RECORD:
+        if (line->byte_count != 4 && line->address != 0) {
+            return -1;
+        }
+
+        *(parser->start_addr) = (line->data[0] << 24) | (line->data[1] << 16) |
+                                (line->data[2] << 8)  | (line->data[3]);
+        break;
+
+    default:
+        return -1;
+    }
+
+    return parser->total_size;
+}
+
+/* return size or -1 if error */
+static int parse_hex_blob(const char *filename, hwaddr *addr, uint8_t *hex_blob,
+                          size_t hex_blob_size, AddressSpace *as)
+{
+    bool in_process = false; /* avoid re-enter and
+                              * check whether record begin with ':' */
+    uint8_t *end = hex_blob + hex_blob_size;
+    uint8_t our_checksum = 0;
+    uint32_t record_index = 0;
+    HexParser parser = {
+        .filename = filename,
+        .bin_buf = g_malloc(hex_blob_size),
+        .start_addr = addr,
+        .as = as,
+    };
+
+    rom_transaction_begin();
+
+    for (; hex_blob < end; ++hex_blob) {
+        switch (*hex_blob) {
+        case '\r':
+        case '\n':
+            if (!in_process) {
+                break;
+            }
+
+            in_process = false;
+            if ((LEN_EXCEPT_DATA + parser.line.byte_count) * 2 !=
+                    record_index ||
+                our_checksum != 0) {
+                parser.total_size = -1;
+                goto out;
+            }
+
+            if (handle_record_type(&parser) == -1) {
+                parser.total_size = -1;
+                goto out;
+            }
+            break;
+
+        /* start of a new record. */
+        case ':':
+            memset(&parser.line, 0, sizeof(HexLine));
+            in_process = true;
+            record_index = 0;
+            break;
+
+        /* decoding lines */
+        default:
+            if (!parse_record(&parser.line, &our_checksum, *hex_blob,
+                              &record_index, in_process)) {
+                parser.total_size = -1;
+                goto out;
+            }
+            break;
+        }
+    }
+
+out:
+    g_free(parser.bin_buf);
+    rom_transaction_end(parser.total_size != -1);
+    return parser.total_size;
+}
+
+/* return size or -1 if error */
+int load_targphys_hex_as(const char *filename, hwaddr *entry, AddressSpace *as)
+{
+    gsize hex_blob_size;
+    gchar *hex_blob;
+    int total_size = 0;
+
+    if (!g_file_get_contents(filename, &hex_blob, &hex_blob_size, NULL)) {
+        return -1;
+    }
+
+    total_size = parse_hex_blob(filename, entry, (uint8_t *)hex_blob,
+                                hex_blob_size, as);
+
+    g_free(hex_blob);
+    return total_size;
+}