diff mbox

[v2] arm: add fw_cfg to "virt" board

Message ID 5485AF57.6050902@redhat.com
State New
Headers show

Commit Message

Laszlo Ersek Dec. 8, 2014, 2:01 p.m. UTC
On 11/30/14 17:59, Laszlo Ersek wrote:
> fw_cfg already supports exposure over MMIO (used in ppc/mac_newworld.c,
> ppc/mac_oldworld.c, sparc/sun4m.c); we can easily add it to the "virt"
> board.
> 
> The mmio register block of fw_cfg is advertized in the device tree. As
> base address we pick 0x09020000, which conforms to the comment preceding
> "a15memmap": it falls in the miscellaneous device I/O range 128MB..256MB,
> and it is aligned at 64KB. The DTB properties follow the documentation in
> the Linux source file "Documentation/devicetree/bindings/arm/fw-cfg.txt".
> 
> fw_cfg automatically exports a number of files to the guest; for example,
> "bootorder" (see fw_cfg_machine_reset()).
> 
> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
> ---
> 
> Notes:
>     v2:
>     - use a single mmio region of size 0x1000
>     - set "compatible" property to "qemu,fw-cfg-mmio"
> 
>  hw/arm/virt.c | 21 +++++++++++++++++++++
>  1 file changed, 21 insertions(+)
> 
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 314e55b..af794ea 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -68,6 +68,7 @@ enum {
>      VIRT_UART,
>      VIRT_MMIO,
>      VIRT_RTC,
> +    VIRT_FW_CFG,
>  };
>  
>  typedef struct MemMapEntry {
> @@ -107,6 +108,7 @@ static const MemMapEntry a15memmap[] = {
>      [VIRT_GIC_CPU] =    { 0x08010000, 0x00010000 },
>      [VIRT_UART] =       { 0x09000000, 0x00001000 },
>      [VIRT_RTC] =        { 0x09010000, 0x00001000 },
> +    [VIRT_FW_CFG] =     { 0x09020000, 0x00001000 },
>      [VIRT_MMIO] =       { 0x0a000000, 0x00000200 },
>      /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
>      /* 0x10000000 .. 0x40000000 reserved for PCI */
> @@ -519,6 +521,23 @@ static void create_flash(const VirtBoardInfo *vbi)
>      g_free(nodename);
>  }
>  
> +static void create_fw_cfg(const VirtBoardInfo *vbi)
> +{
> +    hwaddr base = vbi->memmap[VIRT_FW_CFG].base;
> +    hwaddr size = vbi->memmap[VIRT_FW_CFG].size;
> +    char *nodename;
> +
> +    fw_cfg_init(0, 0, base, base + 2);
> +
> +    nodename = g_strdup_printf("/fw-cfg@%" PRIx64, base);
> +    qemu_fdt_add_subnode(vbi->fdt, nodename);
> +    qemu_fdt_setprop_string(vbi->fdt, nodename,
> +                            "compatible", "qemu,fw-cfg-mmio");
> +    qemu_fdt_setprop_sized_cells(vbi->fdt, nodename, "reg",
> +                                 2, base, 2, size);
> +    g_free(nodename);
> +}
> +
>  static void *machvirt_dtb(const struct arm_boot_info *binfo, int *fdt_size)
>  {
>      const VirtBoardInfo *board = (const VirtBoardInfo *)binfo;
> @@ -604,6 +623,8 @@ static void machvirt_init(MachineState *machine)
>       */
>      create_virtio_devices(vbi, pic);
>  
> +    create_fw_cfg(vbi);
> +
>      vbi->bootinfo.ram_size = machine->ram_size;
>      vbi->bootinfo.kernel_filename = machine->kernel_filename;
>      vbi->bootinfo.kernel_cmdline = machine->kernel_cmdline;
> 

So... after playing with this thing for some time, it's become clear
that "MMIO traps" are painfully slow on the aarch64 platform we've been
working on (using KVM).

The original approach in my guest UEFI patch was a simple loop that
exerted byte-wise access to the fw_cfg device's data register (the only
kind of access that fw_cfg allows ATM). Downloading a kernel image plus
an initrd image byte for byte, which together can total between 30MB and
50MB, takes simply forever.

Drew suggested to try wider accesses. With the following PoC patch:

-------------------
-------------------

(and a corresponding firmware-side patch), we managed to speed it up by
a factor of 7.5.

Peter, can we introduce a second, 64-bit wide, data register, for
fw_cfg? (Or even two -- Drew suggested the LDP instruction for the guest.)

... BTW, re fw_cfg slowness, Rich mentioned this thread from 2010:

http://thread.gmane.org/gmane.comp.emulators.qemu/77582

Thanks
Laszlo

Comments

Peter Maydell Dec. 8, 2014, 2:41 p.m. UTC | #1
On 8 December 2014 at 14:01, Laszlo Ersek <lersek@redhat.com> wrote:
> Peter, can we introduce a second, 64-bit wide, data register, for
> fw_cfg? (Or even two -- Drew suggested the LDP instruction for the guest.)

I don't have an in-principle objection. I do require that
whatever mechanism we provide is usable by 32 bit guests
as well as 64 bit guests. You'll also want the access instruction
to be one where the hardware provides instruction syndrome information
to KVM about the faulting load/store, which means you can only use
AArch64 loads and stores of a single general-purpose register, and
AArch32 instructions LDR, LDA, LDRT, LDRSH, LDRSHT, LDRH, LDAH, LDRHT,
LDRSB, LDRSBT, LDRB, LDAB, LDRBT, STR, STL, STRT, STRH, STLH, STRHT,
STRB, STLB, or STRBT.

Note that this rules out LDP.

We also need to make sure it works with TCG QEMU. (64-bit access
to devices is something we've needed previously in ARM QEMU, so
though in theory it should work it would need testing.)

The other thing you could do is put image and initrd in to guest
memory as we do currently and have the fw_cfg just pass their
addresses. That would require that UEFI not have already stomped
on that part of RAM by the time you get round to looking at it,
though, and I can imagine that might not be feasible.

-- PMM
Peter Maydell Dec. 8, 2014, 2:43 p.m. UTC | #2
On 8 December 2014 at 14:41, Peter Maydell <peter.maydell@linaro.org> wrote:
> We also need to make sure it works with TCG QEMU. (64-bit access
> to devices is something we've needed previously in ARM QEMU, so
> though in theory it should work it would need testing.)

"is not something we've needed previously", obviously.

-- PMM
Laszlo Ersek Dec. 8, 2014, 7:39 p.m. UTC | #3
On 12/08/14 15:41, Peter Maydell wrote:
> On 8 December 2014 at 14:01, Laszlo Ersek <lersek@redhat.com> wrote:
>> Peter, can we introduce a second, 64-bit wide, data register, for
>> fw_cfg? (Or even two -- Drew suggested the LDP instruction for the guest.)
> 
> I don't have an in-principle objection. I do require that
> whatever mechanism we provide is usable by 32 bit guests
> as well as 64 bit guests.

The idea is that in the following:

  fw_cfg_data_mem_read()
    fw_cfg_read()

fw_cfg_data_mem_read() doesn't care about either "addr" or "size", which
happens to be the best for the current problem.

According to the documentation on "MemoryRegionOps" in
"include/exec/memory.h" (and also to my testing in practice), if you set
.valid.max_access_size to something large (definitely up to 8, but maybe
even up to 16?), but keep .impl.max_access_size at something smaller,
then "memory.c" will split up the access for you automatically.
(Ultimately turning the bigger access to several sequential calls to
fw_cfg_read(), which happens to be correct.)

So, if a 32-bit guest never "submits" an access wider than 32-bit, then
that's the largest that "memory.c" will slice and dice.

> You'll also want the access instruction
> to be one where the hardware provides instruction syndrome information
> to KVM about the faulting load/store,

Yes, I definitely remembered this; I just didn't know how to name it
precisely.

> which means you can only use
> AArch64 loads and stores of a single general-purpose register, and
> AArch32 instructions LDR, LDA, LDRT, LDRSH, LDRSHT, LDRH, LDAH, LDRHT,
> LDRSB, LDRSBT, LDRB, LDAB, LDRBT, STR, STL, STRT, STRH, STLH, STRHT,
> STRB, STLB, or STRBT.
> 
> Note that this rules out LDP.

Yes, I was concerned about this (Drew can confirm :)) Thank you very
much for making this clear; this will definitely help with the guest
code. Also we don't have to figure out if .valid.max_access_size=16
would work at all.

> We also need to make sure it works with TCG QEMU. (64-bit access
> to devices is something we've needed previously in ARM QEMU, so
> though in theory it should work it would need testing.)

(Factoring in the typo correction from your next mail:)

Good point. I just tested TCG. While the speedup is similar, the data
that is transferred looks corrupt.

> The other thing you could do is put image and initrd in to guest
> memory as we do currently and have the fw_cfg just pass their
> addresses. That would require that UEFI not have already stomped
> on that part of RAM by the time you get round to looking at it,
> though, and I can imagine that might not be feasible.

Yes, I thought of this. It's not very different from how the initial DTB
is handled right now, theoretically.

The difference is that the DTB is small. It is placed at 0x4000_0000
(1GB, base of DRAM). The earliest phase of the firmare (SEC and first
half of PEI) uses memory at the fixed address 0x4007_c000 (1GB + 496
KB). I would really not want to change that in the firmware.

http://thread.gmane.org/gmane.comp.bios.tianocore.devel/8969/focus=9029

We could place the kernel image and the initrd elsewhere, but (a)
placing stuff in the DRAM in advance puts minimum DRAM size requirements
on the guest (*), which is hard to verify / enforce in the firmware with
blobs of wildly varying size, and more importantly, (b) auditing the
safety of the initial DTB was already the hardest part of the firmware
patchset review by far. (No doubt it was one of the hardest things to
implement for Ard too.)

(*) See later parts in the same message, wrt. "second world". Later
phases need a good chunk of memory counting downwards from the top. See
also the SIZE_128MB assert near
<https://github.com/tianocore/edk2/blob/master/ArmPlatformPkg/ArmVirtualizationPkg/Library/ArmVirtualizationPlatformLib/Virt.c#L174>.

The nice thing about fw_cfg is that it lives in the "128MB..256MB [...]
miscellaneous device I/O" space. It's basically impossible to collide
with that in UEFI, yet it can give you various blobs (and qemu can even
compute those dynamically, see ACPI rebuild after PCI enumeration on
x86). I'd *really* like to stick with that.

I'll try to figure out what's wrong with TCG and come up with a patch
for the second data register.

Thanks!
Laszlo
Christopher Covington Dec. 8, 2014, 11:18 p.m. UTC | #4
Hi Laszlo,

On 12/08/2014 09:01 AM, Laszlo Ersek wrote:
> On 11/30/14 17:59, Laszlo Ersek wrote:
>> fw_cfg already supports exposure over MMIO (used in ppc/mac_newworld.c,
>> ppc/mac_oldworld.c, sparc/sun4m.c); we can easily add it to the "virt"
>> board.
>>
>> The mmio register block of fw_cfg is advertized in the device tree. As
>> base address we pick 0x09020000, which conforms to the comment preceding
>> "a15memmap": it falls in the miscellaneous device I/O range 128MB..256MB,
>> and it is aligned at 64KB. The DTB properties follow the documentation in
>> the Linux source file "Documentation/devicetree/bindings/arm/fw-cfg.txt".
>>
>> fw_cfg automatically exports a number of files to the guest; for example,
>> "bootorder" (see fw_cfg_machine_reset()).
>>
>> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
>> ---
>>
>> Notes:
>>     v2:
>>     - use a single mmio region of size 0x1000
>>     - set "compatible" property to "qemu,fw-cfg-mmio"
>>
>>  hw/arm/virt.c | 21 +++++++++++++++++++++
>>  1 file changed, 21 insertions(+)
>>
>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>> index 314e55b..af794ea 100644
>> --- a/hw/arm/virt.c
>> +++ b/hw/arm/virt.c
>> @@ -68,6 +68,7 @@ enum {
>>      VIRT_UART,
>>      VIRT_MMIO,
>>      VIRT_RTC,
>> +    VIRT_FW_CFG,
>>  };
>>  
>>  typedef struct MemMapEntry {
>> @@ -107,6 +108,7 @@ static const MemMapEntry a15memmap[] = {
>>      [VIRT_GIC_CPU] =    { 0x08010000, 0x00010000 },
>>      [VIRT_UART] =       { 0x09000000, 0x00001000 },
>>      [VIRT_RTC] =        { 0x09010000, 0x00001000 },
>> +    [VIRT_FW_CFG] =     { 0x09020000, 0x00001000 },
>>      [VIRT_MMIO] =       { 0x0a000000, 0x00000200 },
>>      /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
>>      /* 0x10000000 .. 0x40000000 reserved for PCI */
>> @@ -519,6 +521,23 @@ static void create_flash(const VirtBoardInfo *vbi)
>>      g_free(nodename);
>>  }
>>  
>> +static void create_fw_cfg(const VirtBoardInfo *vbi)
>> +{
>> +    hwaddr base = vbi->memmap[VIRT_FW_CFG].base;
>> +    hwaddr size = vbi->memmap[VIRT_FW_CFG].size;
>> +    char *nodename;
>> +
>> +    fw_cfg_init(0, 0, base, base + 2);
>> +
>> +    nodename = g_strdup_printf("/fw-cfg@%" PRIx64, base);
>> +    qemu_fdt_add_subnode(vbi->fdt, nodename);
>> +    qemu_fdt_setprop_string(vbi->fdt, nodename,
>> +                            "compatible", "qemu,fw-cfg-mmio");
>> +    qemu_fdt_setprop_sized_cells(vbi->fdt, nodename, "reg",
>> +                                 2, base, 2, size);
>> +    g_free(nodename);
>> +}
>> +
>>  static void *machvirt_dtb(const struct arm_boot_info *binfo, int *fdt_size)
>>  {
>>      const VirtBoardInfo *board = (const VirtBoardInfo *)binfo;
>> @@ -604,6 +623,8 @@ static void machvirt_init(MachineState *machine)
>>       */
>>      create_virtio_devices(vbi, pic);
>>  
>> +    create_fw_cfg(vbi);
>> +
>>      vbi->bootinfo.ram_size = machine->ram_size;
>>      vbi->bootinfo.kernel_filename = machine->kernel_filename;
>>      vbi->bootinfo.kernel_cmdline = machine->kernel_cmdline;
>>
> 
> So... after playing with this thing for some time, it's become clear
> that "MMIO traps" are painfully slow on the aarch64 platform we've been
> working on (using KVM).
> 
> The original approach in my guest UEFI patch was a simple loop that
> exerted byte-wise access to the fw_cfg device's data register (the only
> kind of access that fw_cfg allows ATM). Downloading a kernel image plus
> an initrd image byte for byte, which together can total between 30MB and
> 50MB, takes simply forever.

Just a thought--would it be possible to add a
DMA-the-whole-thing-to-this-address register to the simulated device?

Chris
Peter Maydell Dec. 8, 2014, 11:51 p.m. UTC | #5
On 8 December 2014 at 23:18, Christopher Covington <cov@codeaurora.org> wrote:
> Just a thought--would it be possible to add a
> DMA-the-whole-thing-to-this-address register to the simulated device?

That was an idea raised in the 2010 thread that Laszlo mentioned:
http://thread.gmane.org/gmane.comp.emulators.qemu/77582

It's a big old thread so I only skimmed it, but there seemed to
be some fairly heavy pushback on the idea -- has anything much
changed in the intervening time to make it a better plan this
time around?

-- PMM
Laszlo Ersek Dec. 9, 2014, 12:05 a.m. UTC | #6
On 12/09/14 00:18, Christopher Covington wrote:
> Hi Laszlo,
> 
> On 12/08/2014 09:01 AM, Laszlo Ersek wrote:
>> On 11/30/14 17:59, Laszlo Ersek wrote:
>>> fw_cfg already supports exposure over MMIO (used in ppc/mac_newworld.c,
>>> ppc/mac_oldworld.c, sparc/sun4m.c); we can easily add it to the "virt"
>>> board.
>>>
>>> The mmio register block of fw_cfg is advertized in the device tree. As
>>> base address we pick 0x09020000, which conforms to the comment preceding
>>> "a15memmap": it falls in the miscellaneous device I/O range 128MB..256MB,
>>> and it is aligned at 64KB. The DTB properties follow the documentation in
>>> the Linux source file "Documentation/devicetree/bindings/arm/fw-cfg.txt".
>>>
>>> fw_cfg automatically exports a number of files to the guest; for example,
>>> "bootorder" (see fw_cfg_machine_reset()).
>>>
>>> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
>>> ---
>>>
>>> Notes:
>>>     v2:
>>>     - use a single mmio region of size 0x1000
>>>     - set "compatible" property to "qemu,fw-cfg-mmio"
>>>
>>>  hw/arm/virt.c | 21 +++++++++++++++++++++
>>>  1 file changed, 21 insertions(+)
>>>
>>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>>> index 314e55b..af794ea 100644
>>> --- a/hw/arm/virt.c
>>> +++ b/hw/arm/virt.c
>>> @@ -68,6 +68,7 @@ enum {
>>>      VIRT_UART,
>>>      VIRT_MMIO,
>>>      VIRT_RTC,
>>> +    VIRT_FW_CFG,
>>>  };
>>>  
>>>  typedef struct MemMapEntry {
>>> @@ -107,6 +108,7 @@ static const MemMapEntry a15memmap[] = {
>>>      [VIRT_GIC_CPU] =    { 0x08010000, 0x00010000 },
>>>      [VIRT_UART] =       { 0x09000000, 0x00001000 },
>>>      [VIRT_RTC] =        { 0x09010000, 0x00001000 },
>>> +    [VIRT_FW_CFG] =     { 0x09020000, 0x00001000 },
>>>      [VIRT_MMIO] =       { 0x0a000000, 0x00000200 },
>>>      /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
>>>      /* 0x10000000 .. 0x40000000 reserved for PCI */
>>> @@ -519,6 +521,23 @@ static void create_flash(const VirtBoardInfo *vbi)
>>>      g_free(nodename);
>>>  }
>>>  
>>> +static void create_fw_cfg(const VirtBoardInfo *vbi)
>>> +{
>>> +    hwaddr base = vbi->memmap[VIRT_FW_CFG].base;
>>> +    hwaddr size = vbi->memmap[VIRT_FW_CFG].size;
>>> +    char *nodename;
>>> +
>>> +    fw_cfg_init(0, 0, base, base + 2);
>>> +
>>> +    nodename = g_strdup_printf("/fw-cfg@%" PRIx64, base);
>>> +    qemu_fdt_add_subnode(vbi->fdt, nodename);
>>> +    qemu_fdt_setprop_string(vbi->fdt, nodename,
>>> +                            "compatible", "qemu,fw-cfg-mmio");
>>> +    qemu_fdt_setprop_sized_cells(vbi->fdt, nodename, "reg",
>>> +                                 2, base, 2, size);
>>> +    g_free(nodename);
>>> +}
>>> +
>>>  static void *machvirt_dtb(const struct arm_boot_info *binfo, int *fdt_size)
>>>  {
>>>      const VirtBoardInfo *board = (const VirtBoardInfo *)binfo;
>>> @@ -604,6 +623,8 @@ static void machvirt_init(MachineState *machine)
>>>       */
>>>      create_virtio_devices(vbi, pic);
>>>  
>>> +    create_fw_cfg(vbi);
>>> +
>>>      vbi->bootinfo.ram_size = machine->ram_size;
>>>      vbi->bootinfo.kernel_filename = machine->kernel_filename;
>>>      vbi->bootinfo.kernel_cmdline = machine->kernel_cmdline;
>>>
>>
>> So... after playing with this thing for some time, it's become clear
>> that "MMIO traps" are painfully slow on the aarch64 platform we've been
>> working on (using KVM).
>>
>> The original approach in my guest UEFI patch was a simple loop that
>> exerted byte-wise access to the fw_cfg device's data register (the only
>> kind of access that fw_cfg allows ATM). Downloading a kernel image plus
>> an initrd image byte for byte, which together can total between 30MB and
>> 50MB, takes simply forever.
> 
> Just a thought--would it be possible to add a
> DMA-the-whole-thing-to-this-address register to the simulated device?

That's exactly what Rich proposed in the 2010 thread that I referenced a
few hours ago.

http://thread.gmane.org/gmane.comp.emulators.qemu/77582/focus=77589

(search for '"DMA"-like')

The original patchset is archived at

http://thread.gmane.org/gmane.comp.emulators.qemu/76610/focus=76673

which was shot down; the rebased one got an (unrelated) nack three
months later.

http://thread.gmane.org/gmane.comp.emulators.qemu/77625

Thanks
Laszlo
Gerd Hoffmann Dec. 9, 2014, 9:31 a.m. UTC | #7
Hi,

> So... after playing with this thing for some time, it's become clear
> that "MMIO traps" are painfully slow on the aarch64 platform we've been
> working on (using KVM).

So, as we don't have compatibility requirements, and we also can't play
tricks like using x86 string instructions:  How about a completely
different, dma-style interface for fw_cfg access?

One register for the (physical) target address.
One register for the transfer size.
One register for the fw_cfg entry.
Possibly one register for the fw_cfg offset (not really needed, but
avoids the need for side effects such as writing fw_cfg entry register
clearing the offset).
One register to kick the transfer.

cheers,
  Gerd
Laszlo Ersek Dec. 9, 2014, 3:41 p.m. UTC | #8
On 12/09/14 10:31, Gerd Hoffmann wrote:
>   Hi,
> 
>> So... after playing with this thing for some time, it's become clear
>> that "MMIO traps" are painfully slow on the aarch64 platform we've been
>> working on (using KVM).
> 
> So, as we don't have compatibility requirements, and we also can't play
> tricks like using x86 string instructions:  How about a completely
> different, dma-style interface for fw_cfg access?
> 
> One register for the (physical) target address.
> One register for the transfer size.
> One register for the fw_cfg entry.
> Possibly one register for the fw_cfg offset (not really needed, but
> avoids the need for side effects such as writing fw_cfg entry register
> clearing the offset).
> One register to kick the transfer.

Again, this was the idea that Rich had in 2010 (see the links in the
discussion thus far). It was rejected back then (which is why I didn't
even try to resurrect it now), and Peter has now asked if anything has
changed that would make that approach more acceptable now.

"Avi and Gleb not hanging around so they can't block the approach now"
might or might not be such a change; I don't know. But, I tend to trust
their opinion, even if dates back to 2010 in this case.

Anyway I'm certainly not opposed to performance, so if someone can
thoroughly refute everything they said in that thread, go ahead.

Thanks!
Laszlo
Peter Maydell Dec. 9, 2014, 3:47 p.m. UTC | #9
On 9 December 2014 at 15:41, Laszlo Ersek <lersek@redhat.com> wrote:
> Again, this was the idea that Rich had in 2010 (see the links in the
> discussion thus far). It was rejected back then (which is why I didn't
> even try to resurrect it now), and Peter has now asked if anything has
> changed that would make that approach more acceptable now.
>
> "Avi and Gleb not hanging around so they can't block the approach now"
> might or might not be such a change; I don't know. But, I tend to trust
> their opinion, even if dates back to 2010 in this case.

Yeah, this is about my point of view. A direct-write-to-memory
fw_cfg doesn't seem too unreasonable to me, but it would be nice
to see a solidly argued case from its proponents to counterbalance
the previous rejection.

> Anyway I'm certainly not opposed to performance, so if someone can
> thoroughly refute everything they said in that thread, go ahead.

As far as I can tell the main line of argument was "if you're
using fw_cfg to transfer huge amounts of data you're doing
something wrong".

Is ARM much higher overhead than x86 for these accesses?

(Added Alex on cc as one of the few people from the previous
round of argument who's still active...)

-- PMM
Richard W.M. Jones Dec. 9, 2014, 3:54 p.m. UTC | #10
On Tue, Dec 09, 2014 at 03:47:44PM +0000, Peter Maydell wrote:
> On 9 December 2014 at 15:41, Laszlo Ersek <lersek@redhat.com> wrote:
> > Again, this was the idea that Rich had in 2010 (see the links in the
> > discussion thus far). It was rejected back then (which is why I didn't
> > even try to resurrect it now), and Peter has now asked if anything has
> > changed that would make that approach more acceptable now.
> >
> > "Avi and Gleb not hanging around so they can't block the approach now"
> > might or might not be such a change; I don't know. But, I tend to trust
> > their opinion, even if dates back to 2010 in this case.
> 
> Yeah, this is about my point of view. A direct-write-to-memory
> fw_cfg doesn't seem too unreasonable to me, but it would be nice
> to see a solidly argued case from its proponents to counterbalance
> the previous rejection.

The argument for is that it's essentially instantaneous.

This is a big deal for libguestfs where even our new initrd (1.5 MB)
takes a noticable fraction of a second to load, and we want to get our
total start-up time down to 3 seconds to match x86.

> > Anyway I'm certainly not opposed to performance, so if someone can
> > thoroughly refute everything they said in that thread, go ahead.
> 
> As far as I can tell the main line of argument was "if you're
> using fw_cfg to transfer huge amounts of data you're doing
> something wrong".
> 
> Is ARM much higher overhead than x86 for these accesses?

On a slightly related topic, virtio-mmio traps are slow:

https://bugzilla.redhat.com/show_bug.cgi?id=1123555

This also kills libguestfs performance -- throughput this time, rather
than start-up time.

Rich.
Paolo Bonzini Dec. 10, 2014, 11:13 a.m. UTC | #11
On 09/12/2014 16:54, Richard W.M. Jones wrote:
> On a slightly related topic, virtio-mmio traps are slow:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1123555

I suspect it's not virtio-mmio, but just KVM on ARM.

It would be nice to have a timing test for vmexits in kvm-unit-tests,
similar to what is already there for x86.

Paolo
Andrew Jones Dec. 10, 2014, 1:10 p.m. UTC | #12
On Wed, Dec 10, 2014 at 12:13:34PM +0100, Paolo Bonzini wrote:
> On 09/12/2014 16:54, Richard W.M. Jones wrote:
> > On a slightly related topic, virtio-mmio traps are slow:
> > 
> > https://bugzilla.redhat.com/show_bug.cgi?id=1123555
> 
> I suspect it's not virtio-mmio, but just KVM on ARM.
> 
> It would be nice to have a timing test for vmexits in kvm-unit-tests,
> similar to what is already there for x86.
> 

Christoffer had some for his initial kernel selftests, and at one
point he was porting them to an early version of kvm-unit-tests/arm.
We can resurrect that shortly, as I'll be posting kvm-unit-tests/arm64
today.

drew
diff mbox

Patch

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 584c40d..739e19c 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -527,7 +527,7 @@  static void create_fw_cfg(const VirtBoardInfo *vbi)
     hwaddr size = vbi->memmap[VIRT_FW_CFG].size;
     char *nodename;

-    fw_cfg_init(0, 0, base, base + 2);
+    fw_cfg_init(0, 0, base, base + 8);

     nodename = g_strdup_printf("/fw-cfg@%" PRIx64, base);
     qemu_fdt_add_subnode(vbi->fdt, nodename);
diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
index a7122ee..7147fea 100644
--- a/hw/nvram/fw_cfg.c
+++ b/hw/nvram/fw_cfg.c
@@ -322,8 +322,9 @@  static const MemoryRegionOps fw_cfg_data_mem_ops = {
     .endianness = DEVICE_NATIVE_ENDIAN,
     .valid = {
         .min_access_size = 1,
-        .max_access_size = 1,
+        .max_access_size = 8,
     },
+    .impl.max_access_size = 1,
 };

 static const MemoryRegionOps fw_cfg_comb_mem_ops = {