Patchwork Seabios - read e820 table from qemu_cfg

login
register
mail settings
Submitter Jes Sorensen
Date Jan. 26, 2010, 9:52 p.m.
Message ID <4B5F640C.2030907@redhat.com>
Download mbox | patch
Permalink /patch/43749/
State New
Headers show

Comments

Jes Sorensen - Jan. 26, 2010, 9:52 p.m.
Hi,

Based on the feedback I received over the e820 reserve patch, I have
changed it to have QEMU pass in a list of entries that can cover more
than just the TSS/EPT range. This should provide the flexibility that
people were asking for.

The Seabios portion should allow for unlimited sized tables in theory,
whereas for QEMU I have set a fixed limit for now, but it can easily
be extended.

Please let me know what you think of this version!

Cheers,
Jes
Read optional table of e820 entries from qemu_cfg

Read optional table of e820 entries through qemu_cfg, allowing QEMU to
provide the location of KVM's switch area etc. rather than rely on
hard coded values.

For now, fall back to the old hard coded values for the TSS and EPT
switch page for compatibility reasons. Compatibility code could
possibly be removed in the future.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>

---
 src/paravirt.c |   17 +++++++++++++++++
 src/paravirt.h |    9 +++++++++
 src/post.c     |   23 +++++++++++++++++++----
 3 files changed, 45 insertions(+), 4 deletions(-)
Alexander Graf - Jan. 26, 2010, 9:55 p.m.
On 26.01.2010, at 22:53, Jes Sorensen wrote:

> Hi,
> 
> This is the QEMU-KVM part of the patch. If we can agree on this
> approach, I will do a version for upstream QEMU as well.

It shows as attachment again :(.


Alex

> 
> Cheers,
> Jes
> 
> <0011-qemu-kvm-e820-table.patch>
Kevin O'Connor - Jan. 28, 2010, 4:39 a.m.
On Tue, Jan 26, 2010 at 10:52:12PM +0100, Jes Sorensen wrote:
> Read optional table of e820 entries from qemu_cfg
[...]
> --- seabios.orig/src/paravirt.c
> +++ seabios/src/paravirt.c
> @@ -132,6 +132,23 @@ u16 qemu_cfg_smbios_entries(void)
>      return cnt;
>  }
>  
> +u32 qemu_cfg_e820_entries(void)
> +{
> +    u32 cnt;
> +
> +    if (!qemu_cfg_present)
> +        return 0;
> +
> +    qemu_cfg_read_entry(&cnt, QEMU_CFG_E820_TABLE, sizeof(cnt));
> +    return cnt;
> +}
> +
> +void* qemu_cfg_e820_load_next(void *addr)
> +{
> +    qemu_cfg_read(addr, sizeof(struct e820_entry));
> +    return addr;
> +}

I think defining accessor functions for every piece of data passed
through qemu-cfg interface is going to get tiring.  I'd prefer to
extend the existing qemu-cfg "file" interface for new content.

For example, add a helper with something like:

int qemu_cfg_get_file(const char *name, void *dest, int maxsize);

> -    if (kvm_para_available())
> -        // 4 pages before the bios, 3 pages for vmx tss pages, the
> -        // other page for EPT real mode pagetable
> -        add_e820(0xfffbc000, 4*4096, E820_RESERVED);
> +    if (kvm_para_available()) {
> +        u32 count;
> +
> +        count = qemu_cfg_e820_entries();
> +        if (count) {
> +            struct e820_entry entry;
> +            int i;
> +
> +            for (i = 0; i < count; i++) {
> +                qemu_cfg_e820_load_next(&entry);
> +                add_e820(entry.address, entry.length, entry.type);
> +            }

and then this becomes:

struct e820entry map[128];
int len = qemu_cfg_get_file("e820map", &map, sizeof(map));
if (len >= 0)
    for (i=0; i<len / sizeof(map[0]); i++)
        add_e820(map[i].start, map[i].size, map[i].type);

The advantage being that it should be possible to write one set of
helper functions in both qemu and seabios that can then be used to
pass arbitrary content.

As a side note, it should probably do the e820 map check even for qemu
users (ie, not just kvm).

-Kevin
Jes Sorensen - Jan. 29, 2010, 9:03 a.m.
On 01/28/10 05:39, Kevin O'Connor wrote:
> I think defining accessor functions for every piece of data passed
> through qemu-cfg interface is going to get tiring.  I'd prefer to
> extend the existing qemu-cfg "file" interface for new content.
>
> For example, add a helper with something like:
>
> int qemu_cfg_get_file(const char *name, void *dest, int maxsize);

Hi Kevin,

I think switching qemu_cfg to use a file name based interface would be
a nice feature, but I think it should be independent of this patch. I am
CC'ing Gleb on this as he did the original design I believe.

>> -    if (kvm_para_available())
>> -        // 4 pages before the bios, 3 pages for vmx tss pages, the
>> -        // other page for EPT real mode pagetable
>> -        add_e820(0xfffbc000, 4*4096, E820_RESERVED);
>> +    if (kvm_para_available()) {
>> +        u32 count;
>> +
>> +        count = qemu_cfg_e820_entries();
>> +        if (count) {
>> +            struct e820_entry entry;
>> +            int i;
>> +
>> +            for (i = 0; i<  count; i++) {
>> +                qemu_cfg_e820_load_next(&entry);
>> +                add_e820(entry.address, entry.length, entry.type);
>> +            }
>
> and then this becomes:
>
> struct e820entry map[128];
> int len = qemu_cfg_get_file("e820map",&map, sizeof(map));
> if (len>= 0)
>      for (i=0; i<len / sizeof(map[0]); i++)
>          add_e820(map[i].start, map[i].size, map[i].type);
>
> The advantage being that it should be possible to write one set of
> helper functions in both qemu and seabios that can then be used to
> pass arbitrary content.

The only issue here is that I designed the Seabios portion to not rely
on the size of the struct, to avoid having to statically reserve it like
in your example. Having the qemu_cfg_get_file() function return a
pointer to a file descriptor and then have a qemu_cfg_read() helper that
takes the descriptor as it's first argument would avoid this problem.

> As a side note, it should probably do the e820 map check even for qemu
> users (ie, not just kvm).

Ah I didn't realize Seabios would try to use the fw_cfg interface if it
wasn't running on top of QEMU. That would be good to do.

Cheers,
Jes
Gleb Natapov - Jan. 29, 2010, 4:08 p.m.
On Fri, Jan 29, 2010 at 10:03:55AM +0100, Jes Sorensen wrote:
> On 01/28/10 05:39, Kevin O'Connor wrote:
> >I think defining accessor functions for every piece of data passed
> >through qemu-cfg interface is going to get tiring.  I'd prefer to
> >extend the existing qemu-cfg "file" interface for new content.
> >
> >For example, add a helper with something like:
> >
> >int qemu_cfg_get_file(const char *name, void *dest, int maxsize);
> 
> Hi Kevin,
> 
> I think switching qemu_cfg to use a file name based interface would be
> a nice feature, but I think it should be independent of this patch. I am
> CC'ing Gleb on this as he did the original design I believe.
> 
There is already file like interface on top of fw_cfg. Look for
qemu_cfg_read_file(). I am not sure this is a good idea to start using
it for something that is not actually a file. I have no problem with
adding accessors for each new data time. As you noted below this way we
don't need to load the whole e820 map into the memory, but can do entry
by entry.

--
			Gleb.
Kevin O'Connor - Jan. 30, 2010, 3:35 a.m.
On Fri, Jan 29, 2010 at 10:03:55AM +0100, Jes Sorensen wrote:
> On 01/28/10 05:39, Kevin O'Connor wrote:
> >The advantage being that it should be possible to write one set of
> >helper functions in both qemu and seabios that can then be used to
> >pass arbitrary content.
> 
> The only issue here is that I designed the Seabios portion to not rely
> on the size of the struct, to avoid having to statically reserve it like
> in your example. Having the qemu_cfg_get_file() function return a
> pointer to a file descriptor and then have a qemu_cfg_read() helper that
> takes the descriptor as it's first argument would avoid this problem.

SeaBIOS already has a maximum size for the e820 map (32) - see
CONFIG_MAX_E820.

> >As a side note, it should probably do the e820 map check even for qemu
> >users (ie, not just kvm).
> 
> Ah I didn't realize Seabios would try to use the fw_cfg interface if it
> wasn't running on top of QEMU. That would be good to do.

Your patch only used it for kvm.  SeaBIOS will use fw_cfg on both qemu
and kvm.

-Kevin

Patch

Index: seabios/src/paravirt.c
===================================================================
--- seabios.orig/src/paravirt.c
+++ seabios/src/paravirt.c
@@ -132,6 +132,23 @@  u16 qemu_cfg_smbios_entries(void)
     return cnt;
 }
 
+u32 qemu_cfg_e820_entries(void)
+{
+    u32 cnt;
+
+    if (!qemu_cfg_present)
+        return 0;
+
+    qemu_cfg_read_entry(&cnt, QEMU_CFG_E820_TABLE, sizeof(cnt));
+    return cnt;
+}
+
+void* qemu_cfg_e820_load_next(void *addr)
+{
+    qemu_cfg_read(addr, sizeof(struct e820_entry));
+    return addr;
+}
+
 struct smbios_header {
     u16 length;
     u8 type;
Index: seabios/src/paravirt.h
===================================================================
--- seabios.orig/src/paravirt.h
+++ seabios/src/paravirt.h
@@ -36,6 +36,7 @@  static inline int kvm_para_available(voi
 #define QEMU_CFG_ACPI_TABLES		(QEMU_CFG_ARCH_LOCAL + 0)
 #define QEMU_CFG_SMBIOS_ENTRIES		(QEMU_CFG_ARCH_LOCAL + 1)
 #define QEMU_CFG_IRQ0_OVERRIDE		(QEMU_CFG_ARCH_LOCAL + 2)
+#define QEMU_CFG_E820_TABLE		(QEMU_CFG_ARCH_LOCAL + 3)
 
 extern int qemu_cfg_present;
 
@@ -61,8 +62,16 @@  typedef struct QemuCfgFile {
     char name[56];
 } QemuCfgFile;
 
+struct e820_entry {
+    u64 address;
+    u64 length;
+    u32 type;
+};
+
 u16 qemu_cfg_first_file(QemuCfgFile *entry);
 u16 qemu_cfg_next_file(QemuCfgFile *entry);
 u32 qemu_cfg_read_file(QemuCfgFile *entry, void *dst, u32 maxlen);
+u32 qemu_cfg_e820_entries(void);
+void* qemu_cfg_e820_load_next(void *addr);
 
 #endif
Index: seabios/src/post.c
===================================================================
--- seabios.orig/src/post.c
+++ seabios/src/post.c
@@ -135,10 +135,25 @@  ram_probe(void)
              , E820_RESERVED);
     add_e820(BUILD_BIOS_ADDR, BUILD_BIOS_SIZE, E820_RESERVED);
 
-    if (kvm_para_available())
-        // 4 pages before the bios, 3 pages for vmx tss pages, the
-        // other page for EPT real mode pagetable
-        add_e820(0xfffbc000, 4*4096, E820_RESERVED);
+    if (kvm_para_available()) {
+        u32 count;
+
+        count = qemu_cfg_e820_entries();
+        if (count) {
+            struct e820_entry entry;
+            int i;
+
+            for (i = 0; i < count; i++) {
+                qemu_cfg_e820_load_next(&entry);
+                add_e820(entry.address, entry.length, entry.type);
+            }
+        } else {
+            // Backwards compatibility - provide hard coded range.
+            // 4 pages before the bios, 3 pages for vmx tss pages, the
+            // other page for EPT real mode pagetable
+            add_e820(0xfffbc000, 4*4096, E820_RESERVED);
+        }
+    }
 
     dprintf(1, "Ram Size=0x%08x (0x%08x%08x high)\n"
             , RamSize, (u32)(RamSizeOver4G >> 32), (u32)RamSizeOver4G);