Message ID | 3d9d904a1e4939a147f8954c9e0d4cdaf3d44c31.1314033132.git.jan.kiszka@siemens.com |
---|---|
State | New |
Headers | show |
On 08/22/2011 08:12 PM, Jan Kiszka wrote: > Most VGA memory access modes require MMIO handling as they demand weird > logic to get a byte from or into the video RAM. However, there is one > exception: chain 4 mode with all memory planes enabled for writing. This > mode actually allows lineary mapping, which can then be combined with > dirty logging to accelerate KVM. > > This patch accelerates specifically VBE accesses like they are used by > grub in graphical mode. Not only the standard VGA adapter benefits from > this, also vmware and spice in VGA mode. > > On which version of grub does this work? This isn't having any effect on my Fedora grub.
On 2011-08-25 09:19, Avi Kivity wrote: > On 08/22/2011 08:12 PM, Jan Kiszka wrote: >> Most VGA memory access modes require MMIO handling as they demand weird >> logic to get a byte from or into the video RAM. However, there is one >> exception: chain 4 mode with all memory planes enabled for writing. This >> mode actually allows lineary mapping, which can then be combined with >> dirty logging to accelerate KVM. >> >> This patch accelerates specifically VBE accesses like they are used by >> grub in graphical mode. Not only the standard VGA adapter benefits from >> this, also vmware and spice in VGA mode. >> >> > > On which version of grub does this work? This isn't having any effect > on my Fedora grub. It's both grub1 (0.97) in graphical mode as used by OpenSUSE 11.4 and grub2 (1.99-rc1) of Ubuntu 11.04. Is Fedora's grub still slow or was it already fast before? Which Fedora? Jan
On 08/25/2011 12:07 PM, Jan Kiszka wrote: > >> > > > > On which version of grub does this work? This isn't having any effect > > on my Fedora grub. > > It's both grub1 (0.97) in graphical mode as used by OpenSUSE 11.4 and > grub2 (1.99-rc1) of Ubuntu 11.04. > > Is Fedora's grub still slow or was it already fast before? Which Fedora? > Still slow. I tried an old F11 image I had lying around, and -snapshot /dev/sda, but this laptop was installed many years ago. I'll download some more images and try.
On 2011-08-25 11:16, Avi Kivity wrote: > On 08/25/2011 12:07 PM, Jan Kiszka wrote: >>>> >>> >>> On which version of grub does this work? This isn't having any effect >>> on my Fedora grub. >> >> It's both grub1 (0.97) in graphical mode as used by OpenSUSE 11.4 and >> grub2 (1.99-rc1) of Ubuntu 11.04. >> >> Is Fedora's grub still slow or was it already fast before? Which Fedora? >> > > Still slow. I tried an old F11 image I had lying around, and -snapshot > /dev/sda, but this laptop was installed many years ago. I'll download > some more images and try. You may also want to instrument vga_update_memory_access if some requirement for linear mapping is not fulfilled. Jan
On 08/25/2011 12:21 PM, Jan Kiszka wrote: > > > > Still slow. I tried an old F11 image I had lying around, and -snapshot > > /dev/sda, but this laptop was installed many years ago. I'll download > > some more images and try. > > You may also want to instrument vga_update_memory_access if some > requirement for linear mapping is not fulfilled. > Plain F15 is slow. SR2 = SR4 = 0.
On 2011-08-25 12:45, Avi Kivity wrote: > On 08/25/2011 12:21 PM, Jan Kiszka wrote: >>> >>> Still slow. I tried an old F11 image I had lying around, and -snapshot >>> /dev/sda, but this laptop was installed many years ago. I'll download >>> some more images and try. >> >> You may also want to instrument vga_update_memory_access if some >> requirement for linear mapping is not fulfilled. >> > > Plain F15 is slow. SR2 = SR4 = 0. So it's not using chain4 mode. Can you check what mode the adapter is actually in and how VRAM is accessed? Likely, there is nothing we can do about it. /me just wonders what makes F15 grub behave differently from other distro's versions... Jan
On 08/25/2011 01:51 PM, Jan Kiszka wrote: > > > > Plain F15 is slow. SR2 = SR4 = 0. > > So it's not using chain4 mode. Can you check what mode the adapter is > actually in and how VRAM is accessed? Likely, there is nothing we can do > about it. /me just wonders what makes F15 grub behave differently from > other distro's versions... > Do you remember offhand which registers I need to look at? We need a collection of gdb scripts to find a qdev, dump the memory hierarchy, look up gpas, and get info about specific devices.
On 2011-08-25 13:19, Avi Kivity wrote: > On 08/25/2011 01:51 PM, Jan Kiszka wrote: >>> >>> Plain F15 is slow. SR2 = SR4 = 0. >> >> So it's not using chain4 mode. Can you check what mode the adapter is >> actually in and how VRAM is accessed? Likely, there is nothing we can do >> about it. /me just wonders what makes F15 grub behave differently from >> other distro's versions... >> > > Do you remember offhand which registers I need to look at? No, just that there are quite a few. You could check e.g. http://www.osdever.net/FreeVGA/vga/vga.htm for descriptions. Also check what paths the mmio handlers take. > > We need a collection of gdb scripts to find a qdev, dump the memory > hierarchy, look up gpas, and get info about specific devices. ...or finally have a "device-show" monitor command to dump the device state. More stable /wrt QEMU internals. Jan
Jan Kiszka wrote: > Most VGA memory access modes require MMIO handling as they demand weird > logic to get a byte from or into the video RAM. However, there is one > exception: chain 4 mode with all memory planes enabled for writing. This > mode actually allows lineary mapping, which can then be combined with > dirty logging to accelerate KVM. > > This patch accelerates specifically VBE accesses like they are used by > grub in graphical mode. Not only the standard VGA adapter benefits from > this, also vmware and spice in VGA mode. > > CC: Gerd Hoffmann <kraxel@redhat.com> > CC: Avi Kivity <avi@redhat.com> > Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> > [...] > +static void vga_update_memory_access(VGACommonState *s) > +{ > + MemoryRegion *region, *old_region = s->chain4_alias; > + target_phys_addr_t base, offset, size; > + > + s->chain4_alias = NULL; > + > + if ((s->sr[0x02] & 0xf) == 0xf && s->sr[0x04] & 0x08) { > + offset = 0; > + switch ((s->gr[6] >> 2) & 3) { > + case 0: > + base = 0xa0000; > + size = 0x20000; > + break; > + case 1: > + base = 0xa0000; > + size = 0x10000; > + offset = s->bank_offset; > + break; > + case 2: > + base = 0xb0000; > + size = 0x8000; > + break; > + case 3: > + base = 0xb8000; > + size = 0x8000; > + break; > + } > + region = g_malloc(sizeof(*region)); > + memory_region_init_alias(region, "vga.chain4", &s->vram, offset, size); > + memory_region_add_subregion_overlap(s->legacy_address_space, base, > + region, 2); > This one eventually gives me the following in info mtree with -M g3beige on qemu-system-ppc: (qemu) info mtree memory system addr 00000000 off 00000000 size 7fffffffffffffff -vga.chain4 addr 000a0000 off 00000000 size 10000 -macio addr 80880000 off 00000000 size 80000 --macio-nvram addr 00060000 off 00000000 size 20000 --pmac-ide addr 00020000 off 00000000 size 1000 --cuda addr 00016000 off 00000000 size 2000 --escc-bar addr 00013000 off 00000000 size 40 --dbdma addr 00008000 off 00000000 size 1000 --heathrow-pic addr 00000000 off 00000000 size 1000 -vga.rom addr 80800000 off 00000000 size 10000 -vga.vram addr 80000000 off 00000000 size 800000 -vga-lowmem addr 800a0000 off 00000000 size 20000 -escc addr 80013000 off 00000000 size 40 -isa-mmio addr fe000000 off 00000000 size 200000 I/O io addr 00000000 off 00000000 size 10000 -cmd646-bmdma addr 00000700 off 00000000 size 10 --cmd646-bmdma-ioport addr 0000000c off 00000000 size 4 --cmd646-bmdma-bus addr 00000008 off 00000000 size 4 --cmd646-bmdma-ioport addr 00000004 off 00000000 size 4 --cmd646-bmdma-bus addr 00000000 off 00000000 size 4 -cmd646-cmd addr 00000680 off 00000000 size 4 -cmd646-data addr 00000600 off 00000000 size 8 -cmd646-cmd addr 00000580 off 00000000 size 4 -cmd646-data addr 00000500 off 00000000 size 8 -ne2000 addr 00000400 off 00000000 size 100 This ends up overmapping 0xa0000, effectively overwriting kernel data. If I #if 0 the offending chunk out, everything is fine. I would assume that chain4 really needs to be inside of lowmem? No idea about VGA, but I'm sure you know what's going on :). Alex
On Mon, Sep 12, 2011 at 3:20 PM, Alexander Graf <agraf@suse.de> wrote: > Jan Kiszka wrote: >> Most VGA memory access modes require MMIO handling as they demand weird >> logic to get a byte from or into the video RAM. However, there is one >> exception: chain 4 mode with all memory planes enabled for writing. This >> mode actually allows lineary mapping, which can then be combined with >> dirty logging to accelerate KVM. >> >> This patch accelerates specifically VBE accesses like they are used by >> grub in graphical mode. Not only the standard VGA adapter benefits from >> this, also vmware and spice in VGA mode. >> >> CC: Gerd Hoffmann <kraxel@redhat.com> >> CC: Avi Kivity <avi@redhat.com> >> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> >> > > [...] > >> +static void vga_update_memory_access(VGACommonState *s) >> +{ >> + MemoryRegion *region, *old_region = s->chain4_alias; >> + target_phys_addr_t base, offset, size; >> + >> + s->chain4_alias = NULL; >> + >> + if ((s->sr[0x02] & 0xf) == 0xf && s->sr[0x04] & 0x08) { >> + offset = 0; >> + switch ((s->gr[6] >> 2) & 3) { >> + case 0: >> + base = 0xa0000; >> + size = 0x20000; >> + break; >> + case 1: >> + base = 0xa0000; >> + size = 0x10000; >> + offset = s->bank_offset; >> + break; >> + case 2: >> + base = 0xb0000; >> + size = 0x8000; >> + break; >> + case 3: >> + base = 0xb8000; >> + size = 0x8000; >> + break; >> + } >> + region = g_malloc(sizeof(*region)); >> + memory_region_init_alias(region, "vga.chain4", &s->vram, offset, size); >> + memory_region_add_subregion_overlap(s->legacy_address_space, base, >> + region, 2); >> > > This one eventually gives me the following in info mtree with -M g3beige > on qemu-system-ppc: > > (qemu) info mtree :-) > memory > system addr 00000000 off 00000000 size 7fffffffffffffff > -vga.chain4 addr 000a0000 off 00000000 size 10000 > -macio addr 80880000 off 00000000 size 80000 > --macio-nvram addr 00060000 off 00000000 size 20000 > --pmac-ide addr 00020000 off 00000000 size 1000 > --cuda addr 00016000 off 00000000 size 2000 > --escc-bar addr 00013000 off 00000000 size 40 > --dbdma addr 00008000 off 00000000 size 1000 > --heathrow-pic addr 00000000 off 00000000 size 1000 > -vga.rom addr 80800000 off 00000000 size 10000 > -vga.vram addr 80000000 off 00000000 size 800000 > -vga-lowmem addr 800a0000 off 00000000 size 20000 > -escc addr 80013000 off 00000000 size 40 > -isa-mmio addr fe000000 off 00000000 size 200000 > I/O > io addr 00000000 off 00000000 size 10000 > -cmd646-bmdma addr 00000700 off 00000000 size 10 > --cmd646-bmdma-ioport addr 0000000c off 00000000 size 4 > --cmd646-bmdma-bus addr 00000008 off 00000000 size 4 > --cmd646-bmdma-ioport addr 00000004 off 00000000 size 4 > --cmd646-bmdma-bus addr 00000000 off 00000000 size 4 > -cmd646-cmd addr 00000680 off 00000000 size 4 > -cmd646-data addr 00000600 off 00000000 size 8 > -cmd646-cmd addr 00000580 off 00000000 size 4 > -cmd646-data addr 00000500 off 00000000 size 8 > -ne2000 addr 00000400 off 00000000 size 100 > > This ends up overmapping 0xa0000, effectively overwriting kernel data. > If I #if 0 the offending chunk out, everything is fine. I would assume > that chain4 really needs to be inside of lowmem? No idea about VGA, but > I'm sure you know what's going on :). I had similar problems with sun4u, fixed with f69539b14bdba7a5cd22e1f4bed439b476b17286. I think also here, PCI should be given a memory range at 0x80000000 and VGA should automatically use that like.
On 12.09.2011, at 22:21, Blue Swirl wrote: > On Mon, Sep 12, 2011 at 3:20 PM, Alexander Graf <agraf@suse.de> wrote: >> Jan Kiszka wrote: >>> Most VGA memory access modes require MMIO handling as they demand weird >>> logic to get a byte from or into the video RAM. However, there is one >>> exception: chain 4 mode with all memory planes enabled for writing. This >>> mode actually allows lineary mapping, which can then be combined with >>> dirty logging to accelerate KVM. >>> >>> This patch accelerates specifically VBE accesses like they are used by >>> grub in graphical mode. Not only the standard VGA adapter benefits from >>> this, also vmware and spice in VGA mode. >>> >>> CC: Gerd Hoffmann <kraxel@redhat.com> >>> CC: Avi Kivity <avi@redhat.com> >>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> >>> >> >> [...] >> >>> +static void vga_update_memory_access(VGACommonState *s) >>> +{ >>> + MemoryRegion *region, *old_region = s->chain4_alias; >>> + target_phys_addr_t base, offset, size; >>> + >>> + s->chain4_alias = NULL; >>> + >>> + if ((s->sr[0x02] & 0xf) == 0xf && s->sr[0x04] & 0x08) { >>> + offset = 0; >>> + switch ((s->gr[6] >> 2) & 3) { >>> + case 0: >>> + base = 0xa0000; >>> + size = 0x20000; >>> + break; >>> + case 1: >>> + base = 0xa0000; >>> + size = 0x10000; >>> + offset = s->bank_offset; >>> + break; >>> + case 2: >>> + base = 0xb0000; >>> + size = 0x8000; >>> + break; >>> + case 3: >>> + base = 0xb8000; >>> + size = 0x8000; >>> + break; >>> + } >>> + region = g_malloc(sizeof(*region)); >>> + memory_region_init_alias(region, "vga.chain4", &s->vram, offset, size); >>> + memory_region_add_subregion_overlap(s->legacy_address_space, base, >>> + region, 2); >>> >> >> This one eventually gives me the following in info mtree with -M g3beige >> on qemu-system-ppc: >> >> (qemu) info mtree > > :-) > >> memory >> system addr 00000000 off 00000000 size 7fffffffffffffff >> -vga.chain4 addr 000a0000 off 00000000 size 10000 >> -macio addr 80880000 off 00000000 size 80000 >> --macio-nvram addr 00060000 off 00000000 size 20000 >> --pmac-ide addr 00020000 off 00000000 size 1000 >> --cuda addr 00016000 off 00000000 size 2000 >> --escc-bar addr 00013000 off 00000000 size 40 >> --dbdma addr 00008000 off 00000000 size 1000 >> --heathrow-pic addr 00000000 off 00000000 size 1000 >> -vga.rom addr 80800000 off 00000000 size 10000 >> -vga.vram addr 80000000 off 00000000 size 800000 >> -vga-lowmem addr 800a0000 off 00000000 size 20000 >> -escc addr 80013000 off 00000000 size 40 >> -isa-mmio addr fe000000 off 00000000 size 200000 >> I/O >> io addr 00000000 off 00000000 size 10000 >> -cmd646-bmdma addr 00000700 off 00000000 size 10 >> --cmd646-bmdma-ioport addr 0000000c off 00000000 size 4 >> --cmd646-bmdma-bus addr 00000008 off 00000000 size 4 >> --cmd646-bmdma-ioport addr 00000004 off 00000000 size 4 >> --cmd646-bmdma-bus addr 00000000 off 00000000 size 4 >> -cmd646-cmd addr 00000680 off 00000000 size 4 >> -cmd646-data addr 00000600 off 00000000 size 8 >> -cmd646-cmd addr 00000580 off 00000000 size 4 >> -cmd646-data addr 00000500 off 00000000 size 8 >> -ne2000 addr 00000400 off 00000000 size 100 >> >> This ends up overmapping 0xa0000, effectively overwriting kernel data. >> If I #if 0 the offending chunk out, everything is fine. I would assume >> that chain4 really needs to be inside of lowmem? No idea about VGA, but >> I'm sure you know what's going on :). > > I had similar problems with sun4u, fixed with > f69539b14bdba7a5cd22e1f4bed439b476b17286. I think also here, PCI > should be given a memory range at 0x80000000 and VGA should > automatically use that like. Yeah, usually the ISA bus is behind an ISA-PCI bridge, so it should inherit the offset from its parent. Or do you mean something different? Alex
On 09/13/2011 09:54 AM, Alexander Graf wrote: > > > > I had similar problems with sun4u, fixed with > > f69539b14bdba7a5cd22e1f4bed439b476b17286. I think also here, PCI > > should be given a memory range at 0x80000000 and VGA should > > automatically use that like. > > Yeah, usually the ISA bus is behind an ISA-PCI bridge, so it should inherit the offset from its parent. Or do you mean something different? > He means that isa_mem_base should go away; instead isa_address_space() should be a subregion at offset 0x80000000. Which vga variant are you using?
On 13.09.2011, at 09:51, Avi Kivity wrote: > On 09/13/2011 09:54 AM, Alexander Graf wrote: >> > >> > I had similar problems with sun4u, fixed with >> > f69539b14bdba7a5cd22e1f4bed439b476b17286. I think also here, PCI >> > should be given a memory range at 0x80000000 and VGA should >> > automatically use that like. >> >> Yeah, usually the ISA bus is behind an ISA-PCI bridge, so it should inherit the offset from its parent. Or do you mean something different? >> > > He means that isa_mem_base should go away; instead isa_address_space() should be a subregion at offset 0x80000000. So we are talking about the same thing. Logically speaking, ISA devices are behind the ISA-PCI bridge, so the parent would be the bridge, right? > Which vga variant are you using? This is stdvga. Alex
On 09/13/2011 10:54 AM, Alexander Graf wrote: > >> > >> Yeah, usually the ISA bus is behind an ISA-PCI bridge, so it should inherit the offset from its parent. Or do you mean something different? > >> > > > > He means that isa_mem_base should go away; instead isa_address_space() should be a subregion at offset 0x80000000. > > So we are talking about the same thing. Logically speaking, ISA devices are behind the ISA-PCI bridge, so the parent would be the bridge, right? Right. system_memory -> pci_address_space() -> isa_address_space() -> various vga areas. > > > Which vga variant are you using? > > This is stdvga. > > Don't see the call to vga_init() in that path (this is what passes isa_address_space() to the vga core).
diff --git a/hw/vga.c b/hw/vga.c index 432d2cb..851fd68 100644 --- a/hw/vga.c +++ b/hw/vga.c @@ -152,6 +152,48 @@ static void vga_screen_dump(void *opaque, const char *filename); static char *screen_dump_filename; static DisplayChangeListener *screen_dump_dcl; +static void vga_update_memory_access(VGACommonState *s) +{ + MemoryRegion *region, *old_region = s->chain4_alias; + target_phys_addr_t base, offset, size; + + s->chain4_alias = NULL; + + if ((s->sr[0x02] & 0xf) == 0xf && s->sr[0x04] & 0x08) { + offset = 0; + switch ((s->gr[6] >> 2) & 3) { + case 0: + base = 0xa0000; + size = 0x20000; + break; + case 1: + base = 0xa0000; + size = 0x10000; + offset = s->bank_offset; + break; + case 2: + base = 0xb0000; + size = 0x8000; + break; + case 3: + base = 0xb8000; + size = 0x8000; + break; + } + region = g_malloc(sizeof(*region)); + memory_region_init_alias(region, "vga.chain4", &s->vram, offset, size); + memory_region_add_subregion_overlap(s->legacy_address_space, base, + region, 2); + s->chain4_alias = region; + } + if (old_region) { + memory_region_del_subregion(s->legacy_address_space, old_region); + memory_region_destroy(old_region); + g_free(old_region); + s->plane_updated = 0xf; + } +} + static void vga_dumb_update_retrace_info(VGACommonState *s) { (void) s; @@ -445,6 +487,7 @@ void vga_ioport_write(void *opaque, uint32_t addr, uint32_t val) #endif s->sr[s->sr_index] = val & sr_mask[s->sr_index]; if (s->sr_index == 1) s->update_retrace_info(s); + vga_update_memory_access(s); break; case 0x3c7: s->dac_read_index = val; @@ -472,6 +515,7 @@ void vga_ioport_write(void *opaque, uint32_t addr, uint32_t val) printf("vga: write GR%x = 0x%02x\n", s->gr_index, val); #endif s->gr[s->gr_index] = val & gr_mask[s->gr_index]; + vga_update_memory_access(s); break; case 0x3b4: case 0x3d4: @@ -605,6 +649,7 @@ static void vbe_ioport_write_data(void *opaque, uint32_t addr, uint32_t val) } s->vbe_regs[s->vbe_index] = val; s->bank_offset = (val << 16); + vga_update_memory_access(s); break; case VBE_DISPI_INDEX_ENABLE: if ((val & VBE_DISPI_ENABLED) && @@ -664,6 +709,7 @@ static void vbe_ioport_write_data(void *opaque, uint32_t addr, uint32_t val) } s->dac_8bit = (val & VBE_DISPI_8BIT_DAC) > 0; s->vbe_regs[s->vbe_index] = val; + vga_update_memory_access(s); break; case VBE_DISPI_INDEX_VIRT_WIDTH: { @@ -1238,7 +1284,7 @@ static void vga_draw_text(VGACommonState *s, int full_update) s->font_offsets[1] = offset; full_update = 1; } - if (s->plane_updated & (1 << 2)) { + if (s->plane_updated & (1 << 2) || s->chain4_alias) { /* if the plane 2 was modified since the last display, it indicates the font may have been modified */ s->plane_updated = 0; @@ -1885,6 +1931,7 @@ void vga_common_reset(VGACommonState *s) memset(&s->retrace_info, 0, sizeof (s->retrace_info)); break; } + vga_update_memory_access(s); } static void vga_reset(void *opaque) @@ -2242,6 +2289,8 @@ void vga_init(VGACommonState *s, MemoryRegion *address_space) s->bank_offset = 0; + s->legacy_address_space = address_space; + vga_io_memory = vga_init_io(s); memory_region_add_subregion_overlap(address_space, isa_mem_base + 0x000a0000, diff --git a/hw/vga_int.h b/hw/vga_int.h index d2dd7dd..28b9287 100644 --- a/hw/vga_int.h +++ b/hw/vga_int.h @@ -105,11 +105,13 @@ typedef uint8_t (* vga_retrace_fn)(struct VGACommonState *s); typedef void (* vga_update_retrace_info_fn)(struct VGACommonState *s); typedef struct VGACommonState { + MemoryRegion *legacy_address_space; uint8_t *vram_ptr; MemoryRegion vram; uint32_t vram_size; uint32_t latch; uint32_t lfb_vram_mapped; /* whether 0xa0000 is mapped as ram */ + MemoryRegion *chain4_alias; uint8_t sr_index; uint8_t sr[256]; uint8_t gr_index;
Most VGA memory access modes require MMIO handling as they demand weird logic to get a byte from or into the video RAM. However, there is one exception: chain 4 mode with all memory planes enabled for writing. This mode actually allows lineary mapping, which can then be combined with dirty logging to accelerate KVM. This patch accelerates specifically VBE accesses like they are used by grub in graphical mode. Not only the standard VGA adapter benefits from this, also vmware and spice in VGA mode. CC: Gerd Hoffmann <kraxel@redhat.com> CC: Avi Kivity <avi@redhat.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> --- hw/vga.c | 51 ++++++++++++++++++++++++++++++++++++++++++++++++++- hw/vga_int.h | 2 ++ 2 files changed, 52 insertions(+), 1 deletions(-)