From patchwork Tue Sep 13 08:40:31 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Alexander Graf X-Patchwork-Id: 114473 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [140.186.70.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 0613BB71B9 for ; Tue, 13 Sep 2011 18:40:53 +1000 (EST) Received: from localhost ([::1]:34976 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1R3OXv-000076-3O for incoming@patchwork.ozlabs.org; Tue, 13 Sep 2011 04:40:47 -0400 Received: from eggs.gnu.org ([140.186.70.92]:35632) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1R3OXn-000071-Nf for qemu-devel@nongnu.org; Tue, 13 Sep 2011 04:40:41 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1R3OXi-0007Kt-Rx for qemu-devel@nongnu.org; Tue, 13 Sep 2011 04:40:39 -0400 Received: from cantor2.suse.de ([195.135.220.15]:44279 helo=mx2.suse.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1R3OXi-0007Kj-E6 for qemu-devel@nongnu.org; Tue, 13 Sep 2011 04:40:34 -0400 Received: from relay2.suse.de (charybdis-ext.suse.de [195.135.221.2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx2.suse.de (Postfix) with ESMTP id 2CAFF8C5DF; Tue, 13 Sep 2011 10:40:33 +0200 (CEST) Mime-Version: 1.0 (Apple Message framework v1084) From: Alexander Graf In-Reply-To: <4E6F10DD.6060606@siemens.com> Date: Tue, 13 Sep 2011 10:40:31 +0200 Message-Id: <6BA6355D-D77A-40F4-A8C4-61901A926E71@suse.de> References: <3d9d904a1e4939a147f8954c9e0d4cdaf3d44c31.1314033132.git.jan.kiszka@siemens.com> <4E6E2329.9050109@suse.de> <4E6E2631.4060300@siemens.com> <4E6E28FD.7070907@web.de> <4E6E2A06.8010900@siemens.com> <4E6E2BDC.3060702@siemens.com> <4E6F10DD.6060606@siemens.com> To: Jan Kiszka X-Mailer: Apple Mail (2.1084) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4-2.6 X-Received-From: 195.135.220.15 Cc: Avi Kivity , =?iso-8859-1?Q?Andreas_F=E4rber?= , Gerd Hoffmann , Anthony Liguori , qemu-devel Subject: Re: [Qemu-devel] [PATCH v3 5/6] vga: Use linear mapping + dirty logging in chain 4 memory access mode X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org On 13.09.2011, at 10:14, Jan Kiszka wrote: > On 2011-09-13 09:39, Alexander Graf wrote: >> >> On 12.09.2011, at 17:57, Jan Kiszka wrote: >> >>> On 2011-09-12 17:49, Jan Kiszka wrote: >>>> On 2011-09-12 17:45, Andreas Färber wrote: >>>>> Am 12.09.2011 17:33, schrieb Jan Kiszka: >>>>>> On 2011-09-12 17:20, Alexander Graf wrote: >>>>>>> Jan Kiszka wrote: >>>>>>>> Most VGA memory access modes require MMIO handling as they demand weird >>>>>>>> logic to get a byte from or into the video RAM. However, there is one >>>>>>>> exception: chain 4 mode with all memory planes enabled for writing. This >>>>>>>> mode actually allows lineary mapping, which can then be combined with >>>>>>>> dirty logging to accelerate KVM. >>>>>>>> >>>>>>>> This patch accelerates specifically VBE accesses like they are used by >>>>>>>> grub in graphical mode. Not only the standard VGA adapter benefits from >>>>>>>> this, also vmware and spice in VGA mode. >>>>>>>> >>>>>>>> CC: Gerd Hoffmann >>>>>>>> CC: Avi Kivity >>>>>>>> Signed-off-by: Jan Kiszka >>>>>>>> >>>>>>> [...] >>>>>>> >>>>>>>> +static void vga_update_memory_access(VGACommonState *s) >>>>>>>> +{ >>>>>>>> + MemoryRegion *region, *old_region = s->chain4_alias; >>>>>>>> + target_phys_addr_t base, offset, size; >>>>>>>> + >>>>>>>> + s->chain4_alias = NULL; >>>>>>>> + >>>>>>>> + if ((s->sr[0x02] & 0xf) == 0xf && s->sr[0x04] & 0x08) { >>>>>>>> + offset = 0; >>>>>>>> + switch ((s->gr[6] >> 2) & 3) { >>>>>>>> + case 0: >>>>>>>> + base = 0xa0000; >>>>>>>> + size = 0x20000; >>>>>>>> + break; >>>>>>>> + case 1: >>>>>>>> + base = 0xa0000; >>>>>>>> + size = 0x10000; >>>>>>>> + offset = s->bank_offset; >>>>>>>> + break; >>>>>>>> + case 2: >>>>>>>> + base = 0xb0000; >>>>>>>> + size = 0x8000; >>>>>>>> + break; >>>>>>>> + case 3: >>>>>>>> + base = 0xb8000; >>>>>>>> + size = 0x8000; >>>>>>>> + break; >>>>>>>> + } >>>>>>>> + region = g_malloc(sizeof(*region)); >>>>>>>> + memory_region_init_alias(region, "vga.chain4", &s->vram, offset, size); >>>>>>>> + memory_region_add_subregion_overlap(s->legacy_address_space, base, >>>>>>>> + region, 2); >>>>>>>> >>>>>>> This one eventually gives me the following in info mtree with -M g3beige >>>>>>> on qemu-system-ppc: >>>>>>> >>>>>>> (qemu) info mtree >>>>>>> memory >>>>>>> system addr 00000000 off 00000000 size 7fffffffffffffff >>>>>>> -vga.chain4 addr 000a0000 off 00000000 size 10000 >>>>>>> -macio addr 80880000 off 00000000 size 80000 >>>>>>> --macio-nvram addr 00060000 off 00000000 size 20000 >>>>>>> --pmac-ide addr 00020000 off 00000000 size 1000 >>>>>>> --cuda addr 00016000 off 00000000 size 2000 >>>>>>> --escc-bar addr 00013000 off 00000000 size 40 >>>>>>> --dbdma addr 00008000 off 00000000 size 1000 >>>>>>> --heathrow-pic addr 00000000 off 00000000 size 1000 >>>>>>> -vga.rom addr 80800000 off 00000000 size 10000 >>>>>>> -vga.vram addr 80000000 off 00000000 size 800000 >>>>>>> -vga-lowmem addr 800a0000 off 00000000 size 20000 >>>>>>> -escc addr 80013000 off 00000000 size 40 >>>>>>> -isa-mmio addr fe000000 off 00000000 size 200000 >>>>>>> I/O >>>>>>> io addr 00000000 off 00000000 size 10000 >>>>>>> -cmd646-bmdma addr 00000700 off 00000000 size 10 >>>>>>> --cmd646-bmdma-ioport addr 0000000c off 00000000 size 4 >>>>>>> --cmd646-bmdma-bus addr 00000008 off 00000000 size 4 >>>>>>> --cmd646-bmdma-ioport addr 00000004 off 00000000 size 4 >>>>>>> --cmd646-bmdma-bus addr 00000000 off 00000000 size 4 >>>>>>> -cmd646-cmd addr 00000680 off 00000000 size 4 >>>>>>> -cmd646-data addr 00000600 off 00000000 size 8 >>>>>>> -cmd646-cmd addr 00000580 off 00000000 size 4 >>>>>>> -cmd646-data addr 00000500 off 00000000 size 8 >>>>>>> -ne2000 addr 00000400 off 00000000 size 100 >>>>>>> >>>>>>> This ends up overmapping 0xa0000, effectively overwriting kernel data. >>>>>>> If I #if 0 the offending chunk out, everything is fine. I would assume >>>>>>> that chain4 really needs to be inside of lowmem? No idea about VGA, but >>>>>>> I'm sure you know what's going on :). >>>>>> Does this help? >>>>>> >>>>>> diff --git a/hw/vga.c b/hw/vga.c >>>>>> index 125fb29..0a0c5a6 100644 >>>>>> --- a/hw/vga.c >>>>>> +++ b/hw/vga.c >>>>>> @@ -181,6 +181,7 @@ static void vga_update_memory_access(VGACommonState *s) >>>>>> size = 0x8000; >>>>>> break; >>>>>> } >>>>>> + base += isa_mem_base; >>>>>> region = g_malloc(sizeof(*region)); >>>>>> memory_region_init_alias(region, "vga.chain4", &s->vram, offset, size); >>>>>> memory_region_add_subregion_overlap(s->legacy_address_space, base, >>>>> >>>>> No longer oopses, but the screen looks chaotic now (black bar at bottom, >>>>> part of contents at top etc.). >>>> >>>> Does this PPC machine map the ISA range and forward VGA accesses to the >>>> adapter in general? >>> >>> If it does, please post a dump of the VGACommonState while the screen is >>> corrupted (gdb or via device_show [1]. Maybe I missed some condition >>> that prevents chain4 optimizations, and your guest triggers this. >> >> Picture: http://dl.dropbox.com/u/8976842/qemu/Screen%20shot%202011-09-13%20at%2009.37.15.png >> >> (qemu) info qtree >> bus: main-system-bus >> type System >> dev: fw_cfg, id "#8" >> dev-prop: ctl_iobase = 0x0 >> dev-prop: data_iobase = 0x0 >> irq 0 >> mmio f0000510/00000002 >> mmio f0000512/00000002 >> dev: escc, id "#4" >> dev-prop: frequency = 3686400 >> dev-prop: it_shift = 4 >> dev-prop: disabled = 0 >> dev-prop: disabled = 0 >> dev-prop: chnBtype = 0 >> dev-prop: chnAtype = 0 >> dev-prop: chrB = >> dev-prop: chrA = serial0 >> irq 2 >> mmio 80013000/00000040 >> dev: grackle, id "#1" >> irq 0 >> mmio fec00000/00001000 >> mmio fee00000/00001000 >> bus: pci >> type PCI >> dev: cmd646-ide, id "#6" >> dev-prop: secondary = 0 >> bus-prop: addr = 03.0 >> bus-prop: romfile = >> bus-prop: rombar = 1 >> bus-prop: multifunction = off >> bus-prop: command_serr_enable = on >> class IDE controller, addr 00:03.0, pci id 1095:0646 (sub 1af4:1100) >> bar 0: i/o at 0x500 [0x507] >> bar 1: i/o at 0x580 [0x583] >> bar 2: i/o at 0x600 [0x607] >> bar 3: i/o at 0x680 [0x683] >> bar 4: i/o at 0x700 [0x70f] >> bus: ide.1 >> type IDE >> bus: ide.0 >> type IDE >> dev: ide-cd, id "#7" >> dev-prop: drive = ide1-cd0 >> dev-prop: logical_block_size = 512 >> dev-prop: physical_block_size = 512 >> dev-prop: min_io_size = 0 >> dev-prop: opt_io_size = 0 >> dev-prop: bootindex = -1 >> dev-prop: discard_granularity = 0 >> dev-prop: ver = "0.15.50" >> dev-prop: serial = "QM00003" >> bus-prop: unit = 0 >> dev: ne2k_pci, id "#5" >> dev-prop: mac = 52:54:00:12:34:56 >> dev-prop: vlan = 0 >> dev-prop: netdev = >> dev-prop: bootindex = -1 >> bus-prop: addr = 02.0 >> bus-prop: romfile = >> bus-prop: rombar = 1 >> bus-prop: multifunction = off >> bus-prop: command_serr_enable = on >> class Ethernet controller, addr 00:02.0, pci id 10ec:8029 (sub 1af4:1100) >> bar 0: i/o at 0x400 [0x4ff] >> dev: VGA, id "#3" >> bus-prop: addr = 01.0 >> bus-prop: romfile = "vgabios-stdvga.bin" >> bus-prop: rombar = 1 >> bus-prop: multifunction = off >> bus-prop: command_serr_enable = on >> class VGA controller, addr 00:01.0, pci id 1234:1111 (sub 1af4:1100) >> bar 0: mem at 0x80000000 [0x807fffff] >> bar 6: mem at 0x80800000 [0x8080ffff] >> dev: grackle, id "#2" >> bus-prop: addr = 00.0 >> bus-prop: romfile = >> bus-prop: rombar = 1 >> bus-prop: multifunction = off >> bus-prop: command_serr_enable = on >> class Host bridge, addr 00:00.0, pci id 1057:0002 (sub 1af4:1100) >> >> >> >> (qemu) device_show #3 >> dev: VGA, id "#3", version 2 >> dev. >> version_id: 00000002 >> config: 00 00 00 00 10 d1 cf 20 - 00 00 00 00 10 d1 d0 30 >> ... >> irq_state: 00 00 00 00 00 00 00 10 - 00 00 00 00 00 00 00 00 >> vga. >> latch: 00000000 >> sr_index: 00 >> sr: 00 00 0f 00 08 00 00 00 >> gr_index: 00 >> gr: 00 00 00 00 00 40 05 00 - 00 00 00 00 00 00 00 00 >> ar_index: 20 >> ar: 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00 >> ... >> ar_flip_flop: 00000001 >> cr_index: 00 >> cr: 00 63 00 00 00 00 00 50 - 00 40 00 00 00 00 00 00 >> ... >> msr: 00 >> fcr: 00 >> st00: 00 >> st01: 00 >> dac_state: 00 >> dac_sub_index: 00 >> dac_read_index: 00 >> dac_write_index: 10 >> dac_cache: 3f 3f 3f >> palette: 00 00 00 00 00 2a 00 2a - 00 00 2a 2a 2a 00 00 2a >> ... >> bank_offset: 00000000 >> is_vbe_vmstate: 01 >> vbe_index: 0004 >> vbe_regs[00]: b0c5 >> vbe_regs[01]: 0320 >> vbe_regs[02]: 0258 >> vbe_regs[03]: 000f >> vbe_regs[04]: 0001 >> vbe_regs[05]: 0000 >> vbe_regs[06]: 0320 >> vbe_regs[07]: 0258 >> vbe_regs[08]: 0000 >> vbe_regs[09]: 0000 >> vbe_start_addr: 00000000 >> vbe_line_offset: 00000640 >> vbe_bank_mask: 0000007f >> > > Makes no sense, must work with this setup. Maybe it's dynamic effect > when switching modes of bank offsets. Do you have some test image for me? Btw, it still tries to execute invalid code even with your patch. #if 0'ing out the memory region updates at least get the guest booting for me. Btw, to get it working you also need a patch for the interrupt controller (another breakage thanks to memory api). Alex diff --git a/hw/heathrow_pic.c b/hw/heathrow_pic.c index 51996ab..16f48d1 100644 --- a/hw/heathrow_pic.c +++ b/hw/heathrow_pic.c @@ -126,7 +126,7 @@ static uint64_t pic_read(void *opaque, target_phys_addr_t addr, static const MemoryRegionOps heathrow_pic_ops = { .read = pic_read, .write = pic_write, - .endianness = DEVICE_NATIVE_ENDIAN, + .endianness = DEVICE_LITTLE_ENDIAN, }; static void heathrow_pic_set_irq(void *opaque, int num, int level)