From patchwork Tue Aug 24 16:52:36 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Cam Macdonell X-Patchwork-Id: 62612 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [199.232.76.165]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 3E8CAB70A7 for ; Wed, 25 Aug 2010 02:53:23 +1000 (EST) Received: from localhost ([127.0.0.1]:47325 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OnwkR-0007wH-Bc for incoming@patchwork.ozlabs.org; Tue, 24 Aug 2010 12:53:19 -0400 Received: from [140.186.70.92] (port=45333 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Onwjp-0007v7-4M for qemu-devel@nongnu.org; Tue, 24 Aug 2010 12:52:42 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1Onwjn-00063h-18 for qemu-devel@nongnu.org; Tue, 24 Aug 2010 12:52:40 -0400 Received: from mail-vw0-f45.google.com ([209.85.212.45]:35928) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Onwjm-00063M-Sb for qemu-devel@nongnu.org; Tue, 24 Aug 2010 12:52:38 -0400 Received: by vws19 with SMTP id 19so661248vws.4 for ; Tue, 24 Aug 2010 09:52:37 -0700 (PDT) MIME-Version: 1.0 Received: by 10.220.59.202 with SMTP id m10mr4497151vch.59.1282668756712; Tue, 24 Aug 2010 09:52:36 -0700 (PDT) Received: by 10.220.170.207 with HTTP; Tue, 24 Aug 2010 09:52:36 -0700 (PDT) In-Reply-To: <20100721034918.GA6285@valinux.co.jp> References: <4C2997C5.1020302@redhat.com> <20100630032901.GA19142@valinux.co.jp> <20100713204157.GI31689@valinux.co.jp> <20100714025216.GK31689@valinux.co.jp> <20100720095223.GC24957@valinux.co.jp> <20100721033101.GD11146@redhat.com> <20100721034918.GA6285@valinux.co.jp> Date: Tue, 24 Aug 2010 10:52:36 -0600 X-Google-Sender-Auth: C2pIfGKfWCjqboCqh-OqDE06UHs Message-ID: Subject: Re: [Qemu-devel] Re: Unusual physical address when using 64-bit BAR From: Cam Macdonell To: Isaku Yamahata X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) Cc: Avi Kivity , seabios@seabios.org, "qemu-devel@nongnu.org Developers" , "Michael S. Tsirkin" X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org On Tue, Jul 20, 2010 at 9:49 PM, Isaku Yamahata wrote: > Added Cc: seabios@seabios.org > > On Wed, Jul 21, 2010 at 06:31:01AM +0300, Michael S. Tsirkin wrote: >> On Tue, Jul 20, 2010 at 06:52:23PM +0900, Isaku Yamahata wrote: >> > On Wed, Jul 14, 2010 at 09:10:28AM -0600, Cam Macdonell wrote: >> > > On Tue, Jul 13, 2010 at 8:52 PM, Isaku Yamahata wrote: >> > > > On Tue, Jul 13, 2010 at 04:48:19PM -0600, Cam Macdonell wrote: >> > > >> On Tue, Jul 13, 2010 at 2:41 PM, Isaku Yamahata wrote: >> > > >> > On Tue, Jul 13, 2010 at 02:05:51PM -0600, Cam Macdonell wrote: >> > > >> >> >> > Seabios completely ignore the 64-bitness of the BAR. ?Looks like it also >> > > >> >> >> > thinks the second half of the BAR is an I/O region instead of memory (hence >> > > >> >> >> > the c200, that's part of the pci portio region. >> > > >> >> > >> > > >> >> > I've sent the patches to address it. But they haven't been merged yet. >> > > >> >> > seabios doesn't map BARs beyond 4GB. >> > > >> >> > If bar is mapped beyond 4GB, guest BIOS does it. >> > > >> >> >> > > >> >> Have those patches been merged yet? >> > > >> > >> > > >> > They have been merged into seabios upstream now. >> > > >> > qemu seabios fork hasn't pulled for a while, though. >> > > >> > >> > > >> > >> > > >> >> > To see how seabios works, it would help to increase CONFIG_DEBUG_LEVEL >> > > >> >> > in config.h of seabios >> > > >> >> >> > > >> >> Where does the output from seabios end up? ?Inside dmesg? >> > > >> > >> > > >> > It outputs them to the serial console which qemu emulates. >> > > >> > seabios is out of kernel control, so dmesg doesn't show it. >> > > >> > >> > > >> > >> > > >> >> >> pci_read_config: (val) 0x0 <- 0x1c (addr) >> > > >> >> >> pci_write_config: (val) 0x0 -> 0x1c (addr) >> > > >> >> >> pci_read_config: (val) 0xffffffff <- 0x1c (addr) >> > > >> >> >> pci_write_config: (val) 0x0 -> 0x1c (addr) >> > > >> >> >> pci_read_config: (val) 0x0 <- 0x1c (addr) >> > > >> >> >> pci_write_config: (val) 0x0 -> 0x1c (addr) >> > > >> >> > >> > > >> >> > seabios BAR3. Not sure how it is mapped from this >> > > >> >> > message. >> > > >> >> >> > > >> >> Isn't the BAR3 from the fact that a 64-bit BAR would use both BAR2 and >> > > >> >> BAR3 to store all 64-bits? >> > > >> > >> > > >> > Yes. Seabios misbehaves. 64bit bar is(was) a missing feature. >> > > >> > -- >> > > >> > yamahata >> > > >> > >> > > >> > >> > > >> >> > > >> With the latest seabios git passed via -bios, I no longer see the >> > > >> 48-bit address, but instead a 32-bit address and then >> > > >> ffffffff00000000. ?This guest has 1gb of RAM so the address isn't be >> > > >> mapped beyond 4g. >> > > > >> > > > Can I see the debug log like before? >> > > > (hopefully seabios with CONFIG_DEBUG_LEVEL enabled.) >> > > >> > > Here's the dump from SeaBIOS in the region related to the PCI devices. >> > >  The SeaBIOS output is identical whether the BAR is 32-bit or 64-bit. >> > > >> > > PCI: bus=0 devfn=0x10: vendor_id=0x1013 device_id=0x00b8 >> > > region 0: 0xf0000000 >> > > region 1: 0xf2000000 >> > > region 6: 0xf2010000 >> > > PCI: bus=0 devfn=0x18: vendor_id=0x1af4 device_id=0x1000 >> > > region 0: 0x0000c020 >> > > region 1: 0xf2020000 >> > > region 6: 0xf2030000 >> > > PCI: bus=0 devfn=0x20: vendor_id=0x1af4 device_id=0x1110 >> > > region 0: 0xf2040000 >> > > region 1: 0xf2041000 >> > > region 2: 0x00000000 >> > >> > Is this region (region 2 of devfn=0x20: vendor_id=0x1af4 device_id=0x1110) >> > the BAR in quistion? >> > The value 0 seems odd. Probably BAR address calculation overflowed. >> > Currently seabios doesn't check overflow. I attached the patch. >> > >> > >> > > > Do you know who sets the BAR to ffffffff00000000? >> > > >> > > Here are the config reads/writes related to the 0x18/1c, the 'IVSHMEM' >> > > lines are from the map function passed to pci_register_bar().  It >> > > looks like SeaBIOS sets the address to 0 and then the potentially >> > > useful e0000000 address gets mangled into ffffffff000000. >> > >> > There is something wrong with the debug message of write case, I suppose. >> > All written value are 0, but the resulted effect doesn't seems so. >> > >> > > >> > > IVSHMEM: guest pci addr = 0, guest h/w addr = 1090912256, size = 536870912 >> > > >> > > ...snip... >> > > >> > > pci_read_config: (val) 0x4 <- 0x18 (addr) >> > > pci_write_config: (val) 0x0 -> 0x18 (addr) >> > > IVSHMEM: guest pci addr = e0000000, guest h/w addr = 1090912256, size = 20000000 >> > >> > If 0 is written to 0x18, the bar address should be 0, but it says e0000000. >> > >> > > pci_read_config: (val) 0xe0000004 <- 0x18 (addr) >> > >> > The read value isn't 0. and so on... >> > >> > > pci_write_config: (val) 0x0 -> 0x18 (addr) >> > > pci_read_config: (val) 0x0 <- 0x1c (addr) >> > > pci_write_config: (val) 0x0 -> 0x1c (addr) >> > > IVSHMEM: guest pci addr = ffffffff00000000, guest h/w addr = >> > > 1090912256, size = 20000000 >> > > pci_read_config: (val) 0xffffffff <- 0x1c (addr) >> > > pci_write_config: (val) 0x0 -> 0x1c (addr) >> > > >> > > and with the 64-bit guest I get this error as well (recall the guest >> > > fails to boot on 64-bit) >> > > >> > > BUG: kvm_dirty_pages_log_change: invalid parameters >> > > 00000000f0000000-00000000f0ffffff >> > >> > >> > diff --git a/src/pciinit.c b/src/pciinit.c >> > index b110531..6eca2ce 100644 >> > --- a/src/pciinit.c >> > +++ b/src/pciinit.c >> > @@ -90,7 +90,8 @@ static int pci_bios_allocate_region(u16 bdf, int region_num) >> >                   /* If pci_bios_prefmem_addr == 0, keep old behaviour */ >> >                   pci_bios_prefmem_addr != 0) { >> >              paddr = &pci_bios_prefmem_addr; >> > -            if (ALIGN(*paddr, size) + size >= BUILD_PCIPREFMEM_END) { >> > +            if (ALIGN(*paddr, size) + size < *paddr || >> > +                ALIGN(*paddr, size) + size >= BUILD_PCIPREFMEM_END) { >> >                  dprintf(1, >> >                          "prefmem region of (bdf 0x%x bar %d) can't be mapped. " >> >                          "decrease BUILD_PCIMEM_SIZE and recompile. size %x\n", >> > @@ -99,7 +100,8 @@ static int pci_bios_allocate_region(u16 bdf, int region_num) >> >              } >> >          } else { >> >              paddr = &pci_bios_mem_addr; >> > -            if (ALIGN(*paddr, size) + size >= BUILD_PCIMEM_END) { >> > +            if (ALIGN(*paddr, size) + size < *paddr || >> > +                ALIGN(*paddr, size) + size >= BUILD_PCIMEM_END) { >> >                  dprintf(1, >> >                          "mem region of (bdf 0x%x bar %d) can't be mapped. " >> >                          "increase BUILD_PCIMEM_SIZE and recompile. size %x\n", >> >> Looking at the source, all of the values like pci_bios_prefmem_addr seem to be >> 32 bit. Since in the spec prefetcheable memory is up to 64 bit, >> can't the math overflow, here and elsewhere? >> Maybe we should switch to 64 bit values all over ... > > Make sense. I'll create a patch to convert them into u64. > >> >> > @@ -116,12 +118,8 @@ static int pci_bios_allocate_region(u16 bdf, int region_num) >> > >> >      int is_64bit = !(val & PCI_BASE_ADDRESS_SPACE_IO) && >> >          (val & PCI_BASE_ADDRESS_MEM_TYPE_MASK) == PCI_BASE_ADDRESS_MEM_TYPE_64; >> > -    if (is_64bit) { >> > -        if (size > 0) { >> > -            pci_config_writel(bdf, ofs + 4, 0); >> > -        } else { >> > -            pci_config_writel(bdf, ofs + 4, ~0); >> > -        } >> > +    if (is_64bit && size > 0) { >> > +        pci_config_writel(bdf, ofs + 4, 0); >> >      } >> >      return is_64bit; >> >  } >> >> >> Was there any reason we wrote all-ones there on size 0? >> BAR sizing? > > No reason. It's just left over from debugging. > So I'd like to remove it. > > -- > yamahata > > Hi, 64-bit BARs still do not seem to be working. When using the latest seabios the guest does not hit a "BUG:" statement, but booting still fails HPET: 1 timers in total, 0 timers will be used for per-cpu timer divide error: 0000 [#1] SMP last sysfs file: CPU 0 Modules linked in: Pid: 1, comm: swapper Not tainted 2.6.35+ #299 /Bochs RIP: 0010:[] [] hpet_alloc+0x12c/0x35b RSP: 0018:ffff88007d7b3d80 EFLAGS: 00010246 RAX: 00038d7ea4c68000 RBX: ffff88007d062cc0 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff817bb9b0 RBP: ffff88007d7b3dc0 R08: 00000000000080d0 R09: ffffc90000000000 R10: ffff88007d72b5a0 R11: 0000000000000000 R12: ffff88007d7b3dd0 R13: ffffc90000000000 R14: 0000000000000000 R15: ffffffff817a41c3 FS: 0000000000000000(0000) GS:ffff880002000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000000 CR3: 0000000001a42000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper (pid: 1, threadinfo ffff88007d7b2000, task ffff88007d7b8000) Stack: ffff88007f43ab90 ffff88007f43ab90 ffffffff81ca6174 ffffffff81b1f5e1 <0> 0000000000000000 0000000000000100 0000000000000100 0000000000000000 <0> ffff88007d7b3e80 ffffffff810294ea 00000000fed00000 ffffc90000000000 Call Trace: [] ? hpet_late_init+0x0/0xea [] hpet_reserve_platform_timers+0x10b/0x115 [] ? hpet_late_init+0x0/0xea [] hpet_late_init+0x6b/0xea [] ? hpet_late_init+0x0/0xea [] do_one_initcall+0x5e/0x159 [] kernel_init+0x19a/0x228 [] kernel_thread_helper+0x4/0x10 [] ? kernel_init+0x0/0x228 [] ? kernel_thread_helper+0x0/0x10 Code: 89 1d ca b2 b3 00 48 c1 ea 21 8b 73 34 49 c7 c7 c3 41 7a 81 48 8d 04 02 4c 89 f2 48 c7 c7 b0 b9 7b 81 48 c1 ea 20 48 89 d1 31 d2 <48> f7 f1 83 7b 30 01 48 c7 c1 86 1c 7d 81 49 0f 46 cf 48 89 43 RIP [] hpet_alloc+0x12c/0x35b RSP ---[ end trace a7919e7f17c0a725 ]--- Kernel panic - not syncing: Attempted to kill init! Pid: 1, comm: swapper Tainted: G D 2.6.35+ #299 Call Trace: [] panic+0x8b/0x10b [] ? exit_ptrace+0x38/0x121 [] do_exit+0x7a/0x722 [] ? spin_unlock_irqrestore+0xe/0x10 [] ? kmsg_dump+0x12b/0x145 [] oops_end+0xbf/0xc7 [] die+0x5a/0x63 [] do_trap+0x121/0x130 [] do_divide_error+0x96/0x9f [] ? hpet_alloc+0x12c/0x35b [] ? radix_tree_preload+0x34/0x88 [] divide_error+0x1b/0x20 [] ? hpet_alloc+0x12c/0x35b [] ? hpet_late_init+0x0/0xea [] hpet_reserve_platform_timers+0x10b/0x115 [] ? hpet_late_init+0x0/0xea [] hpet_late_init+0x6b/0xea [] ? hpet_late_init+0x0/0xea [] do_one_initcall+0x5e/0x159 [] kernel_init+0x19a/0x228 [] kernel_thread_helper+0x4/0x10 [] ? kernel_init+0x0/0x228 [] ? kernel_thread_helper+0x0/0x10 seabios output for the device: PCI: bus=0 devfn=0x20: vendor_id=0x1af4 device_id=0x1110 region 0: 0xf1020000 region 2: 0x00000000 init smm Running the latest seabios, the debug output only remaps the BAR twice, once with a potentially correct address of e00000000 pci_read_config: (val) 0xe0000004 <- 0x18 (addr) ...snip... pci_default_write_config: (val) 0x0 -> 0x18 (addr) IVSHMEM: guest pci addr = e0000000, guest h/w addr = 2164588544, size = 20000000 pci_read_config: (val) 0xe0000004 <- 0x18 (addr) pci_default_write_config: (val) 0x0 -> 0x18 (addr) pci_read_config: (val) 0x0 <- 0x1c (addr) pci_default_write_config: (val) 0x0 -> 0x1c (addr) IVSHMEM: guest pci addr = ffffffff00000000, guest h/w addr = 2164588544, size = 20000000 pci_read_config: (val) 0xffffffff <- 0x1c (addr) pci_default_write_config: (val) 0x0 -> 0x1c (addr) pci_read_config: (val) 0x0 <- 0x20 (addr) the pci writes are all still 0, I can't see how my debug statements are incorrect though. Below is my trivial pci config debugging patch. addr >= PIIX_CONFIG_IRQ_ROUTE && Cam diff --git a/hw/pci.c b/hw/pci.c index 70dbace..01087b1 100644 --- a/hw/pci.c +++ b/hw/pci.c @@ -1159,6 +1159,8 @@ static uint32_t pci_read_config(PCIDevice *d, len = MIN(len, pci_config_size(d) - address); memcpy(&val, d->config + address, len); + if (strncmp(d->name, "ivshmem", 7) == 0) + printf("pci_read_config: (val) 0x%x <- 0x%x (addr)\n", val, address); return le32_to_cpu(val); } @@ -1219,6 +1221,8 @@ void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val, int l) d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask); } + if (strncmp(d->name, "ivshmem", 7) == 0) + printf("pci_write_config: (val) 0x%x -> 0x%x (addr)\n", val, addr); #ifdef CONFIG_KVM_DEVICE_ASSIGNMENT if (kvm_enabled() && kvm_irqchip_in_kernel() &&