diff mbox

Re: Unusual physical address when using 64-bit BAR

Message ID AANLkTinw-3cErM-ZAfXK_-S4F3oEkn49HJgwAu5=y_YJ@mail.gmail.com
State New
Headers show

Commit Message

Cam Macdonell Aug. 24, 2010, 4:52 p.m. UTC
On Tue, Jul 20, 2010 at 9:49 PM, Isaku Yamahata <yamahata@valinux.co.jp> wrote:
> Added Cc: seabios@seabios.org
>
> On Wed, Jul 21, 2010 at 06:31:01AM +0300, Michael S. Tsirkin wrote:
>> On Tue, Jul 20, 2010 at 06:52:23PM +0900, Isaku Yamahata wrote:
>> > On Wed, Jul 14, 2010 at 09:10:28AM -0600, Cam Macdonell wrote:
>> > > On Tue, Jul 13, 2010 at 8:52 PM, Isaku Yamahata <yamahata@valinux.co.jp> wrote:
>> > > > On Tue, Jul 13, 2010 at 04:48:19PM -0600, Cam Macdonell wrote:
>> > > >> On Tue, Jul 13, 2010 at 2:41 PM, Isaku Yamahata <yamahata@valinux.co.jp> wrote:
>> > > >> > On Tue, Jul 13, 2010 at 02:05:51PM -0600, Cam Macdonell wrote:
>> > > >> >> >> > Seabios completely ignore the 64-bitness of the BAR. ?Looks like it also
>> > > >> >> >> > thinks the second half of the BAR is an I/O region instead of memory (hence
>> > > >> >> >> > the c200, that's part of the pci portio region.
>> > > >> >> >
>> > > >> >> > I've sent the patches to address it. But they haven't been merged yet.
>> > > >> >> > seabios doesn't map BARs beyond 4GB.
>> > > >> >> > If bar is mapped beyond 4GB, guest BIOS does it.
>> > > >> >>
>> > > >> >> Have those patches been merged yet?
>> > > >> >
>> > > >> > They have been merged into seabios upstream now.
>> > > >> > qemu seabios fork hasn't pulled for a while, though.
>> > > >> >
>> > > >> >
>> > > >> >> > To see how seabios works, it would help to increase CONFIG_DEBUG_LEVEL
>> > > >> >> > in config.h of seabios
>> > > >> >>
>> > > >> >> Where does the output from seabios end up? ?Inside dmesg?
>> > > >> >
>> > > >> > It outputs them to the serial console which qemu emulates.
>> > > >> > seabios is out of kernel control, so dmesg doesn't show it.
>> > > >> >
>> > > >> >
>> > > >> >> >> pci_read_config: (val) 0x0 <- 0x1c (addr)
>> > > >> >> >> pci_write_config: (val) 0x0 -> 0x1c (addr)
>> > > >> >> >> pci_read_config: (val) 0xffffffff <- 0x1c (addr)
>> > > >> >> >> pci_write_config: (val) 0x0 -> 0x1c (addr)
>> > > >> >> >> pci_read_config: (val) 0x0 <- 0x1c (addr)
>> > > >> >> >> pci_write_config: (val) 0x0 -> 0x1c (addr)
>> > > >> >> >
>> > > >> >> > seabios BAR3. Not sure how it is mapped from this
>> > > >> >> > message.
>> > > >> >>
>> > > >> >> Isn't the BAR3 from the fact that a 64-bit BAR would use both BAR2 and
>> > > >> >> BAR3 to store all 64-bits?
>> > > >> >
>> > > >> > Yes. Seabios misbehaves. 64bit bar is(was) a missing feature.
>> > > >> > --
>> > > >> > yamahata
>> > > >> >
>> > > >> >
>> > > >>
>> > > >> With the latest seabios git passed via -bios, I no longer see the
>> > > >> 48-bit address, but instead a 32-bit address and then
>> > > >> ffffffff00000000. ?This guest has 1gb of RAM so the address isn't be
>> > > >> mapped beyond 4g.
>> > > >
>> > > > Can I see the debug log like before?
>> > > > (hopefully seabios with CONFIG_DEBUG_LEVEL enabled.)
>> > >
>> > > Here's the dump from SeaBIOS in the region related to the PCI devices.
>> > >  The SeaBIOS output is identical whether the BAR is 32-bit or 64-bit.
>> > >
>> > > PCI: bus=0 devfn=0x10: vendor_id=0x1013 device_id=0x00b8
>> > > region 0: 0xf0000000
>> > > region 1: 0xf2000000
>> > > region 6: 0xf2010000
>> > > PCI: bus=0 devfn=0x18: vendor_id=0x1af4 device_id=0x1000
>> > > region 0: 0x0000c020
>> > > region 1: 0xf2020000
>> > > region 6: 0xf2030000
>> > > PCI: bus=0 devfn=0x20: vendor_id=0x1af4 device_id=0x1110
>> > > region 0: 0xf2040000
>> > > region 1: 0xf2041000
>> > > region 2: 0x00000000
>> >
>> > Is this region (region 2 of devfn=0x20: vendor_id=0x1af4 device_id=0x1110)
>> > the BAR in quistion?
>> > The value 0 seems odd. Probably BAR address calculation overflowed.
>> > Currently seabios doesn't check overflow. I attached the patch.
>> >
>> >
>> > > > Do you know who sets the BAR to ffffffff00000000?
>> > >
>> > > Here are the config reads/writes related to the 0x18/1c, the 'IVSHMEM'
>> > > lines are from the map function passed to pci_register_bar().  It
>> > > looks like SeaBIOS sets the address to 0 and then the potentially
>> > > useful e0000000 address gets mangled into ffffffff000000.
>> >
>> > There is something wrong with the debug message of write case, I suppose.
>> > All written value are 0, but the resulted effect doesn't seems so.
>> >
>> > >
>> > > IVSHMEM: guest pci addr = 0, guest h/w addr = 1090912256, size = 536870912
>> > >
>> > > ...snip...
>> > >
>> > > pci_read_config: (val) 0x4 <- 0x18 (addr)
>> > > pci_write_config: (val) 0x0 -> 0x18 (addr)
>> > > IVSHMEM: guest pci addr = e0000000, guest h/w addr = 1090912256, size = 20000000
>> >
>> > If 0 is written to 0x18, the bar address should be 0, but it says e0000000.
>> >
>> > > pci_read_config: (val) 0xe0000004 <- 0x18 (addr)
>> >
>> > The read value isn't 0. and so on...
>> >
>> > > pci_write_config: (val) 0x0 -> 0x18 (addr)
>> > > pci_read_config: (val) 0x0 <- 0x1c (addr)
>> > > pci_write_config: (val) 0x0 -> 0x1c (addr)
>> > > IVSHMEM: guest pci addr = ffffffff00000000, guest h/w addr =
>> > > 1090912256, size = 20000000
>> > > pci_read_config: (val) 0xffffffff <- 0x1c (addr)
>> > > pci_write_config: (val) 0x0 -> 0x1c (addr)
>> > >
>> > > and with the 64-bit guest I get this error as well (recall the guest
>> > > fails to boot on 64-bit)
>> > >
>> > > BUG: kvm_dirty_pages_log_change: invalid parameters
>> > > 00000000f0000000-00000000f0ffffff
>> >
>> >
>> > diff --git a/src/pciinit.c b/src/pciinit.c
>> > index b110531..6eca2ce 100644
>> > --- a/src/pciinit.c
>> > +++ b/src/pciinit.c
>> > @@ -90,7 +90,8 @@ static int pci_bios_allocate_region(u16 bdf, int region_num)
>> >                   /* If pci_bios_prefmem_addr == 0, keep old behaviour */
>> >                   pci_bios_prefmem_addr != 0) {
>> >              paddr = &pci_bios_prefmem_addr;
>> > -            if (ALIGN(*paddr, size) + size >= BUILD_PCIPREFMEM_END) {
>> > +            if (ALIGN(*paddr, size) + size < *paddr ||
>> > +                ALIGN(*paddr, size) + size >= BUILD_PCIPREFMEM_END) {
>> >                  dprintf(1,
>> >                          "prefmem region of (bdf 0x%x bar %d) can't be mapped. "
>> >                          "decrease BUILD_PCIMEM_SIZE and recompile. size %x\n",
>> > @@ -99,7 +100,8 @@ static int pci_bios_allocate_region(u16 bdf, int region_num)
>> >              }
>> >          } else {
>> >              paddr = &pci_bios_mem_addr;
>> > -            if (ALIGN(*paddr, size) + size >= BUILD_PCIMEM_END) {
>> > +            if (ALIGN(*paddr, size) + size < *paddr ||
>> > +                ALIGN(*paddr, size) + size >= BUILD_PCIMEM_END) {
>> >                  dprintf(1,
>> >                          "mem region of (bdf 0x%x bar %d) can't be mapped. "
>> >                          "increase BUILD_PCIMEM_SIZE and recompile. size %x\n",
>>
>> Looking at the source, all of the values like pci_bios_prefmem_addr seem to be
>> 32 bit. Since in the spec prefetcheable memory is up to 64 bit,
>> can't the math overflow, here and elsewhere?
>> Maybe we should switch to 64 bit values all over ...
>
> Make sense. I'll create a patch to convert them into u64.
>
>>
>> > @@ -116,12 +118,8 @@ static int pci_bios_allocate_region(u16 bdf, int region_num)
>> >
>> >      int is_64bit = !(val & PCI_BASE_ADDRESS_SPACE_IO) &&
>> >          (val & PCI_BASE_ADDRESS_MEM_TYPE_MASK) == PCI_BASE_ADDRESS_MEM_TYPE_64;
>> > -    if (is_64bit) {
>> > -        if (size > 0) {
>> > -            pci_config_writel(bdf, ofs + 4, 0);
>> > -        } else {
>> > -            pci_config_writel(bdf, ofs + 4, ~0);
>> > -        }
>> > +    if (is_64bit && size > 0) {
>> > +        pci_config_writel(bdf, ofs + 4, 0);
>> >      }
>> >      return is_64bit;
>> >  }
>>
>>
>> Was there any reason we wrote all-ones there on size 0?
>> BAR sizing?
>
> No reason. It's just left over from debugging.
> So I'd like to remove it.
>
> --
> yamahata
>
>

Hi, 64-bit BARs still do not seem to be working.

When using the latest seabios the guest does not hit a "BUG:"
statement, but booting still fails

HPET: 1 timers in total, 0 timers will be used for per-cpu timer
divide error: 0000 [#1] SMP
last sysfs file:
CPU 0
Modules linked in:

Pid: 1, comm: swapper Not tainted 2.6.35+ #299 /Bochs
RIP: 0010:[<ffffffff812a9b5b>]  [<ffffffff812a9b5b>] hpet_alloc+0x12c/0x35b
RSP: 0018:ffff88007d7b3d80  EFLAGS: 00010246
RAX: 00038d7ea4c68000 RBX: ffff88007d062cc0 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff817bb9b0
RBP: ffff88007d7b3dc0 R08: 00000000000080d0 R09: ffffc90000000000
R10: ffff88007d72b5a0 R11: 0000000000000000 R12: ffff88007d7b3dd0
R13: ffffc90000000000 R14: 0000000000000000 R15: ffffffff817a41c3
FS:  0000000000000000(0000) GS:ffff880002000000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000001a42000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 1, threadinfo ffff88007d7b2000, task ffff88007d7b8000)
Stack:
 ffff88007f43ab90 ffff88007f43ab90 ffffffff81ca6174 ffffffff81b1f5e1
<0> 0000000000000000 0000000000000100 0000000000000100 0000000000000000
<0> ffff88007d7b3e80 ffffffff810294ea 00000000fed00000 ffffc90000000000
Call Trace:
 [<ffffffff81b1f5e1>] ? hpet_late_init+0x0/0xea
 [<ffffffff810294ea>] hpet_reserve_platform_timers+0x10b/0x115
 [<ffffffff81b1f5e1>] ? hpet_late_init+0x0/0xea
 [<ffffffff81b1f64c>] hpet_late_init+0x6b/0xea
 [<ffffffff81b1f5e1>] ? hpet_late_init+0x0/0xea
 [<ffffffff81002069>] do_one_initcall+0x5e/0x159
 [<ffffffff81b0d72a>] kernel_init+0x19a/0x228
 [<ffffffff8100aa24>] kernel_thread_helper+0x4/0x10
 [<ffffffff81b0d590>] ? kernel_init+0x0/0x228
 [<ffffffff8100aa20>] ? kernel_thread_helper+0x0/0x10
Code: 89 1d ca b2 b3 00 48 c1 ea 21 8b 73 34 49 c7 c7 c3 41 7a 81 48
8d 04 02 4c 89 f2 48 c7 c7 b0 b9 7b 81 48 c1 ea 20 48 89 d1 31 d2 <48>
f7 f1 83 7b 30 01 48 c7 c1 86 1c 7d 81 49 0f 46 cf 48 89 43
RIP  [<ffffffff812a9b5b>] hpet_alloc+0x12c/0x35b
 RSP <ffff88007d7b3d80>
---[ end trace a7919e7f17c0a725 ]---
Kernel panic - not syncing: Attempted to kill init!
Pid: 1, comm: swapper Tainted: G      D     2.6.35+ #299
Call Trace:
 [<ffffffff81459a85>] panic+0x8b/0x10b
 [<ffffffff81056a83>] ? exit_ptrace+0x38/0x121
 [<ffffffff8104f9e8>] do_exit+0x7a/0x722
 [<ffffffff8104c3bd>] ? spin_unlock_irqrestore+0xe/0x10
 [<ffffffff8104cfd6>] ? kmsg_dump+0x12b/0x145
 [<ffffffff8145ccc8>] oops_end+0xbf/0xc7
 [<ffffffff8100d299>] die+0x5a/0x63
 [<ffffffff8145c6d2>] do_trap+0x121/0x130
 [<ffffffff8100b560>] do_divide_error+0x96/0x9f
 [<ffffffff812a9b5b>] ? hpet_alloc+0x12c/0x35b
 [<ffffffff8120cf80>] ? radix_tree_preload+0x34/0x88
 [<ffffffff8100a83b>] divide_error+0x1b/0x20
 [<ffffffff812a9b5b>] ? hpet_alloc+0x12c/0x35b
 [<ffffffff81b1f5e1>] ? hpet_late_init+0x0/0xea
 [<ffffffff810294ea>] hpet_reserve_platform_timers+0x10b/0x115
 [<ffffffff81b1f5e1>] ? hpet_late_init+0x0/0xea
 [<ffffffff81b1f64c>] hpet_late_init+0x6b/0xea
 [<ffffffff81b1f5e1>] ? hpet_late_init+0x0/0xea
 [<ffffffff81002069>] do_one_initcall+0x5e/0x159
 [<ffffffff81b0d72a>] kernel_init+0x19a/0x228
 [<ffffffff8100aa24>] kernel_thread_helper+0x4/0x10
 [<ffffffff81b0d590>] ? kernel_init+0x0/0x228
 [<ffffffff8100aa20>] ? kernel_thread_helper+0x0/0x10

seabios output for the device:

PCI: bus=0 devfn=0x20: vendor_id=0x1af4 device_id=0x1110
region 0: 0xf1020000
region 2: 0x00000000
init smm

Running the latest seabios, the debug output only remaps the BAR
twice, once with a potentially correct address of e00000000

pci_read_config: (val) 0xe0000004 <- 0x18 (addr)

...snip...

pci_default_write_config: (val) 0x0 -> 0x18 (addr)
IVSHMEM: guest pci addr = e0000000, guest h/w addr = 2164588544, size = 20000000
pci_read_config: (val) 0xe0000004 <- 0x18 (addr)
pci_default_write_config: (val) 0x0 -> 0x18 (addr)
pci_read_config: (val) 0x0 <- 0x1c (addr)
pci_default_write_config: (val) 0x0 -> 0x1c (addr)
IVSHMEM: guest pci addr = ffffffff00000000, guest h/w addr =
2164588544, size = 20000000
pci_read_config: (val) 0xffffffff <- 0x1c (addr)
pci_default_write_config: (val) 0x0 -> 0x1c (addr)
pci_read_config: (val) 0x0 <- 0x20 (addr)

the pci writes are all still 0, I can't see how my debug statements
are incorrect though.  Below is my trivial pci config debugging patch.

         addr >= PIIX_CONFIG_IRQ_ROUTE &&

Cam

Comments

Isaku Yamahata Aug. 25, 2010, 2:21 a.m. UTC | #1
On Tue, Aug 24, 2010 at 10:52:36AM -0600, Cam Macdonell wrote:
> Hi, 64-bit BARs still do not seem to be working.
> 
> When using the latest seabios the guest does not hit a "BUG:"
> statement, but booting still fails
> 
> HPET: 1 timers in total, 0 timers will be used for per-cpu timer
> divide error: 0000 [#1] SMP
> last sysfs file:
> CPU 0
> Modules linked in:
> 
> Pid: 1, comm: swapper Not tainted 2.6.35+ #299 /Bochs
> RIP: 0010:[<ffffffff812a9b5b>]  [<ffffffff812a9b5b>] hpet_alloc+0x12c/0x35b
> RSP: 0018:ffff88007d7b3d80  EFLAGS: 00010246
> RAX: 00038d7ea4c68000 RBX: ffff88007d062cc0 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff817bb9b0
> RBP: ffff88007d7b3dc0 R08: 00000000000080d0 R09: ffffc90000000000
> R10: ffff88007d72b5a0 R11: 0000000000000000 R12: ffff88007d7b3dd0
> R13: ffffc90000000000 R14: 0000000000000000 R15: ffffffff817a41c3
> FS:  0000000000000000(0000) GS:ffff880002000000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 0000000000000000 CR3: 0000000001a42000 CR4: 00000000000006f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process swapper (pid: 1, threadinfo ffff88007d7b2000, task ffff88007d7b8000)
> Stack:
>  ffff88007f43ab90 ffff88007f43ab90 ffffffff81ca6174 ffffffff81b1f5e1
> <0> 0000000000000000 0000000000000100 0000000000000100 0000000000000000
> <0> ffff88007d7b3e80 ffffffff810294ea 00000000fed00000 ffffc90000000000
> Call Trace:
>  [<ffffffff81b1f5e1>] ? hpet_late_init+0x0/0xea
>  [<ffffffff810294ea>] hpet_reserve_platform_timers+0x10b/0x115
>  [<ffffffff81b1f5e1>] ? hpet_late_init+0x0/0xea
>  [<ffffffff81b1f64c>] hpet_late_init+0x6b/0xea
>  [<ffffffff81b1f5e1>] ? hpet_late_init+0x0/0xea
>  [<ffffffff81002069>] do_one_initcall+0x5e/0x159
>  [<ffffffff81b0d72a>] kernel_init+0x19a/0x228
>  [<ffffffff8100aa24>] kernel_thread_helper+0x4/0x10
>  [<ffffffff81b0d590>] ? kernel_init+0x0/0x228
>  [<ffffffff8100aa20>] ? kernel_thread_helper+0x0/0x10
> Code: 89 1d ca b2 b3 00 48 c1 ea 21 8b 73 34 49 c7 c7 c3 41 7a 81 48
> 8d 04 02 4c 89 f2 48 c7 c7 b0 b9 7b 81 48 c1 ea 20 48 89 d1 31 d2 <48>
> f7 f1 83 7b 30 01 48 c7 c1 86 1c 7d 81 49 0f 46 cf 48 89 43
> RIP  [<ffffffff812a9b5b>] hpet_alloc+0x12c/0x35b
>  RSP <ffff88007d7b3d80>
> ---[ end trace a7919e7f17c0a725 ]---
> Kernel panic - not syncing: Attempted to kill init!
> Pid: 1, comm: swapper Tainted: G      D     2.6.35+ #299
> Call Trace:
>  [<ffffffff81459a85>] panic+0x8b/0x10b
>  [<ffffffff81056a83>] ? exit_ptrace+0x38/0x121
>  [<ffffffff8104f9e8>] do_exit+0x7a/0x722
>  [<ffffffff8104c3bd>] ? spin_unlock_irqrestore+0xe/0x10
>  [<ffffffff8104cfd6>] ? kmsg_dump+0x12b/0x145
>  [<ffffffff8145ccc8>] oops_end+0xbf/0xc7
>  [<ffffffff8100d299>] die+0x5a/0x63
>  [<ffffffff8145c6d2>] do_trap+0x121/0x130
>  [<ffffffff8100b560>] do_divide_error+0x96/0x9f
>  [<ffffffff812a9b5b>] ? hpet_alloc+0x12c/0x35b
>  [<ffffffff8120cf80>] ? radix_tree_preload+0x34/0x88
>  [<ffffffff8100a83b>] divide_error+0x1b/0x20
>  [<ffffffff812a9b5b>] ? hpet_alloc+0x12c/0x35b
>  [<ffffffff81b1f5e1>] ? hpet_late_init+0x0/0xea
>  [<ffffffff810294ea>] hpet_reserve_platform_timers+0x10b/0x115
>  [<ffffffff81b1f5e1>] ? hpet_late_init+0x0/0xea
>  [<ffffffff81b1f64c>] hpet_late_init+0x6b/0xea
>  [<ffffffff81b1f5e1>] ? hpet_late_init+0x0/0xea
>  [<ffffffff81002069>] do_one_initcall+0x5e/0x159
>  [<ffffffff81b0d72a>] kernel_init+0x19a/0x228
>  [<ffffffff8100aa24>] kernel_thread_helper+0x4/0x10
>  [<ffffffff81b0d590>] ? kernel_init+0x0/0x228
>  [<ffffffff8100aa20>] ? kernel_thread_helper+0x0/0x10
> 
> seabios output for the device:
> 
> PCI: bus=0 devfn=0x20: vendor_id=0x1af4 device_id=0x1110
> region 0: 0xf1020000
> region 2: 0x00000000
> init smm
> 
> Running the latest seabios, the debug output only remaps the BAR
> twice, once with a potentially correct address of e00000000
> 
> pci_read_config: (val) 0xe0000004 <- 0x18 (addr)

The upstream seabios lacks overflow check at the moment.
I haven't found time to address PMM yet.


> pci_default_write_config: (val) 0x0 -> 0x18 (addr)
> IVSHMEM: guest pci addr = e0000000, guest h/w addr = 2164588544, size = 20000000
> pci_read_config: (val) 0xe0000004 <- 0x18 (addr)
> pci_default_write_config: (val) 0x0 -> 0x18 (addr)
> pci_read_config: (val) 0x0 <- 0x1c (addr)
> pci_default_write_config: (val) 0x0 -> 0x1c (addr)
> IVSHMEM: guest pci addr = ffffffff00000000, guest h/w addr =
> 2164588544, size = 20000000
> pci_read_config: (val) 0xffffffff <- 0x1c (addr)
> pci_default_write_config: (val) 0x0 -> 0x1c (addr)
> pci_read_config: (val) 0x0 <- 0x20 (addr)
> 
> the pci writes are all still 0, I can't see how my debug statements
> are incorrect though.  Below is my trivial pci config debugging patch.

The debug out put should be before the for-loop.

    for (i = 0; i < l && addr + i < config_size; val >>= 8, ++i) {
                                                 ^^^^^^^^^ 
                                                 Here val becomes 0
        uint8_t wmask = d->wmask[addr + i];
        d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
    }
Cam Macdonell Aug. 27, 2010, 7:35 p.m. UTC | #2
On Tue, Aug 24, 2010 at 8:21 PM, Isaku Yamahata <yamahata@valinux.co.jp> wrote:
> On Tue, Aug 24, 2010 at 10:52:36AM -0600, Cam Macdonell wrote:
>> Hi, 64-bit BARs still do not seem to be working.
>>
>> When using the latest seabios the guest does not hit a "BUG:"
>> statement, but booting still fails
>>
>> HPET: 1 timers in total, 0 timers will be used for per-cpu timer
>> divide error: 0000 [#1] SMP
>> last sysfs file:
>> CPU 0
>> Modules linked in:
>>
>> Pid: 1, comm: swapper Not tainted 2.6.35+ #299 /Bochs
>> RIP: 0010:[<ffffffff812a9b5b>]  [<ffffffff812a9b5b>] hpet_alloc+0x12c/0x35b
>> RSP: 0018:ffff88007d7b3d80  EFLAGS: 00010246
>> RAX: 00038d7ea4c68000 RBX: ffff88007d062cc0 RCX: 0000000000000000
>> RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff817bb9b0
>> RBP: ffff88007d7b3dc0 R08: 00000000000080d0 R09: ffffc90000000000
>> R10: ffff88007d72b5a0 R11: 0000000000000000 R12: ffff88007d7b3dd0
>> R13: ffffc90000000000 R14: 0000000000000000 R15: ffffffff817a41c3
>> FS:  0000000000000000(0000) GS:ffff880002000000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>> CR2: 0000000000000000 CR3: 0000000001a42000 CR4: 00000000000006f0
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> Process swapper (pid: 1, threadinfo ffff88007d7b2000, task ffff88007d7b8000)
>> Stack:
>>  ffff88007f43ab90 ffff88007f43ab90 ffffffff81ca6174 ffffffff81b1f5e1
>> <0> 0000000000000000 0000000000000100 0000000000000100 0000000000000000
>> <0> ffff88007d7b3e80 ffffffff810294ea 00000000fed00000 ffffc90000000000
>> Call Trace:
>>  [<ffffffff81b1f5e1>] ? hpet_late_init+0x0/0xea
>>  [<ffffffff810294ea>] hpet_reserve_platform_timers+0x10b/0x115
>>  [<ffffffff81b1f5e1>] ? hpet_late_init+0x0/0xea
>>  [<ffffffff81b1f64c>] hpet_late_init+0x6b/0xea
>>  [<ffffffff81b1f5e1>] ? hpet_late_init+0x0/0xea
>>  [<ffffffff81002069>] do_one_initcall+0x5e/0x159
>>  [<ffffffff81b0d72a>] kernel_init+0x19a/0x228
>>  [<ffffffff8100aa24>] kernel_thread_helper+0x4/0x10
>>  [<ffffffff81b0d590>] ? kernel_init+0x0/0x228
>>  [<ffffffff8100aa20>] ? kernel_thread_helper+0x0/0x10
>> Code: 89 1d ca b2 b3 00 48 c1 ea 21 8b 73 34 49 c7 c7 c3 41 7a 81 48
>> 8d 04 02 4c 89 f2 48 c7 c7 b0 b9 7b 81 48 c1 ea 20 48 89 d1 31 d2 <48>
>> f7 f1 83 7b 30 01 48 c7 c1 86 1c 7d 81 49 0f 46 cf 48 89 43
>> RIP  [<ffffffff812a9b5b>] hpet_alloc+0x12c/0x35b
>>  RSP <ffff88007d7b3d80>
>> ---[ end trace a7919e7f17c0a725 ]---
>> Kernel panic - not syncing: Attempted to kill init!
>> Pid: 1, comm: swapper Tainted: G      D     2.6.35+ #299
>> Call Trace:
>>  [<ffffffff81459a85>] panic+0x8b/0x10b
>>  [<ffffffff81056a83>] ? exit_ptrace+0x38/0x121
>>  [<ffffffff8104f9e8>] do_exit+0x7a/0x722
>>  [<ffffffff8104c3bd>] ? spin_unlock_irqrestore+0xe/0x10
>>  [<ffffffff8104cfd6>] ? kmsg_dump+0x12b/0x145
>>  [<ffffffff8145ccc8>] oops_end+0xbf/0xc7
>>  [<ffffffff8100d299>] die+0x5a/0x63
>>  [<ffffffff8145c6d2>] do_trap+0x121/0x130
>>  [<ffffffff8100b560>] do_divide_error+0x96/0x9f
>>  [<ffffffff812a9b5b>] ? hpet_alloc+0x12c/0x35b
>>  [<ffffffff8120cf80>] ? radix_tree_preload+0x34/0x88
>>  [<ffffffff8100a83b>] divide_error+0x1b/0x20
>>  [<ffffffff812a9b5b>] ? hpet_alloc+0x12c/0x35b
>>  [<ffffffff81b1f5e1>] ? hpet_late_init+0x0/0xea
>>  [<ffffffff810294ea>] hpet_reserve_platform_timers+0x10b/0x115
>>  [<ffffffff81b1f5e1>] ? hpet_late_init+0x0/0xea
>>  [<ffffffff81b1f64c>] hpet_late_init+0x6b/0xea
>>  [<ffffffff81b1f5e1>] ? hpet_late_init+0x0/0xea
>>  [<ffffffff81002069>] do_one_initcall+0x5e/0x159
>>  [<ffffffff81b0d72a>] kernel_init+0x19a/0x228
>>  [<ffffffff8100aa24>] kernel_thread_helper+0x4/0x10
>>  [<ffffffff81b0d590>] ? kernel_init+0x0/0x228
>>  [<ffffffff8100aa20>] ? kernel_thread_helper+0x0/0x10
>>
>> seabios output for the device:
>>
>> PCI: bus=0 devfn=0x20: vendor_id=0x1af4 device_id=0x1110
>> region 0: 0xf1020000
>> region 2: 0x00000000
>> init smm
>>
>> Running the latest seabios, the debug output only remaps the BAR
>> twice, once with a potentially correct address of e00000000
>>
>> pci_read_config: (val) 0xe0000004 <- 0x18 (addr)
>
> The upstream seabios lacks overflow check at the moment.
> I haven't found time to address PMM yet.
>
>
>> pci_default_write_config: (val) 0x0 -> 0x18 (addr)
>> IVSHMEM: guest pci addr = e0000000, guest h/w addr = 2164588544, size = 20000000
>> pci_read_config: (val) 0xe0000004 <- 0x18 (addr)
>> pci_default_write_config: (val) 0x0 -> 0x18 (addr)
>> pci_read_config: (val) 0x0 <- 0x1c (addr)
>> pci_default_write_config: (val) 0x0 -> 0x1c (addr)
>> IVSHMEM: guest pci addr = ffffffff00000000, guest h/w addr =
>> 2164588544, size = 20000000
>> pci_read_config: (val) 0xffffffff <- 0x1c (addr)
>> pci_default_write_config: (val) 0x0 -> 0x1c (addr)
>> pci_read_config: (val) 0x0 <- 0x20 (addr)
>>
>> the pci writes are all still 0, I can't see how my debug statements
>> are incorrect though.  Below is my trivial pci config debugging patch.
>
> The debug out put should be before the for-loop.
>
>    for (i = 0; i < l && addr + i < config_size; val >>= 8, ++i) {
>                                                 ^^^^^^^^^
>                                                 Here val becomes 0
>        uint8_t wmask = d->wmask[addr + i];
>        d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
>    }
>
> --
> yamahata
>
>

Ah, thanks.  I moved the debug statement and now the writes are the
proper values.

In seabios-kvm, it seems the guest is writing the address of c040 to
0x1c which results to the 48-bit address c04000000000.

pci_read_config: (val) 0x1af4 <- 0x0 (addr)
pci_read_config: (val) 0x0 <- 0xe (addr)
pci_read_config: (val) 0x1af4 <- 0x0 (addr)
pci_read_config: (val) 0x1110 <- 0x2 (addr)
pci_read_config: (val) 0x0 <- 0xe (addr)
pci_read_config: (val) 0x1af4 <- 0x0 (addr)
pci_read_config: (val) 0x0 <- 0xe (addr)
pci_read_config: (val) 0x500 <- 0xa (addr)
pci_read_config: (val) 0x1af4 <- 0x0 (addr)
pci_read_config: (val) 0x1110 <- 0x2 (addr)
pci_read_config: (val) 0x0 <- 0x10 (addr)
pci_write_config: (val) 0xffffffff -> 0x10 (addr)
pci_read_config: (val) 0xffffff00 <- 0x10 (addr)
pci_write_config: (val) 0x0 -> 0x10 (addr)
pci_read_config: (val) 0x0 <- 0x10 (addr)
pci_write_config: (val) 0xf1020000 -> 0x10 (addr)
pci_read_config: (val) 0x0 <- 0x14 (addr)
pci_write_config: (val) 0xffffffff -> 0x14 (addr)
pci_read_config: (val) 0x0 <- 0x14 (addr)
pci_write_config: (val) 0x0 -> 0x14 (addr)
pci_read_config: (val) 0x4 <- 0x18 (addr)
pci_write_config: (val) 0xffffffff -> 0x18 (addr)
pci_read_config: (val) 0xe0000004 <- 0x18 (addr)
pci_write_config: (val) 0x4 -> 0x18 (addr)
pci_read_config: (val) 0x4 <- 0x18 (addr)
pci_write_config: (val) 0x0 -> 0x18 (addr)
pci_read_config: (val) 0x0 <- 0x1c (addr)
pci_write_config: (val) 0xffffffff -> 0x1c (addr)
pci_read_config: (val) 0xffffffff <- 0x1c (addr)
pci_write_config: (val) 0x0 -> 0x1c (addr)
pci_read_config: (val) 0x0 <- 0x1c (addr)
pci_write_config: (val) 0xc040 -> 0x1c (addr)

<snip>

pci_read_config: (val) 0x0 <- 0x4 (addr)
pci_write_config: (val) 0x3 -> 0x4 (addr)
IVSHMEM: guest pci addr = c04000000000, guest h/w addr = 2164588544,
size = 20000000
pci_read_config: (val) 0x1 <- 0x3d (addr)

<snip>

pci_read_config: (val) 0x4 <- 0x18 (addr)
pci_write_config: (val) 0xffffffff -> 0x18 (addr)
IVSHMEM: guest pci addr = c040e0000000, guest h/w addr = 2164588544,
size = 20000000
pci_read_config: (val) 0xe0000004 <- 0x18 (addr)
pci_write_config: (val) 0x4 -> 0x18 (addr)
IVSHMEM: guest pci addr = c04000000000, guest h/w addr = 2164588544,
size = 20000000
pci_read_config: (val) 0xc040 <- 0x1c (addr)
pci_write_config: (val) 0xffffffff -> 0x1c (addr)
IVSHMEM: guest pci addr = ffffffff00000000, guest h/w addr =
2164588544, size = 20000000
pci_read_config: (val) 0xffffffff <- 0x1c (addr)
pci_write_config: (val) 0xc040 -> 0x1c (addr)
IVSHMEM: guest pci addr = c04000000000, guest h/w addr = 2164588544,
size = 20000000
pci_read_config: (val) 0x0 <- 0x20 (addr)
pci_write_config: (val) 0xffffffff -> 0x20 (addr)
pci_read_config: (val) 0x0 <- 0x20 (addr)
pci_write_config: (val) 0x0 -> 0x20 (addr)

In upstream seabios.git, the c040 is not written, but the device
returns ffffffff from 0x1c (only reads and writes to 0x18 and 0x1c are
shown below)

pci_read_config: (val) 0x4 <- 0x18 (addr)
pci_write_config: (val) 0xffffffff -> 0x18 (addr)
pci_read_config: (val) 0xe0000004 <- 0x18 (addr)
pci_write_config: (val) 0x4 -> 0x18 (addr)
pci_read_config: (val) 0x4 <- 0x18 (addr)
pci_write_config: (val) 0x0 -> 0x18 (addr)
pci_write_config: (val) 0x0 -> 0x1c (addr)
pci_read_config: (val) 0x4 <- 0x18 (addr)
pci_write_config: (val) 0xffffffff -> 0x18 (addr)
IVSHMEM: guest pci addr = e0000000, guest h/w addr = 2164588544, size = 20000000
pci_read_config: (val) 0xe0000004 <- 0x18 (addr)
pci_write_config: (val) 0x4 -> 0x18 (addr)
pci_read_config: (val) 0x0 <- 0x1c (addr)
pci_write_config: (val) 0xffffffff -> 0x1c (addr)
IVSHMEM: guest pci addr = ffffffff00000000, guest h/w addr =
2164588544, size = 20000000
pci_read_config: (val) 0xffffffff <- 0x1c (addr)
pci_write_config: (val) 0x0 -> 0x1c (addr)

Thanks,
Cam
Isaku Yamahata Aug. 30, 2010, 2:36 a.m. UTC | #3
On Fri, Aug 27, 2010 at 01:35:23PM -0600, Cam Macdonell wrote:
> In upstream seabios.git, the c040 is not written, but the device
> returns ffffffff from 0x1c (only reads and writes to 0x18 and 0x1c are
> shown below)
> 
> pci_read_config: (val) 0x4 <- 0x18 (addr)
> pci_write_config: (val) 0xffffffff -> 0x18 (addr)
> pci_read_config: (val) 0xe0000004 <- 0x18 (addr)
> pci_write_config: (val) 0x4 -> 0x18 (addr)
> pci_read_config: (val) 0x4 <- 0x18 (addr)
	this read is useless. I sent out the patch.

> pci_write_config: (val) 0x0 -> 0x18 (addr)
> pci_write_config: (val) 0x0 -> 0x1c (addr)

It looks like that overflow occurred. You need overflow patch.
As the size is huge, seabios can't map it.
Anyway even with overflow patch, the expected result is
to leave the bar unmodified(initial value).


> pci_read_config: (val) 0x4 <- 0x18 (addr)
> pci_write_config: (val) 0xffffffff -> 0x18 (addr)
> IVSHMEM: guest pci addr = e0000000, guest h/w addr = 2164588544, size = 20000000
> pci_read_config: (val) 0xe0000004 <- 0x18 (addr)
> pci_write_config: (val) 0x4 -> 0x18 (addr)
> pci_read_config: (val) 0x0 <- 0x1c (addr)
> pci_write_config: (val) 0xffffffff -> 0x1c (addr)
> IVSHMEM: guest pci addr = ffffffff00000000, guest h/w addr =
> 2164588544, size = 20000000
> pci_read_config: (val) 0xffffffff <- 0x1c (addr)
> pci_write_config: (val) 0x0 -> 0x1c (addr)

Is the above done by guest OS?
The guest OS doesn't seem to know 64bit bar.
diff mbox

Patch

diff --git a/hw/pci.c b/hw/pci.c
index 70dbace..01087b1 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -1159,6 +1159,8 @@  static uint32_t pci_read_config(PCIDevice *d,

     len = MIN(len, pci_config_size(d) - address);
     memcpy(&val, d->config + address, len);
+    if (strncmp(d->name, "ivshmem", 7) == 0)
+        printf("pci_read_config: (val) 0x%x <- 0x%x (addr)\n", val, address);
     return le32_to_cpu(val);
 }

@@ -1219,6 +1221,8 @@  void pci_default_write_config(PCIDevice *d,
uint32_t addr, uint32_t val, int l)
         d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
     }

+    if (strncmp(d->name, "ivshmem", 7) == 0)
+        printf("pci_write_config: (val) 0x%x -> 0x%x (addr)\n", val, addr);
 #ifdef CONFIG_KVM_DEVICE_ASSIGNMENT
     if (kvm_enabled() && kvm_irqchip_in_kernel() &&