diff mbox

pci: do not pci_update_mappings when guest gets bar size

Message ID 1413284654-13420-1-git-send-email-arei.gonglei@huawei.com
State New
Headers show

Commit Message

Gonglei (Arei) Oct. 14, 2014, 11:04 a.m. UTC
From: ChenLiang <chenliang88@huawei.com>

Power-up software can determine how much address space the device
requires by writing a value of all 1's to the register and then
reading the value back(PCI specification). Qemu should not do
pci_update_mappings. Qemu may exit, because the wrong address of
this bar is overlap with other memslots.

Signed-off-by: ChenLiang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
 hw/pci/pci.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Michael S. Tsirkin Oct. 14, 2014, 11:20 a.m. UTC | #1
On Tue, Oct 14, 2014 at 07:04:14PM +0800, arei.gonglei@huawei.com wrote:
> From: ChenLiang <chenliang88@huawei.com>
> 
> Power-up software can determine how much address space the device
> requires by writing a value of all 1's to the register and then
> reading the value back(PCI specification). Qemu should not do
> pci_update_mappings. Qemu may exit, because the wrong address of
> this bar is overlap with other memslots.
> 
> Signed-off-by: ChenLiang <chenliang88@huawei.com>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>

This is at best a work-around.
Overlapping is observed in practice, qemu really shouldn't exit when
this happens.
So we should find the root cause and fix it there instead of
adding work-arounds in PCI core.

With which device do you observe this?


> ---
>  hw/pci/pci.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> index 6ce75aa..4d44b44 100644
> --- a/hw/pci/pci.c
> +++ b/hw/pci/pci.c
> @@ -1158,12 +1158,12 @@ void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val_in, int
>          d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
>          d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */
>      }
> -    if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
> +    if (((ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
>          ranges_overlap(addr, l, PCI_ROM_ADDRESS, 4) ||
> -        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4) ||
> -        range_covers_byte(addr, l, PCI_COMMAND))
> +        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4)) &&
> +        val_in != 0xffffffff) || range_covers_byte(addr, l, PCI_COMMAND)) {
>          pci_update_mappings(d);
> -
> +    }
>      if (range_covers_byte(addr, l, PCI_COMMAND)) {
>          pci_update_irq_disabled(d, was_irq_disabled);
>          memory_region_set_enabled(&d->bus_master_enable_region,
> -- 
> 1.7.12.4
>
ChenLiang Oct. 14, 2014, 11:41 a.m. UTC | #2
We find overlap when the size of pci bar is bigger then 16MB, it overlaps with private
memslot in the kmod. By the way, the new kmod skip private memslot. But I think if the size
of pci bar is enough big, it also  overlaps with other memslots.

the root cause is:

pci_default_write_config will do that:
    for (i = 0; i < l; val >>= 8, ++i) {
        uint8_t wmask = d->wmask[addr + i];
        uint8_t w1cmask = d->w1cmask[addr + i];
        assert(!(wmask & w1cmask));
        d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
        d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */
    }

*(int*)(d->config[addr]) will be 0xfe00000c, if val is 0xffffffff and the size of bar is 32MB.
This range overlap with private memslot in the old kmod.

then pci_update_mappings will update memslot.

On 2014/10/14 19:20, Michael S. Tsirkin wrote:

> On Tue, Oct 14, 2014 at 07:04:14PM +0800, arei.gonglei@huawei.com wrote:
>> From: ChenLiang <chenliang88@huawei.com>
>>
>> Power-up software can determine how much address space the device
>> requires by writing a value of all 1's to the register and then
>> reading the value back(PCI specification). Qemu should not do
>> pci_update_mappings. Qemu may exit, because the wrong address of
>> this bar is overlap with other memslots.
>>
>> Signed-off-by: ChenLiang <chenliang88@huawei.com>
>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> 
> This is at best a work-around.
> Overlapping is observed in practice, qemu really shouldn't exit when
> this happens.
> So we should find the root cause and fix it there instead of
> adding work-arounds in PCI core.
> 
> With which device do you observe this?
> 
> 
>> ---
>>  hw/pci/pci.c | 8 ++++----
>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
>> index 6ce75aa..4d44b44 100644
>> --- a/hw/pci/pci.c
>> +++ b/hw/pci/pci.c
>> @@ -1158,12 +1158,12 @@ void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val_in, int
>>          d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
>>          d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */
>>      }
>> -    if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
>> +    if (((ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
>>          ranges_overlap(addr, l, PCI_ROM_ADDRESS, 4) ||
>> -        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4) ||
>> -        range_covers_byte(addr, l, PCI_COMMAND))
>> +        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4)) &&
>> +        val_in != 0xffffffff) || range_covers_byte(addr, l, PCI_COMMAND)) {
>>          pci_update_mappings(d);
>> -
>> +    }
>>      if (range_covers_byte(addr, l, PCI_COMMAND)) {
>>          pci_update_irq_disabled(d, was_irq_disabled);
>>          memory_region_set_enabled(&d->bus_master_enable_region,
>> -- 
>> 1.7.12.4
>>
> 
> .
>
Michael S. Tsirkin Oct. 14, 2014, 11:48 a.m. UTC | #3
On Tue, Oct 14, 2014 at 07:41:14PM +0800, ChenLiang wrote:
> We find overlap when the size of pci bar is bigger then 16MB, it overlaps with private
> memslot in the kmod. By the way, the new kmod skip private memslot. But I think if the size
> of pci bar is enough big, it also  overlaps with other memslots.

Of course but it should not cause a crash.
If you need the overlapping memslot available during the programming
process, increase it's priority.

> the root cause is:
> 
> pci_default_write_config will do that:
>     for (i = 0; i < l; val >>= 8, ++i) {
>         uint8_t wmask = d->wmask[addr + i];
>         uint8_t w1cmask = d->w1cmask[addr + i];
>         assert(!(wmask & w1cmask));
>         d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
>         d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */
>     }
> 
> *(int*)(d->config[addr]) will be 0xfe00000c, if val is 0xffffffff and the size of bar is 32MB.
> This range overlap with private memslot in the old kmod.
> 
> then pci_update_mappings will update memslot.
> 
> On 2014/10/14 19:20, Michael S. Tsirkin wrote:
> 
> > On Tue, Oct 14, 2014 at 07:04:14PM +0800, arei.gonglei@huawei.com wrote:
> >> From: ChenLiang <chenliang88@huawei.com>
> >>
> >> Power-up software can determine how much address space the device
> >> requires by writing a value of all 1's to the register and then
> >> reading the value back(PCI specification). Qemu should not do
> >> pci_update_mappings. Qemu may exit, because the wrong address of
> >> this bar is overlap with other memslots.
> >>
> >> Signed-off-by: ChenLiang <chenliang88@huawei.com>
> >> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> > 
> > This is at best a work-around.
> > Overlapping is observed in practice, qemu really shouldn't exit when
> > this happens.
> > So we should find the root cause and fix it there instead of
> > adding work-arounds in PCI core.
> > 
> > With which device do you observe this?
> > 
> > 
> >> ---
> >>  hw/pci/pci.c | 8 ++++----
> >>  1 file changed, 4 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> >> index 6ce75aa..4d44b44 100644
> >> --- a/hw/pci/pci.c
> >> +++ b/hw/pci/pci.c
> >> @@ -1158,12 +1158,12 @@ void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val_in, int
> >>          d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
> >>          d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */
> >>      }
> >> -    if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
> >> +    if (((ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
> >>          ranges_overlap(addr, l, PCI_ROM_ADDRESS, 4) ||
> >> -        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4) ||
> >> -        range_covers_byte(addr, l, PCI_COMMAND))
> >> +        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4)) &&
> >> +        val_in != 0xffffffff) || range_covers_byte(addr, l, PCI_COMMAND)) {
> >>          pci_update_mappings(d);
> >> -
> >> +    }
> >>      if (range_covers_byte(addr, l, PCI_COMMAND)) {
> >>          pci_update_irq_disabled(d, was_irq_disabled);
> >>          memory_region_set_enabled(&d->bus_master_enable_region,
> >> -- 
> >> 1.7.12.4
> >>
> > 
> > .
> > 
> 
>
Michael S. Tsirkin Oct. 14, 2014, 11:58 a.m. UTC | #4
On Tue, Oct 14, 2014 at 07:41:14PM +0800, ChenLiang wrote:
> We find overlap when the size of pci bar is bigger then 16MB, it overlaps with private
> memslot in the kmod. By the way, the new kmod skip private memslot. But I think if the size
> of pci bar is enough big, it also  overlaps with other memslots.
> 
> the root cause is:
> 
> pci_default_write_config will do that:
>     for (i = 0; i < l; val >>= 8, ++i) {
>         uint8_t wmask = d->wmask[addr + i];
>         uint8_t w1cmask = d->w1cmask[addr + i];
>         assert(!(wmask & w1cmask));
>         d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
>         d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */
>     }
> 
> *(int*)(d->config[addr]) will be 0xfe00000c, if val is 0xffffffff and the size of bar is 32MB.
> This range overlap with private memslot in the old kmod.
> 
> then pci_update_mappings will update memslot.


In fact, ever since
	83d08f2673504a299194dcac1657a13754b5932a
    pc: map PCI address space as catchall region for not mapped addresses

all pci memory has lower priority than ioapic at 0xfe0000000.

so ioapic will win, there should be no issue.

IOW this is not the root cause.



> On 2014/10/14 19:20, Michael S. Tsirkin wrote:
> 
> > On Tue, Oct 14, 2014 at 07:04:14PM +0800, arei.gonglei@huawei.com wrote:
> >> From: ChenLiang <chenliang88@huawei.com>
> >>
> >> Power-up software can determine how much address space the device
> >> requires by writing a value of all 1's to the register and then
> >> reading the value back(PCI specification). Qemu should not do
> >> pci_update_mappings. Qemu may exit, because the wrong address of
> >> this bar is overlap with other memslots.
> >>
> >> Signed-off-by: ChenLiang <chenliang88@huawei.com>
> >> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> > 
> > This is at best a work-around.
> > Overlapping is observed in practice, qemu really shouldn't exit when
> > this happens.
> > So we should find the root cause and fix it there instead of
> > adding work-arounds in PCI core.
> > 
> > With which device do you observe this?
> > 
> > 
> >> ---
> >>  hw/pci/pci.c | 8 ++++----
> >>  1 file changed, 4 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> >> index 6ce75aa..4d44b44 100644
> >> --- a/hw/pci/pci.c
> >> +++ b/hw/pci/pci.c
> >> @@ -1158,12 +1158,12 @@ void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val_in, int
> >>          d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
> >>          d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */
> >>      }
> >> -    if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
> >> +    if (((ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
> >>          ranges_overlap(addr, l, PCI_ROM_ADDRESS, 4) ||
> >> -        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4) ||
> >> -        range_covers_byte(addr, l, PCI_COMMAND))
> >> +        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4)) &&
> >> +        val_in != 0xffffffff) || range_covers_byte(addr, l, PCI_COMMAND)) {
> >>          pci_update_mappings(d);
> >> -
> >> +    }
> >>      if (range_covers_byte(addr, l, PCI_COMMAND)) {
> >>          pci_update_irq_disabled(d, was_irq_disabled);
> >>          memory_region_set_enabled(&d->bus_master_enable_region,
> >> -- 
> >> 1.7.12.4
> >>
> > 
> > .
> > 
> 
>
ChenLiang Oct. 14, 2014, 12:15 p.m. UTC | #5
On 2014/10/14 19:48, Michael S. Tsirkin wrote:

> On Tue, Oct 14, 2014 at 07:41:14PM +0800, ChenLiang wrote:
>> We find overlap when the size of pci bar is bigger then 16MB, it overlaps with private
>> memslot in the kmod. By the way, the new kmod skip private memslot. But I think if the size
>> of pci bar is enough big, it also  overlaps with other memslots.
> 
> Of course but it should not cause a crash.
> If you need the overlapping memslot available during the programming
> process, increase it's priority.
> 

Yeah, I know the priority of memory region.
The problem is overlaping should not happen when one pci bar is not
overlap with any other memslots. But Qemu always do pci_update_mappings
when guest os writes pci bar. Actually, should not do pci_update_mappings
if var is 0xffffffff.

>> the root cause is:
>>
>> pci_default_write_config will do that:
>>     for (i = 0; i < l; val >>= 8, ++i) {
>>         uint8_t wmask = d->wmask[addr + i];
>>         uint8_t w1cmask = d->w1cmask[addr + i];
>>         assert(!(wmask & w1cmask));
>>         d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
>>         d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */
>>     }
>>
>> *(int*)(d->config[addr]) will be 0xfe00000c, if val is 0xffffffff and the size of bar is 32MB.
>> This range overlap with private memslot in the old kmod.
>>
>> then pci_update_mappings will update memslot.
>>
>> On 2014/10/14 19:20, Michael S. Tsirkin wrote:
>>
>>> On Tue, Oct 14, 2014 at 07:04:14PM +0800, arei.gonglei@huawei.com wrote:
>>>> From: ChenLiang <chenliang88@huawei.com>
>>>>
>>>> Power-up software can determine how much address space the device
>>>> requires by writing a value of all 1's to the register and then
>>>> reading the value back(PCI specification). Qemu should not do
>>>> pci_update_mappings. Qemu may exit, because the wrong address of
>>>> this bar is overlap with other memslots.
>>>>
>>>> Signed-off-by: ChenLiang <chenliang88@huawei.com>
>>>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
>>>
>>> This is at best a work-around.
>>> Overlapping is observed in practice, qemu really shouldn't exit when
>>> this happens.
>>> So we should find the root cause and fix it there instead of
>>> adding work-arounds in PCI core.
>>>
>>> With which device do you observe this?
>>>
>>>
>>>> ---
>>>>  hw/pci/pci.c | 8 ++++----
>>>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
>>>> index 6ce75aa..4d44b44 100644
>>>> --- a/hw/pci/pci.c
>>>> +++ b/hw/pci/pci.c
>>>> @@ -1158,12 +1158,12 @@ void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val_in, int
>>>>          d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
>>>>          d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */
>>>>      }
>>>> -    if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
>>>> +    if (((ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
>>>>          ranges_overlap(addr, l, PCI_ROM_ADDRESS, 4) ||
>>>> -        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4) ||
>>>> -        range_covers_byte(addr, l, PCI_COMMAND))
>>>> +        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4)) &&
>>>> +        val_in != 0xffffffff) || range_covers_byte(addr, l, PCI_COMMAND)) {
>>>>          pci_update_mappings(d);
>>>> -
>>>> +    }
>>>>      if (range_covers_byte(addr, l, PCI_COMMAND)) {
>>>>          pci_update_irq_disabled(d, was_irq_disabled);
>>>>          memory_region_set_enabled(&d->bus_master_enable_region,
>>>> -- 
>>>> 1.7.12.4
>>>>
>>>
>>> .
>>>
>>
>>
> 
> .
>
ChenLiang Oct. 14, 2014, 12:19 p.m. UTC | #6
On 2014/10/14 19:58, Michael S. Tsirkin wrote:

> On Tue, Oct 14, 2014 at 07:41:14PM +0800, ChenLiang wrote:
>> We find overlap when the size of pci bar is bigger then 16MB, it overlaps with private
>> memslot in the kmod. By the way, the new kmod skip private memslot. But I think if the size
>> of pci bar is enough big, it also  overlaps with other memslots.
>>
>> the root cause is:
>>
>> pci_default_write_config will do that:
>>     for (i = 0; i < l; val >>= 8, ++i) {
>>         uint8_t wmask = d->wmask[addr + i];
>>         uint8_t w1cmask = d->w1cmask[addr + i];
>>         assert(!(wmask & w1cmask));
>>         d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
>>         d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */
>>     }
>>
>> *(int*)(d->config[addr]) will be 0xfe00000c, if val is 0xffffffff and the size of bar is 32MB.
>> This range overlap with private memslot in the old kmod.
>>
>> then pci_update_mappings will update memslot.
> 
> 
> In fact, ever since
> 	83d08f2673504a299194dcac1657a13754b5932a
>     pc: map PCI address space as catchall region for not mapped addresses
> 
> all pci memory has lower priority than ioapic at 0xfe0000000.
> 
> so ioapic will win, there should be no issue.
> 
> IOW this is not the root cause.
> 
> 
> 
>> On 2014/10/14 19:20, Michael S. Tsirkin wrote:
>>
>>> On Tue, Oct 14, 2014 at 07:04:14PM +0800, arei.gonglei@huawei.com wrote:
>>>> From: ChenLiang <chenliang88@huawei.com>
>>>>
>>>> Power-up software can determine how much address space the device
>>>> requires by writing a value of all 1's to the register and then
>>>> reading the value back(PCI specification). Qemu should not do
>>>> pci_update_mappings. Qemu may exit, because the wrong address of
>>>> this bar is overlap with other memslots.
>>>>
>>>> Signed-off-by: ChenLiang <chenliang88@huawei.com>
>>>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
>>>
>>> This is at best a work-around.
>>> Overlapping is observed in practice, qemu really shouldn't exit when
>>> this happens.
>>> So we should find the root cause and fix it there instead of
>>> adding work-arounds in PCI core.
>>>
>>> With which device do you observe this?
>>>
>>>
>>>> ---
>>>>  hw/pci/pci.c | 8 ++++----
>>>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
>>>> index 6ce75aa..4d44b44 100644
>>>> --- a/hw/pci/pci.c
>>>> +++ b/hw/pci/pci.c
>>>> @@ -1158,12 +1158,12 @@ void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val_in, int
>>>>          d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
>>>>          d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */
>>>>      }
>>>> -    if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
>>>> +    if (((ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
>>>>          ranges_overlap(addr, l, PCI_ROM_ADDRESS, 4) ||
>>>> -        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4) ||
>>>> -        range_covers_byte(addr, l, PCI_COMMAND))
>>>> +        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4)) &&
>>>> +        val_in != 0xffffffff) || range_covers_byte(addr, l, PCI_COMMAND)) {
>>>>          pci_update_mappings(d);
>>>> -
>>>> +    }
>>>>      if (range_covers_byte(addr, l, PCI_COMMAND)) {
>>>>          pci_update_irq_disabled(d, was_irq_disabled);
>>>>          memory_region_set_enabled(&d->bus_master_enable_region,
>>>> -- 
>>>> 1.7.12.4
>>>>
>>>
>>> .
>>>
>>
>>
> 
> .
>
ChenLiang Oct. 14, 2014, 12:19 p.m. UTC | #7
On 2014/10/14 19:58, Michael S. Tsirkin wrote:

> On Tue, Oct 14, 2014 at 07:41:14PM +0800, ChenLiang wrote:
>> We find overlap when the size of pci bar is bigger then 16MB, it overlaps with private
>> memslot in the kmod. By the way, the new kmod skip private memslot. But I think if the size
>> of pci bar is enough big, it also  overlaps with other memslots.
>>
>> the root cause is:
>>
>> pci_default_write_config will do that:
>>     for (i = 0; i < l; val >>= 8, ++i) {
>>         uint8_t wmask = d->wmask[addr + i];
>>         uint8_t w1cmask = d->w1cmask[addr + i];
>>         assert(!(wmask & w1cmask));
>>         d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
>>         d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */
>>     }
>>
>> *(int*)(d->config[addr]) will be 0xfe00000c, if val is 0xffffffff and the size of bar is 32MB.
>> This range overlap with private memslot in the old kmod.
>>
>> then pci_update_mappings will update memslot.
> 
> 
> In fact, ever since
> 	83d08f2673504a299194dcac1657a13754b5932a
>     pc: map PCI address space as catchall region for not mapped addresses
> 
> all pci memory has lower priority than ioapic at 0xfe0000000.
> 
> so ioapic will win, there should be no issue.
> 
> IOW this is not the root cause.


TSS_PRIVATE_MEMSLOT and IDENTITY_PAGETABLE_PRIVATE_MEMSLOT also will overlap.

> 
> 
> 
>> On 2014/10/14 19:20, Michael S. Tsirkin wrote:
>>
>>> On Tue, Oct 14, 2014 at 07:04:14PM +0800, arei.gonglei@huawei.com wrote:
>>>> From: ChenLiang <chenliang88@huawei.com>
>>>>
>>>> Power-up software can determine how much address space the device
>>>> requires by writing a value of all 1's to the register and then
>>>> reading the value back(PCI specification). Qemu should not do
>>>> pci_update_mappings. Qemu may exit, because the wrong address of
>>>> this bar is overlap with other memslots.
>>>>
>>>> Signed-off-by: ChenLiang <chenliang88@huawei.com>
>>>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
>>>
>>> This is at best a work-around.
>>> Overlapping is observed in practice, qemu really shouldn't exit when
>>> this happens.
>>> So we should find the root cause and fix it there instead of
>>> adding work-arounds in PCI core.
>>>
>>> With which device do you observe this?
>>>
>>>
>>>> ---
>>>>  hw/pci/pci.c | 8 ++++----
>>>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
>>>> index 6ce75aa..4d44b44 100644
>>>> --- a/hw/pci/pci.c
>>>> +++ b/hw/pci/pci.c
>>>> @@ -1158,12 +1158,12 @@ void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val_in, int
>>>>          d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
>>>>          d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */
>>>>      }
>>>> -    if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
>>>> +    if (((ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
>>>>          ranges_overlap(addr, l, PCI_ROM_ADDRESS, 4) ||
>>>> -        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4) ||
>>>> -        range_covers_byte(addr, l, PCI_COMMAND))
>>>> +        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4)) &&
>>>> +        val_in != 0xffffffff) || range_covers_byte(addr, l, PCI_COMMAND)) {
>>>>          pci_update_mappings(d);
>>>> -
>>>> +    }
>>>>      if (range_covers_byte(addr, l, PCI_COMMAND)) {
>>>>          pci_update_irq_disabled(d, was_irq_disabled);
>>>>          memory_region_set_enabled(&d->bus_master_enable_region,
>>>> -- 
>>>> 1.7.12.4
>>>>
>>>
>>> .
>>>
>>
>>
> 
> .
>
Gonglei (Arei) Oct. 14, 2014, 12:23 p.m. UTC | #8
On 2014/10/14 20:15, chenliang (T) wrote:

> On 2014/10/14 19:48, Michael S. Tsirkin wrote:
> 
>> On Tue, Oct 14, 2014 at 07:41:14PM +0800, ChenLiang wrote:
>>> We find overlap when the size of pci bar is bigger then 16MB, it overlaps with private
>>> memslot in the kmod. By the way, the new kmod skip private memslot. But I think if the size
>>> of pci bar is enough big, it also  overlaps with other memslots.
>>
>> Of course but it should not cause a crash.
>> If you need the overlapping memslot available during the programming
>> process, increase it's priority.
>>
> 
> Yeah, I know the priority of memory region.
> The problem is overlaping should not happen when one pci bar is not
> overlap with any other memslots. But Qemu always do pci_update_mappings
> when guest os writes pci bar. Actually, should not do pci_update_mappings
> if var is 0xffffffff.
> 

The PCI spec says [Page 222]:
Decode (I/O or memory) of a register is disabled via the command register before sizing a
Base Address register. Software saves the original value of the Base Address register,
writes 0FFFFFFFFh to the register, then reads it back.

>>> the root cause is:
>>>
>>> pci_default_write_config will do that:
>>>     for (i = 0; i < l; val >>= 8, ++i) {
>>>         uint8_t wmask = d->wmask[addr + i];
>>>         uint8_t w1cmask = d->w1cmask[addr + i];
>>>         assert(!(wmask & w1cmask));
>>>         d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
>>>         d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */
>>>     }
>>>
>>> *(int*)(d->config[addr]) will be 0xfe00000c, if val is 0xffffffff and the size of bar is 32MB.
>>> This range overlap with private memslot in the old kmod.
>>>
>>> then pci_update_mappings will update memslot.
>>>
>>> On 2014/10/14 19:20, Michael S. Tsirkin wrote:
>>>
>>>> On Tue, Oct 14, 2014 at 07:04:14PM +0800, arei.gonglei@huawei.com wrote:
>>>>> From: ChenLiang <chenliang88@huawei.com>
>>>>>
>>>>> Power-up software can determine how much address space the device
>>>>> requires by writing a value of all 1's to the register and then
>>>>> reading the value back(PCI specification). Qemu should not do
>>>>> pci_update_mappings. Qemu may exit, because the wrong address of
>>>>> this bar is overlap with other memslots.
>>>>>
>>>>> Signed-off-by: ChenLiang <chenliang88@huawei.com>
>>>>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
>>>>
>>>> This is at best a work-around.
>>>> Overlapping is observed in practice, qemu really shouldn't exit when
>>>> this happens.
>>>> So we should find the root cause and fix it there instead of
>>>> adding work-arounds in PCI core.
>>>>
>>>> With which device do you observe this?

ivshmem device with 32M' bar2 size.

Best regards,
-Gonglei

>>>>
>>>>
>>>>> ---
>>>>>  hw/pci/pci.c | 8 ++++----
>>>>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>>>>
>>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
>>>>> index 6ce75aa..4d44b44 100644
>>>>> --- a/hw/pci/pci.c
>>>>> +++ b/hw/pci/pci.c
>>>>> @@ -1158,12 +1158,12 @@ void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val_in, int
>>>>>          d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
>>>>>          d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */
>>>>>      }
>>>>> -    if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
>>>>> +    if (((ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
>>>>>          ranges_overlap(addr, l, PCI_ROM_ADDRESS, 4) ||
>>>>> -        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4) ||
>>>>> -        range_covers_byte(addr, l, PCI_COMMAND))
>>>>> +        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4)) &&
>>>>> +        val_in != 0xffffffff) || range_covers_byte(addr, l, PCI_COMMAND)) {
>>>>>          pci_update_mappings(d);
>>>>> -
>>>>> +    }
>>>>>      if (range_covers_byte(addr, l, PCI_COMMAND)) {
>>>>>          pci_update_irq_disabled(d, was_irq_disabled);
>>>>>          memory_region_set_enabled(&d->bus_master_enable_region,
>>>>> -- 
>>>>> 1.7.12.4
>>>>>
>>>>
>>>> .
>>>>
>>>
>>>
>>
>> .
>>
> 
> 
>
Michael S. Tsirkin Oct. 14, 2014, 12:27 p.m. UTC | #9
On Tue, Oct 14, 2014 at 08:15:10PM +0800, ChenLiang wrote:
> On 2014/10/14 19:48, Michael S. Tsirkin wrote:
> 
> > On Tue, Oct 14, 2014 at 07:41:14PM +0800, ChenLiang wrote:
> >> We find overlap when the size of pci bar is bigger then 16MB, it overlaps with private
> >> memslot in the kmod. By the way, the new kmod skip private memslot. But I think if the size
> >> of pci bar is enough big, it also  overlaps with other memslots.
> > 
> > Of course but it should not cause a crash.
> > If you need the overlapping memslot available during the programming
> > process, increase it's priority.
> > 
> 
> Yeah, I know the priority of memory region.
> The problem is overlaping should not happen when one pci bar is not
> overlap with any other memslots. But Qemu always do pci_update_mappings
> when guest os writes pci bar. Actually, should not do pci_update_mappings
> if var is 0xffffffff.

Unfortunately your hack is not robust, so we can not include it.
PCI devices should support arbitrary addresses.
For example, if a device is programmed with an address
0xfe000000 then this is exactly the address it should claim.
So I am sorry, you will have to either debug the problem to understand what
is causing a crash, or tell us on the list how to reproduce it
so others on the list can debug it.


> >> the root cause is:
> >>
> >> pci_default_write_config will do that:
> >>     for (i = 0; i < l; val >>= 8, ++i) {
> >>         uint8_t wmask = d->wmask[addr + i];
> >>         uint8_t w1cmask = d->w1cmask[addr + i];
> >>         assert(!(wmask & w1cmask));
> >>         d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
> >>         d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */
> >>     }
> >>
> >> *(int*)(d->config[addr]) will be 0xfe00000c, if val is 0xffffffff and the size of bar is 32MB.
> >> This range overlap with private memslot in the old kmod.
> >>
> >> then pci_update_mappings will update memslot.
> >>
> >> On 2014/10/14 19:20, Michael S. Tsirkin wrote:
> >>
> >>> On Tue, Oct 14, 2014 at 07:04:14PM +0800, arei.gonglei@huawei.com wrote:
> >>>> From: ChenLiang <chenliang88@huawei.com>
> >>>>
> >>>> Power-up software can determine how much address space the device
> >>>> requires by writing a value of all 1's to the register and then
> >>>> reading the value back(PCI specification). Qemu should not do
> >>>> pci_update_mappings. Qemu may exit, because the wrong address of
> >>>> this bar is overlap with other memslots.
> >>>>
> >>>> Signed-off-by: ChenLiang <chenliang88@huawei.com>
> >>>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> >>>
> >>> This is at best a work-around.
> >>> Overlapping is observed in practice, qemu really shouldn't exit when
> >>> this happens.
> >>> So we should find the root cause and fix it there instead of
> >>> adding work-arounds in PCI core.
> >>>
> >>> With which device do you observe this?
> >>>
> >>>
> >>>> ---
> >>>>  hw/pci/pci.c | 8 ++++----
> >>>>  1 file changed, 4 insertions(+), 4 deletions(-)
> >>>>
> >>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> >>>> index 6ce75aa..4d44b44 100644
> >>>> --- a/hw/pci/pci.c
> >>>> +++ b/hw/pci/pci.c
> >>>> @@ -1158,12 +1158,12 @@ void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val_in, int
> >>>>          d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
> >>>>          d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */
> >>>>      }
> >>>> -    if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
> >>>> +    if (((ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
> >>>>          ranges_overlap(addr, l, PCI_ROM_ADDRESS, 4) ||
> >>>> -        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4) ||
> >>>> -        range_covers_byte(addr, l, PCI_COMMAND))
> >>>> +        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4)) &&
> >>>> +        val_in != 0xffffffff) || range_covers_byte(addr, l, PCI_COMMAND)) {
> >>>>          pci_update_mappings(d);
> >>>> -
> >>>> +    }
> >>>>      if (range_covers_byte(addr, l, PCI_COMMAND)) {
> >>>>          pci_update_irq_disabled(d, was_irq_disabled);
> >>>>          memory_region_set_enabled(&d->bus_master_enable_region,
> >>>> -- 
> >>>> 1.7.12.4
> >>>>
> >>>
> >>> .
> >>>
> >>
> >>
> > 
> > .
> > 
> 
>
ChenLiang Oct. 14, 2014, 12:27 p.m. UTC | #10
On 2014/10/14 20:28, Michael S. Tsirkin wrote:

> On Tue, Oct 14, 2014 at 08:19:56PM +0800, ChenLiang wrote:
>> On 2014/10/14 19:58, Michael S. Tsirkin wrote:
>>
>>> On Tue, Oct 14, 2014 at 07:41:14PM +0800, ChenLiang wrote:
>>>> We find overlap when the size of pci bar is bigger then 16MB, it overlaps with private
>>>> memslot in the kmod. By the way, the new kmod skip private memslot. But I think if the size
>>>> of pci bar is enough big, it also  overlaps with other memslots.
>>>>
>>>> the root cause is:
>>>>
>>>> pci_default_write_config will do that:
>>>>     for (i = 0; i < l; val >>= 8, ++i) {
>>>>         uint8_t wmask = d->wmask[addr + i];
>>>>         uint8_t w1cmask = d->w1cmask[addr + i];
>>>>         assert(!(wmask & w1cmask));
>>>>         d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
>>>>         d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */
>>>>     }
>>>>
>>>> *(int*)(d->config[addr]) will be 0xfe00000c, if val is 0xffffffff and the size of bar is 32MB.
>>>> This range overlap with private memslot in the old kmod.
>>>>
>>>> then pci_update_mappings will update memslot.
>>>
>>>
>>> In fact, ever since
>>> 	83d08f2673504a299194dcac1657a13754b5932a
>>>     pc: map PCI address space as catchall region for not mapped addresses
>>>
>>> all pci memory has lower priority than ioapic at 0xfe0000000.
>>>
>>> so ioapic will win, there should be no issue.
>>>
>>> IOW this is not the root cause.
>>
>>
>> TSS_PRIVATE_MEMSLOT and IDENTITY_PAGETABLE_PRIVATE_MEMSLOT also will overlap.
> 
> I'm sorry, I can't find these symbols in qemu source.


These symbols is in kmod source.

Best regards
chenliang

> 
>>>
>>>
>>>
>>>> On 2014/10/14 19:20, Michael S. Tsirkin wrote:
>>>>
>>>>> On Tue, Oct 14, 2014 at 07:04:14PM +0800, arei.gonglei@huawei.com wrote:
>>>>>> From: ChenLiang <chenliang88@huawei.com>
>>>>>>
>>>>>> Power-up software can determine how much address space the device
>>>>>> requires by writing a value of all 1's to the register and then
>>>>>> reading the value back(PCI specification). Qemu should not do
>>>>>> pci_update_mappings. Qemu may exit, because the wrong address of
>>>>>> this bar is overlap with other memslots.
>>>>>>
>>>>>> Signed-off-by: ChenLiang <chenliang88@huawei.com>
>>>>>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
>>>>>
>>>>> This is at best a work-around.
>>>>> Overlapping is observed in practice, qemu really shouldn't exit when
>>>>> this happens.
>>>>> So we should find the root cause and fix it there instead of
>>>>> adding work-arounds in PCI core.
>>>>>
>>>>> With which device do you observe this?
>>>>>
>>>>>
>>>>>> ---
>>>>>>  hw/pci/pci.c | 8 ++++----
>>>>>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>>>>>
>>>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
>>>>>> index 6ce75aa..4d44b44 100644
>>>>>> --- a/hw/pci/pci.c
>>>>>> +++ b/hw/pci/pci.c
>>>>>> @@ -1158,12 +1158,12 @@ void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val_in, int
>>>>>>          d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
>>>>>>          d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */
>>>>>>      }
>>>>>> -    if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
>>>>>> +    if (((ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
>>>>>>          ranges_overlap(addr, l, PCI_ROM_ADDRESS, 4) ||
>>>>>> -        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4) ||
>>>>>> -        range_covers_byte(addr, l, PCI_COMMAND))
>>>>>> +        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4)) &&
>>>>>> +        val_in != 0xffffffff) || range_covers_byte(addr, l, PCI_COMMAND)) {
>>>>>>          pci_update_mappings(d);
>>>>>> -
>>>>>> +    }
>>>>>>      if (range_covers_byte(addr, l, PCI_COMMAND)) {
>>>>>>          pci_update_irq_disabled(d, was_irq_disabled);
>>>>>>          memory_region_set_enabled(&d->bus_master_enable_region,
>>>>>> -- 
>>>>>> 1.7.12.4
>>>>>>
>>>>>
>>>>> .
>>>>>
>>>>
>>>>
>>>
>>> .
>>>
>>
>>
> 
> .
>
Michael S. Tsirkin Oct. 14, 2014, 12:28 p.m. UTC | #11
On Tue, Oct 14, 2014 at 08:19:56PM +0800, ChenLiang wrote:
> On 2014/10/14 19:58, Michael S. Tsirkin wrote:
> 
> > On Tue, Oct 14, 2014 at 07:41:14PM +0800, ChenLiang wrote:
> >> We find overlap when the size of pci bar is bigger then 16MB, it overlaps with private
> >> memslot in the kmod. By the way, the new kmod skip private memslot. But I think if the size
> >> of pci bar is enough big, it also  overlaps with other memslots.
> >>
> >> the root cause is:
> >>
> >> pci_default_write_config will do that:
> >>     for (i = 0; i < l; val >>= 8, ++i) {
> >>         uint8_t wmask = d->wmask[addr + i];
> >>         uint8_t w1cmask = d->w1cmask[addr + i];
> >>         assert(!(wmask & w1cmask));
> >>         d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
> >>         d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */
> >>     }
> >>
> >> *(int*)(d->config[addr]) will be 0xfe00000c, if val is 0xffffffff and the size of bar is 32MB.
> >> This range overlap with private memslot in the old kmod.
> >>
> >> then pci_update_mappings will update memslot.
> > 
> > 
> > In fact, ever since
> > 	83d08f2673504a299194dcac1657a13754b5932a
> >     pc: map PCI address space as catchall region for not mapped addresses
> > 
> > all pci memory has lower priority than ioapic at 0xfe0000000.
> > 
> > so ioapic will win, there should be no issue.
> > 
> > IOW this is not the root cause.
> 
> 
> TSS_PRIVATE_MEMSLOT and IDENTITY_PAGETABLE_PRIVATE_MEMSLOT also will overlap.

I'm sorry, I can't find these symbols in qemu source.

> > 
> > 
> > 
> >> On 2014/10/14 19:20, Michael S. Tsirkin wrote:
> >>
> >>> On Tue, Oct 14, 2014 at 07:04:14PM +0800, arei.gonglei@huawei.com wrote:
> >>>> From: ChenLiang <chenliang88@huawei.com>
> >>>>
> >>>> Power-up software can determine how much address space the device
> >>>> requires by writing a value of all 1's to the register and then
> >>>> reading the value back(PCI specification). Qemu should not do
> >>>> pci_update_mappings. Qemu may exit, because the wrong address of
> >>>> this bar is overlap with other memslots.
> >>>>
> >>>> Signed-off-by: ChenLiang <chenliang88@huawei.com>
> >>>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> >>>
> >>> This is at best a work-around.
> >>> Overlapping is observed in practice, qemu really shouldn't exit when
> >>> this happens.
> >>> So we should find the root cause and fix it there instead of
> >>> adding work-arounds in PCI core.
> >>>
> >>> With which device do you observe this?
> >>>
> >>>
> >>>> ---
> >>>>  hw/pci/pci.c | 8 ++++----
> >>>>  1 file changed, 4 insertions(+), 4 deletions(-)
> >>>>
> >>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> >>>> index 6ce75aa..4d44b44 100644
> >>>> --- a/hw/pci/pci.c
> >>>> +++ b/hw/pci/pci.c
> >>>> @@ -1158,12 +1158,12 @@ void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val_in, int
> >>>>          d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
> >>>>          d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */
> >>>>      }
> >>>> -    if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
> >>>> +    if (((ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
> >>>>          ranges_overlap(addr, l, PCI_ROM_ADDRESS, 4) ||
> >>>> -        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4) ||
> >>>> -        range_covers_byte(addr, l, PCI_COMMAND))
> >>>> +        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4)) &&
> >>>> +        val_in != 0xffffffff) || range_covers_byte(addr, l, PCI_COMMAND)) {
> >>>>          pci_update_mappings(d);
> >>>> -
> >>>> +    }
> >>>>      if (range_covers_byte(addr, l, PCI_COMMAND)) {
> >>>>          pci_update_irq_disabled(d, was_irq_disabled);
> >>>>          memory_region_set_enabled(&d->bus_master_enable_region,
> >>>> -- 
> >>>> 1.7.12.4
> >>>>
> >>>
> >>> .
> >>>
> >>
> >>
> > 
> > .
> > 
> 
>
Michael S. Tsirkin Oct. 14, 2014, 12:31 p.m. UTC | #12
On Tue, Oct 14, 2014 at 08:23:08PM +0800, Gonglei wrote:
> On 2014/10/14 20:15, chenliang (T) wrote:
> 
> > On 2014/10/14 19:48, Michael S. Tsirkin wrote:
> > 
> >> On Tue, Oct 14, 2014 at 07:41:14PM +0800, ChenLiang wrote:
> >>> We find overlap when the size of pci bar is bigger then 16MB, it overlaps with private
> >>> memslot in the kmod. By the way, the new kmod skip private memslot. But I think if the size
> >>> of pci bar is enough big, it also  overlaps with other memslots.
> >>
> >> Of course but it should not cause a crash.
> >> If you need the overlapping memslot available during the programming
> >> process, increase it's priority.
> >>
> > 
> > Yeah, I know the priority of memory region.
> > The problem is overlaping should not happen when one pci bar is not
> > overlap with any other memslots. But Qemu always do pci_update_mappings
> > when guest os writes pci bar. Actually, should not do pci_update_mappings
> > if var is 0xffffffff.
> > 
> 
> The PCI spec says [Page 222]:
> Decode (I/O or memory) of a register is disabled via the command register before sizing a
> Base Address register. Software saves the original value of the Base Address register,
> writes 0FFFFFFFFh to the register, then reads it back.

Yes. Unfortunately many guests ignore the spec, they
size an enabled BAR and hope for the best.
This typically works on real hardware and should work
within a VM as well.



> >>> the root cause is:
> >>>
> >>> pci_default_write_config will do that:
> >>>     for (i = 0; i < l; val >>= 8, ++i) {
> >>>         uint8_t wmask = d->wmask[addr + i];
> >>>         uint8_t w1cmask = d->w1cmask[addr + i];
> >>>         assert(!(wmask & w1cmask));
> >>>         d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
> >>>         d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */
> >>>     }
> >>>
> >>> *(int*)(d->config[addr]) will be 0xfe00000c, if val is 0xffffffff and the size of bar is 32MB.
> >>> This range overlap with private memslot in the old kmod.
> >>>
> >>> then pci_update_mappings will update memslot.
> >>>
> >>> On 2014/10/14 19:20, Michael S. Tsirkin wrote:
> >>>
> >>>> On Tue, Oct 14, 2014 at 07:04:14PM +0800, arei.gonglei@huawei.com wrote:
> >>>>> From: ChenLiang <chenliang88@huawei.com>
> >>>>>
> >>>>> Power-up software can determine how much address space the device
> >>>>> requires by writing a value of all 1's to the register and then
> >>>>> reading the value back(PCI specification). Qemu should not do
> >>>>> pci_update_mappings. Qemu may exit, because the wrong address of
> >>>>> this bar is overlap with other memslots.
> >>>>>
> >>>>> Signed-off-by: ChenLiang <chenliang88@huawei.com>
> >>>>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> >>>>
> >>>> This is at best a work-around.
> >>>> Overlapping is observed in practice, qemu really shouldn't exit when
> >>>> this happens.
> >>>> So we should find the root cause and fix it there instead of
> >>>> adding work-arounds in PCI core.
> >>>>
> >>>> With which device do you observe this?
> 
> ivshmem device with 32M' bar2 size.
> 
> Best regards,
> -Gonglei
> 
> >>>>
> >>>>
> >>>>> ---
> >>>>>  hw/pci/pci.c | 8 ++++----
> >>>>>  1 file changed, 4 insertions(+), 4 deletions(-)
> >>>>>
> >>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> >>>>> index 6ce75aa..4d44b44 100644
> >>>>> --- a/hw/pci/pci.c
> >>>>> +++ b/hw/pci/pci.c
> >>>>> @@ -1158,12 +1158,12 @@ void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val_in, int
> >>>>>          d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
> >>>>>          d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */
> >>>>>      }
> >>>>> -    if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
> >>>>> +    if (((ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
> >>>>>          ranges_overlap(addr, l, PCI_ROM_ADDRESS, 4) ||
> >>>>> -        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4) ||
> >>>>> -        range_covers_byte(addr, l, PCI_COMMAND))
> >>>>> +        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4)) &&
> >>>>> +        val_in != 0xffffffff) || range_covers_byte(addr, l, PCI_COMMAND)) {
> >>>>>          pci_update_mappings(d);
> >>>>> -
> >>>>> +    }
> >>>>>      if (range_covers_byte(addr, l, PCI_COMMAND)) {
> >>>>>          pci_update_irq_disabled(d, was_irq_disabled);
> >>>>>          memory_region_set_enabled(&d->bus_master_enable_region,
> >>>>> -- 
> >>>>> 1.7.12.4
> >>>>>
> >>>>
> >>>> .
> >>>>
> >>>
> >>>
> >>
> >> .
> >>
> > 
> > 
> > 
> 
>
ChenLiang Oct. 14, 2014, 12:59 p.m. UTC | #13
On 2014/10/14 20:27, Michael S. Tsirkin wrote:

> On Tue, Oct 14, 2014 at 08:15:10PM +0800, ChenLiang wrote:
>> On 2014/10/14 19:48, Michael S. Tsirkin wrote:
>>
>>> On Tue, Oct 14, 2014 at 07:41:14PM +0800, ChenLiang wrote:
>>>> We find overlap when the size of pci bar is bigger then 16MB, it overlaps with private
>>>> memslot in the kmod. By the way, the new kmod skip private memslot. But I think if the size
>>>> of pci bar is enough big, it also  overlaps with other memslots.
>>>
>>> Of course but it should not cause a crash.
>>> If you need the overlapping memslot available during the programming
>>> process, increase it's priority.
>>>
>>
>> Yeah, I know the priority of memory region.
>> The problem is overlaping should not happen when one pci bar is not
>> overlap with any other memslots. But Qemu always do pci_update_mappings
>> when guest os writes pci bar. Actually, should not do pci_update_mappings
>> if var is 0xffffffff.
> 
> Unfortunately your hack is not robust, so we can not include it.
> PCI devices should support arbitrary addresses.
> For example, if a device is programmed with an address
> 0xfe000000 then this is exactly the address it should claim.
> So I am sorry, you will have to either debug the problem to understand what
> is causing a crash, or tell us on the list how to reproduce it
> so others on the list can debug it.
> 
> 

For example:
guest os: suse10sp1
device: ivshmem 32MB
kmod: not include patch "KVM: Fix user memslot overlap check"

qemu will exit when vm start.

But guest os suse 11 sp2 will be ok.

Because the old linux kernel don't disable response in Memory space when it gets the size of pci bar.

ps: I am not sure whether "KVM: Fix user memslot overlap check" skip check overlaping with private memslot is safe.

The root cause is when guest write 0xffffffff to get size of pci bar. We should not do pci_update_mappings in pci_default_write_config.

>>>> the root cause is:
>>>>
>>>> pci_default_write_config will do that:
>>>>     for (i = 0; i < l; val >>= 8, ++i) {
>>>>         uint8_t wmask = d->wmask[addr + i];
>>>>         uint8_t w1cmask = d->w1cmask[addr + i];
>>>>         assert(!(wmask & w1cmask));
>>>>         d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
>>>>         d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */
>>>>     }
>>>>
>>>> *(int*)(d->config[addr]) will be 0xfe00000c, if val is 0xffffffff and the size of bar is 32MB.
>>>> This range overlap with private memslot in the old kmod.
>>>>
>>>> then pci_update_mappings will update memslot.
>>>>
>>>> On 2014/10/14 19:20, Michael S. Tsirkin wrote:
>>>>
>>>>> On Tue, Oct 14, 2014 at 07:04:14PM +0800, arei.gonglei@huawei.com wrote:
>>>>>> From: ChenLiang <chenliang88@huawei.com>
>>>>>>
>>>>>> Power-up software can determine how much address space the device
>>>>>> requires by writing a value of all 1's to the register and then
>>>>>> reading the value back(PCI specification). Qemu should not do
>>>>>> pci_update_mappings. Qemu may exit, because the wrong address of
>>>>>> this bar is overlap with other memslots.
>>>>>>
>>>>>> Signed-off-by: ChenLiang <chenliang88@huawei.com>
>>>>>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
>>>>>
>>>>> This is at best a work-around.
>>>>> Overlapping is observed in practice, qemu really shouldn't exit when
>>>>> this happens.
>>>>> So we should find the root cause and fix it there instead of
>>>>> adding work-arounds in PCI core.
>>>>>
>>>>> With which device do you observe this?
>>>>>
>>>>>
>>>>>> ---
>>>>>>  hw/pci/pci.c | 8 ++++----
>>>>>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>>>>>
>>>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
>>>>>> index 6ce75aa..4d44b44 100644
>>>>>> --- a/hw/pci/pci.c
>>>>>> +++ b/hw/pci/pci.c
>>>>>> @@ -1158,12 +1158,12 @@ void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val_in, int
>>>>>>          d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
>>>>>>          d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */
>>>>>>      }
>>>>>> -    if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
>>>>>> +    if (((ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
>>>>>>          ranges_overlap(addr, l, PCI_ROM_ADDRESS, 4) ||
>>>>>> -        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4) ||
>>>>>> -        range_covers_byte(addr, l, PCI_COMMAND))
>>>>>> +        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4)) &&
>>>>>> +        val_in != 0xffffffff) || range_covers_byte(addr, l, PCI_COMMAND)) {
>>>>>>          pci_update_mappings(d);
>>>>>> -
>>>>>> +    }
>>>>>>      if (range_covers_byte(addr, l, PCI_COMMAND)) {
>>>>>>          pci_update_irq_disabled(d, was_irq_disabled);
>>>>>>          memory_region_set_enabled(&d->bus_master_enable_region,
>>>>>> -- 
>>>>>> 1.7.12.4
>>>>>>
>>>>>
>>>>> .
>>>>>
>>>>
>>>>
>>>
>>> .
>>>
>>
>>
> 
> .
>
Michael S. Tsirkin Oct. 14, 2014, 2:21 p.m. UTC | #14
On Tue, Oct 14, 2014 at 08:27:25PM +0800, ChenLiang wrote:
> On 2014/10/14 20:28, Michael S. Tsirkin wrote:
> 
> > On Tue, Oct 14, 2014 at 08:19:56PM +0800, ChenLiang wrote:
> >> On 2014/10/14 19:58, Michael S. Tsirkin wrote:
> >>
> >>> On Tue, Oct 14, 2014 at 07:41:14PM +0800, ChenLiang wrote:
> >>>> We find overlap when the size of pci bar is bigger then 16MB, it overlaps with private
> >>>> memslot in the kmod. By the way, the new kmod skip private memslot. But I think if the size
> >>>> of pci bar is enough big, it also  overlaps with other memslots.
> >>>>
> >>>> the root cause is:
> >>>>
> >>>> pci_default_write_config will do that:
> >>>>     for (i = 0; i < l; val >>= 8, ++i) {
> >>>>         uint8_t wmask = d->wmask[addr + i];
> >>>>         uint8_t w1cmask = d->w1cmask[addr + i];
> >>>>         assert(!(wmask & w1cmask));
> >>>>         d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
> >>>>         d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */
> >>>>     }
> >>>>
> >>>> *(int*)(d->config[addr]) will be 0xfe00000c, if val is 0xffffffff and the size of bar is 32MB.
> >>>> This range overlap with private memslot in the old kmod.
> >>>>
> >>>> then pci_update_mappings will update memslot.
> >>>
> >>>
> >>> In fact, ever since
> >>> 	83d08f2673504a299194dcac1657a13754b5932a
> >>>     pc: map PCI address space as catchall region for not mapped addresses
> >>>
> >>> all pci memory has lower priority than ioapic at 0xfe0000000.
> >>>
> >>> so ioapic will win, there should be no issue.
> >>>
> >>> IOW this is not the root cause.
> >>
> >>
> >> TSS_PRIVATE_MEMSLOT and IDENTITY_PAGETABLE_PRIVATE_MEMSLOT also will overlap.
> > 
> > I'm sorry, I can't find these symbols in qemu source.
> 
> 
> These symbols is in kmod source.
> 
> Best regards
> chenliang


OK, I see.
Thus the fix is to make memory core aware of the holes at identity
base and TSS base.

Maybe create an MR to cover this range.


> > 
> >>>
> >>>
> >>>
> >>>> On 2014/10/14 19:20, Michael S. Tsirkin wrote:
> >>>>
> >>>>> On Tue, Oct 14, 2014 at 07:04:14PM +0800, arei.gonglei@huawei.com wrote:
> >>>>>> From: ChenLiang <chenliang88@huawei.com>
> >>>>>>
> >>>>>> Power-up software can determine how much address space the device
> >>>>>> requires by writing a value of all 1's to the register and then
> >>>>>> reading the value back(PCI specification). Qemu should not do
> >>>>>> pci_update_mappings. Qemu may exit, because the wrong address of
> >>>>>> this bar is overlap with other memslots.
> >>>>>>
> >>>>>> Signed-off-by: ChenLiang <chenliang88@huawei.com>
> >>>>>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> >>>>>
> >>>>> This is at best a work-around.
> >>>>> Overlapping is observed in practice, qemu really shouldn't exit when
> >>>>> this happens.
> >>>>> So we should find the root cause and fix it there instead of
> >>>>> adding work-arounds in PCI core.
> >>>>>
> >>>>> With which device do you observe this?
> >>>>>
> >>>>>
> >>>>>> ---
> >>>>>>  hw/pci/pci.c | 8 ++++----
> >>>>>>  1 file changed, 4 insertions(+), 4 deletions(-)
> >>>>>>
> >>>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> >>>>>> index 6ce75aa..4d44b44 100644
> >>>>>> --- a/hw/pci/pci.c
> >>>>>> +++ b/hw/pci/pci.c
> >>>>>> @@ -1158,12 +1158,12 @@ void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val_in, int
> >>>>>>          d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
> >>>>>>          d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */
> >>>>>>      }
> >>>>>> -    if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
> >>>>>> +    if (((ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
> >>>>>>          ranges_overlap(addr, l, PCI_ROM_ADDRESS, 4) ||
> >>>>>> -        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4) ||
> >>>>>> -        range_covers_byte(addr, l, PCI_COMMAND))
> >>>>>> +        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4)) &&
> >>>>>> +        val_in != 0xffffffff) || range_covers_byte(addr, l, PCI_COMMAND)) {
> >>>>>>          pci_update_mappings(d);
> >>>>>> -
> >>>>>> +    }
> >>>>>>      if (range_covers_byte(addr, l, PCI_COMMAND)) {
> >>>>>>          pci_update_irq_disabled(d, was_irq_disabled);
> >>>>>>          memory_region_set_enabled(&d->bus_master_enable_region,
> >>>>>> -- 
> >>>>>> 1.7.12.4
> >>>>>>
> >>>>>
> >>>>> .
> >>>>>
> >>>>
> >>>>
> >>>
> >>> .
> >>>
> >>
> >>
> > 
> > .
> > 
> 
>
Michael S. Tsirkin Oct. 14, 2014, 2:31 p.m. UTC | #15
On Tue, Oct 14, 2014 at 08:59:56PM +0800, ChenLiang wrote:
> On 2014/10/14 20:27, Michael S. Tsirkin wrote:
> 
> > On Tue, Oct 14, 2014 at 08:15:10PM +0800, ChenLiang wrote:
> >> On 2014/10/14 19:48, Michael S. Tsirkin wrote:
> >>
> >>> On Tue, Oct 14, 2014 at 07:41:14PM +0800, ChenLiang wrote:
> >>>> We find overlap when the size of pci bar is bigger then 16MB, it overlaps with private
> >>>> memslot in the kmod. By the way, the new kmod skip private memslot. But I think if the size
> >>>> of pci bar is enough big, it also  overlaps with other memslots.
> >>>
> >>> Of course but it should not cause a crash.
> >>> If you need the overlapping memslot available during the programming
> >>> process, increase it's priority.
> >>>
> >>
> >> Yeah, I know the priority of memory region.
> >> The problem is overlaping should not happen when one pci bar is not
> >> overlap with any other memslots. But Qemu always do pci_update_mappings
> >> when guest os writes pci bar. Actually, should not do pci_update_mappings
> >> if var is 0xffffffff.
> > 
> > Unfortunately your hack is not robust, so we can not include it.
> > PCI devices should support arbitrary addresses.
> > For example, if a device is programmed with an address
> > 0xfe000000 then this is exactly the address it should claim.
> > So I am sorry, you will have to either debug the problem to understand what
> > is causing a crash, or tell us on the list how to reproduce it
> > so others on the list can debug it.
> > 
> > 
> 
> For example:
> guest os: suse10sp1
> device: ivshmem 32MB
> kmod: not include patch "KVM: Fix user memslot overlap check"
> 
> qemu will exit when vm start.
> 
> But guest os suse 11 sp2 will be ok.
> 
> Because the old linux kernel don't disable response in Memory space when it gets the size of pci bar.
> 
> ps: I am not sure whether "KVM: Fix user memslot overlap check" skip check overlaping with private memslot is safe.

So there's a problem with kvm on hosts with older kernels?
Workarounds for this belong in kvm code, not in PCI core.



> 
> The root cause is when guest write 0xffffffff to get size of pci bar. We should not do pci_update_mappings in pci_default_write_config.

Judging from what you say, it's just a symptom not the cause.
The root cause is private mappings in kvm, nothing to do with pci.

> >>>> the root cause is:
> >>>>
> >>>> pci_default_write_config will do that:
> >>>>     for (i = 0; i < l; val >>= 8, ++i) {
> >>>>         uint8_t wmask = d->wmask[addr + i];
> >>>>         uint8_t w1cmask = d->w1cmask[addr + i];
> >>>>         assert(!(wmask & w1cmask));
> >>>>         d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
> >>>>         d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */
> >>>>     }
> >>>>
> >>>> *(int*)(d->config[addr]) will be 0xfe00000c, if val is 0xffffffff and the size of bar is 32MB.
> >>>> This range overlap with private memslot in the old kmod.
> >>>>
> >>>> then pci_update_mappings will update memslot.
> >>>>
> >>>> On 2014/10/14 19:20, Michael S. Tsirkin wrote:
> >>>>
> >>>>> On Tue, Oct 14, 2014 at 07:04:14PM +0800, arei.gonglei@huawei.com wrote:
> >>>>>> From: ChenLiang <chenliang88@huawei.com>
> >>>>>>
> >>>>>> Power-up software can determine how much address space the device
> >>>>>> requires by writing a value of all 1's to the register and then
> >>>>>> reading the value back(PCI specification). Qemu should not do
> >>>>>> pci_update_mappings. Qemu may exit, because the wrong address of
> >>>>>> this bar is overlap with other memslots.
> >>>>>>
> >>>>>> Signed-off-by: ChenLiang <chenliang88@huawei.com>
> >>>>>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> >>>>>
> >>>>> This is at best a work-around.
> >>>>> Overlapping is observed in practice, qemu really shouldn't exit when
> >>>>> this happens.
> >>>>> So we should find the root cause and fix it there instead of
> >>>>> adding work-arounds in PCI core.
> >>>>>
> >>>>> With which device do you observe this?
> >>>>>
> >>>>>
> >>>>>> ---
> >>>>>>  hw/pci/pci.c | 8 ++++----
> >>>>>>  1 file changed, 4 insertions(+), 4 deletions(-)
> >>>>>>
> >>>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> >>>>>> index 6ce75aa..4d44b44 100644
> >>>>>> --- a/hw/pci/pci.c
> >>>>>> +++ b/hw/pci/pci.c
> >>>>>> @@ -1158,12 +1158,12 @@ void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val_in, int
> >>>>>>          d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
> >>>>>>          d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */
> >>>>>>      }
> >>>>>> -    if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
> >>>>>> +    if (((ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
> >>>>>>          ranges_overlap(addr, l, PCI_ROM_ADDRESS, 4) ||
> >>>>>> -        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4) ||
> >>>>>> -        range_covers_byte(addr, l, PCI_COMMAND))
> >>>>>> +        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4)) &&
> >>>>>> +        val_in != 0xffffffff) || range_covers_byte(addr, l, PCI_COMMAND)) {
> >>>>>>          pci_update_mappings(d);
> >>>>>> -
> >>>>>> +    }
> >>>>>>      if (range_covers_byte(addr, l, PCI_COMMAND)) {
> >>>>>>          pci_update_irq_disabled(d, was_irq_disabled);
> >>>>>>          memory_region_set_enabled(&d->bus_master_enable_region,
> >>>>>> -- 
> >>>>>> 1.7.12.4
> >>>>>>
> >>>>>
> >>>>> .
> >>>>>
> >>>>
> >>>>
> >>>
> >>> .
> >>>
> >>
> >>
> > 
> > .
> > 
> 
>
diff mbox

Patch

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 6ce75aa..4d44b44 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -1158,12 +1158,12 @@  void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val_in, int
         d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
         d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */
     }
-    if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
+    if (((ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
         ranges_overlap(addr, l, PCI_ROM_ADDRESS, 4) ||
-        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4) ||
-        range_covers_byte(addr, l, PCI_COMMAND))
+        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4)) &&
+        val_in != 0xffffffff) || range_covers_byte(addr, l, PCI_COMMAND)) {
         pci_update_mappings(d);
-
+    }
     if (range_covers_byte(addr, l, PCI_COMMAND)) {
         pci_update_irq_disabled(d, was_irq_disabled);
         memory_region_set_enabled(&d->bus_master_enable_region,