Patchwork [v3,23/23] pci: honor PCI_COMMAND_MASTER

login
register
mail settings
Submitter pingfan liu
Date Oct. 11, 2012, 3:38 a.m.
Message ID <CAJnKYQkGC+mPDwMTEc9+oayXwbDoWu3O8=cbZo=ZDKxTsh+X6w@mail.gmail.com>
Download mbox | patch
Permalink /patch/190782/
State New
Headers show

Comments

pingfan liu - Oct. 11, 2012, 3:38 a.m.
On Wed, Oct 10, 2012 at 12:32 AM, Avi Kivity <avi@redhat.com> wrote:
> Currently we ignore PCI_COMMAND_MASTER completely: DMA succeeds even when
> the bit is clear.
>
> Honor PCI_COMMAND_MASTER by inserting a memory region into the device's
> bus master address space, and tying its enable status to PCI_COMMAND_MASTER.
>
> Tested using
>
>   setpci -s 03 COMMAND=3
>
> while a ping was running on a NIC in slot 3.  The kernel (Linux) detected
> the stall and recovered after the command
>
>   setpci -s 03 COMMAND=7
>
> was issued.
>
> Signed-off-by: Avi Kivity <avi@redhat.com>
> ---
>  hw/pci.c | 13 +++++++++++--
>  hw/pci.h |  1 +
>  2 files changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/hw/pci.c b/hw/pci.c
> index 8e8e030..7adf61b 100644
> --- a/hw/pci.c
> +++ b/hw/pci.c
> @@ -782,7 +782,11 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, PCIBus *bus,
>          /* FIXME: Make dma_context_fn use MemoryRegions instead, so this path is
>           * taken unconditionally */
>          /* FIXME: inherit memory region from bus creator */
> -        address_space_init(&pci_dev->bus_master_as, get_system_memory());
> +        memory_region_init_alias(&pci_dev->bus_master_enable_region, "bus master",
> +                                 get_system_memory(), 0,
> +                                 memory_region_size(get_system_memory()));

Could we achieve that  hiding special mr from some address space?  I
think one approach is to limit the iommu's lookup table, but that is
the guest's willing.  The other one is use dummy-mr to overlap some
piece of region of system_memory.
This method will require changing the render sequence of alias mr.


> +        memory_region_set_enabled(&pci_dev->bus_master_enable_region, false);
> +        address_space_init(&pci_dev->bus_master_as, &pci_dev->bus_master_enable_region);
>          pci_dev->dma = g_new(DMAContext, 1);
>          dma_context_init(pci_dev->dma, &pci_dev->bus_master_as, NULL, NULL, NULL);
>      }
> @@ -841,6 +845,7 @@ static void do_pci_unregister_device(PCIDevice *pci_dev)
>
>      if (!pci_dev->bus->dma_context_fn) {
>          address_space_destroy(&pci_dev->bus_master_as);
> +        memory_region_destroy(&pci_dev->bus_master_enable_region);
>          g_free(pci_dev->dma);
>          pci_dev->dma = NULL;
>      }
> @@ -1065,8 +1070,12 @@ void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val, int l)
>          range_covers_byte(addr, l, PCI_COMMAND))
>          pci_update_mappings(d);
>
> -    if (range_covers_byte(addr, l, PCI_COMMAND))
> +    if (range_covers_byte(addr, l, PCI_COMMAND)) {
>          pci_update_irq_disabled(d, was_irq_disabled);
> +        memory_region_set_enabled(&d->bus_master_enable_region,
> +                                  pci_get_word(d->config + PCI_COMMAND)
> +                                    & PCI_COMMAND_MASTER);
> +    }
>
>      msi_write_config(d, addr, val, l);
>      msix_write_config(d, addr, val, l);
> diff --git a/hw/pci.h b/hw/pci.h
> index 3192d81..a65e490 100644
> --- a/hw/pci.h
> +++ b/hw/pci.h
> @@ -212,6 +212,7 @@ struct PCIDevice {
>      char name[64];
>      PCIIORegion io_regions[PCI_NUM_REGIONS];
>      AddressSpace bus_master_as;
> +    MemoryRegion bus_master_enable_region;
>      DMAContext *dma;
>
>      /* do not access the following fields */
> --
> 1.7.12
>
Avi Kivity - Oct. 11, 2012, 8:34 a.m.
On 10/11/2012 05:38 AM, liu ping fan wrote:
> On Wed, Oct 10, 2012 at 12:32 AM, Avi Kivity <avi@redhat.com> wrote:
>> Currently we ignore PCI_COMMAND_MASTER completely: DMA succeeds even when
>> the bit is clear.
>>
>> Honor PCI_COMMAND_MASTER by inserting a memory region into the device's
>> bus master address space, and tying its enable status to PCI_COMMAND_MASTER.
>>
>> Tested using
>>
>>   setpci -s 03 COMMAND=3
>>
>> while a ping was running on a NIC in slot 3.  The kernel (Linux) detected
>> the stall and recovered after the command
>>
>>   setpci -s 03 COMMAND=7
>>
>> was issued.
>>
>> Signed-off-by: Avi Kivity <avi@redhat.com>
>> ---
>>  hw/pci.c | 13 +++++++++++--
>>  hw/pci.h |  1 +
>>  2 files changed, 12 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/pci.c b/hw/pci.c
>> index 8e8e030..7adf61b 100644
>> --- a/hw/pci.c
>> +++ b/hw/pci.c
>> @@ -782,7 +782,11 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, PCIBus *bus,
>>          /* FIXME: Make dma_context_fn use MemoryRegions instead, so this path is
>>           * taken unconditionally */
>>          /* FIXME: inherit memory region from bus creator */
>> -        address_space_init(&pci_dev->bus_master_as, get_system_memory());
>> +        memory_region_init_alias(&pci_dev->bus_master_enable_region, "bus master",
>> +                                 get_system_memory(), 0,
>> +                                 memory_region_size(get_system_memory()));
> 
> Could we achieve that  hiding special mr from some address space?  I
> think one approach is to limit the iommu's lookup table, but that is
> the guest's willing.  The other one is use dummy-mr to overlap some
> piece of region of system_memory.
> This method will require changing the render sequence of alias mr.
> 
> diff --git a/memory.c b/memory.c
> index 2f68d67..cf67c66 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -499,6 +499,11 @@ static void render_memory_region(FlatView *view,
> 
>      clip = addrrange_intersection(tmp, clip);
> 
> +    /* Render subregions in priority order. */
> +    QTAILQ_FOREACH(subregion, &mr->subregions, subregions_link) {
> +        render_memory_region(view, subregion, base, clip, readonly);
> +    }
> +
>      if (mr->alias) {
>          int128_subfrom(&base, int128_make64(mr->alias->addr));
>          int128_subfrom(&base, int128_make64(mr->alias_offset));
> @@ -506,11 +511,6 @@ static void render_memory_region(FlatView *view,
>          return;
>      }
> 
> -    /* Render subregions in priority order. */
> -    QTAILQ_FOREACH(subregion, &mr->subregions, subregions_link) {
> -        render_memory_region(view, subregion, base, clip, readonly);
> -    }
> -
> 

I don't really follow.  We don't have any memory region that is both a
container and an alias, so this change has no effect.  Can you describe
what you intend?
pingfan liu - Oct. 11, 2012, 8:49 a.m.
On Thu, Oct 11, 2012 at 4:34 PM, Avi Kivity <avi@redhat.com> wrote:
> On 10/11/2012 05:38 AM, liu ping fan wrote:
>> On Wed, Oct 10, 2012 at 12:32 AM, Avi Kivity <avi@redhat.com> wrote:
>>> Currently we ignore PCI_COMMAND_MASTER completely: DMA succeeds even when
>>> the bit is clear.
>>>
>>> Honor PCI_COMMAND_MASTER by inserting a memory region into the device's
>>> bus master address space, and tying its enable status to PCI_COMMAND_MASTER.
>>>
>>> Tested using
>>>
>>>   setpci -s 03 COMMAND=3
>>>
>>> while a ping was running on a NIC in slot 3.  The kernel (Linux) detected
>>> the stall and recovered after the command
>>>
>>>   setpci -s 03 COMMAND=7
>>>
>>> was issued.
>>>
>>> Signed-off-by: Avi Kivity <avi@redhat.com>
>>> ---
>>>  hw/pci.c | 13 +++++++++++--
>>>  hw/pci.h |  1 +
>>>  2 files changed, 12 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/hw/pci.c b/hw/pci.c
>>> index 8e8e030..7adf61b 100644
>>> --- a/hw/pci.c
>>> +++ b/hw/pci.c
>>> @@ -782,7 +782,11 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, PCIBus *bus,
>>>          /* FIXME: Make dma_context_fn use MemoryRegions instead, so this path is
>>>           * taken unconditionally */
>>>          /* FIXME: inherit memory region from bus creator */
>>> -        address_space_init(&pci_dev->bus_master_as, get_system_memory());
>>> +        memory_region_init_alias(&pci_dev->bus_master_enable_region, "bus master",
>>> +                                 get_system_memory(), 0,
>>> +                                 memory_region_size(get_system_memory()));
>>
>> Could we achieve that  hiding special mr from some address space?  I
>> think one approach is to limit the iommu's lookup table, but that is
>> the guest's willing.  The other one is use dummy-mr to overlap some
>> piece of region of system_memory.
>> This method will require changing the render sequence of alias mr.
>>
>> diff --git a/memory.c b/memory.c
>> index 2f68d67..cf67c66 100644
>> --- a/memory.c
>> +++ b/memory.c
>> @@ -499,6 +499,11 @@ static void render_memory_region(FlatView *view,
>>
>>      clip = addrrange_intersection(tmp, clip);
>>
>> +    /* Render subregions in priority order. */
>> +    QTAILQ_FOREACH(subregion, &mr->subregions, subregions_link) {
>> +        render_memory_region(view, subregion, base, clip, readonly);
>> +    }
>> +
>>      if (mr->alias) {
>>          int128_subfrom(&base, int128_make64(mr->alias->addr));
>>          int128_subfrom(&base, int128_make64(mr->alias_offset));
>> @@ -506,11 +511,6 @@ static void render_memory_region(FlatView *view,
>>          return;
>>      }
>>
>> -    /* Render subregions in priority order. */
>> -    QTAILQ_FOREACH(subregion, &mr->subregions, subregions_link) {
>> -        render_memory_region(view, subregion, base, clip, readonly);
>> -    }
>> -
>>
>
> I don't really follow.  We don't have any memory region that is both a
> container and an alias, so this change has no effect.  Can you describe
> what you intend?
>
Suppose iommuA takes over only a subsection of system_memory, and
forbidden to touch other space. How can we achieve this model after
adopting this series?

>
> --
> error compiling committee.c: too many arguments to function
Avi Kivity - Oct. 11, 2012, 8:53 a.m.
On 10/11/2012 10:49 AM, liu ping fan wrote:
> On Thu, Oct 11, 2012 at 4:34 PM, Avi Kivity <avi@redhat.com> wrote:
>> On 10/11/2012 05:38 AM, liu ping fan wrote:
>>> On Wed, Oct 10, 2012 at 12:32 AM, Avi Kivity <avi@redhat.com> wrote:
>>>> Currently we ignore PCI_COMMAND_MASTER completely: DMA succeeds even when
>>>> the bit is clear.
>>>>
>>>> Honor PCI_COMMAND_MASTER by inserting a memory region into the device's
>>>> bus master address space, and tying its enable status to PCI_COMMAND_MASTER.
>>>>
>>>> Tested using
>>>>
>>>>   setpci -s 03 COMMAND=3
>>>>
>>>> while a ping was running on a NIC in slot 3.  The kernel (Linux) detected
>>>> the stall and recovered after the command
>>>>
>>>>   setpci -s 03 COMMAND=7
>>>>
>>>> was issued.
>>>>
>>>> Signed-off-by: Avi Kivity <avi@redhat.com>
>>>> ---
>>>>  hw/pci.c | 13 +++++++++++--
>>>>  hw/pci.h |  1 +
>>>>  2 files changed, 12 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/hw/pci.c b/hw/pci.c
>>>> index 8e8e030..7adf61b 100644
>>>> --- a/hw/pci.c
>>>> +++ b/hw/pci.c
>>>> @@ -782,7 +782,11 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, PCIBus *bus,
>>>>          /* FIXME: Make dma_context_fn use MemoryRegions instead, so this path is
>>>>           * taken unconditionally */
>>>>          /* FIXME: inherit memory region from bus creator */
>>>> -        address_space_init(&pci_dev->bus_master_as, get_system_memory());
>>>> +        memory_region_init_alias(&pci_dev->bus_master_enable_region, "bus master",
>>>> +                                 get_system_memory(), 0,
>>>> +                                 memory_region_size(get_system_memory()));
>>>
>>> Could we achieve that  hiding special mr from some address space?  I
>>> think one approach is to limit the iommu's lookup table, but that is
>>> the guest's willing.  The other one is use dummy-mr to overlap some
>>> piece of region of system_memory.
>>> This method will require changing the render sequence of alias mr.
>>>
>>> diff --git a/memory.c b/memory.c
>>> index 2f68d67..cf67c66 100644
>>> --- a/memory.c
>>> +++ b/memory.c
>>> @@ -499,6 +499,11 @@ static void render_memory_region(FlatView *view,
>>>
>>>      clip = addrrange_intersection(tmp, clip);
>>>
>>> +    /* Render subregions in priority order. */
>>> +    QTAILQ_FOREACH(subregion, &mr->subregions, subregions_link) {
>>> +        render_memory_region(view, subregion, base, clip, readonly);
>>> +    }
>>> +
>>>      if (mr->alias) {
>>>          int128_subfrom(&base, int128_make64(mr->alias->addr));
>>>          int128_subfrom(&base, int128_make64(mr->alias_offset));
>>> @@ -506,11 +511,6 @@ static void render_memory_region(FlatView *view,
>>>          return;
>>>      }
>>>
>>> -    /* Render subregions in priority order. */
>>> -    QTAILQ_FOREACH(subregion, &mr->subregions, subregions_link) {
>>> -        render_memory_region(view, subregion, base, clip, readonly);
>>> -    }
>>> -
>>>
>>
>> I don't really follow.  We don't have any memory region that is both a
>> container and an alias, so this change has no effect.  Can you describe
>> what you intend?
>>
> Suppose iommuA takes over only a subsection of system_memory, and
> forbidden to touch other space. How can we achieve this model after
> adopting this series?

I will soon post patches.  Basically an iommu is just another memory
region, and it can be anywhere in the memory hierarchy except it cannot
be a leaf (like an alias).

Patch

diff --git a/memory.c b/memory.c
index 2f68d67..cf67c66 100644
--- a/memory.c
+++ b/memory.c
@@ -499,6 +499,11 @@  static void render_memory_region(FlatView *view,

     clip = addrrange_intersection(tmp, clip);

+    /* Render subregions in priority order. */
+    QTAILQ_FOREACH(subregion, &mr->subregions, subregions_link) {
+        render_memory_region(view, subregion, base, clip, readonly);
+    }
+
     if (mr->alias) {
         int128_subfrom(&base, int128_make64(mr->alias->addr));
         int128_subfrom(&base, int128_make64(mr->alias_offset));
@@ -506,11 +511,6 @@  static void render_memory_region(FlatView *view,
         return;
     }

-    /* Render subregions in priority order. */
-    QTAILQ_FOREACH(subregion, &mr->subregions, subregions_link) {
-        render_memory_region(view, subregion, base, clip, readonly);
-    }
-

Regards,
pingfan