Patchwork Re: [PATCH] qemu/pci: optimize pci config handling

login
register
mail settings
Submitter Paolo Bonzini
Date Oct. 7, 2009, 1:01 p.m.
Message ID <4ACC9133.9060903@redhat.com>
Download mbox | patch
Permalink /patch/35285/
State RFC
Headers show

Comments

Paolo Bonzini - Oct. 7, 2009, 1:01 p.m.
On 10/07/2009 02:33 PM, Michael S. Tsirkin wrote:
> There's no need to save all of config space before each config cycle:
> just the 64 byte header is enough for our purposes.  This will become
> more important as we add pci express support, which has 4K config space.

You can even go a step further and save it only if something is actually 
being changed.  Untested though.

Not-quite-signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Paolo
Michael S. Tsirkin - Oct. 7, 2009, 1:48 p.m.
On Wed, Oct 07, 2009 at 03:01:39PM +0200, Paolo Bonzini wrote:
> On 10/07/2009 02:33 PM, Michael S. Tsirkin wrote:
>> There's no need to save all of config space before each config cycle:
>> just the 64 byte header is enough for our purposes.  This will become
>> more important as we add pci express support, which has 4K config space.
>
> You can even go a step further and save it only if something is actually  
> being changed.  Untested though.

I'm actively trying to get out of range-checking address.
What you porpose here is certainly more code than we had.
So why is this a good idea?

> Not-quite-signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>
> Paolo

> diff --git a/hw/pci.c b/hw/pci.c
> index 41e99a9..f9959fc 100644
> --- a/hw/pci.c
> +++ b/hw/pci.c
> @@ -541,19 +541,26 @@ uint32_t pci_default_read_config(PCIDevice *d,
>  
>  void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val, int l)
>  {
> -    uint8_t orig[PCI_CONFIG_SPACE_SIZE];
> +    uint8_t orig[PCI_CONFIG_HEADER_SIZE];
>      int i;
>  
> -    /* not efficient, but simple */
> -    memcpy(orig, d->config, PCI_CONFIG_SPACE_SIZE);
> -    for(i = 0; i < l && addr < PCI_CONFIG_SPACE_SIZE; val >>= 8, ++i, ++addr) {
> -        uint8_t wmask = d->wmask[addr];
> -        d->config[addr] = (d->config[addr] & ~wmask) | (val & wmask);
> +    /* not efficient, but simple.  If modifying the header, save it so we
> +       can compare its contents later.  */
> +    if (addr < sizeof orig) {
> +        memcpy(orig, d->config, sizeof orig);
> +    }
> +
> +    for(i = 0; i < l && addr+i < PCI_CONFIG_SPACE_SIZE; val >>= 8, ++i) {
> +        uint8_t wmask = d->wmask[addr+i];
> +        d->config[addr+i] = (d->config[addr+i] & ~wmask) | (val & wmask);
> +    }
> +
> +    if (addr < sizeof orig) {
> +        if (memcmp(orig + PCI_BASE_ADDRESS_0, d->config + PCI_BASE_ADDRESS_0, 24)
> +            || ((orig[PCI_COMMAND] ^ d->config[PCI_COMMAND])
> +                & (PCI_COMMAND_MEMORY | PCI_COMMAND_IO)))
> +            pci_update_mappings(d);
>      }
> -    if (memcmp(orig + PCI_BASE_ADDRESS_0, d->config + PCI_BASE_ADDRESS_0, 24)
> -        || ((orig[PCI_COMMAND] ^ d->config[PCI_COMMAND])
> -            & (PCI_COMMAND_MEMORY | PCI_COMMAND_IO)))
> -        pci_update_mappings(d);
>  }
>  
>  void pci_data_write(void *opaque, uint32_t addr, uint32_t val, int len)
Paolo Bonzini - Oct. 7, 2009, 2:24 p.m.
> I'm actively trying to get out of range-checking address.

I don't understand what you mean, sorry.

> What you porpose here is certainly more code than we had.
> So why is this a good idea?

Because it avoids the memcpy/memcmp most of the time (when the memcmp 
would surely succeed).  I supposed that would also matter more as the 
config space size increases---correct me and dismiss the patch if I am 
mistaken.

Paolo
Michael S. Tsirkin - Oct. 7, 2009, 7:21 p.m.
On Wed, Oct 07, 2009 at 04:24:45PM +0200, Paolo Bonzini wrote:
>> I'm actively trying to get out of range-checking address.
>
> I don't understand what you mean, sorry.
>
>> What you porpose here is certainly more code than we had.
>> So why is this a good idea?
>
> Because it avoids the memcpy/memcmp most of the time (when the memcmp  
> would surely succeed).

Yes :) But at the cost of more code.  I don't think speed
matters there, so less code is good.  But even if it did, extra
read of a single cache line might be cheaper than an extra branch,
and generated code will be more compact.

>  I supposed that would also matter more as the  
> config space size increases---correct me and dismiss the patch if I am  
> mistaken.

No, we'll always only look need to look at the header, whatever the size
of the config space.  That's the point of the patch I posted - future
proof against config space size increases, not optimization.

> Paolo
Paolo Bonzini - Oct. 8, 2009, 8:23 a.m.
>>> What you porpose here is certainly more code than we had.
>>> So why is this a good idea?
>>
>> Because it avoids the memcpy/memcmp most of the time (when the memcmp
>> would surely succeed).
>
> Yes :) But at the cost of more code.  I don't think speed
> matters there, so less code is good.

Fine.

>>   I supposed that would also matter more as the
>> config space size increases---correct me and dismiss the patch if I am
>> mistaken.
>
> No, we'll always only look need to look at the header, whatever the size
> of the config space.  That's the point of the patch I posted - future
> proof against config space size increases, not optimization.

But fewer reads on average will not modify the header, so there will be 
even fewer memcpy with my patch when the config space will be 4k.

Paolo
Michael S. Tsirkin - Oct. 8, 2009, 8:57 a.m.
On Thu, Oct 08, 2009 at 10:23:41AM +0200, Paolo Bonzini wrote:
>
>>>> What you porpose here is certainly more code than we had.
>>>> So why is this a good idea?
>>>
>>> Because it avoids the memcpy/memcmp most of the time (when the memcmp
>>> would surely succeed).
>>
>> Yes :) But at the cost of more code.  I don't think speed
>> matters there, so less code is good.
>
> Fine.
>
>>>   I supposed that would also matter more as the
>>> config space size increases---correct me and dismiss the patch if I am
>>> mistaken.
>>
>> No, we'll always only look need to look at the header, whatever the size
>> of the config space.  That's the point of the patch I posted - future
>> proof against config space size increases, not optimization.
>
> But fewer reads on average will not modify the header, so there will be  
> even fewer memcpy with my patch when the config space will be 4k.

Oh, I see. I was worrying about us adding some side effect where you
write into PCI extended register and header changes.  Maybe that won't
ever happen.

> Paolo

Patch

diff --git a/hw/pci.c b/hw/pci.c
index 41e99a9..f9959fc 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -541,19 +541,26 @@  uint32_t pci_default_read_config(PCIDevice *d,
 
 void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val, int l)
 {
-    uint8_t orig[PCI_CONFIG_SPACE_SIZE];
+    uint8_t orig[PCI_CONFIG_HEADER_SIZE];
     int i;
 
-    /* not efficient, but simple */
-    memcpy(orig, d->config, PCI_CONFIG_SPACE_SIZE);
-    for(i = 0; i < l && addr < PCI_CONFIG_SPACE_SIZE; val >>= 8, ++i, ++addr) {
-        uint8_t wmask = d->wmask[addr];
-        d->config[addr] = (d->config[addr] & ~wmask) | (val & wmask);
+    /* not efficient, but simple.  If modifying the header, save it so we
+       can compare its contents later.  */
+    if (addr < sizeof orig) {
+        memcpy(orig, d->config, sizeof orig);
+    }
+
+    for(i = 0; i < l && addr+i < PCI_CONFIG_SPACE_SIZE; val >>= 8, ++i) {
+        uint8_t wmask = d->wmask[addr+i];
+        d->config[addr+i] = (d->config[addr+i] & ~wmask) | (val & wmask);
+    }
+
+    if (addr < sizeof orig) {
+        if (memcmp(orig + PCI_BASE_ADDRESS_0, d->config + PCI_BASE_ADDRESS_0, 24)
+            || ((orig[PCI_COMMAND] ^ d->config[PCI_COMMAND])
+                & (PCI_COMMAND_MEMORY | PCI_COMMAND_IO)))
+            pci_update_mappings(d);
     }
-    if (memcmp(orig + PCI_BASE_ADDRESS_0, d->config + PCI_BASE_ADDRESS_0, 24)
-        || ((orig[PCI_COMMAND] ^ d->config[PCI_COMMAND])
-            & (PCI_COMMAND_MEMORY | PCI_COMMAND_IO)))
-        pci_update_mappings(d);
 }
 
 void pci_data_write(void *opaque, uint32_t addr, uint32_t val, int len)