Message ID | 1448016301-20944-1-git-send-email-caoj.fnst@cn.fujitsu.com |
---|---|
State | New |
Headers | show |
On Fri, Nov 20, 2015 at 06:45:01PM +0800, Cao jin wrote: > 1. Do param check in pci_add_capability2(), as it is a public API. Separate patch pls. > 2. As spec says, each capability must be DWORD aligned, so an optimization can > be done via Loop Unrolling. Why do we want to optimize it? > > Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com> > --- > hw/pci/pci.c | 12 ++++++++---- > 1 file changed, 8 insertions(+), 4 deletions(-) > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c > index 168b9cc..1e99603 100644 > --- a/hw/pci/pci.c > +++ b/hw/pci/pci.c > @@ -1924,13 +1924,15 @@ PCIDevice *pci_create_simple(PCIBus *bus, int devfn, const char *name) > static uint8_t pci_find_space(PCIDevice *pdev, uint8_t size) > { > int offset = PCI_CONFIG_HEADER_SIZE; > - int i; > - for (i = PCI_CONFIG_HEADER_SIZE; i < PCI_CONFIG_SPACE_SIZE; ++i) { > + int i = PCI_CONFIG_HEADER_SIZE;; > + > + for (; i < PCI_CONFIG_SPACE_SIZE; i = i + 4) { > if (pdev->used[i]) > - offset = i + 1; > - else if (i - offset + 1 == size) > + offset = i + 4; > + else if (i - offset >= size) > return offset; > } > + > return 0; > } > > @@ -2144,6 +2146,8 @@ int pci_add_capability2(PCIDevice *pdev, uint8_t cap_id, > uint8_t *config; > int i, overlapping_cap; > > + assert(size > 0); > + > if (!offset) { > offset = pci_find_space(pdev, size); > if (!offset) { > -- > 2.1.0
On 11/20/2015 06:45 PM, Michael S. Tsirkin wrote: > On Fri, Nov 20, 2015 at 06:45:01PM +0800, Cao jin wrote: >> 1. Do param check in pci_add_capability2(), as it is a public API. > > Separate patch pls. OK > >> 2. As spec says, each capability must be DWORD aligned, so an optimization can >> be done via Loop Unrolling. > > Why do we want to optimize it? > For tiny performance improvement via less loop. take pcie express capability(60 bytes at most) for example, it may loop 60 times, now we just need 15 times, a quarter of before. >> >> Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com> >> --- >> hw/pci/pci.c | 12 ++++++++---- >> 1 file changed, 8 insertions(+), 4 deletions(-) >> >> diff --git a/hw/pci/pci.c b/hw/pci/pci.c >> index 168b9cc..1e99603 100644 >> --- a/hw/pci/pci.c >> +++ b/hw/pci/pci.c >> @@ -1924,13 +1924,15 @@ PCIDevice *pci_create_simple(PCIBus *bus, int devfn, const char *name) >> static uint8_t pci_find_space(PCIDevice *pdev, uint8_t size) >> { >> int offset = PCI_CONFIG_HEADER_SIZE; >> - int i; >> - for (i = PCI_CONFIG_HEADER_SIZE; i < PCI_CONFIG_SPACE_SIZE; ++i) { >> + int i = PCI_CONFIG_HEADER_SIZE;; >> + >> + for (; i < PCI_CONFIG_SPACE_SIZE; i = i + 4) { >> if (pdev->used[i]) >> - offset = i + 1; >> - else if (i - offset + 1 == size) >> + offset = i + 4; >> + else if (i - offset >= size) >> return offset; >> } >> + >> return 0; >> } >> >> @@ -2144,6 +2146,8 @@ int pci_add_capability2(PCIDevice *pdev, uint8_t cap_id, >> uint8_t *config; >> int i, overlapping_cap; >> >> + assert(size > 0); >> + >> if (!offset) { >> offset = pci_find_space(pdev, size); >> if (!offset) { >> -- >> 2.1.0 > . >
On Fri, Nov 20, 2015 at 07:04:07PM +0800, Cao jin wrote: > > > On 11/20/2015 06:45 PM, Michael S. Tsirkin wrote: > >On Fri, Nov 20, 2015 at 06:45:01PM +0800, Cao jin wrote: > >>1. Do param check in pci_add_capability2(), as it is a public API. > > > >Separate patch pls. > > OK > > > > >>2. As spec says, each capability must be DWORD aligned, so an optimization can > >> be done via Loop Unrolling. > > > >Why do we want to optimize it? > > > > For tiny performance improvement via less loop. take pcie express > capability(60 bytes at most) for example, it may loop 60 times, now we just > need 15 times, a quarter of before. But who cares? This is not a data path operation. > >> > >>Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com> > >>--- > >> hw/pci/pci.c | 12 ++++++++---- > >> 1 file changed, 8 insertions(+), 4 deletions(-) > >> > >>diff --git a/hw/pci/pci.c b/hw/pci/pci.c > >>index 168b9cc..1e99603 100644 > >>--- a/hw/pci/pci.c > >>+++ b/hw/pci/pci.c > >>@@ -1924,13 +1924,15 @@ PCIDevice *pci_create_simple(PCIBus *bus, int devfn, const char *name) > >> static uint8_t pci_find_space(PCIDevice *pdev, uint8_t size) > >> { > >> int offset = PCI_CONFIG_HEADER_SIZE; > >>- int i; > >>- for (i = PCI_CONFIG_HEADER_SIZE; i < PCI_CONFIG_SPACE_SIZE; ++i) { > >>+ int i = PCI_CONFIG_HEADER_SIZE;; > >>+ > >>+ for (; i < PCI_CONFIG_SPACE_SIZE; i = i + 4) { > >> if (pdev->used[i]) > >>- offset = i + 1; > >>- else if (i - offset + 1 == size) > >>+ offset = i + 4; > >>+ else if (i - offset >= size) > >> return offset; > >> } > >>+ > >> return 0; > >> } > >> > >>@@ -2144,6 +2146,8 @@ int pci_add_capability2(PCIDevice *pdev, uint8_t cap_id, > >> uint8_t *config; > >> int i, overlapping_cap; > >> > >>+ assert(size > 0); > >>+ > >> if (!offset) { > >> offset = pci_find_space(pdev, size); > >> if (!offset) { > >>-- > >>2.1.0 > >. > > > > -- > Yours Sincerely, > > Cao Jin
On 11/20/2015 07:26 PM, Michael S. Tsirkin wrote: > On Fri, Nov 20, 2015 at 07:04:07PM +0800, Cao jin wrote: >> >> >> On 11/20/2015 06:45 PM, Michael S. Tsirkin wrote: >>> On Fri, Nov 20, 2015 at 06:45:01PM +0800, Cao jin wrote: >>> >>>> 2. As spec says, each capability must be DWORD aligned, so an optimization can >>>> be done via Loop Unrolling. >>> >>> Why do we want to optimize it? >>> >> >> For tiny performance improvement via less loop. take pcie express >> capability(60 bytes at most) for example, it may loop 60 times, now we just >> need 15 times, a quarter of before. > > But who cares? This is not a data path operation. It is tiny thing I found when browsing code. When found there are several places looks like this, I think maybe it does good to qemu to do this and CCed to you because it don`t look like a simple trivial patch. So, hey Michael, if you don`t like this kind of optimization, that`t ok, forget it. But I think it make me little confused when determine which kind of patch should be CCed to you. > >>>> >>>> Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com> >>>> --- >>>> hw/pci/pci.c | 12 ++++++++---- >>>> 1 file changed, 8 insertions(+), 4 deletions(-) >>>> >>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c >>>> index 168b9cc..1e99603 100644 >>>> --- a/hw/pci/pci.c >>>> +++ b/hw/pci/pci.c >>>> @@ -1924,13 +1924,15 @@ PCIDevice *pci_create_simple(PCIBus *bus, int devfn, const char *name) >>>> static uint8_t pci_find_space(PCIDevice *pdev, uint8_t size) >>>> { >>>> int offset = PCI_CONFIG_HEADER_SIZE; >>>> - int i; >>>> - for (i = PCI_CONFIG_HEADER_SIZE; i < PCI_CONFIG_SPACE_SIZE; ++i) { >>>> + int i = PCI_CONFIG_HEADER_SIZE;; >>>> + >>>> + for (; i < PCI_CONFIG_SPACE_SIZE; i = i + 4) { >>>> if (pdev->used[i]) >>>> - offset = i + 1; >>>> - else if (i - offset + 1 == size) >>>> + offset = i + 4; >>>> + else if (i - offset >= size) >>>> return offset; >>>> } >>>> + >>>> return 0; >>>> } >>>> >>>> @@ -2144,6 +2146,8 @@ int pci_add_capability2(PCIDevice *pdev, uint8_t cap_id, >>>> uint8_t *config; >>>> int i, overlapping_cap; >>>> >>>> + assert(size > 0); >>>> + >>>> if (!offset) { >>>> offset = pci_find_space(pdev, size); >>>> if (!offset) { >>>> -- >>>> 2.1.0 >>> . >>> >> >> -- >> Yours Sincerely, >> >> Cao Jin > . >
On Fri, Nov 20, 2015 at 07:58:01PM +0800, Cao jin wrote: > > > On 11/20/2015 07:26 PM, Michael S. Tsirkin wrote: > >On Fri, Nov 20, 2015 at 07:04:07PM +0800, Cao jin wrote: > >> > >> > >>On 11/20/2015 06:45 PM, Michael S. Tsirkin wrote: > >>>On Fri, Nov 20, 2015 at 06:45:01PM +0800, Cao jin wrote: > >>> > >>>>2. As spec says, each capability must be DWORD aligned, so an optimization can > >>>> be done via Loop Unrolling. > >>> > >>>Why do we want to optimize it? > >>> > >> > >>For tiny performance improvement via less loop. take pcie express > >>capability(60 bytes at most) for example, it may loop 60 times, now we just > >>need 15 times, a quarter of before. > > > >But who cares? This is not a data path operation. > > It is tiny thing I found when browsing code. When found there are several > places looks like this, I think maybe it does good to qemu to do this and > CCed to you because it don`t look like a simple trivial patch. > > So, hey Michael, if you don`t like this kind of optimization, that`t ok, > forget it. But I think it make me little confused when determine which kind > of patch should be CCed to you. Optimization patches should normally include performance numbers if they are to be merged. Try to come up with a benchmark and you will realize that the speed of this function has no effect under even half way realistic conditions. > > > >>>> > >>>>Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com> > >>>>--- > >>>> hw/pci/pci.c | 12 ++++++++---- > >>>> 1 file changed, 8 insertions(+), 4 deletions(-) > >>>> > >>>>diff --git a/hw/pci/pci.c b/hw/pci/pci.c > >>>>index 168b9cc..1e99603 100644 > >>>>--- a/hw/pci/pci.c > >>>>+++ b/hw/pci/pci.c > >>>>@@ -1924,13 +1924,15 @@ PCIDevice *pci_create_simple(PCIBus *bus, int devfn, const char *name) > >>>> static uint8_t pci_find_space(PCIDevice *pdev, uint8_t size) > >>>> { > >>>> int offset = PCI_CONFIG_HEADER_SIZE; > >>>>- int i; > >>>>- for (i = PCI_CONFIG_HEADER_SIZE; i < PCI_CONFIG_SPACE_SIZE; ++i) { > >>>>+ int i = PCI_CONFIG_HEADER_SIZE;; > >>>>+ > >>>>+ for (; i < PCI_CONFIG_SPACE_SIZE; i = i + 4) { > >>>> if (pdev->used[i]) > >>>>- offset = i + 1; > >>>>- else if (i - offset + 1 == size) > >>>>+ offset = i + 4; > >>>>+ else if (i - offset >= size) > >>>> return offset; > >>>> } > >>>>+ > >>>> return 0; > >>>> } > >>>> > >>>>@@ -2144,6 +2146,8 @@ int pci_add_capability2(PCIDevice *pdev, uint8_t cap_id, > >>>> uint8_t *config; > >>>> int i, overlapping_cap; > >>>> > >>>>+ assert(size > 0); > >>>>+ > >>>> if (!offset) { > >>>> offset = pci_find_space(pdev, size); > >>>> if (!offset) { > >>>>-- > >>>>2.1.0 > >>>. > >>> > >> > >>-- > >>Yours Sincerely, > >> > >>Cao Jin > >. > > > > -- > Yours Sincerely, > > Cao Jin
On 11/20/2015 09:30 PM, Michael S. Tsirkin wrote: > On Fri, Nov 20, 2015 at 07:58:01PM +0800, Cao jin wrote: >> >> >> On 11/20/2015 07:26 PM, Michael S. Tsirkin wrote: >>> On Fri, Nov 20, 2015 at 07:04:07PM +0800, Cao jin wrote: >>>> >>>> >>>> On 11/20/2015 06:45 PM, Michael S. Tsirkin wrote: >>>>> On Fri, Nov 20, 2015 at 06:45:01PM +0800, Cao jin wrote: >>>>> >>>>>> 2. As spec says, each capability must be DWORD aligned, so an optimization can >>>>>> be done via Loop Unrolling. >>>>> >>>>> Why do we want to optimize it? >>>>> >>>> >>>> For tiny performance improvement via less loop. take pcie express >>>> capability(60 bytes at most) for example, it may loop 60 times, now we just >>>> need 15 times, a quarter of before. >>> >>> But who cares? This is not a data path operation. >> >> It is tiny thing I found when browsing code. When found there are several >> places looks like this, I think maybe it does good to qemu to do this and >> CCed to you because it don`t look like a simple trivial patch. >> >> So, hey Michael, if you don`t like this kind of optimization, that`t ok, >> forget it. But I think it make me little confused when determine which kind >> of patch should be CCed to you. > > Optimization patches should normally include performance numbers > if they are to be merged. > Try to come up with a benchmark and you will realize that the speed of > this function has no effect under even half way realistic conditions. > Maybe you are right. OK, will send the param check patch to the qemu-trivial >>>>> . >>>>> >>>> >>>> -- >>>> Yours Sincerely, >>>> >>>> Cao Jin >>> . >>> >> >> -- >> Yours Sincerely, >> >> Cao Jin > . >
diff --git a/hw/pci/pci.c b/hw/pci/pci.c index 168b9cc..1e99603 100644 --- a/hw/pci/pci.c +++ b/hw/pci/pci.c @@ -1924,13 +1924,15 @@ PCIDevice *pci_create_simple(PCIBus *bus, int devfn, const char *name) static uint8_t pci_find_space(PCIDevice *pdev, uint8_t size) { int offset = PCI_CONFIG_HEADER_SIZE; - int i; - for (i = PCI_CONFIG_HEADER_SIZE; i < PCI_CONFIG_SPACE_SIZE; ++i) { + int i = PCI_CONFIG_HEADER_SIZE;; + + for (; i < PCI_CONFIG_SPACE_SIZE; i = i + 4) { if (pdev->used[i]) - offset = i + 1; - else if (i - offset + 1 == size) + offset = i + 4; + else if (i - offset >= size) return offset; } + return 0; } @@ -2144,6 +2146,8 @@ int pci_add_capability2(PCIDevice *pdev, uint8_t cap_id, uint8_t *config; int i, overlapping_cap; + assert(size > 0); + if (!offset) { offset = pci_find_space(pdev, size); if (!offset) {
1. Do param check in pci_add_capability2(), as it is a public API. 2. As spec says, each capability must be DWORD aligned, so an optimization can be done via Loop Unrolling. Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com> --- hw/pci/pci.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)