Message ID | 1411028820-29933-4-git-send-email-mikey@neuling.org (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
On Thu, 2014-09-18 at 18:26 +1000, Michael Neuling wrote: > From: Ian Munsie <imunsie@au1.ibm.com> > > Currently msi_bitmap_alloc_hwirqs() will round up any IRQ allocation requests > to the nearest power of 2. eg. ask for 5 IRQs and you'll get 8. This wastes a > lot of IRQs which can be a scarce resource. > > For cxl we can require multiple IRQs for every contexts that is attached to the > accelerator. For AFU directed accelerators, there may be 1000s of contexts > attached, hence we can easily run out of IRQs, especially if we are needlessly > wasting them. > > This changes the msi_bitmap_alloc_hwirqs() to allocate only the required number > of IRQs, hence avoiding this wastage. > > Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> > Signed-off-by: Michael Neuling <mikey@neuling.org> > --- > arch/powerpc/sysdev/msi_bitmap.c | 18 +++++++++++++----- > 1 file changed, 13 insertions(+), 5 deletions(-) This conflicts with (and partially duplicates) http://patchwork.ozlabs.org/patch/381892/ which I have in my tree. How should we handle it? Laurentiu, from looking at the overlap between patches I see a problem with your existing patch, regarding the out-of-irqs path and msi_bitmap_free_hwirqs(), so one way or another that needs to get fixed soon. -Scott > diff --git a/arch/powerpc/sysdev/msi_bitmap.c b/arch/powerpc/sysdev/msi_bitmap.c > index 2ff6302..e001559 100644 > --- a/arch/powerpc/sysdev/msi_bitmap.c > +++ b/arch/powerpc/sysdev/msi_bitmap.c > @@ -24,28 +24,36 @@ int msi_bitmap_alloc_hwirqs(struct msi_bitmap *bmp, int num) > * This is fast, but stricter than we need. We might want to add > * a fallback routine which does a linear search with no alignment. > */ > - offset = bitmap_find_free_region(bmp->bitmap, bmp->irq_count, order); > + offset = bitmap_find_next_zero_area(bmp->bitmap, bmp->irq_count, 0, > + num, (1 << order) - 1); > + if (offset > bmp->irq_count) > + goto err; > + bitmap_set(bmp->bitmap, offset, num); > spin_unlock_irqrestore(&bmp->lock, flags); > > pr_debug("msi_bitmap: allocated 0x%x (2^%d) at offset 0x%x\n", > num, order, offset); > > return offset; > +err: > + spin_unlock_irqrestore(&bmp->lock, flags); > + return -ENOMEM; > } > +EXPORT_SYMBOL(msi_bitmap_alloc_hwirqs); > > void msi_bitmap_free_hwirqs(struct msi_bitmap *bmp, unsigned int offset, > unsigned int num) > { > unsigned long flags; > - int order = get_count_order(num); > > - pr_debug("msi_bitmap: freeing 0x%x (2^%d) at offset 0x%x\n", > - num, order, offset); > + pr_debug("msi_bitmap: freeing 0x%x at offset 0x%x\n", > + num, offset); > > spin_lock_irqsave(&bmp->lock, flags); > - bitmap_release_region(bmp->bitmap, offset, order); > + bitmap_clear(bmp->bitmap, offset, num); > spin_unlock_irqrestore(&bmp->lock, flags); > } > +EXPORT_SYMBOL(msi_bitmap_free_hwirqs); > > void msi_bitmap_reserve_hwirq(struct msi_bitmap *bmp, unsigned int hwirq) > {
On Fri, 2014-09-19 at 15:16 -0500, Scott Wood wrote: > On Thu, 2014-09-18 at 18:26 +1000, Michael Neuling wrote: > > From: Ian Munsie <imunsie@au1.ibm.com> > > > > Currently msi_bitmap_alloc_hwirqs() will round up any IRQ allocation requests > > to the nearest power of 2. eg. ask for 5 IRQs and you'll get 8. This wastes a > > lot of IRQs which can be a scarce resource. > > > > For cxl we can require multiple IRQs for every contexts that is attached to the > > accelerator. For AFU directed accelerators, there may be 1000s of contexts > > attached, hence we can easily run out of IRQs, especially if we are needlessly > > wasting them. > > > > This changes the msi_bitmap_alloc_hwirqs() to allocate only the required number > > of IRQs, hence avoiding this wastage. > > > > Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> > > Signed-off-by: Michael Neuling <mikey@neuling.org> > > --- > > arch/powerpc/sysdev/msi_bitmap.c | 18 +++++++++++++----- > > 1 file changed, 13 insertions(+), 5 deletions(-) > > This conflicts with (and partially duplicates) > http://patchwork.ozlabs.org/patch/381892/ > which I have in my tree. How should we handle it? > > Laurentiu, from looking at the overlap between patches I see a problem > with your existing patch, regarding the out-of-irqs path and > msi_bitmap_free_hwirqs(), so one way or another that needs to get fixed > soon. Given the problems with Laurentiu's patch, perhaps it'd be best for me to just revert that patch in my tree, and respin it after this patchset has been merged. -Scott
On 09/19/2014 11:16 PM, Scott Wood wrote: > On Thu, 2014-09-18 at 18:26 +1000, Michael Neuling wrote: >> From: Ian Munsie <imunsie@au1.ibm.com> >> >> Currently msi_bitmap_alloc_hwirqs() will round up any IRQ allocation requests >> to the nearest power of 2. eg. ask for 5 IRQs and you'll get 8. This wastes a >> lot of IRQs which can be a scarce resource. >> >> For cxl we can require multiple IRQs for every contexts that is attached to the >> accelerator. For AFU directed accelerators, there may be 1000s of contexts >> attached, hence we can easily run out of IRQs, especially if we are needlessly >> wasting them. >> >> This changes the msi_bitmap_alloc_hwirqs() to allocate only the required number >> of IRQs, hence avoiding this wastage. >> >> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> >> Signed-off-by: Michael Neuling <mikey@neuling.org> >> --- >> arch/powerpc/sysdev/msi_bitmap.c | 18 +++++++++++++----- >> 1 file changed, 13 insertions(+), 5 deletions(-) > > This conflicts with (and partially duplicates) > http://patchwork.ozlabs.org/patch/381892/ > which I have in my tree. How should we handle it? > > Laurentiu, from looking at the overlap between patches I see a problem > with your existing patch, regarding the out-of-irqs path and > msi_bitmap_free_hwirqs(), so one way or another that needs to get fixed > soon. > Agree. My patch lacks error checking so Michael's patch is better. --- Best Regards, Laurentiu > >> diff --git a/arch/powerpc/sysdev/msi_bitmap.c b/arch/powerpc/sysdev/msi_bitmap.c >> index 2ff6302..e001559 100644 >> --- a/arch/powerpc/sysdev/msi_bitmap.c >> +++ b/arch/powerpc/sysdev/msi_bitmap.c >> @@ -24,28 +24,36 @@ int msi_bitmap_alloc_hwirqs(struct msi_bitmap *bmp, int num) >> * This is fast, but stricter than we need. We might want to add >> * a fallback routine which does a linear search with no alignment. >> */ >> - offset = bitmap_find_free_region(bmp->bitmap, bmp->irq_count, order); >> + offset = bitmap_find_next_zero_area(bmp->bitmap, bmp->irq_count, 0, >> + num, (1 << order) - 1); >> + if (offset > bmp->irq_count) >> + goto err; >> + bitmap_set(bmp->bitmap, offset, num); >> spin_unlock_irqrestore(&bmp->lock, flags); >> >> pr_debug("msi_bitmap: allocated 0x%x (2^%d) at offset 0x%x\n", >> num, order, offset); >> >> return offset; >> +err: >> + spin_unlock_irqrestore(&bmp->lock, flags); >> + return -ENOMEM; >> } >> +EXPORT_SYMBOL(msi_bitmap_alloc_hwirqs); >> >> void msi_bitmap_free_hwirqs(struct msi_bitmap *bmp, unsigned int offset, >> unsigned int num) >> { >> unsigned long flags; >> - int order = get_count_order(num); >> >> - pr_debug("msi_bitmap: freeing 0x%x (2^%d) at offset 0x%x\n", >> - num, order, offset); >> + pr_debug("msi_bitmap: freeing 0x%x at offset 0x%x\n", >> + num, offset); >> >> spin_lock_irqsave(&bmp->lock, flags); >> - bitmap_release_region(bmp->bitmap, offset, order); >> + bitmap_clear(bmp->bitmap, offset, num); >> spin_unlock_irqrestore(&bmp->lock, flags); >> } >> +EXPORT_SYMBOL(msi_bitmap_free_hwirqs); >> >> void msi_bitmap_reserve_hwirq(struct msi_bitmap *bmp, unsigned int hwirq) >> { > >
On 09/19/2014 11:19 PM, Scott Wood wrote: > On Fri, 2014-09-19 at 15:16 -0500, Scott Wood wrote: >> On Thu, 2014-09-18 at 18:26 +1000, Michael Neuling wrote: >>> From: Ian Munsie <imunsie@au1.ibm.com> >>> >>> Currently msi_bitmap_alloc_hwirqs() will round up any IRQ allocation requests >>> to the nearest power of 2. eg. ask for 5 IRQs and you'll get 8. This wastes a >>> lot of IRQs which can be a scarce resource. >>> >>> For cxl we can require multiple IRQs for every contexts that is attached to the >>> accelerator. For AFU directed accelerators, there may be 1000s of contexts >>> attached, hence we can easily run out of IRQs, especially if we are needlessly >>> wasting them. >>> >>> This changes the msi_bitmap_alloc_hwirqs() to allocate only the required number >>> of IRQs, hence avoiding this wastage. >>> >>> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> >>> Signed-off-by: Michael Neuling <mikey@neuling.org> >>> --- >>> arch/powerpc/sysdev/msi_bitmap.c | 18 +++++++++++++----- >>> 1 file changed, 13 insertions(+), 5 deletions(-) >> >> This conflicts with (and partially duplicates) >> http://patchwork.ozlabs.org/patch/381892/ >> which I have in my tree. How should we handle it? >> >> Laurentiu, from looking at the overlap between patches I see a problem >> with your existing patch, regarding the out-of-irqs path and >> msi_bitmap_free_hwirqs(), so one way or another that needs to get fixed >> soon. > > Given the problems with Laurentiu's patch, perhaps it'd be best for me > to just revert that patch in my tree, and respin it after this patchset > has been merged. Let me know if you want me to rebase my stuff on top of Michael's patch. --- Best Regards, Laurentiu
Hi Michael, Minor comment inline. On 09/18/2014 11:26 AM, Michael Neuling wrote: > From: Ian Munsie <imunsie@au1.ibm.com> > > Currently msi_bitmap_alloc_hwirqs() will round up any IRQ allocation requests > to the nearest power of 2. eg. ask for 5 IRQs and you'll get 8. This wastes a > lot of IRQs which can be a scarce resource. > > For cxl we can require multiple IRQs for every contexts that is attached to the > accelerator. For AFU directed accelerators, there may be 1000s of contexts > attached, hence we can easily run out of IRQs, especially if we are needlessly > wasting them. > > This changes the msi_bitmap_alloc_hwirqs() to allocate only the required number > of IRQs, hence avoiding this wastage. > > Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> > Signed-off-by: Michael Neuling <mikey@neuling.org> > --- > arch/powerpc/sysdev/msi_bitmap.c | 18 +++++++++++++----- > 1 file changed, 13 insertions(+), 5 deletions(-) > > diff --git a/arch/powerpc/sysdev/msi_bitmap.c b/arch/powerpc/sysdev/msi_bitmap.c > index 2ff6302..e001559 100644 > --- a/arch/powerpc/sysdev/msi_bitmap.c > +++ b/arch/powerpc/sysdev/msi_bitmap.c > @@ -24,28 +24,36 @@ int msi_bitmap_alloc_hwirqs(struct msi_bitmap *bmp, int num) > * This is fast, but stricter than we need. We might want to add > * a fallback routine which does a linear search with no alignment. > */ Is this comment still relevant (especially the part mentioning "fast")? --- Best Regards, Laurentiu
> > diff --git a/arch/powerpc/sysdev/msi_bitmap.c b/arch/powerpc/sysdev/msi_bitmap.c > > index 2ff6302..e001559 100644 > > --- a/arch/powerpc/sysdev/msi_bitmap.c > > +++ b/arch/powerpc/sysdev/msi_bitmap.c > > @@ -24,28 +24,36 @@ int msi_bitmap_alloc_hwirqs(struct msi_bitmap *bmp, int num) > > * This is fast, but stricter than we need. We might want to add > > * a fallback routine which does a linear search with no alignment. > > */ > > Is this comment still relevant (especially the part mentioning "fast")? You're right, it's not really relevant any more. I'll remove. Thanks Mikey
On Mon, 2014-09-22 at 11:26 +0300, Laurentiu Tudor wrote: > On 09/19/2014 11:19 PM, Scott Wood wrote: > > On Fri, 2014-09-19 at 15:16 -0500, Scott Wood wrote: > >> On Thu, 2014-09-18 at 18:26 +1000, Michael Neuling wrote: > >>> From: Ian Munsie <imunsie@au1.ibm.com> > >>> > >>> Currently msi_bitmap_alloc_hwirqs() will round up any IRQ allocation requests > >>> to the nearest power of 2. eg. ask for 5 IRQs and you'll get 8. This wastes a > >>> lot of IRQs which can be a scarce resource. > >>> > >>> For cxl we can require multiple IRQs for every contexts that is attached to the > >>> accelerator. For AFU directed accelerators, there may be 1000s of contexts > >>> attached, hence we can easily run out of IRQs, especially if we are needlessly > >>> wasting them. > >>> > >>> This changes the msi_bitmap_alloc_hwirqs() to allocate only the required number > >>> of IRQs, hence avoiding this wastage. > >>> > >>> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> > >>> Signed-off-by: Michael Neuling <mikey@neuling.org> > >>> --- > >>> arch/powerpc/sysdev/msi_bitmap.c | 18 +++++++++++++----- > >>> 1 file changed, 13 insertions(+), 5 deletions(-) > >> > >> This conflicts with (and partially duplicates) > >> http://patchwork.ozlabs.org/patch/381892/ > >> which I have in my tree. How should we handle it? > >> > >> Laurentiu, from looking at the overlap between patches I see a problem > >> with your existing patch, regarding the out-of-irqs path and > >> msi_bitmap_free_hwirqs(), so one way or another that needs to get fixed > >> soon. > > > > Given the problems with Laurentiu's patch, perhaps it'd be best for me > > to just revert that patch in my tree, and respin it after this patchset > > has been merged. > > Let me know if you want me to rebase my stuff on top of Michael's patch. Yes, please resend it once Michael's patch gets merged. -Scott
diff --git a/arch/powerpc/sysdev/msi_bitmap.c b/arch/powerpc/sysdev/msi_bitmap.c index 2ff6302..e001559 100644 --- a/arch/powerpc/sysdev/msi_bitmap.c +++ b/arch/powerpc/sysdev/msi_bitmap.c @@ -24,28 +24,36 @@ int msi_bitmap_alloc_hwirqs(struct msi_bitmap *bmp, int num) * This is fast, but stricter than we need. We might want to add * a fallback routine which does a linear search with no alignment. */ - offset = bitmap_find_free_region(bmp->bitmap, bmp->irq_count, order); + offset = bitmap_find_next_zero_area(bmp->bitmap, bmp->irq_count, 0, + num, (1 << order) - 1); + if (offset > bmp->irq_count) + goto err; + bitmap_set(bmp->bitmap, offset, num); spin_unlock_irqrestore(&bmp->lock, flags); pr_debug("msi_bitmap: allocated 0x%x (2^%d) at offset 0x%x\n", num, order, offset); return offset; +err: + spin_unlock_irqrestore(&bmp->lock, flags); + return -ENOMEM; } +EXPORT_SYMBOL(msi_bitmap_alloc_hwirqs); void msi_bitmap_free_hwirqs(struct msi_bitmap *bmp, unsigned int offset, unsigned int num) { unsigned long flags; - int order = get_count_order(num); - pr_debug("msi_bitmap: freeing 0x%x (2^%d) at offset 0x%x\n", - num, order, offset); + pr_debug("msi_bitmap: freeing 0x%x at offset 0x%x\n", + num, offset); spin_lock_irqsave(&bmp->lock, flags); - bitmap_release_region(bmp->bitmap, offset, order); + bitmap_clear(bmp->bitmap, offset, num); spin_unlock_irqrestore(&bmp->lock, flags); } +EXPORT_SYMBOL(msi_bitmap_free_hwirqs); void msi_bitmap_reserve_hwirq(struct msi_bitmap *bmp, unsigned int hwirq) {