diff mbox

[03/15] powerpc/msi: Improve IRQ bitmap allocator

Message ID 1411028820-29933-4-git-send-email-mikey@neuling.org (mailing list archive)
State Superseded
Headers show

Commit Message

Michael Neuling Sept. 18, 2014, 8:26 a.m. UTC
From: Ian Munsie <imunsie@au1.ibm.com>

Currently msi_bitmap_alloc_hwirqs() will round up any IRQ allocation requests
to the nearest power of 2.  eg. ask for 5 IRQs and you'll get 8.  This wastes a
lot of IRQs which can be a scarce resource.

For cxl we can require multiple IRQs for every contexts that is attached to the
accelerator.  For AFU directed accelerators, there may be 1000s of contexts
attached, hence we can easily run out of IRQs, especially if we are needlessly
wasting them.

This changes the msi_bitmap_alloc_hwirqs() to allocate only the required number
of IRQs, hence avoiding this wastage.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
---
 arch/powerpc/sysdev/msi_bitmap.c | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

Comments

Scott Wood Sept. 19, 2014, 8:16 p.m. UTC | #1
On Thu, 2014-09-18 at 18:26 +1000, Michael Neuling wrote:
> From: Ian Munsie <imunsie@au1.ibm.com>
> 
> Currently msi_bitmap_alloc_hwirqs() will round up any IRQ allocation requests
> to the nearest power of 2.  eg. ask for 5 IRQs and you'll get 8.  This wastes a
> lot of IRQs which can be a scarce resource.
> 
> For cxl we can require multiple IRQs for every contexts that is attached to the
> accelerator.  For AFU directed accelerators, there may be 1000s of contexts
> attached, hence we can easily run out of IRQs, especially if we are needlessly
> wasting them.
> 
> This changes the msi_bitmap_alloc_hwirqs() to allocate only the required number
> of IRQs, hence avoiding this wastage.
> 
> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
> Signed-off-by: Michael Neuling <mikey@neuling.org>
> ---
>  arch/powerpc/sysdev/msi_bitmap.c | 18 +++++++++++++-----
>  1 file changed, 13 insertions(+), 5 deletions(-)

This conflicts with (and partially duplicates)
http://patchwork.ozlabs.org/patch/381892/
which I have in my tree.  How should we handle it?

Laurentiu, from looking at the overlap between patches I see a problem
with your existing patch, regarding the out-of-irqs path and
msi_bitmap_free_hwirqs(), so one way or another that needs to get fixed
soon.

-Scott

> diff --git a/arch/powerpc/sysdev/msi_bitmap.c b/arch/powerpc/sysdev/msi_bitmap.c
> index 2ff6302..e001559 100644
> --- a/arch/powerpc/sysdev/msi_bitmap.c
> +++ b/arch/powerpc/sysdev/msi_bitmap.c
> @@ -24,28 +24,36 @@ int msi_bitmap_alloc_hwirqs(struct msi_bitmap *bmp, int num)
>  	 * This is fast, but stricter than we need. We might want to add
>  	 * a fallback routine which does a linear search with no alignment.
>  	 */
> -	offset = bitmap_find_free_region(bmp->bitmap, bmp->irq_count, order);
> +	offset = bitmap_find_next_zero_area(bmp->bitmap, bmp->irq_count, 0,
> +					    num, (1 << order) - 1);
> +	if (offset > bmp->irq_count)
> +		goto err;
> +	bitmap_set(bmp->bitmap, offset, num);
>  	spin_unlock_irqrestore(&bmp->lock, flags);
>  
>  	pr_debug("msi_bitmap: allocated 0x%x (2^%d) at offset 0x%x\n",
>  		 num, order, offset);
>  
>  	return offset;
> +err:
> +	spin_unlock_irqrestore(&bmp->lock, flags);
> +	return -ENOMEM;
>  }
> +EXPORT_SYMBOL(msi_bitmap_alloc_hwirqs);
>  
>  void msi_bitmap_free_hwirqs(struct msi_bitmap *bmp, unsigned int offset,
>  			    unsigned int num)
>  {
>  	unsigned long flags;
> -	int order = get_count_order(num);
>  
> -	pr_debug("msi_bitmap: freeing 0x%x (2^%d) at offset 0x%x\n",
> -		 num, order, offset);
> +	pr_debug("msi_bitmap: freeing 0x%x at offset 0x%x\n",
> +		 num, offset);
>  
>  	spin_lock_irqsave(&bmp->lock, flags);
> -	bitmap_release_region(bmp->bitmap, offset, order);
> +	bitmap_clear(bmp->bitmap, offset, num);
>  	spin_unlock_irqrestore(&bmp->lock, flags);
>  }
> +EXPORT_SYMBOL(msi_bitmap_free_hwirqs);
>  
>  void msi_bitmap_reserve_hwirq(struct msi_bitmap *bmp, unsigned int hwirq)
>  {
Scott Wood Sept. 19, 2014, 8:19 p.m. UTC | #2
On Fri, 2014-09-19 at 15:16 -0500, Scott Wood wrote:
> On Thu, 2014-09-18 at 18:26 +1000, Michael Neuling wrote:
> > From: Ian Munsie <imunsie@au1.ibm.com>
> > 
> > Currently msi_bitmap_alloc_hwirqs() will round up any IRQ allocation requests
> > to the nearest power of 2.  eg. ask for 5 IRQs and you'll get 8.  This wastes a
> > lot of IRQs which can be a scarce resource.
> > 
> > For cxl we can require multiple IRQs for every contexts that is attached to the
> > accelerator.  For AFU directed accelerators, there may be 1000s of contexts
> > attached, hence we can easily run out of IRQs, especially if we are needlessly
> > wasting them.
> > 
> > This changes the msi_bitmap_alloc_hwirqs() to allocate only the required number
> > of IRQs, hence avoiding this wastage.
> > 
> > Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
> > Signed-off-by: Michael Neuling <mikey@neuling.org>
> > ---
> >  arch/powerpc/sysdev/msi_bitmap.c | 18 +++++++++++++-----
> >  1 file changed, 13 insertions(+), 5 deletions(-)
> 
> This conflicts with (and partially duplicates)
> http://patchwork.ozlabs.org/patch/381892/
> which I have in my tree.  How should we handle it?
> 
> Laurentiu, from looking at the overlap between patches I see a problem
> with your existing patch, regarding the out-of-irqs path and
> msi_bitmap_free_hwirqs(), so one way or another that needs to get fixed
> soon.

Given the problems with Laurentiu's patch, perhaps it'd be best for me
to just revert that patch in my tree, and respin it after this patchset
has been merged.

-Scott
Tudor Laurentiu Sept. 22, 2014, 8:25 a.m. UTC | #3
On 09/19/2014 11:16 PM, Scott Wood wrote:
> On Thu, 2014-09-18 at 18:26 +1000, Michael Neuling wrote:
>> From: Ian Munsie <imunsie@au1.ibm.com>
>>
>> Currently msi_bitmap_alloc_hwirqs() will round up any IRQ allocation requests
>> to the nearest power of 2.  eg. ask for 5 IRQs and you'll get 8.  This wastes a
>> lot of IRQs which can be a scarce resource.
>>
>> For cxl we can require multiple IRQs for every contexts that is attached to the
>> accelerator.  For AFU directed accelerators, there may be 1000s of contexts
>> attached, hence we can easily run out of IRQs, especially if we are needlessly
>> wasting them.
>>
>> This changes the msi_bitmap_alloc_hwirqs() to allocate only the required number
>> of IRQs, hence avoiding this wastage.
>>
>> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
>> Signed-off-by: Michael Neuling <mikey@neuling.org>
>> ---
>>  arch/powerpc/sysdev/msi_bitmap.c | 18 +++++++++++++-----
>>  1 file changed, 13 insertions(+), 5 deletions(-)
> 
> This conflicts with (and partially duplicates)
> http://patchwork.ozlabs.org/patch/381892/
> which I have in my tree.  How should we handle it?
> 
> Laurentiu, from looking at the overlap between patches I see a problem
> with your existing patch, regarding the out-of-irqs path and
> msi_bitmap_free_hwirqs(), so one way or another that needs to get fixed
> soon.
> 

Agree. My patch lacks error checking so Michael's patch is better.

---
Best Regards, Laurentiu

> 
>> diff --git a/arch/powerpc/sysdev/msi_bitmap.c b/arch/powerpc/sysdev/msi_bitmap.c
>> index 2ff6302..e001559 100644
>> --- a/arch/powerpc/sysdev/msi_bitmap.c
>> +++ b/arch/powerpc/sysdev/msi_bitmap.c
>> @@ -24,28 +24,36 @@ int msi_bitmap_alloc_hwirqs(struct msi_bitmap *bmp, int num)
>>  	 * This is fast, but stricter than we need. We might want to add
>>  	 * a fallback routine which does a linear search with no alignment.
>>  	 */
>> -	offset = bitmap_find_free_region(bmp->bitmap, bmp->irq_count, order);
>> +	offset = bitmap_find_next_zero_area(bmp->bitmap, bmp->irq_count, 0,
>> +					    num, (1 << order) - 1);
>> +	if (offset > bmp->irq_count)
>> +		goto err;
>> +	bitmap_set(bmp->bitmap, offset, num);
>>  	spin_unlock_irqrestore(&bmp->lock, flags);
>>  
>>  	pr_debug("msi_bitmap: allocated 0x%x (2^%d) at offset 0x%x\n",
>>  		 num, order, offset);
>>  
>>  	return offset;
>> +err:
>> +	spin_unlock_irqrestore(&bmp->lock, flags);
>> +	return -ENOMEM;
>>  }
>> +EXPORT_SYMBOL(msi_bitmap_alloc_hwirqs);
>>  
>>  void msi_bitmap_free_hwirqs(struct msi_bitmap *bmp, unsigned int offset,
>>  			    unsigned int num)
>>  {
>>  	unsigned long flags;
>> -	int order = get_count_order(num);
>>  
>> -	pr_debug("msi_bitmap: freeing 0x%x (2^%d) at offset 0x%x\n",
>> -		 num, order, offset);
>> +	pr_debug("msi_bitmap: freeing 0x%x at offset 0x%x\n",
>> +		 num, offset);
>>  
>>  	spin_lock_irqsave(&bmp->lock, flags);
>> -	bitmap_release_region(bmp->bitmap, offset, order);
>> +	bitmap_clear(bmp->bitmap, offset, num);
>>  	spin_unlock_irqrestore(&bmp->lock, flags);
>>  }
>> +EXPORT_SYMBOL(msi_bitmap_free_hwirqs);
>>  
>>  void msi_bitmap_reserve_hwirq(struct msi_bitmap *bmp, unsigned int hwirq)
>>  {
> 
>
Tudor Laurentiu Sept. 22, 2014, 8:26 a.m. UTC | #4
On 09/19/2014 11:19 PM, Scott Wood wrote:
> On Fri, 2014-09-19 at 15:16 -0500, Scott Wood wrote:
>> On Thu, 2014-09-18 at 18:26 +1000, Michael Neuling wrote:
>>> From: Ian Munsie <imunsie@au1.ibm.com>
>>>
>>> Currently msi_bitmap_alloc_hwirqs() will round up any IRQ allocation requests
>>> to the nearest power of 2.  eg. ask for 5 IRQs and you'll get 8.  This wastes a
>>> lot of IRQs which can be a scarce resource.
>>>
>>> For cxl we can require multiple IRQs for every contexts that is attached to the
>>> accelerator.  For AFU directed accelerators, there may be 1000s of contexts
>>> attached, hence we can easily run out of IRQs, especially if we are needlessly
>>> wasting them.
>>>
>>> This changes the msi_bitmap_alloc_hwirqs() to allocate only the required number
>>> of IRQs, hence avoiding this wastage.
>>>
>>> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
>>> Signed-off-by: Michael Neuling <mikey@neuling.org>
>>> ---
>>>  arch/powerpc/sysdev/msi_bitmap.c | 18 +++++++++++++-----
>>>  1 file changed, 13 insertions(+), 5 deletions(-)
>>
>> This conflicts with (and partially duplicates)
>> http://patchwork.ozlabs.org/patch/381892/
>> which I have in my tree.  How should we handle it?
>>
>> Laurentiu, from looking at the overlap between patches I see a problem
>> with your existing patch, regarding the out-of-irqs path and
>> msi_bitmap_free_hwirqs(), so one way or another that needs to get fixed
>> soon.
> 
> Given the problems with Laurentiu's patch, perhaps it'd be best for me
> to just revert that patch in my tree, and respin it after this patchset
> has been merged.

Let me know if you want me to rebase my stuff on top of Michael's patch.

---
Best Regards, Laurentiu
Tudor Laurentiu Sept. 22, 2014, 8:29 a.m. UTC | #5
Hi Michael,

Minor comment inline.

On 09/18/2014 11:26 AM, Michael Neuling wrote:
> From: Ian Munsie <imunsie@au1.ibm.com>
> 
> Currently msi_bitmap_alloc_hwirqs() will round up any IRQ allocation requests
> to the nearest power of 2.  eg. ask for 5 IRQs and you'll get 8.  This wastes a
> lot of IRQs which can be a scarce resource.
> 
> For cxl we can require multiple IRQs for every contexts that is attached to the
> accelerator.  For AFU directed accelerators, there may be 1000s of contexts
> attached, hence we can easily run out of IRQs, especially if we are needlessly
> wasting them.
> 
> This changes the msi_bitmap_alloc_hwirqs() to allocate only the required number
> of IRQs, hence avoiding this wastage.
> 
> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
> Signed-off-by: Michael Neuling <mikey@neuling.org>
> ---
>  arch/powerpc/sysdev/msi_bitmap.c | 18 +++++++++++++-----
>  1 file changed, 13 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/powerpc/sysdev/msi_bitmap.c b/arch/powerpc/sysdev/msi_bitmap.c
> index 2ff6302..e001559 100644
> --- a/arch/powerpc/sysdev/msi_bitmap.c
> +++ b/arch/powerpc/sysdev/msi_bitmap.c
> @@ -24,28 +24,36 @@ int msi_bitmap_alloc_hwirqs(struct msi_bitmap *bmp, int num)
>  	 * This is fast, but stricter than we need. We might want to add
>  	 * a fallback routine which does a linear search with no alignment.
>  	 */

Is this comment still relevant (especially the part mentioning "fast")?

---
Best Regards, Laurentiu
Michael Neuling Sept. 22, 2014, 10:59 p.m. UTC | #6
> > diff --git a/arch/powerpc/sysdev/msi_bitmap.c b/arch/powerpc/sysdev/msi_bitmap.c
> > index 2ff6302..e001559 100644
> > --- a/arch/powerpc/sysdev/msi_bitmap.c
> > +++ b/arch/powerpc/sysdev/msi_bitmap.c
> > @@ -24,28 +24,36 @@ int msi_bitmap_alloc_hwirqs(struct msi_bitmap *bmp, int num)
> >  	 * This is fast, but stricter than we need. We might want to add
> >  	 * a fallback routine which does a linear search with no alignment.
> >  	 */
> 
> Is this comment still relevant (especially the part mentioning "fast")?

You're right, it's not really relevant any more.  I'll remove.

Thanks
Mikey
Scott Wood Sept. 22, 2014, 11:50 p.m. UTC | #7
On Mon, 2014-09-22 at 11:26 +0300, Laurentiu Tudor wrote:
> On 09/19/2014 11:19 PM, Scott Wood wrote:
> > On Fri, 2014-09-19 at 15:16 -0500, Scott Wood wrote:
> >> On Thu, 2014-09-18 at 18:26 +1000, Michael Neuling wrote:
> >>> From: Ian Munsie <imunsie@au1.ibm.com>
> >>>
> >>> Currently msi_bitmap_alloc_hwirqs() will round up any IRQ allocation requests
> >>> to the nearest power of 2.  eg. ask for 5 IRQs and you'll get 8.  This wastes a
> >>> lot of IRQs which can be a scarce resource.
> >>>
> >>> For cxl we can require multiple IRQs for every contexts that is attached to the
> >>> accelerator.  For AFU directed accelerators, there may be 1000s of contexts
> >>> attached, hence we can easily run out of IRQs, especially if we are needlessly
> >>> wasting them.
> >>>
> >>> This changes the msi_bitmap_alloc_hwirqs() to allocate only the required number
> >>> of IRQs, hence avoiding this wastage.
> >>>
> >>> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
> >>> Signed-off-by: Michael Neuling <mikey@neuling.org>
> >>> ---
> >>>  arch/powerpc/sysdev/msi_bitmap.c | 18 +++++++++++++-----
> >>>  1 file changed, 13 insertions(+), 5 deletions(-)
> >>
> >> This conflicts with (and partially duplicates)
> >> http://patchwork.ozlabs.org/patch/381892/
> >> which I have in my tree.  How should we handle it?
> >>
> >> Laurentiu, from looking at the overlap between patches I see a problem
> >> with your existing patch, regarding the out-of-irqs path and
> >> msi_bitmap_free_hwirqs(), so one way or another that needs to get fixed
> >> soon.
> > 
> > Given the problems with Laurentiu's patch, perhaps it'd be best for me
> > to just revert that patch in my tree, and respin it after this patchset
> > has been merged.
> 
> Let me know if you want me to rebase my stuff on top of Michael's patch.

Yes, please resend it once Michael's patch gets merged.

-Scott
diff mbox

Patch

diff --git a/arch/powerpc/sysdev/msi_bitmap.c b/arch/powerpc/sysdev/msi_bitmap.c
index 2ff6302..e001559 100644
--- a/arch/powerpc/sysdev/msi_bitmap.c
+++ b/arch/powerpc/sysdev/msi_bitmap.c
@@ -24,28 +24,36 @@  int msi_bitmap_alloc_hwirqs(struct msi_bitmap *bmp, int num)
 	 * This is fast, but stricter than we need. We might want to add
 	 * a fallback routine which does a linear search with no alignment.
 	 */
-	offset = bitmap_find_free_region(bmp->bitmap, bmp->irq_count, order);
+	offset = bitmap_find_next_zero_area(bmp->bitmap, bmp->irq_count, 0,
+					    num, (1 << order) - 1);
+	if (offset > bmp->irq_count)
+		goto err;
+	bitmap_set(bmp->bitmap, offset, num);
 	spin_unlock_irqrestore(&bmp->lock, flags);
 
 	pr_debug("msi_bitmap: allocated 0x%x (2^%d) at offset 0x%x\n",
 		 num, order, offset);
 
 	return offset;
+err:
+	spin_unlock_irqrestore(&bmp->lock, flags);
+	return -ENOMEM;
 }
+EXPORT_SYMBOL(msi_bitmap_alloc_hwirqs);
 
 void msi_bitmap_free_hwirqs(struct msi_bitmap *bmp, unsigned int offset,
 			    unsigned int num)
 {
 	unsigned long flags;
-	int order = get_count_order(num);
 
-	pr_debug("msi_bitmap: freeing 0x%x (2^%d) at offset 0x%x\n",
-		 num, order, offset);
+	pr_debug("msi_bitmap: freeing 0x%x at offset 0x%x\n",
+		 num, offset);
 
 	spin_lock_irqsave(&bmp->lock, flags);
-	bitmap_release_region(bmp->bitmap, offset, order);
+	bitmap_clear(bmp->bitmap, offset, num);
 	spin_unlock_irqrestore(&bmp->lock, flags);
 }
+EXPORT_SYMBOL(msi_bitmap_free_hwirqs);
 
 void msi_bitmap_reserve_hwirq(struct msi_bitmap *bmp, unsigned int hwirq)
 {