Patchwork [2/3] irq: Add hw continuous IRQs map to virtual continuous IRQs support

login
register
mail settings
Submitter Mike Qiu
Date Jan. 15, 2013, 7:38 a.m.
Message ID <1358235536-32741-3-git-send-email-qiudayu@linux.vnet.ibm.com>
Download mbox | patch
Permalink /patch/212026/
State Changes Requested
Headers show

Comments

Mike Qiu - Jan. 15, 2013, 7:38 a.m.
Adding a function irq_create_mapping_many() which can associate
multiple MSIs to a continous irq mapping.

This is needed to enable multiple MSI support for pSeries.

Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com>
---
 include/linux/irq.h       |    2 +
 include/linux/irqdomain.h |    3 ++
 kernel/irq/irqdomain.c    |   61 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 66 insertions(+), 0 deletions(-)
Michael Ellerman - March 5, 2013, 2:23 a.m.
On Tue, Jan 15, 2013 at 03:38:55PM +0800, Mike Qiu wrote:
> Adding a function irq_create_mapping_many() which can associate
> multiple MSIs to a continous irq mapping.
> 
> This is needed to enable multiple MSI support for pSeries.
> 
> Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com>
> ---
>  include/linux/irq.h       |    2 +
>  include/linux/irqdomain.h |    3 ++
>  kernel/irq/irqdomain.c    |   61 +++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 66 insertions(+), 0 deletions(-)
> 
> diff --git a/include/linux/irq.h b/include/linux/irq.h
> index 60ef45b..e00a7ec 100644
> --- a/include/linux/irq.h
> +++ b/include/linux/irq.h
> @@ -592,6 +592,8 @@ int __irq_alloc_descs(int irq, unsigned int from, unsigned int cnt, int node,
>  #define irq_alloc_desc_from(from, node)		\
>  	irq_alloc_descs(-1, from, 1, node)
>  
> +#define irq_alloc_desc_n(nevc, node)		\
> +	irq_alloc_descs(-1, 0, nevc, node)

This has been superseeded by irq_alloc_descs_from(), which is the right
way to do it.

> diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
> index 0d5b17b..831dded 100644
> --- a/include/linux/irqdomain.h
> +++ b/include/linux/irqdomain.h
> @@ -168,6 +168,9 @@ extern int irq_create_strict_mappings(struct irq_domain *domain,
>  				      unsigned int irq_base,
>  				      irq_hw_number_t hwirq_base, int count);
>  
> +extern int irq_create_mapping_many(struct irq_domain *domain,
> +					irq_hw_number_t hwirq_base, int count);
> +
>  static inline int irq_create_identity_mapping(struct irq_domain *host,
>  					      irq_hw_number_t hwirq)
>  {
> diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
> index 96f3a1d..38648e6 100644
> --- a/kernel/irq/irqdomain.c
> +++ b/kernel/irq/irqdomain.c
> @@ -636,6 +636,67 @@ int irq_create_strict_mappings(struct irq_domain *domain, unsigned int irq_base,
>  }
>  EXPORT_SYMBOL_GPL(irq_create_strict_mappings);
>  
> +/**
> + * irq_create_mapping_many - Map a range of hw IRQs to a range of virtual IRQs
> + * @domain: domain owning the interrupt range
> + * @hwirq_base: beginning of continuous hardware IRQ range
> + * @count: Number of interrupts to map

For multiple-MSI the allocated interrupt numbers must be a power-of-2,
and must be naturally aligned. I don't /think/ that's a requirement for
the virtual numbers, but it's probably best that we do it anyway.

So this API needs to specify that it will give you back a power-of-2
block that is naturally aligned - otherwise you can't use it for MSI.

> + * This routine is used for allocating and mapping a range of hardware
> + * irqs to virtual IRQs where the virtual irq numbers are not at pre-defined
> + * locations.

This comment doesn't make sense to me.

> + *
> + * Greater than 0 is returned upon success, while any failure to establish a
> + * static mapping is treated as an error.
> + */
> +int irq_create_mapping_many(struct irq_domain *domain,
> +		irq_hw_number_t hwirq_base, int count)
> +{
> +	int ret, irq_base;
> +	int virq, i;
> +
> +	pr_debug("irq_create_mapping(0x%p, 0x%lx)\n", domain, hwirq_base);


I'd like to see this whole function rewritten to reduce the duplication
vs irq_create_mapping(). I don't see any reason why this can't be the
core routine, and irq_create_mapping() becomes a caller of it, passing a
count of 1 ?

> +	/* Look for default domain if nececssary */
> +	if (!domain)
> +		domain = irq_default_domain;
> +	if (!domain) {
> +		pr_warn("irq_create_mapping called for NULL domain, hwirq=%lx\n"
> +			, hwirq_base);
> +		WARN_ON(1);
> +		return 0;
> +	}
> +	pr_debug("-> using domain @%p\n", domain);
> +
> +	/* For IRQ_DOMAIN_MAP_LEGACY, get the first virtual interrupt number */
> +	if (domain->revmap_type == IRQ_DOMAIN_MAP_LEGACY)
> +		return irq_domain_legacy_revmap(domain, hwirq_base);

The above doesn't work.

> +	/* Check if mapping already exists */
> +	for (i = 0; i < count; i++) {
> +		virq = irq_find_mapping(domain, hwirq_base+i);
> +		if (virq) {
> +			pr_debug("existing mapping on virq %d,"
> +					" now dispose it first\n", virq);
> +			irq_dispose_mapping(virq);

You might have just disposed of someone elses mapping, we shouldn't do
that. It should be an error to the caller.

cheers
Paul Mundt - March 5, 2013, 2:41 a.m.
On Tue, Jan 15, 2013 at 03:38:55PM +0800, Mike Qiu wrote:
> Adding a function irq_create_mapping_many() which can associate
> multiple MSIs to a continous irq mapping.
> 
> This is needed to enable multiple MSI support for pSeries.
> 
> +int irq_create_mapping_many(struct irq_domain *domain,
> +		irq_hw_number_t hwirq_base, int count)
> +{

Other than the other review comments already made, I think you can
simplify this considerably by simply doing what irq_create_strict_mappings() does,
and relaxing the irq_base requirements.

In any event, as you are creating a new interface, I don't think you want
to carry around half of the legacy crap that irq_create_mapping() has to
deal with. We made the decision to avoid this with irq_create_strict_mappings()
intentionally, too.
Mike Qiu - March 5, 2013, 7:19 a.m.
于 2013/3/5 10:23, Michael Ellerman 写道:
> On Tue, Jan 15, 2013 at 03:38:55PM +0800, Mike Qiu wrote:
>> Adding a function irq_create_mapping_many() which can associate
>> multiple MSIs to a continous irq mapping.
>>
>> This is needed to enable multiple MSI support for pSeries.
>>
>> Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com>
>> ---
>>   include/linux/irq.h       |    2 +
>>   include/linux/irqdomain.h |    3 ++
>>   kernel/irq/irqdomain.c    |   61 +++++++++++++++++++++++++++++++++++++++++++++
>>   3 files changed, 66 insertions(+), 0 deletions(-)
>>
>> diff --git a/include/linux/irq.h b/include/linux/irq.h
>> index 60ef45b..e00a7ec 100644
>> --- a/include/linux/irq.h
>> +++ b/include/linux/irq.h
>> @@ -592,6 +592,8 @@ int __irq_alloc_descs(int irq, unsigned int from, unsigned int cnt, int node,
>>   #define irq_alloc_desc_from(from, node)		\
>>   	irq_alloc_descs(-1, from, 1, node)
>>   
>> +#define irq_alloc_desc_n(nevc, node)		\
>> +	irq_alloc_descs(-1, 0, nevc, node)
> This has been superseeded by irq_alloc_descs_from(), which is the right
> way to do it.
Yes, but irq_alloc_descs_from() just for 1 irq, and if I change the api, 
maybe a lot places which call this
function will be affact.
>
>> diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
>> index 0d5b17b..831dded 100644
>> --- a/include/linux/irqdomain.h
>> +++ b/include/linux/irqdomain.h
>> @@ -168,6 +168,9 @@ extern int irq_create_strict_mappings(struct irq_domain *domain,
>>   				      unsigned int irq_base,
>>   				      irq_hw_number_t hwirq_base, int count);
>>   
>> +extern int irq_create_mapping_many(struct irq_domain *domain,
>> +					irq_hw_number_t hwirq_base, int count);
>> +
>>   static inline int irq_create_identity_mapping(struct irq_domain *host,
>>   					      irq_hw_number_t hwirq)
>>   {
>> diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
>> index 96f3a1d..38648e6 100644
>> --- a/kernel/irq/irqdomain.c
>> +++ b/kernel/irq/irqdomain.c
>> @@ -636,6 +636,67 @@ int irq_create_strict_mappings(struct irq_domain *domain, unsigned int irq_base,
>>   }
>>   EXPORT_SYMBOL_GPL(irq_create_strict_mappings);
>>   
>> +/**
>> + * irq_create_mapping_many - Map a range of hw IRQs to a range of virtual IRQs
>> + * @domain: domain owning the interrupt range
>> + * @hwirq_base: beginning of continuous hardware IRQ range
>> + * @count: Number of interrupts to map
> For multiple-MSI the allocated interrupt numbers must be a power-of-2,
> and must be naturally aligned. I don't /think/ that's a requirement for
> the virtual numbers, but it's probably best that we do it anyway.
>
> So this API needs to specify that it will give you back a power-of-2
> block that is naturally aligned - otherwise you can't use it for MSI.
rtas_call will return the numbers of hardware interrupt, and it should 
be power-of-2,
as this I think do not need to specify
>> + * This routine is used for allocating and mapping a range of hardware
>> + * irqs to virtual IRQs where the virtual irq numbers are not at pre-defined
>> + * locations.
> This comment doesn't make sense to me.
>
>> + *
>> + * Greater than 0 is returned upon success, while any failure to establish a
>> + * static mapping is treated as an error.
>> + */
>> +int irq_create_mapping_many(struct irq_domain *domain,
>> +		irq_hw_number_t hwirq_base, int count)
>> +{
>> +	int ret, irq_base;
>> +	int virq, i;
>> +
>> +	pr_debug("irq_create_mapping(0x%p, 0x%lx)\n", domain, hwirq_base);
>
> I'd like to see this whole function rewritten to reduce the duplication
> vs irq_create_mapping(). I don't see any reason why this can't be the
> core routine, and irq_create_mapping() becomes a caller of it, passing a
> count of 1 ?
It's good suggestion.
>> +	/* Look for default domain if nececssary */
>> +	if (!domain)
>> +		domain = irq_default_domain;
>> +	if (!domain) {
>> +		pr_warn("irq_create_mapping called for NULL domain, hwirq=%lx\n"
>> +			, hwirq_base);
>> +		WARN_ON(1);
>> +		return 0;
>> +	}
>> +	pr_debug("-> using domain @%p\n", domain);
>> +
>> +	/* For IRQ_DOMAIN_MAP_LEGACY, get the first virtual interrupt number */
>> +	if (domain->revmap_type == IRQ_DOMAIN_MAP_LEGACY)
>> +		return irq_domain_legacy_revmap(domain, hwirq_base);
> The above doesn't work.
Why it doesn't work ?
>> +	/* Check if mapping already exists */
>> +	for (i = 0; i < count; i++) {
>> +		virq = irq_find_mapping(domain, hwirq_base+i);
>> +		if (virq) {
>> +			pr_debug("existing mapping on virq %d,"
>> +					" now dispose it first\n", virq);
>> +			irq_dispose_mapping(virq);
> You might have just disposed of someone elses mapping, we shouldn't do
> that. It should be an error to the caller.
It's a good question. If the interrupt used for someone elses, why I can 
apply it from the system?
So it may someone else forget to dispose mapping, and it never be used 
for others as I have got
the interrupt I think.
> cheers
>
Mike Qiu - March 5, 2013, 7:44 a.m.
于 2013/3/5 10:41, Paul Mundt 写道:
> On Tue, Jan 15, 2013 at 03:38:55PM +0800, Mike Qiu wrote:
>> Adding a function irq_create_mapping_many() which can associate
>> multiple MSIs to a continous irq mapping.
>>
>> This is needed to enable multiple MSI support for pSeries.
>>
>> +int irq_create_mapping_many(struct irq_domain *domain,
>> +		irq_hw_number_t hwirq_base, int count)
>> +{
> Other than the other review comments already made, I think you can
> simplify this considerably by simply doing what irq_create_strict_mappings() does,
> and relaxing the irq_base requirements.
>
> In any event, as you are creating a new interface, I don't think you want
> to carry around half of the legacy crap that irq_create_mapping() has to
> deal with. We made the decision to avoid this with irq_create_strict_mappings()
> intentionally, too.
>
Oh, yes, you are right, I will send out V2 of my patch to make it more 
comfortable , and hope you can review my patch again

Thanks

Mike
Michael Ellerman - March 6, 2013, 3:54 a.m.
On Tue, Mar 05, 2013 at 03:19:57PM +0800, Mike Qiu wrote:
> 于 2013/3/5 10:23, Michael Ellerman 写道:
> >On Tue, Jan 15, 2013 at 03:38:55PM +0800, Mike Qiu wrote:
> >>Adding a function irq_create_mapping_many() which can associate
> >>multiple MSIs to a continous irq mapping.
> >>
> >>This is needed to enable multiple MSI support for pSeries.
> >>
> >>Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com>
> >>---
> >>  include/linux/irq.h       |    2 +
> >>  include/linux/irqdomain.h |    3 ++
> >>  kernel/irq/irqdomain.c    |   61 +++++++++++++++++++++++++++++++++++++++++++++
> >>  3 files changed, 66 insertions(+), 0 deletions(-)
> >>
> >>diff --git a/include/linux/irq.h b/include/linux/irq.h
> >>index 60ef45b..e00a7ec 100644
> >>--- a/include/linux/irq.h
> >>+++ b/include/linux/irq.h
> >>@@ -592,6 +592,8 @@ int __irq_alloc_descs(int irq, unsigned int from, unsigned int cnt, int node,
> >>  #define irq_alloc_desc_from(from, node)		\
> >>  	irq_alloc_descs(-1, from, 1, node)
> >>+#define irq_alloc_desc_n(nevc, node)		\
> >>+	irq_alloc_descs(-1, 0, nevc, node)
> >This has been superseeded by irq_alloc_descs_from(), which is the right
> >way to do it.

> Yes, but irq_alloc_descs_from() just for 1 irq

No it's not, look again.

#define irq_alloc_descs_from(from, cnt, node)   \
	irq_alloc_descs(-1, from, cnt, node)


> >>diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
> >>index 96f3a1d..38648e6 100644
> >>--- a/kernel/irq/irqdomain.c
> >>+++ b/kernel/irq/irqdomain.c
> >>@@ -636,6 +636,67 @@ int irq_create_strict_mappings(struct irq_domain *domain, unsigned int irq_base,
> >>  }
> >>  EXPORT_SYMBOL_GPL(irq_create_strict_mappings);
> >>+/**
> >>+ * irq_create_mapping_many - Map a range of hw IRQs to a range of virtual IRQs
> >>+ * @domain: domain owning the interrupt range
> >>+ * @hwirq_base: beginning of continuous hardware IRQ range
> >>+ * @count: Number of interrupts to map

> >For multiple-MSI the allocated interrupt numbers must be a power-of-2,
> >and must be naturally aligned. I don't /think/ that's a requirement for
> >the virtual numbers, but it's probably best that we do it anyway.
> >
> >So this API needs to specify that it will give you back a power-of-2
> >block that is naturally aligned - otherwise you can't use it for MSI.

> rtas_call will return the numbers of hardware interrupt, and it
> should be power-of-2, as this I think do not need to specify

You're confusing hardware interrupt numbers and virtual interrupt
numbers. My comment is about irq_create_mapping_many(), which returns
virtual interrupt numbers.

As I said I don't think there is a requirement that the virtual
interrupt numbers are also a power-of-2 naturally aligned block, but we
should allocate them as one anyway, to avoid any issues in future.

And so this API, which returns virtual interrupt numbers, must satisfy
that specification.

> >>+	/* Look for default domain if nececssary */
> >>+	if (!domain)
> >>+		domain = irq_default_domain;
> >>+	if (!domain) {
> >>+		pr_warn("irq_create_mapping called for NULL domain, hwirq=%lx\n"
> >>+			, hwirq_base);
> >>+		WARN_ON(1);
> >>+		return 0;
> >>+	}
> >>+	pr_debug("-> using domain @%p\n", domain);
> >>+
> >>+	/* For IRQ_DOMAIN_MAP_LEGACY, get the first virtual interrupt number */
> >>+	if (domain->revmap_type == IRQ_DOMAIN_MAP_LEGACY)
> >>+		return irq_domain_legacy_revmap(domain, hwirq_base);
> >The above doesn't work.
> Why it doesn't work ?

Because irq_domain_legacy_revmap() only allocates a single interrupt
number.

> >>+	/* Check if mapping already exists */
> >>+	for (i = 0; i < count; i++) {
> >>+		virq = irq_find_mapping(domain, hwirq_base+i);
> >>+		if (virq) {
> >>+			pr_debug("existing mapping on virq %d,"
> >>+					" now dispose it first\n", virq);
> >>+			irq_dispose_mapping(virq);

> >You might have just disposed of someone elses mapping, we shouldn't do
> >that. It should be an error to the caller.

> It's a good question. If the interrupt used for someone elses, why I
> can apply it from the system?

I agree, that would be a bug. But disposing of someone elses mapping is
not OK.

> So it may someone else forget to dispose mapping, and it never be
> used for others as I have got the interrupt I think.

Perhaps, but that is a bug that needs to be fixed in the code that
forgets to dispose of the mapping.

cheers
Mike Qiu - March 6, 2013, 5:34 a.m.
于 2013/3/6 11:54, Michael Ellerman 写道:
> On Tue, Mar 05, 2013 at 03:19:57PM +0800, Mike Qiu wrote:
>> 于 2013/3/5 10:23, Michael Ellerman 写道:
>>> On Tue, Jan 15, 2013 at 03:38:55PM +0800, Mike Qiu wrote:
>>>> Adding a function irq_create_mapping_many() which can associate
>>>> multiple MSIs to a continous irq mapping.
>>>>
>>>> This is needed to enable multiple MSI support for pSeries.
>>>>
>>>> Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com>
>>>> ---
>>>>   include/linux/irq.h       |    2 +
>>>>   include/linux/irqdomain.h |    3 ++
>>>>   kernel/irq/irqdomain.c    |   61 +++++++++++++++++++++++++++++++++++++++++++++
>>>>   3 files changed, 66 insertions(+), 0 deletions(-)
>>>>
>>>> diff --git a/include/linux/irq.h b/include/linux/irq.h
>>>> index 60ef45b..e00a7ec 100644
>>>> --- a/include/linux/irq.h
>>>> +++ b/include/linux/irq.h
>>>> @@ -592,6 +592,8 @@ int __irq_alloc_descs(int irq, unsigned int from, unsigned int cnt, int node,
>>>>   #define irq_alloc_desc_from(from, node)		\
>>>>   	irq_alloc_descs(-1, from, 1, node)
>>>> +#define irq_alloc_desc_n(nevc, node)		\
>>>> +	irq_alloc_descs(-1, 0, nevc, node)
>>> This has been superseeded by irq_alloc_descs_from(), which is the right
>>> way to do it.
>> Yes, but irq_alloc_descs_from() just for 1 irq
> No it's not, look again.
>
> #define irq_alloc_descs_from(from, cnt, node)   \
> 	irq_alloc_descs(-1, from, cnt, node)
Sorry, I see as irq_alloc_desc_from(from, node)
you are right
>
>
>>>> diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
>>>> index 96f3a1d..38648e6 100644
>>>> --- a/kernel/irq/irqdomain.c
>>>> +++ b/kernel/irq/irqdomain.c
>>>> @@ -636,6 +636,67 @@ int irq_create_strict_mappings(struct irq_domain *domain, unsigned int irq_base,
>>>>   }
>>>>   EXPORT_SYMBOL_GPL(irq_create_strict_mappings);
>>>> +/**
>>>> + * irq_create_mapping_many - Map a range of hw IRQs to a range of virtual IRQs
>>>> + * @domain: domain owning the interrupt range
>>>> + * @hwirq_base: beginning of continuous hardware IRQ range
>>>> + * @count: Number of interrupts to map
>>> For multiple-MSI the allocated interrupt numbers must be a power-of-2,
>>> and must be naturally aligned. I don't /think/ that's a requirement for
>>> the virtual numbers, but it's probably best that we do it anyway.
>>>
>>> So this API needs to specify that it will give you back a power-of-2
>>> block that is naturally aligned - otherwise you can't use it for MSI.
>> rtas_call will return the numbers of hardware interrupt, and it
>> should be power-of-2, as this I think do not need to specify
> You're confusing hardware interrupt numbers and virtual interrupt
> numbers. My comment is about irq_create_mapping_many(), which returns
> virtual interrupt numbers.
>
> As I said I don't think there is a requirement that the virtual
> interrupt numbers are also a power-of-2 naturally aligned block, but we
> should allocate them as one anyway, to avoid any issues in future.
But for virtual interrupt numbersit should be a power-of-2 naturally
aligned block, because it must be continuous, as the MSI-HOWTO.txt says:

     4.2.2 pci_enable_msi_block
     int pci_enable_msi_block(struct pci_dev *dev, int count)
     This variation on the above call allows a device driver to request
     multiple MSIs.  The MSI specification only allows interrupts to be
     allocated in powers of two, up to a maximum of 2^5 (32).
     If this function returns 0, it has succeeded in allocating at least
     as many interrupts as the driver requested
     (it may have allocated more in order to satisfy the power-of-two
     requirement). In this case, the function enables MSI on this device
     and updates dev->irq to be the lowest of the new interrupts
     assigned to it. The other interrupts assigned to the device are in
     the range dev->irq to dev->irq + count - 1.

See the last line, that means for the virtual interrupts must be a
continuous block.
> And so this API, which returns virtual interrupt numbers, must satisfy
> that specification.
>
>>>> +	/* Look for default domain if nececssary */
>>>> +	if (!domain)
>>>> +		domain = irq_default_domain;
>>>> +	if (!domain) {
>>>> +		pr_warn("irq_create_mapping called for NULL domain, hwirq=%lx\n"
>>>> +			, hwirq_base);
>>>> +		WARN_ON(1);
>>>> +		return 0;
>>>> +	}
>>>> +	pr_debug("-> using domain @%p\n", domain);
>>>> +
>>>> +	/* For IRQ_DOMAIN_MAP_LEGACY, get the first virtual interrupt number */
>>>> +	if (domain->revmap_type == IRQ_DOMAIN_MAP_LEGACY)
>>>> +		return irq_domain_legacy_revmap(domain, hwirq_base);
>>> The above doesn't work.
>> Why it doesn't work ?
> Because irq_domain_legacy_revmap() only allocates a single interrupt
> number.
OK, your right.
>>>> +	/* Check if mapping already exists */
>>>> +	for (i = 0; i < count; i++) {
>>>> +		virq = irq_find_mapping(domain, hwirq_base+i);
>>>> +		if (virq) {
>>>> +			pr_debug("existing mapping on virq %d,"
>>>> +					" now dispose it first\n", virq);
>>>> +			irq_dispose_mapping(virq);
>>> You might have just disposed of someone elses mapping, we shouldn't do
>>> that. It should be an error to the caller.
>> It's a good question. If the interrupt used for someone elses, why I
>> can apply it from the system?
> I agree, that would be a bug. But disposing of someone elses mapping is
> not OK.
>
>> So it may someone else forget to dispose mapping, and it never be
>> used for others as I have got the interrupt I think.
> Perhaps, but that is a bug that needs to be fixed in the code that
> forgets to dispose of the mapping.
>
> cheers
>
Michael Ellerman - March 6, 2013, 5:42 a.m.
On Wed, Mar 06, 2013 at 01:34:58PM +0800, Mike Qiu wrote:
> 于 2013/3/6 11:54, Michael Ellerman 写道:
> >On Tue, Mar 05, 2013 at 03:19:57PM +0800, Mike Qiu wrote:
> >>于 2013/3/5 10:23, Michael Ellerman 写道:
> >>>On Tue, Jan 15, 2013 at 03:38:55PM +0800, Mike Qiu wrote:
> >>>>diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
> >>>>index 96f3a1d..38648e6 100644
> >>>>--- a/kernel/irq/irqdomain.c
> >>>>+++ b/kernel/irq/irqdomain.c
> >>>>@@ -636,6 +636,67 @@ int irq_create_strict_mappings(struct irq_domain *domain, unsigned int irq_base,
> >>>>  }
> >>>>  EXPORT_SYMBOL_GPL(irq_create_strict_mappings);
> >>>>+/**
> >>>>+ * irq_create_mapping_many - Map a range of hw IRQs to a range of virtual IRQs
> >>>>+ * @domain: domain owning the interrupt range
> >>>>+ * @hwirq_base: beginning of continuous hardware IRQ range
> >>>>+ * @count: Number of interrupts to map
> >>>For multiple-MSI the allocated interrupt numbers must be a power-of-2,
> >>>and must be naturally aligned. I don't /think/ that's a requirement for
> >>>the virtual numbers, but it's probably best that we do it anyway.
> >>>
> >>>So this API needs to specify that it will give you back a power-of-2
> >>>block that is naturally aligned - otherwise you can't use it for MSI.
> >>rtas_call will return the numbers of hardware interrupt, and it
> >>should be power-of-2, as this I think do not need to specify
> >You're confusing hardware interrupt numbers and virtual interrupt
> >numbers. My comment is about irq_create_mapping_many(), which returns
> >virtual interrupt numbers.
> >
> >As I said I don't think there is a requirement that the virtual
> >interrupt numbers are also a power-of-2 naturally aligned block, but we
> >should allocate them as one anyway, to avoid any issues in future.

> But for virtual interrupt numbersit should be a power-of-2 naturally
> aligned block, because it must be continuous, as the MSI-HOWTO.txt says:
> 
>     4.2.2 pci_enable_msi_block
>     int pci_enable_msi_block(struct pci_dev *dev, int count)
>     This variation on the above call allows a device driver to request
>     multiple MSIs.  The MSI specification only allows interrupts to be
>     allocated in powers of two, up to a maximum of 2^5 (32).
>     If this function returns 0, it has succeeded in allocating at least
>     as many interrupts as the driver requested
>     (it may have allocated more in order to satisfy the power-of-two
>     requirement). In this case, the function enables MSI on this device
>     and updates dev->irq to be the lowest of the new interrupts
>     assigned to it. The other interrupts assigned to the device are in
>     the range dev->irq to dev->irq + count - 1.
> 
> See the last line, that means for the virtual interrupts must be a
> continuous block.

In practice I think things could work if we didn't, because we are not
using the mask routines that assume that layout.

But you're right, we must implement the API as it's specified, so the
virtual interrupt numbers must be a naturally aligned power-of-2.

cheers
Mike Qiu - March 6, 2013, 7:02 a.m.
于 2013/3/6 13:42, Michael Ellerman 写道:
> On Wed, Mar 06, 2013 at 01:34:58PM +0800, Mike Qiu wrote:
>> 于 2013/3/6 11:54, Michael Ellerman 写道:
>>> On Tue, Mar 05, 2013 at 03:19:57PM +0800, Mike Qiu wrote:
>>>> 于 2013/3/5 10:23, Michael Ellerman 写道:
>>>>> On Tue, Jan 15, 2013 at 03:38:55PM +0800, Mike Qiu wrote:
>>>>>> diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
>>>>>> index 96f3a1d..38648e6 100644
>>>>>> --- a/kernel/irq/irqdomain.c
>>>>>> +++ b/kernel/irq/irqdomain.c
>>>>>> @@ -636,6 +636,67 @@ int irq_create_strict_mappings(struct irq_domain *domain, unsigned int irq_base,
>>>>>>   }
>>>>>>   EXPORT_SYMBOL_GPL(irq_create_strict_mappings);
>>>>>> +/**
>>>>>> + * irq_create_mapping_many - Map a range of hw IRQs to a range of virtual IRQs
>>>>>> + * @domain: domain owning the interrupt range
>>>>>> + * @hwirq_base: beginning of continuous hardware IRQ range
>>>>>> + * @count: Number of interrupts to map
>>>>> For multiple-MSI the allocated interrupt numbers must be a power-of-2,
>>>>> and must be naturally aligned. I don't /think/ that's a requirement for
>>>>> the virtual numbers, but it's probably best that we do it anyway.
>>>>>
>>>>> So this API needs to specify that it will give you back a power-of-2
>>>>> block that is naturally aligned - otherwise you can't use it for MSI.
>>>> rtas_call will return the numbers of hardware interrupt, and it
>>>> should be power-of-2, as this I think do not need to specify
>>> You're confusing hardware interrupt numbers and virtual interrupt
>>> numbers. My comment is about irq_create_mapping_many(), which returns
>>> virtual interrupt numbers.
>>>
>>> As I said I don't think there is a requirement that the virtual
>>> interrupt numbers are also a power-of-2 naturally aligned block, but we
>>> should allocate them as one anyway, to avoid any issues in future.
>> But for virtual interrupt numbersit should be a power-of-2 naturally
>> aligned block, because it must be continuous, as the MSI-HOWTO.txt says:
>>
>>      4.2.2 pci_enable_msi_block
>>      int pci_enable_msi_block(struct pci_dev *dev, int count)
>>      This variation on the above call allows a device driver to request
>>      multiple MSIs.  The MSI specification only allows interrupts to be
>>      allocated in powers of two, up to a maximum of 2^5 (32).
>>      If this function returns 0, it has succeeded in allocating at least
>>      as many interrupts as the driver requested
>>      (it may have allocated more in order to satisfy the power-of-two
>>      requirement). In this case, the function enables MSI on this device
>>      and updates dev->irq to be the lowest of the new interrupts
>>      assigned to it. The other interrupts assigned to the device are in
>>      the range dev->irq to dev->irq + count - 1.
>>
>> See the last line, that means for the virtual interrupts must be a
>> continuous block.
> In practice I think things could work if we didn't, because we are not
> using the mask routines that assume that layout.
>
> But you're right, we must implement the API as it's specified, so the
> virtual interrupt numbers must be a naturally aligned power-of-2.
Yes, also your opinion is also right, just becasue the API requires
a naturally aligned power-of-2 interrupt numbers, so we need to
implement it like this.

cheers
>
> cheers
>

Patch

diff --git a/include/linux/irq.h b/include/linux/irq.h
index 60ef45b..e00a7ec 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -592,6 +592,8 @@  int __irq_alloc_descs(int irq, unsigned int from, unsigned int cnt, int node,
 #define irq_alloc_desc_from(from, node)		\
 	irq_alloc_descs(-1, from, 1, node)
 
+#define irq_alloc_desc_n(nevc, node)		\
+	irq_alloc_descs(-1, 0, nevc, node)
 void irq_free_descs(unsigned int irq, unsigned int cnt);
 int irq_reserve_irqs(unsigned int from, unsigned int cnt);
 
diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
index 0d5b17b..831dded 100644
--- a/include/linux/irqdomain.h
+++ b/include/linux/irqdomain.h
@@ -168,6 +168,9 @@  extern int irq_create_strict_mappings(struct irq_domain *domain,
 				      unsigned int irq_base,
 				      irq_hw_number_t hwirq_base, int count);
 
+extern int irq_create_mapping_many(struct irq_domain *domain,
+					irq_hw_number_t hwirq_base, int count);
+
 static inline int irq_create_identity_mapping(struct irq_domain *host,
 					      irq_hw_number_t hwirq)
 {
diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index 96f3a1d..38648e6 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -636,6 +636,67 @@  int irq_create_strict_mappings(struct irq_domain *domain, unsigned int irq_base,
 }
 EXPORT_SYMBOL_GPL(irq_create_strict_mappings);
 
+/**
+ * irq_create_mapping_many - Map a range of hw IRQs to a range of virtual IRQs
+ * @domain: domain owning the interrupt range
+ * @hwirq_base: beginning of continuous hardware IRQ range
+ * @count: Number of interrupts to map
+ *
+ * This routine is used for allocating and mapping a range of hardware
+ * irqs to virtual IRQs where the virtual irq numbers are not at pre-defined
+ * locations.
+ *
+ * Greater than 0 is returned upon success, while any failure to establish a
+ * static mapping is treated as an error.
+ */
+int irq_create_mapping_many(struct irq_domain *domain,
+		irq_hw_number_t hwirq_base, int count)
+{
+	int ret, irq_base;
+	int virq, i;
+
+	pr_debug("irq_create_mapping(0x%p, 0x%lx)\n", domain, hwirq_base);
+
+	/* Look for default domain if nececssary */
+	if (!domain)
+		domain = irq_default_domain;
+	if (!domain) {
+		pr_warn("irq_create_mapping called for NULL domain, hwirq=%lx\n"
+			, hwirq_base);
+		WARN_ON(1);
+		return 0;
+	}
+	pr_debug("-> using domain @%p\n", domain);
+
+	/* For IRQ_DOMAIN_MAP_LEGACY, get the first virtual interrupt number */
+	if (domain->revmap_type == IRQ_DOMAIN_MAP_LEGACY)
+		return irq_domain_legacy_revmap(domain, hwirq_base);
+
+	/* Check if mapping already exists */
+	for (i = 0; i < count; i++) {
+		virq = irq_find_mapping(domain, hwirq_base+i);
+		if (virq) {
+			pr_debug("existing mapping on virq %d,"
+					" now dispose it first\n", virq);
+			irq_dispose_mapping(virq);
+		}
+	}
+
+	/* Allocate the continuous virtual interrupt numbers */
+	irq_base = irq_alloc_desc_n(count, of_node_to_nid(domain->of_node));
+	if (unlikely(irq_base < 0))
+		return  irq_base;
+
+	ret = irq_domain_associate_many(domain, irq_base, hwirq_base, count);
+	if (unlikely(ret < 0)) {
+		irq_free_descs(irq_base, count);
+		return ret;
+	}
+
+	return irq_base;
+}
+EXPORT_SYMBOL_GPL(irq_create_mapping_many);
+
 unsigned int irq_create_of_mapping(struct device_node *controller,
 				   const u32 *intspec, unsigned int intsize)
 {