Patchwork [RFC,v2,12/29] PCI/MSI: Introduce pcim_enable_msi*() family helpers

login
register
mail settings
Submitter Alexander Gordeev
Date Oct. 18, 2013, 5:12 p.m.
Message ID <6bc575621ef70f72b206e4aa944acd32f1a75718.1382103786.git.agordeev@redhat.com>
Download mbox | patch
Permalink /patch/284844/
State Superseded
Headers show

Comments

Alexander Gordeev - Oct. 18, 2013, 5:12 p.m.
Currently many device drivers need contiguously call functions
pci_enable_msix() for MSI-X or pci_enable_msi_block() for MSI
in a loop until success or failure. This update generalizes
this usage pattern and introduces pcim_enable_msi*() family
helpers.

As result, device drivers do not have to deal with tri-state
return values from pci_enable_msix() and pci_enable_msi_block()
functions directly and expected to have more clearer and straight
code.

So i.e. the request loop described in the documentation...

	int foo_driver_enable_msix(struct foo_adapter *adapter,
				   int nvec)
	{
		while (nvec >= FOO_DRIVER_MINIMUM_NVEC) {
			rc = pci_enable_msix(adapter->pdev,
					     adapter->msix_entries,
					     nvec);
			if (rc > 0)
				nvec = rc;
			else
				return rc;
		}

		return -ENOSPC;
	}

...would turn into a single helper call....

	rc = pcim_enable_msix_range(adapter->pdev,
				    adapter->msix_entries,
				    nvec,
				    FOO_DRIVER_MINIMUM_NVEC);

Device drivers with more specific requirements (i.e. a number of
MSI-Xs which is a multiple of a certain number within a specified
range) would still need to implement the loop using the two old
functions.

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Suggested-by: Ben Hutchings <bhutchings@solarflare.com>
---
 Documentation/PCI/MSI-HOWTO.txt |  134 +++++++++++++++++++++++++++++++++++++--
 drivers/pci/msi.c               |   46 +++++++++++++
 include/linux/pci.h             |   59 +++++++++++++++++
 3 files changed, 234 insertions(+), 5 deletions(-)
Tejun Heo - Oct. 24, 2013, 10:51 a.m.
Hello, Alexander.

On Fri, Oct 18, 2013 at 07:12:12PM +0200, Alexander Gordeev wrote:
> So i.e. the request loop described in the documentation...
> 
> 	int foo_driver_enable_msix(struct foo_adapter *adapter,
> 				   int nvec)
> 	{
> 		while (nvec >= FOO_DRIVER_MINIMUM_NVEC) {
> 			rc = pci_enable_msix(adapter->pdev,
> 					     adapter->msix_entries,
> 					     nvec);
> 			if (rc > 0)
> 				nvec = rc;
> 			else
> 				return rc;
> 		}
> 
> 		return -ENOSPC;
> 	}
> 
> ...would turn into a single helper call....
> 
> 	rc = pcim_enable_msix_range(adapter->pdev,
> 				    adapter->msix_entries,
> 				    nvec,
> 				    FOO_DRIVER_MINIMUM_NVEC);

I haven't looked into any details but, if the above works for most use
cases, it looks really good to me.

Thanks!
David Laight - Oct. 24, 2013, 10:57 a.m.
> > ...would turn into a single helper call....
> >
> > 	rc = pcim_enable_msix_range(adapter->pdev,
> > 				    adapter->msix_entries,
> > 				    nvec,
> > 				    FOO_DRIVER_MINIMUM_NVEC);
> 
> I haven't looked into any details but, if the above works for most use
> cases, it looks really good to me.

The one case it doesn't work is where the driver either
wants the full number or the minimum number - but not
a value in between.

Might be worth adding an extra parameter so that this
(and maybe other) odd requirements can be met.

Some static inline functions could be used for the common cases.

	David



--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tejun Heo - Oct. 24, 2013, 11:01 a.m.
Hello,

On Thu, Oct 24, 2013 at 11:57:40AM +0100, David Laight wrote:
> > > ...would turn into a single helper call....
> > >
> > > 	rc = pcim_enable_msix_range(adapter->pdev,
> > > 				    adapter->msix_entries,
> > > 				    nvec,
> > > 				    FOO_DRIVER_MINIMUM_NVEC);
> > 
> > I haven't looked into any details but, if the above works for most use
> > cases, it looks really good to me.
> 
> The one case it doesn't work is where the driver either
> wants the full number or the minimum number - but not
> a value in between.
> 
> Might be worth adding an extra parameter so that this
> (and maybe other) odd requirements can be met.
> 
> Some static inline functions could be used for the common cases.

If those are edge cases, I don't think it's a big deal no matter what
we do.

Thanks.
Alexander Gordeev - Oct. 24, 2013, 11:41 a.m.
On Thu, Oct 24, 2013 at 11:57:40AM +0100, David Laight wrote:
> The one case it doesn't work is where the driver either
> wants the full number or the minimum number - but not
> a value in between.
> 
> Might be worth adding an extra parameter so that this
> (and maybe other) odd requirements can be met.

IMHO its not worth it, since it is not possible to generalize
all odd requirements out there. I do not think we should blow
the API in this case.

Having said that, the min-or-max interface is probably the only
worth considering. But again, I would prefer to put its semantics
to function name rather than to extra parameters, i.e.

pcim_enable_msix_min_max(struct pci_dev *dev, struct msix_entry *entries,
			 unsigned int minvec, unsigned int maxvec);

> Some static inline functions could be used for the common cases.
> 
> 	David
Alexander Gordeev - Oct. 24, 2013, 2:31 p.m.
On Thu, Oct 24, 2013 at 11:51:58AM +0100, Tejun Heo wrote:
> I haven't looked into any details but, if the above works for most use
> cases, it looks really good to me.

Well, if we reuse Michael's statistics:

 - 58 drivers call pci_enable_msix()
 - 24 try a single allocation and then fallback to MSI/LSI
 - 19 use the loop style allocation
 - 14 try an allocation, and if it fails retry once

...then I expect most of 19/58 (loop style) could be converted to
pcim_enable_msix() and pcim_enable_msix_range() and all of 14/58
(single fallback) should be converted to pcim_enable_msix() users.

> tejun
Mark Lord - Oct. 25, 2013, 2:22 a.m.
On 13-10-24 07:41 AM, Alexander Gordeev wrote:
> On Thu, Oct 24, 2013 at 11:57:40AM +0100, David Laight wrote:
>> The one case it doesn't work is where the driver either
>> wants the full number or the minimum number - but not
>> a value in between.
>>
>> Might be worth adding an extra parameter so that this
>> (and maybe other) odd requirements can be met.
> 
> IMHO its not worth it, since it is not possible to generalize
> all odd requirements out there. I do not think we should blow
> the API in this case.
> 
> Having said that, the min-or-max interface is probably the only
> worth considering. But again, I would prefer to put its semantics
> to function name rather than to extra parameters, i.e.
> 
> pcim_enable_msix_min_max(struct pci_dev *dev, struct msix_entry *entries,
> 			 unsigned int minvec, unsigned int maxvec);

The hardware I have in mind here works only for powers of two.
Eg. 16, 8, 4, 2, or 1 MSI-X vector.  Not the odd values in between.

But it appears I can still just use a loop for that case,
calling the new function above instead of the old functions.

Cheers

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mark Lord - Oct. 25, 2013, 2:23 a.m.
On 13-10-24 10:31 AM, Alexander Gordeev wrote:
> On Thu, Oct 24, 2013 at 11:51:58AM +0100, Tejun Heo wrote:
>> I haven't looked into any details but, if the above works for most use
>> cases, it looks really good to me.
> 
> Well, if we reuse Michael's statistics:
> 
>  - 58 drivers call pci_enable_msix()
>  - 24 try a single allocation and then fallback to MSI/LSI
>  - 19 use the loop style allocation
>  - 14 try an allocation, and if it fails retry once
> 
> ...then I expect most of 19/58 (loop style) could be converted to
> pcim_enable_msix() and pcim_enable_msix_range() and all of 14/58
> (single fallback) should be converted to pcim_enable_msix() users.

Those are just the in-kernel drivers.
There are many, many more out-of-kernel drivers for embedded platforms,
hardware in-development, etc..

Let's not be overly presumptive about the size of the user base.

Cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Laight - Oct. 25, 2013, 9:10 a.m.
> > pcim_enable_msix_min_max(struct pci_dev *dev, struct msix_entry *entries,

> > 			 unsigned int minvec, unsigned int maxvec);

> 

> The hardware I have in mind here works only for powers of two.

> Eg. 16, 8, 4, 2, or 1 MSI-X vector.  Not the odd values in between.

> 

> But it appears I can still just use a loop for that case,

> calling the new function above instead of the old functions.


You'd either have to loop with min and max the same (starting at 16),
or do a single call with min=1 and max=16 and the release the
unwanted ones.

The latter might be preferred because it might stop an annoying
trace about the system not having enough MSI interrupts.

What this doesn't resolve is a driver requesting a lot of interrupts
early on and leaving none for later drivers.

Really the system needs to allocate the minimum number to all
drivers before giving out any extra ones - I've NFI how this
would be arranged!

	David
Alexander Gordeev - Oct. 25, 2013, 10:01 a.m.
On Fri, Oct 25, 2013 at 10:10:02AM +0100, David Laight wrote:
> What this doesn't resolve is a driver requesting a lot of interrupts
> early on and leaving none for later drivers.

If this problem really exists anywhere besides pSeries?

I can imagine x86 hitting lack of vectors in interrupt table when
number of CPUs exceeds hundreds, but do we have this problem now?

> Really the system needs to allocate the minimum number to all
> drivers before giving out any extra ones - I've NFI how this
> would be arranged!

Do not know. The pSeries quota approach seems more reasonable to me.

> 	David
Karicheri, Muralidharan - Oct. 25, 2013, 2:52 p.m.
All,

I am a first timer for this mailing list. I have tried sending email to the list after subscribing successfully. I had sent an email before, but I can't see that in the mailing list as seen through the web browser. So I am re-sending. My apologies if multiple copies appear in the list.

I am working on my SMP kernel on an ARM A15 with RT patch applied against v3.10.10. I have a setup to use the e1000e pci driver against the pcie root complex driver that I have on this platform. It works well with non RT version of the kernel. I am able to do iperf on UDP and TCP traffic. But against RT kernel, I am able to do only ping. If I run iperf, it seems there is a deadlock happening. When I did suspend the core, it seems to be in the idle loop. Some time I am able to do ctrl-c and get back to shell. When I try doing ping again, it times out. If anyone has seen this issue and know how to fix, please reply.

Thanks.

Murali Karicheri
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Karicheri, Muralidharan - Oct. 25, 2013, 2:59 p.m.
Resending as there was a format issue

I am a first timer for this mailing list. I have tried sending email to the list after subscribing
successfully. I had sent an email before, but I can't see that in the mailing list as seen
through the web browser. So I am re-sending. My apologies if multiple copies appear in the
list.

I am working on my SMP kernel on an ARM A15 with RT patch applied against v3.10.10. I
have a setup to use the e1000e pci driver against the pcie root complex driver that I have
on this platform. It works well with non RT version of the kernel. I am able to do iperf on
UDP and TCP traffic. But against RT kernel, I am able to do only ping. If I run iperf, it seems
there is a deadlock happening. When I did suspend the core, it seems to be in the idle loop.
Some time I am able to do ctrl-c and get back to shell. When I try doing ping again, it times
out. If anyone has seen this issue and know how to fix, please reply.
 
 Thanks.
 
 Murali Karicheri

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael Ellerman - Oct. 27, 2013, 10:27 p.m.
On Fri, 2013-10-25 at 12:01 +0200, Alexander Gordeev wrote:
> On Fri, Oct 25, 2013 at 10:10:02AM +0100, David Laight wrote:
> > What this doesn't resolve is a driver requesting a lot of interrupts
> > early on and leaving none for later drivers.
> 
> If this problem really exists anywhere besides pSeries?
> 
> I can imagine x86 hitting lack of vectors in interrupt table when
> number of CPUs exceeds hundreds, but do we have this problem now?
> 
> > Really the system needs to allocate the minimum number to all
> > drivers before giving out any extra ones - I've NFI how this
> > would be arranged!
> 
> Do not know. The pSeries quota approach seems more reasonable to me.

When the system boots each driver should get a fair share of the
available MSIs, the quota achieves this.

But ideally the sysadmin would then be able to override that, and give
more MSIs to one device, the quota doesn't allow that.

Hopefully we'll see the number of available MSIs grow faster than the
number required by devices (usually driven by NR_CPUs), and so this will
become a non-problem.

cheers

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mark Lord - Oct. 28, 2013, 4:30 p.m.
On 13-10-25 06:01 AM, Alexander Gordeev wrote:
>
> If this problem really exists anywhere besides pSeries?
> 
> I can imagine x86 hitting lack of vectors in interrupt table when
> number of CPUs exceeds hundreds, but do we have this problem now?

An awful lot of x86 hardware has a 256 (255?) vector limit for MSI/MSI-X.
Couple that with PCIe Virtual Functions, each wanting 16 vectors (for example),
and that limit is really simple to exceed today.

But this is more a problem for a sysadmin, and I am happy with the current
and the proposed methods.

Cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Gordeev - Nov. 4, 2013, 8:12 a.m.
On Thu, Oct 24, 2013 at 11:51:58AM +0100, Tejun Heo wrote:

Hi Tejun,

> I haven't looked into any details but, if the above works for most use
> cases, it looks really good to me.

Could you please take a look (at least) at patches 10 and 12?

> tejun
Tejun Heo - Nov. 20, 2013, 5:15 p.m.
Hello,

On Fri, Oct 18, 2013 at 07:12:12PM +0200, Alexander Gordeev wrote:
> Currently many device drivers need contiguously call functions
> pci_enable_msix() for MSI-X or pci_enable_msi_block() for MSI
> in a loop until success or failure. This update generalizes
> this usage pattern and introduces pcim_enable_msi*() family
> helpers.
> 
> As result, device drivers do not have to deal with tri-state
> return values from pci_enable_msix() and pci_enable_msi_block()
> functions directly and expected to have more clearer and straight
> code.
> 
> So i.e. the request loop described in the documentation...
> 
> 	int foo_driver_enable_msix(struct foo_adapter *adapter,
> 				   int nvec)
> 	{
> 		while (nvec >= FOO_DRIVER_MINIMUM_NVEC) {
> 			rc = pci_enable_msix(adapter->pdev,
> 					     adapter->msix_entries,
> 					     nvec);
> 			if (rc > 0)
> 				nvec = rc;
> 			else
> 				return rc;
> 		}
> 
> 		return -ENOSPC;
> 	}
> 
> ...would turn into a single helper call....
> 
> 	rc = pcim_enable_msix_range(adapter->pdev,
> 				    adapter->msix_entries,
> 				    nvec,
> 				    FOO_DRIVER_MINIMUM_NVEC);
> 
> Device drivers with more specific requirements (i.e. a number of
> MSI-Xs which is a multiple of a certain number within a specified
> range) would still need to implement the loop using the two old
> functions.
> 
> Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
> Suggested-by: Ben Hutchings <bhutchings@solarflare.com>

The use of @nvec and @maxvec is a bit inconsistent.  Maybe it'd be
better to make them uniform?  Also, can you please add function
comments to the new public functions?  People are much more likely to
check them than the documentation.  Other than that,

Reviewed-by: Tejun Heo <tj@kernel.org>

Thanks a lot!
Alexander Gordeev - Nov. 22, 2013, 6:44 p.m.
On Wed, Nov 20, 2013 at 12:15:26PM -0500, Tejun Heo wrote:
> The use of @nvec and @maxvec is a bit inconsistent.  Maybe it'd be
> better to make them uniform?

With @maxvec I tried to stress an implication there could be values
less than @maxvec. While @nvec is more like an exact number.
Perfectly makes sense to me, but this is personal :)

> tejun
Tejun Heo - Nov. 22, 2013, 6:44 p.m.
On Fri, Nov 22, 2013 at 07:44:30PM +0100, Alexander Gordeev wrote:
> On Wed, Nov 20, 2013 at 12:15:26PM -0500, Tejun Heo wrote:
> > The use of @nvec and @maxvec is a bit inconsistent.  Maybe it'd be
> > better to make them uniform?
> 
> With @maxvec I tried to stress an implication there could be values
> less than @maxvec. While @nvec is more like an exact number.
> Perfectly makes sense to me, but this is personal :)

Oh yeah, I agree but saw a place where @nvec is used for max.  Maybe I
was confused.  Looking again...

+int pcim_enable_msi_range(struct pci_dev *dev, struct msix_entry *entries,
+                         unsigned int nvec, unsigned int minvec)
+
+This variation on pci_enable_msi_block() call allows a device driver to
+request any number of MSIs within specified range minvec to nvec. Whenever
+possible device drivers are encouraged to use this function rather than
+explicit request loop calling pci_enable_msi_block().

e.g. shouldn't that @nvec be @maxvec?

Thanks.
Alexander Gordeev - Nov. 22, 2013, 6:54 p.m.
On Fri, Nov 22, 2013 at 01:44:47PM -0500, Tejun Heo wrote:
> +int pcim_enable_msi_range(struct pci_dev *dev, struct msix_entry *entries,
> +                         unsigned int nvec, unsigned int minvec)
> +
> +This variation on pci_enable_msi_block() call allows a device driver to
> +request any number of MSIs within specified range minvec to nvec. Whenever
> +possible device drivers are encouraged to use this function rather than
> +explicit request loop calling pci_enable_msi_block().
> 
> e.g. shouldn't that @nvec be @maxvec?

Right, it should. May be even in different order.

> tejun

Patch

diff --git a/Documentation/PCI/MSI-HOWTO.txt b/Documentation/PCI/MSI-HOWTO.txt
index fdf3ae3..f348b6f 100644
--- a/Documentation/PCI/MSI-HOWTO.txt
+++ b/Documentation/PCI/MSI-HOWTO.txt
@@ -127,7 +127,62 @@  on the number of vectors that can be allocated; pci_enable_msi_block()
 returns as soon as it finds any constraint that doesn't allow the
 call to succeed.
 
-4.2.3 pci_disable_msi
+4.2.3 pcim_enable_msi_range
+
+int pcim_enable_msi_range(struct pci_dev *dev, struct msix_entry *entries,
+			  unsigned int nvec, unsigned int minvec)
+
+This variation on pci_enable_msi_block() call allows a device driver to
+request any number of MSIs within specified range minvec to nvec. Whenever
+possible device drivers are encouraged to use this function rather than
+explicit request loop calling pci_enable_msi_block().
+
+If this function returns a negative number, it indicates an error and
+the driver should not attempt to request any more MSI interrupts for
+this device.
+
+If this function returns a positive number it indicates at least the
+returned number of MSI interrupts have been successfully allocated (it may
+have allocated more in order to satisfy the power-of-two requirement).
+Device drivers can use this number to further initialize devices.
+
+4.2.4 pcim_enable_msi
+
+int pcim_enable_msi(struct pci_dev *dev,
+		    struct msix_entry *entries, unsigned int maxvec)
+
+This variation on pci_enable_msi_block() call allows a device driver to
+request any number of MSIs up to maxvec. Whenever possible device drivers
+are encouraged to use this function rather than explicit request loop
+calling pci_enable_msi_block().
+
+If this function returns a negative number, it indicates an error and
+the driver should not attempt to request any more MSI interrupts for
+this device.
+
+If this function returns a positive number it indicates at least the
+returned number of MSI interrupts have been successfully allocated (it may
+have allocated more in order to satisfy the power-of-two requirement).
+Device drivers can use this number to further initialize devices.
+
+4.2.5 pcim_enable_msi_exact
+
+int pcim_enable_msi_exact(struct pci_dev *dev,
+			  struct msix_entry *entries, unsigned int nvec)
+
+This variation on pci_enable_msi_block() call allows a device driver to
+request exactly nvec MSIs.
+
+If this function returns a negative number, it indicates an error and
+the driver should not attempt to request any more MSI interrupts for
+this device.
+
+If this function returns the value of nvec it indicates MSI interrupts
+have been successfully allocated. No other value in case of success could
+be returned. Device drivers can use this value to further allocate and
+initialize device resources.
+
+4.2.6 pci_disable_msi
 
 void pci_disable_msi(struct pci_dev *dev)
 
@@ -143,7 +198,7 @@  on any interrupt for which it previously called request_irq().
 Failure to do so results in a BUG_ON(), leaving the device with
 MSI enabled and thus leaking its vector.
 
-4.2.4 pci_get_msi_cap
+4.2.7 pci_get_msi_cap
 
 int pci_get_msi_cap(struct pci_dev *dev)
 
@@ -224,7 +279,76 @@  static int foo_driver_enable_msix(struct foo_adapter *adapter, int nvec)
 	return -ENOSPC;
 }
 
-4.3.2 pci_disable_msix
+4.3.2 pcim_enable_msix_range
+
+int pcim_enable_msix_range(struct pci_dev *dev, struct msix_entry *entries,
+			   unsigned int nvec, unsigned int minvec)
+
+This variation on pci_enable_msix() call allows a device driver to request
+any number of MSI-Xs within specified range minvec to nvec. Whenever possible
+device drivers are encouraged to use this function rather than explicit
+request loop calling pci_enable_msix().
+
+If this function returns a negative number, it indicates an error and
+the driver should not attempt to allocate any more MSI-X interrupts for
+this device.
+
+If this function returns a positive number it indicates the number of
+MSI-X interrupts that have been successfully allocated. Device drivers
+can use this number to further allocate and initialize device resources.
+
+A modified function calling pci_enable_msix() in a loop might look like:
+
+static int foo_driver_enable_msix(struct foo_adapter *adapter, int nvec)
+{
+	rc = pcim_enable_msix_range(adapter->pdev, adapter->msix_entries,
+				    nvec, FOO_DRIVER_MINIMUM_NVEC);
+	if (rc < 0)
+		return rc;
+
+	rc = foo_driver_init_other(adapter, rc);
+	if (rc < 0)
+		pci_disable_msix(adapter->pdev);
+
+	return rc;
+}
+
+4.3.3 pcim_enable_msix
+
+int pcim_enable_msix(struct pci_dev *dev,
+		     struct msix_entry *entries, unsigned int maxvec)
+
+This variation on pci_enable_msix() call allows a device driver to request
+any number of MSI-Xs up to maxvec. Whenever possible device drivers are
+encouraged to use this function rather than explicit request loop calling
+pci_enable_msix().
+
+If this function returns a negative number, it indicates an error and
+the driver should not attempt to allocate any more MSI-X interrupts for
+this device.
+
+If this function returns a positive number it indicates the number of
+MSI-X interrupts that have been successfully allocated. Device drivers
+can use this number to further allocate and initialize device resources.
+
+4.3.4 pcim_enable_msix_exact
+
+int pcim_enable_msix_exact(struct pci_dev *dev,
+			   struct msix_entry *entries, unsigned int nvec)
+
+This variation on pci_enable_msix() call allows a device driver to request
+exactly nvec MSI-Xs.
+
+If this function returns a negative number, it indicates an error and
+the driver should not attempt to allocate any more MSI-X interrupts for
+this device.
+
+If this function returns the value of nvec it indicates MSI-X interrupts
+have been successfully allocated. No other value in case of success could
+be returned. Device drivers can use this value to further allocate and
+initialize device resources.
+
+4.3.5 pci_disable_msix
 
 void pci_disable_msix(struct pci_dev *dev)
 
@@ -238,14 +362,14 @@  on any interrupt for which it previously called request_irq().
 Failure to do so results in a BUG_ON(), leaving the device with
 MSI-X enabled and thus leaking its vector.
 
-4.3.3 The MSI-X Table
+4.3.6 The MSI-X Table
 
 The MSI-X capability specifies a BAR and offset within that BAR for the
 MSI-X Table.  This address is mapped by the PCI subsystem, and should not
 be accessed directly by the device driver.  If the driver wishes to
 mask or unmask an interrupt, it should call disable_irq() / enable_irq().
 
-4.3.4 pci_msix_table_size
+4.3.7 pci_msix_table_size
 
 int pci_msix_table_size(struct pci_dev *dev)
 
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 96f51d0..91acd8a 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -1042,3 +1042,49 @@  void pci_msi_init_pci_dev(struct pci_dev *dev)
 	if (dev->msix_cap)
 		msix_set_enable(dev, 0);
 }
+
+int pcim_enable_msi_range(struct pci_dev *dev,
+			  unsigned int nvec, unsigned int minvec)
+{
+	int rc;
+
+	if (nvec < minvec)
+		return -ERANGE;
+
+	do {
+		rc = pci_enable_msi_block(dev, nvec);
+		if (rc < 0) {
+			return rc;
+		} else if (rc > 0) {
+			if (rc < minvec)
+				return -ENOSPC;
+			nvec = rc;
+		}
+	} while (rc);
+
+	return nvec;
+}
+EXPORT_SYMBOL(pcim_enable_msi_range);
+
+int pcim_enable_msix_range(struct pci_dev *dev, struct msix_entry *entries,
+			   unsigned int nvec, unsigned int minvec)
+{
+	int rc;
+
+	if (nvec < minvec)
+		return -ERANGE;
+
+	do {
+		rc = pci_enable_msix(dev, entries, nvec);
+		if (rc < 0) {
+			return rc;
+		} else if (rc > 0) {
+			if (rc < minvec)
+				return -ENOSPC;
+			nvec = rc;
+		}
+	} while (rc);
+
+	return nvec;
+}
+EXPORT_SYMBOL(pcim_enable_msix_range);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index bef5775..3c18a8f 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1185,6 +1185,39 @@  static inline int pci_msi_enabled(void)
 {
 	return 0;
 }
+
+int pcim_enable_msi_range(struct pci_dev *dev,
+			  unsigned int nvec, unsigned int minvec)
+{
+	return -ENOSYS;
+}
+static inline int pcim_enable_msi(struct pci_dev *dev, unsigned int maxvec)
+{
+	return -ENOSYS;
+}
+static inline int pcim_enable_msi_exact(struct pci_dev *dev, unsigned int nvec)
+{
+	return -ENOSYS;
+}
+
+static inline int
+pcim_enable_msix_range(struct pci_dev *dev, struct msix_entry *entries,
+		       unsigned int nvec, unsigned int maxvec)
+{
+	return -ENOSYS;
+}
+static inline int
+pcim_enable_msix(struct pci_dev *dev,
+		 struct msix_entry *entries, unsigned int maxvec)
+{
+	return -ENOSYS;
+}
+static inline int
+pcim_enable_msix_exact(struct pci_dev *dev,
+		       struct msix_entry *entries, unsigned int nvec)
+{
+	return -ENOSYS;
+}
 #else
 int pci_get_msi_cap(struct pci_dev *dev);
 int pci_enable_msi_block(struct pci_dev *dev, unsigned int nvec);
@@ -1198,6 +1231,32 @@  void pci_disable_msix(struct pci_dev *dev);
 void msi_remove_pci_irq_vectors(struct pci_dev *dev);
 void pci_restore_msi_state(struct pci_dev *dev);
 int pci_msi_enabled(void);
+
+int pcim_enable_msi_range(struct pci_dev *dev,
+			  unsigned int nvec, unsigned int minvec);
+static inline int pcim_enable_msi(struct pci_dev *dev, unsigned int maxvec)
+{
+	return pcim_enable_msi_range(dev, maxvec, 1);
+}
+static inline int pcim_enable_msi_exact(struct pci_dev *dev, unsigned int nvec)
+{
+	return pcim_enable_msi_range(dev, nvec, nvec);
+}
+
+int pcim_enable_msix_range(struct pci_dev *dev, struct msix_entry *entries,
+			   unsigned int nvec, unsigned int minvec);
+static inline int
+pcim_enable_msix(struct pci_dev *dev,
+		 struct msix_entry *entries, unsigned int maxvec)
+{
+	return pcim_enable_msix_range(dev, entries, maxvec, 1);
+}
+static inline int
+pcim_enable_msix_exact(struct pci_dev *dev,
+		       struct msix_entry *entries, unsigned int nvec)
+{
+	return pcim_enable_msix_range(dev, entries, nvec, nvec);
+}
 #endif
 
 #ifdef CONFIG_PCIEPORTBUS