Patchwork [2/2] PCI/MSI: Phase out pci_enable_msi_block()

login
register
mail settings
Submitter Alexander Gordeev
Date April 14, 2014, 1:28 p.m.
Message ID <20140414132834.GA9164@dhcp-26-207.brq.redhat.com>
Download mbox | patch
Permalink /patch/338952/
State Accepted
Headers show

Comments

Alexander Gordeev - April 14, 2014, 1:28 p.m.
There are no users of pci_enable_msi_block() function have
left. Obsolete it in favor of pci_enable_msi_range() and
pci_enable_msi_exact() functions.

Up until now, when enabling MSI mode for a device a single
successful call to arch_msi_check_device() was followed by
a single call to arch_setup_msi_irqs() function.

Yet, if arch_msi_check_device() returned success we should be
able to call arch_setup_msi_irqs() multiple times - while it
returns a number of MSI vectors that could have been allocated
(a third state).

This update makes use of the assumption described above. It
could have broke things had the architectures done any pre-
allocations or switch to some state in a call to function
arch_msi_check_device(). But because arch_msi_check_device()
is expected stateless and MSI resources are allocated in a
follow-up call to arch_setup_msi_irqs() we should be fine.

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s390@vger.kernel.org
Cc: x86@kernel.org
Cc: linux-pci@vger.kernel.org
---
 drivers/pci/msi.c   |   79 +++++++++++++++++++++-----------------------------
 include/linux/pci.h |    5 +--
 2 files changed, 34 insertions(+), 50 deletions(-)
Alexander Gordeev - April 30, 2014, 9:19 p.m.
On Mon, Apr 14, 2014 at 03:28:34PM +0200, Alexander Gordeev wrote:
> There are no users of pci_enable_msi_block() function have
> left. Obsolete it in favor of pci_enable_msi_range() and
> pci_enable_msi_exact() functions.

Hi Bjorn,

How about this one?

Thanks!
Bjorn Helgaas - April 30, 2014, 11:49 p.m.
On Mon, Apr 14, 2014 at 03:28:35PM +0200, Alexander Gordeev wrote:
> There are no users of pci_enable_msi_block() function have
> left. Obsolete it in favor of pci_enable_msi_range() and
> pci_enable_msi_exact() functions.

I mistakenly assumed this would have to wait because I thought there were
other pci_enable_msi_block() users that wouldn't be removed until the v3.16
merge window.  But I think I was wrong: I put your GenWQE patch in my tree,
and I think that was the last use, so I can just add this patch on top.

But I need a little help understanding the changelog:

> Up until now, when enabling MSI mode for a device a single
> successful call to arch_msi_check_device() was followed by
> a single call to arch_setup_msi_irqs() function.

I understand this part; the following two paths call
arch_msi_check_device() once and then arch_setup_msi_irqs() once:

  pci_enable_msi_block
    pci_msi_check_device
      arch_msi_check_device
    msi_capability_init
      arch_setup_msi_irqs

  pci_enable_msix
    pci_msi_check_device
      arch_msi_check_device
    msix_capability_init
      arch_setup_msi_irqs

> Yet, if arch_msi_check_device() returned success we should be
> able to call arch_setup_msi_irqs() multiple times - while it
> returns a number of MSI vectors that could have been allocated
> (a third state).

I don't know what you mean by "a third state."

Previously we only called arch_msi_check_device() once.  After your patch,
pci_enable_msi_range() can call it several times.  The only non-trivial
implementation of arch_msi_check_device() is in powerpc, and all the
ppc_md.msi_check_device() possibilities look safe to call multiple times.

After your patch, the pci_enable_msi_range() path can also call
arch_setup_msi_irqs() several times.  I don't see a problem with that --
even if the first call succeeds and allocates something, then a subsequent
call fails, I assume the allocations will be cleaned up when
msi_capability_init() calls free_msi_irqs().

> This update makes use of the assumption described above. It
> could have broke things had the architectures done any pre-
> allocations or switch to some state in a call to function
> arch_msi_check_device(). But because arch_msi_check_device()
> is expected stateless and MSI resources are allocated in a
> follow-up call to arch_setup_msi_irqs() we should be fine.

I guess you mean that your patch assumes arch_msi_check_device() is
stateless.  That looks like a safe assumption to me.

arch_setup_msi_irqs() is clearly not stateless, but I assume
free_msi_irqs() is enough to clean up any state if we fail.

Bjorn

> Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
> Cc: linux-mips@linux-mips.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: linux-s390@vger.kernel.org
> Cc: x86@kernel.org
> Cc: linux-pci@vger.kernel.org
> ---
>  drivers/pci/msi.c   |   79 +++++++++++++++++++++-----------------------------
>  include/linux/pci.h |    5 +--
>  2 files changed, 34 insertions(+), 50 deletions(-)
> 
> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
> index 955ab79..fc669ab 100644
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -883,50 +883,6 @@ int pci_msi_vec_count(struct pci_dev *dev)
>  }
>  EXPORT_SYMBOL(pci_msi_vec_count);
>  
> -/**
> - * pci_enable_msi_block - configure device's MSI capability structure
> - * @dev: device to configure
> - * @nvec: number of interrupts to configure
> - *
> - * Allocate IRQs for a device with the MSI capability.
> - * This function returns a negative errno if an error occurs.  If it
> - * is unable to allocate the number of interrupts requested, it returns
> - * the number of interrupts it might be able to allocate.  If it successfully
> - * allocates at least the number of interrupts requested, it returns 0 and
> - * updates the @dev's irq member to the lowest new interrupt number; the
> - * other interrupt numbers allocated to this device are consecutive.
> - */
> -int pci_enable_msi_block(struct pci_dev *dev, int nvec)
> -{
> -	int status, maxvec;
> -
> -	if (dev->current_state != PCI_D0)
> -		return -EINVAL;
> -
> -	maxvec = pci_msi_vec_count(dev);
> -	if (maxvec < 0)
> -		return maxvec;
> -	if (nvec > maxvec)
> -		return maxvec;
> -
> -	status = pci_msi_check_device(dev, nvec, PCI_CAP_ID_MSI);
> -	if (status)
> -		return status;
> -
> -	WARN_ON(!!dev->msi_enabled);
> -
> -	/* Check whether driver already requested MSI-X irqs */
> -	if (dev->msix_enabled) {
> -		dev_info(&dev->dev, "can't enable MSI "
> -			 "(MSI-X already enabled)\n");
> -		return -EINVAL;
> -	}
> -
> -	status = msi_capability_init(dev, nvec);
> -	return status;
> -}
> -EXPORT_SYMBOL(pci_enable_msi_block);
> -
>  void pci_msi_shutdown(struct pci_dev *dev)
>  {
>  	struct msi_desc *desc;
> @@ -1132,14 +1088,45 @@ void pci_msi_init_pci_dev(struct pci_dev *dev)
>   **/
>  int pci_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec)
>  {
> -	int nvec = maxvec;
> +	int nvec;
>  	int rc;
>  
> +	if (dev->current_state != PCI_D0)
> +		return -EINVAL;
> +
> +	WARN_ON(!!dev->msi_enabled);
> +
> +	/* Check whether driver already requested MSI-X irqs */
> +	if (dev->msix_enabled) {
> +		dev_info(&dev->dev,
> +			 "can't enable MSI (MSI-X already enabled)\n");
> +		return -EINVAL;
> +	}
> +
>  	if (maxvec < minvec)
>  		return -ERANGE;
>  
> +	nvec = pci_msi_vec_count(dev);
> +	if (nvec < 0)
> +		return nvec;
> +	else if (nvec < minvec)
> +		return -EINVAL;
> +	else if (nvec > maxvec)
> +		nvec = maxvec;
> +
> +	do {
> +		rc = pci_msi_check_device(dev, nvec, PCI_CAP_ID_MSI);
> +		if (rc < 0) {
> +			return rc;
> +		} else if (rc > 0) {
> +			if (rc < minvec)
> +				return -ENOSPC;
> +			nvec = rc;
> +		}
> +	} while (rc);
> +
>  	do {
> -		rc = pci_enable_msi_block(dev, nvec);
> +		rc = msi_capability_init(dev, nvec);
>  		if (rc < 0) {
>  			return rc;
>  		} else if (rc > 0) {
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index aab57b4..499755e 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1158,7 +1158,6 @@ struct msix_entry {
>  
>  #ifdef CONFIG_PCI_MSI
>  int pci_msi_vec_count(struct pci_dev *dev);
> -int pci_enable_msi_block(struct pci_dev *dev, int nvec);
>  void pci_msi_shutdown(struct pci_dev *dev);
>  void pci_disable_msi(struct pci_dev *dev);
>  int pci_msix_vec_count(struct pci_dev *dev);
> @@ -1188,8 +1187,6 @@ static inline int pci_enable_msix_exact(struct pci_dev *dev,
>  }
>  #else
>  static inline int pci_msi_vec_count(struct pci_dev *dev) { return -ENOSYS; }
> -static inline int pci_enable_msi_block(struct pci_dev *dev, int nvec)
> -{ return -ENOSYS; }
>  static inline void pci_msi_shutdown(struct pci_dev *dev) { }
>  static inline void pci_disable_msi(struct pci_dev *dev) { }
>  static inline int pci_msix_vec_count(struct pci_dev *dev) { return -ENOSYS; }
> @@ -1244,7 +1241,7 @@ static inline void pcie_set_ecrc_checking(struct pci_dev *dev) { }
>  static inline void pcie_ecrc_get_policy(char *str) { }
>  #endif
>  
> -#define pci_enable_msi(pdev)	pci_enable_msi_block(pdev, 1)
> +#define pci_enable_msi(pdev)	pci_enable_msi_exact(pdev, 1)
>  
>  #ifdef CONFIG_HT_IRQ
>  /* The functions a driver should call */
> -- 
> 1.7.7.6
> 
> -- 
> Regards,
> Alexander Gordeev
> agordeev@redhat.com
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Gordeev - May 1, 2014, 4:07 p.m.
On Wed, Apr 30, 2014 at 05:49:33PM -0600, Bjorn Helgaas wrote:
> I mistakenly assumed this would have to wait because I thought there were
> other pci_enable_msi_block() users that wouldn't be removed until the v3.16
> merge window.  But I think I was wrong: I put your GenWQE patch in my tree,
> and I think that was the last use, so I can just add this patch on top.

That is right, GenWQE is the last one.

> But I need a little help understanding the changelog:
> 
> > Up until now, when enabling MSI mode for a device a single
> > successful call to arch_msi_check_device() was followed by
> > a single call to arch_setup_msi_irqs() function.
> 
> I understand this part; the following two paths call
> arch_msi_check_device() once and then arch_setup_msi_irqs() once:
> 
>   pci_enable_msi_block
>     pci_msi_check_device
>       arch_msi_check_device
>     msi_capability_init
>       arch_setup_msi_irqs
> 
>   pci_enable_msix
>     pci_msi_check_device
>       arch_msi_check_device
>     msix_capability_init
>       arch_setup_msi_irqs
> 
> > Yet, if arch_msi_check_device() returned success we should be
> > able to call arch_setup_msi_irqs() multiple times - while it
> > returns a number of MSI vectors that could have been allocated
> > (a third state).
> 
> I don't know what you mean by "a third state."

That is "a number of MSI vectors that could have been allocated", which
is neither success nor failure. In previous conversations someone branded
it this as "third state".

> Previously we only called arch_msi_check_device() once.  After your patch,
> pci_enable_msi_range() can call it several times.  The only non-trivial
> implementation of arch_msi_check_device() is in powerpc, and all the
> ppc_md.msi_check_device() possibilities look safe to call multiple times.

Yep, I see it the same way.

> After your patch, the pci_enable_msi_range() path can also call
> arch_setup_msi_irqs() several times.  I don't see a problem with that --
> even if the first call succeeds and allocates something, then a subsequent
> call fails, I assume the allocations will be cleaned up when
> msi_capability_init() calls free_msi_irqs().

Well, the potential problem related to the fact arch_msi_check_device()
could be called with 'nvec1' while arch_setup_msi_irqs() could be called
with 'nvec2', where 'nvec1' > 'nvec2'.

While it is not a problem with current implementations, in therory it is
possible free_msi_irqs() could be called with 'nvec2' and fail to clean
up 'nvec1' - 'nvec2' number of resources.

The only assumption that makes the above scenario impossible is if
arch_msi_check_device() is stateless.

Again, that is purely theoretical with no particular architecture in mind.

> > This update makes use of the assumption described above. It
> > could have broke things had the architectures done any pre-
> > allocations or switch to some state in a call to function
> > arch_msi_check_device(). But because arch_msi_check_device()
> > is expected stateless and MSI resources are allocated in a
> > follow-up call to arch_setup_msi_irqs() we should be fine.
> 
> I guess you mean that your patch assumes arch_msi_check_device() is
> stateless.  That looks like a safe assumption to me.

Moreover, if arch_msi_check_device() was not stateless then it would
be superfluous, since all state switches and allocations are done in
arch_setup_msi_irqs() anyway.

In fact, I think arch_msi_check_device() could be eliminated, but I
do not want to engage with PPC folks at this stage ;)

> arch_setup_msi_irqs() is clearly not stateless, but I assume
> free_msi_irqs() is enough to clean up any state if we fail.
> 
> Bjorn

Patch

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 955ab79..fc669ab 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -883,50 +883,6 @@  int pci_msi_vec_count(struct pci_dev *dev)
 }
 EXPORT_SYMBOL(pci_msi_vec_count);
 
-/**
- * pci_enable_msi_block - configure device's MSI capability structure
- * @dev: device to configure
- * @nvec: number of interrupts to configure
- *
- * Allocate IRQs for a device with the MSI capability.
- * This function returns a negative errno if an error occurs.  If it
- * is unable to allocate the number of interrupts requested, it returns
- * the number of interrupts it might be able to allocate.  If it successfully
- * allocates at least the number of interrupts requested, it returns 0 and
- * updates the @dev's irq member to the lowest new interrupt number; the
- * other interrupt numbers allocated to this device are consecutive.
- */
-int pci_enable_msi_block(struct pci_dev *dev, int nvec)
-{
-	int status, maxvec;
-
-	if (dev->current_state != PCI_D0)
-		return -EINVAL;
-
-	maxvec = pci_msi_vec_count(dev);
-	if (maxvec < 0)
-		return maxvec;
-	if (nvec > maxvec)
-		return maxvec;
-
-	status = pci_msi_check_device(dev, nvec, PCI_CAP_ID_MSI);
-	if (status)
-		return status;
-
-	WARN_ON(!!dev->msi_enabled);
-
-	/* Check whether driver already requested MSI-X irqs */
-	if (dev->msix_enabled) {
-		dev_info(&dev->dev, "can't enable MSI "
-			 "(MSI-X already enabled)\n");
-		return -EINVAL;
-	}
-
-	status = msi_capability_init(dev, nvec);
-	return status;
-}
-EXPORT_SYMBOL(pci_enable_msi_block);
-
 void pci_msi_shutdown(struct pci_dev *dev)
 {
 	struct msi_desc *desc;
@@ -1132,14 +1088,45 @@  void pci_msi_init_pci_dev(struct pci_dev *dev)
  **/
 int pci_enable_msi_range(struct pci_dev *dev, int minvec, int maxvec)
 {
-	int nvec = maxvec;
+	int nvec;
 	int rc;
 
+	if (dev->current_state != PCI_D0)
+		return -EINVAL;
+
+	WARN_ON(!!dev->msi_enabled);
+
+	/* Check whether driver already requested MSI-X irqs */
+	if (dev->msix_enabled) {
+		dev_info(&dev->dev,
+			 "can't enable MSI (MSI-X already enabled)\n");
+		return -EINVAL;
+	}
+
 	if (maxvec < minvec)
 		return -ERANGE;
 
+	nvec = pci_msi_vec_count(dev);
+	if (nvec < 0)
+		return nvec;
+	else if (nvec < minvec)
+		return -EINVAL;
+	else if (nvec > maxvec)
+		nvec = maxvec;
+
+	do {
+		rc = pci_msi_check_device(dev, nvec, PCI_CAP_ID_MSI);
+		if (rc < 0) {
+			return rc;
+		} else if (rc > 0) {
+			if (rc < minvec)
+				return -ENOSPC;
+			nvec = rc;
+		}
+	} while (rc);
+
 	do {
-		rc = pci_enable_msi_block(dev, nvec);
+		rc = msi_capability_init(dev, nvec);
 		if (rc < 0) {
 			return rc;
 		} else if (rc > 0) {
diff --git a/include/linux/pci.h b/include/linux/pci.h
index aab57b4..499755e 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1158,7 +1158,6 @@  struct msix_entry {
 
 #ifdef CONFIG_PCI_MSI
 int pci_msi_vec_count(struct pci_dev *dev);
-int pci_enable_msi_block(struct pci_dev *dev, int nvec);
 void pci_msi_shutdown(struct pci_dev *dev);
 void pci_disable_msi(struct pci_dev *dev);
 int pci_msix_vec_count(struct pci_dev *dev);
@@ -1188,8 +1187,6 @@  static inline int pci_enable_msix_exact(struct pci_dev *dev,
 }
 #else
 static inline int pci_msi_vec_count(struct pci_dev *dev) { return -ENOSYS; }
-static inline int pci_enable_msi_block(struct pci_dev *dev, int nvec)
-{ return -ENOSYS; }
 static inline void pci_msi_shutdown(struct pci_dev *dev) { }
 static inline void pci_disable_msi(struct pci_dev *dev) { }
 static inline int pci_msix_vec_count(struct pci_dev *dev) { return -ENOSYS; }
@@ -1244,7 +1241,7 @@  static inline void pcie_set_ecrc_checking(struct pci_dev *dev) { }
 static inline void pcie_ecrc_get_policy(char *str) { }
 #endif
 
-#define pci_enable_msi(pdev)	pci_enable_msi_block(pdev, 1)
+#define pci_enable_msi(pdev)	pci_enable_msi_exact(pdev, 1)
 
 #ifdef CONFIG_HT_IRQ
 /* The functions a driver should call */