diff mbox

genirq/msi: Make sure PCI MSIs are activated early

Message ID 1468426713-31431-1-git-send-email-marc.zyngier@arm.com
State Not Applicable
Headers show

Commit Message

Marc Zyngier July 13, 2016, 4:18 p.m. UTC
Bharat Kumar Gogada reported issues with the generic MSI code,
where the end-point ended up with garbage in its MSI configuration
(both for the vector and the message).

It turns out that the two MSI paths in the kernel are doing slightly
different things:

generic MSI: disable MSI -> allocate MSI -> enable MSI -> setup EP
PCI MSI: disable MSI -> allocate MSI -> setup EP -> enable MSI

and it turns out that end-points are allowed to latch the content
of the MSI configuration registers as soon as MSIs are enabled.
In Bharat's case, the end-point ends up using whatever was there
already, which is not what you want.

In order to make things converge, we introduce a new MSI domain
flag (MSI_FLAG_ACTIVATE_EARLY) that is unconditionally set for
PCI/MSI. When set, this flag forces the programming of the end-point
as soon as the MSIs are allocated.

A consequence of this is that we have an extra activate in
irq_startup, but that should be without much consequence.

Reported-by: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com>
Tested-by: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 drivers/pci/msi.c   | 2 ++
 include/linux/msi.h | 2 ++
 kernel/irq/msi.c    | 7 +++++++
 3 files changed, 11 insertions(+)

Comments

Bjorn Helgaas July 22, 2016, 10:04 p.m. UTC | #1
On Wed, Jul 13, 2016 at 05:18:33PM +0100, Marc Zyngier wrote:
> Bharat Kumar Gogada reported issues with the generic MSI code,
> where the end-point ended up with garbage in its MSI configuration
> (both for the vector and the message).
> 
> It turns out that the two MSI paths in the kernel are doing slightly
> different things:
> 
> generic MSI: disable MSI -> allocate MSI -> enable MSI -> setup EP
> PCI MSI: disable MSI -> allocate MSI -> setup EP -> enable MSI
> 
> and it turns out that end-points are allowed to latch the content
> of the MSI configuration registers as soon as MSIs are enabled.
> In Bharat's case, the end-point ends up using whatever was there
> already, which is not what you want.
> 
> In order to make things converge, we introduce a new MSI domain
> flag (MSI_FLAG_ACTIVATE_EARLY) that is unconditionally set for
> PCI/MSI. When set, this flag forces the programming of the end-point
> as soon as the MSIs are allocated.
> 
> A consequence of this is that we have an extra activate in
> irq_startup, but that should be without much consequence.
> 
> Reported-by: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com>
> Tested-by: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

Thomas, let me know if you'd like me to take this.  It looks like the
real smarts here are in kernel/irq, so I assume you'll take it unless
I hear otherwise.

> ---
>  drivers/pci/msi.c   | 2 ++
>  include/linux/msi.h | 2 ++
>  kernel/irq/msi.c    | 7 +++++++
>  3 files changed, 11 insertions(+)
> 
> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
> index a080f44..565e2a4 100644
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -1277,6 +1277,8 @@ struct irq_domain *pci_msi_create_irq_domain(struct fwnode_handle *fwnode,
>  	if (info->flags & MSI_FLAG_USE_DEF_CHIP_OPS)
>  		pci_msi_domain_update_chip_ops(info);
>  
> +	info->flags |= MSI_FLAG_ACTIVATE_EARLY;
> +
>  	domain = msi_create_irq_domain(fwnode, info, parent);
>  	if (!domain)
>  		return NULL;
> diff --git a/include/linux/msi.h b/include/linux/msi.h
> index 8b425c6..513b7c7 100644
> --- a/include/linux/msi.h
> +++ b/include/linux/msi.h
> @@ -270,6 +270,8 @@ enum {
>  	MSI_FLAG_MULTI_PCI_MSI		= (1 << 3),
>  	/* Support PCI MSIX interrupts */
>  	MSI_FLAG_PCI_MSIX		= (1 << 4),
> +	/* Needs early activate, required for PCI */
> +	MSI_FLAG_ACTIVATE_EARLY		= (1 << 5),
>  };
>  
>  int msi_domain_set_affinity(struct irq_data *data, const struct cpumask *mask,
> diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c
> index 38e89ce..4ed2cca 100644
> --- a/kernel/irq/msi.c
> +++ b/kernel/irq/msi.c
> @@ -361,6 +361,13 @@ int msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev,
>  		else
>  			dev_dbg(dev, "irq [%d-%d] for MSI\n",
>  				virq, virq + desc->nvec_used - 1);
> +
> +		if (info->flags & MSI_FLAG_ACTIVATE_EARLY) {
> +			struct irq_data *irq_data;
> +
> +			irq_data = irq_domain_get_irq_data(domain, desc->irq);
> +			irq_domain_activate_irq(irq_data);
> +		}
>  	}
>  
>  	return 0;
> -- 
> 2.1.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thomas Gleixner July 25, 2016, 7:45 a.m. UTC | #2
On Fri, 22 Jul 2016, Bjorn Helgaas wrote:
> On Wed, Jul 13, 2016 at 05:18:33PM +0100, Marc Zyngier wrote:
> > and it turns out that end-points are allowed to latch the content
> > of the MSI configuration registers as soon as MSIs are enabled.
> > In Bharat's case, the end-point ends up using whatever was there
> > already, which is not what you want.
> > 
> > In order to make things converge, we introduce a new MSI domain
> > flag (MSI_FLAG_ACTIVATE_EARLY) that is unconditionally set for
> > PCI/MSI. When set, this flag forces the programming of the end-point
> > as soon as the MSIs are allocated.
> > 
> > A consequence of this is that we have an extra activate in
> > irq_startup, but that should be without much consequence.
> > 
> > Reported-by: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com>
> > Tested-by: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com>
> > Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> 
> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
> 
> Thomas, let me know if you'd like me to take this.  It looks like the
> real smarts here are in kernel/irq, so I assume you'll take it unless
> I hear otherwise.

I'll take it. Though I have second thoughts about the whole issue.

We deliberately made the allocation sequence of interrupts in a way that we
can easily rollback in case of failure.

We achieved that by activating the interrupts only at request time and not
somewhere in the middle of the allocation sequence. That makes the whole
hierarchical allocation more robust and avoids complex rollbacks.

Now that new flag is basically torpedoing that approach.

What I really wonder is why that is only an issue with that particular xilinx
hardware/IP block. I'm aware that up to PCI 2.3 the mask bit for MSI
interrupts is optional or in really old versions not even specified. So only
if that mask bit is missing the above described issue can happen.

If not, then we might have a general issue that we don't mask the entry before
we call pci_msi_set_enable().

Thoughts?

	tglx


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas July 25, 2016, 2:47 p.m. UTC | #3
On Mon, Jul 25, 2016 at 09:45:13AM +0200, Thomas Gleixner wrote:
> On Fri, 22 Jul 2016, Bjorn Helgaas wrote:
> > On Wed, Jul 13, 2016 at 05:18:33PM +0100, Marc Zyngier wrote:
> > > and it turns out that end-points are allowed to latch the content
> > > of the MSI configuration registers as soon as MSIs are enabled.
> > > In Bharat's case, the end-point ends up using whatever was there
> > > already, which is not what you want.
> > > 
> > > In order to make things converge, we introduce a new MSI domain
> > > flag (MSI_FLAG_ACTIVATE_EARLY) that is unconditionally set for
> > > PCI/MSI. When set, this flag forces the programming of the end-point
> > > as soon as the MSIs are allocated.
> > > 
> > > A consequence of this is that we have an extra activate in
> > > irq_startup, but that should be without much consequence.
> > > 
> > > Reported-by: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com>
> > > Tested-by: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com>
> > > Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> > 
> > Acked-by: Bjorn Helgaas <bhelgaas@google.com>
> > 
> > Thomas, let me know if you'd like me to take this.  It looks like the
> > real smarts here are in kernel/irq, so I assume you'll take it unless
> > I hear otherwise.
> 
> I'll take it. Though I have second thoughts about the whole issue.
> 
> We deliberately made the allocation sequence of interrupts in a way that we
> can easily rollback in case of failure.
> 
> We achieved that by activating the interrupts only at request time and not
> somewhere in the middle of the allocation sequence. That makes the whole
> hierarchical allocation more robust and avoids complex rollbacks.
> 
> Now that new flag is basically torpedoing that approach.
> 
> What I really wonder is why that is only an issue with that particular xilinx
> hardware/IP block. I'm aware that up to PCI 2.3 the mask bit for MSI
> interrupts is optional or in really old versions not even specified. So only
> if that mask bit is missing the above described issue can happen.
> 
> If not, then we might have a general issue that we don't mask the entry before
> we call pci_msi_set_enable().

Good question.  I haven't followed this thread in detail, so my ack
meant "I'm OK with this if you are," not "I've reviewed this and
think it's great."

I thought the original issue [1] was that PCI_MSI_FLAGS_ENABLE was being
written before PCI_MSI_ADDRESS_LO.  That doesn't sound like a good
idea to me.

I don't understand the whole flow.  Here's what I've gleaned so far:

  pci_enable_msi_range
    msi_capability_init
      pci_msi_setup_msi_irqs
        domain = pci_msi_get_domain(dev)
        if (domain)
          # this seems like the problem case
          pci_msi_domain_alloc_irqs(domain, dev, nvec)
            msi_domain_alloc_irqs
              ...
        else
          # this case apparently works fine
          arch_setup_msi_irqs
            for_each_pci_msi_entry(entry, dev)
              arch_setup_msi_irq
                chip->setup_irq
                  xilinx_pcie_msi_setup_irq  # xilinx_pcie_msi_chip.setup_irq
                    pci_write_msi_msg
                      __pci_write_msi_msg
                        pci_write_config_dword(PCI_MSI_ADDRESS_LO)
      pci_msi_set_enable(dev, 1)
        pci_write_config_word(PCI_MSI_FLAGS, PCI_MSI_FLAGS_ENABLE)

I assume the problem is that in the MSI domain case, we don't call the
chip->setup_irq method until later.  I gave up trying to figure out
where that happens.  Is it something like the following?

  request_irq
    request_threaded_irq
      __setup_irq
        ...
          ?? chip->setup_irq ??

That does seem like a problem.  Maybe it would be better to delay
setting PCI_MSI_FLAGS_ENABLE until after the MSI address & data bits
have been set?

[1] http://lkml.kernel.org/r/8520D5D51A55D047800579B094147198258B80DE@XAP-PVEXMBX01.xlnx.xilinx.com
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index a080f44..565e2a4 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -1277,6 +1277,8 @@  struct irq_domain *pci_msi_create_irq_domain(struct fwnode_handle *fwnode,
 	if (info->flags & MSI_FLAG_USE_DEF_CHIP_OPS)
 		pci_msi_domain_update_chip_ops(info);
 
+	info->flags |= MSI_FLAG_ACTIVATE_EARLY;
+
 	domain = msi_create_irq_domain(fwnode, info, parent);
 	if (!domain)
 		return NULL;
diff --git a/include/linux/msi.h b/include/linux/msi.h
index 8b425c6..513b7c7 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -270,6 +270,8 @@  enum {
 	MSI_FLAG_MULTI_PCI_MSI		= (1 << 3),
 	/* Support PCI MSIX interrupts */
 	MSI_FLAG_PCI_MSIX		= (1 << 4),
+	/* Needs early activate, required for PCI */
+	MSI_FLAG_ACTIVATE_EARLY		= (1 << 5),
 };
 
 int msi_domain_set_affinity(struct irq_data *data, const struct cpumask *mask,
diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c
index 38e89ce..4ed2cca 100644
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -361,6 +361,13 @@  int msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev,
 		else
 			dev_dbg(dev, "irq [%d-%d] for MSI\n",
 				virq, virq + desc->nvec_used - 1);
+
+		if (info->flags & MSI_FLAG_ACTIVATE_EARLY) {
+			struct irq_data *irq_data;
+
+			irq_data = irq_domain_get_irq_data(domain, desc->irq);
+			irq_domain_activate_irq(irq_data);
+		}
 	}
 
 	return 0;