mbox series

[0/3] PCI: designware: Fixing MSI handling flow

Message ID 20181113225734.8026-1-marc.zyngier@arm.com
Headers show
Series PCI: designware: Fixing MSI handling flow | expand

Message

Marc Zyngier Nov. 13, 2018, 10:57 p.m. UTC
It recently came to light that the Designware PCIe driver is rather
broken in the way it handles MSI[1]:

- It masks interrupt by disabling them, meaning that MSIs generated
  during the masked window are simply lost. Oops.

- Acking of the currently pending MSI is done outside of the interrupt
  flow, getting moved around randomly and ultimately breaking the
  driver. Not great.

This series attempts to address this by switching to using the MASK
register for masking interrupts (!), and move the ack into the
appropriate callback, giving it a fixed place in the MSI handling
flow.

Note that this is only compile-tested on my arm64 laptop, as I'm
travelling and do not have the required HW to test it anyway. I'd
welcome both review and testing by the interested parties (dwc
maintainer and users affected by existing bugs).

Thanks,

	M.

[1] https://patchwork.kernel.org/patch/10657987/

Marc Zyngier (3):
  PCI: designware: Use interrupt masking instead of disabling
  PCI: designware: Take lock when ACKing an interrupt
  PCI: designware: Move interrupt acking into the proper callback

 .../pci/controller/dwc/pcie-designware-host.c | 22 ++++++++++++-------
 1 file changed, 14 insertions(+), 8 deletions(-)

Comments

Gustavo Pimentel Nov. 13, 2018, 11:16 p.m. UTC | #1
On 13/11/2018 22:57, Marc Zyngier wrote:
> It recently came to light that the Designware PCIe driver is rather
> broken in the way it handles MSI[1]:
> 
> - It masks interrupt by disabling them, meaning that MSIs generated
>   during the masked window are simply lost. Oops.
> 
> - Acking of the currently pending MSI is done outside of the interrupt
>   flow, getting moved around randomly and ultimately breaking the
>   driver. Not great.
> 
> This series attempts to address this by switching to using the MASK
> register for masking interrupts (!), and move the ack into the
> appropriate callback, giving it a fixed place in the MSI handling
> flow.
> 
> Note that this is only compile-tested on my arm64 laptop, as I'm
> travelling and do not have the required HW to test it anyway. I'd
> welcome both review and testing by the interested parties (dwc
> maintainer and users affected by existing bugs).

As we spoke on the conference, as soon as I get back and I've the necessary
conditions I'll test the discussed modifications on my HW.

> 
> Thanks,
> 
> 	M.
> 
> [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__patchwork.kernel.org_patch_10657987_&d=DwIDAg&c=DPL6_X_6JkXFx7AXWqB0tg&r=bkWxpLoW-f-E3EdiDCCa0_h0PicsViasSlvIpzZvPxs&m=ksl4uVn2tYUOeqywcyeYHziG6ejYHFeq-Cm4p8q7amU&s=WILOPbhOsYjM3XNhZwxGb2T6069LtLL1aqf9hbS-ajg&e=
> 
> Marc Zyngier (3):
>   PCI: designware: Use interrupt masking instead of disabling
>   PCI: designware: Take lock when ACKing an interrupt
>   PCI: designware: Move interrupt acking into the proper callback
> 
>  .../pci/controller/dwc/pcie-designware-host.c | 22 ++++++++++++-------
>  1 file changed, 14 insertions(+), 8 deletions(-)
>
Marc Zyngier Nov. 14, 2018, 9:54 a.m. UTC | #2
On Tue, 13 Nov 2018 23:16:33 +0000,
Gustavo Pimentel <gustavo.pimentel@synopsys.com> wrote:
> 
> On 13/11/2018 22:57, Marc Zyngier wrote:
> > It recently came to light that the Designware PCIe driver is rather
> > broken in the way it handles MSI[1]:
> > 
> > - It masks interrupt by disabling them, meaning that MSIs generated
> >   during the masked window are simply lost. Oops.
> > 
> > - Acking of the currently pending MSI is done outside of the interrupt
> >   flow, getting moved around randomly and ultimately breaking the
> >   driver. Not great.
> > 
> > This series attempts to address this by switching to using the MASK
> > register for masking interrupts (!), and move the ack into the
> > appropriate callback, giving it a fixed place in the MSI handling
> > flow.
> > 
> > Note that this is only compile-tested on my arm64 laptop, as I'm
> > travelling and do not have the required HW to test it anyway. I'd
> > welcome both review and testing by the interested parties (dwc
> > maintainer and users affected by existing bugs).
> 
> As we spoke on the conference, as soon as I get back and I've the necessary
> conditions I'll test the discussed modifications on my HW.

I just realised (at 1am!) that the first patch is awfully buggy:
- ENABLE and MASK have opposite polarities
- There is nothing initialising the ENABLE and MASK registers

I've stashed the following fix-up on top of the existing series:

diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
index f06e67c60593..0fa9e8fdce66 100644
--- a/drivers/pci/controller/dwc/pcie-designware-host.c
+++ b/drivers/pci/controller/dwc/pcie-designware-host.c
@@ -166,7 +166,7 @@ static void dw_pci_bottom_mask(struct irq_data *data)
 
 		pp->irq_status[ctrl] &= ~(1 << bit);
 		dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_MASK + res, 4,
-				    pp->irq_status[ctrl]);
+				    ~pp->irq_status[ctrl]);
 	}
 
 	raw_spin_unlock_irqrestore(&pp->lock, flags);
@@ -189,7 +189,7 @@ static void dw_pci_bottom_unmask(struct irq_data *data)
 
 		pp->irq_status[ctrl] |= 1 << bit;
 		dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_MASK + res, 4,
-				    pp->irq_status[ctrl]);
+				    ~pp->irq_status[ctrl]);
 	}
 
 	raw_spin_unlock_irqrestore(&pp->lock, flags);
@@ -664,10 +664,15 @@ void dw_pcie_setup_rc(struct pcie_port *pp)
 	num_ctrls = pp->num_vectors / MAX_MSI_IRQS_PER_CTRL;
 
 	/* Initialize IRQ Status array */
-	for (ctrl = 0; ctrl < num_ctrls; ctrl++)
-		dw_pcie_rd_own_conf(pp, PCIE_MSI_INTR0_ENABLE +
+	for (ctrl = 0; ctrl < num_ctrls; ctrl++) {
+		dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_MASK +
 					(ctrl * MSI_REG_CTRL_BLOCK_SIZE),
-				    4, &pp->irq_status[ctrl]);
+				    4, ~0);
+		dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_ENABLE +
+					(ctrl * MSI_REG_CTRL_BLOCK_SIZE),
+				    4, ~0);
+		pp->irq_status[ctrl] = 0;
+	}
 
 	/* Setup RC BARs */
 	dw_pcie_writel_dbi(pci, PCI_BASE_ADDRESS_0, 0x00000004);

Please let me know if this helps.

	M.
Trent Piepho Nov. 14, 2018, 6:28 p.m. UTC | #3
On Tue, 2018-11-13 at 22:57 +0000, Marc Zyngier wrote:
> It recently came to light that the Designware PCIe driver is rather
> broken in the way it handles MSI[1]:
> 
> - It masks interrupt by disabling them, meaning that MSIs generated
>   during the masked window are simply lost. Oops.
> 
> - Acking of the currently pending MSI is done outside of the
> interrupt
>   flow, getting moved around randomly and ultimately breaking the
>   driver. Not great.
> 
> This series attempts to address this by switching to using the MASK
> register for masking interrupts (!), and move the ack into the
> appropriate callback, giving it a fixed place in the MSI handling
> flow.
> 
> Note that this is only compile-tested on my arm64 laptop, as I'm
> travelling and do not have the required HW to test it anyway. I'd
> welcome both review and testing by the interested parties (dwc
> maintainer and users affected by existing bugs).
> 

I've started to test this series after porting all the patches needed
to make IMX7d work from 4.16.8 to 4.20.0-rc2.

Took a little while to figure out that the pcieport driver has a new
config entry to enable, or one gets no interrupts.  I'm not sure if
this is entirely correct behavior.

The new domain stuff does not appear to integrate into the existing irq
framework perfectly.  My interrupt has changed from MSI #1 to MSI
#524288.  Not the most user friendly number.

292:          0          0   PCI-MSI   0 Edge      PCIe PME, aerdrv
293:          1          0   PCI-MSI 524288 Edge      impinj-rfid-modem

Previously the dwc controller would show up as the owner of GPCv2 IRQ
122.  It doesn't any more.  Seems like the kernel info for it is wrong.

/sys/kernel/irq/65/actions:(null)
/sys/kernel/irq/65/chip_name:GPCv2
/sys/kernel/irq/65/hwirq:122
/sys/kernel/irq/65/per_cpu_count:0,0
/sys/kernel/irq/65/type:edge

Should be level and the count should be 1,0.  The debugfs interface is
more accurate:

# cat /sys/kernel/debug/irq/irqs/65
handler:  dw_chained_msi_isr
device:   (null)
status:   0x00010c00
            _IRQ_NOPROBE
            _IRQ_NOREQUEST
            _IRQ_NOTHREAD
dstate:   0x03400204
            IRQ_TYPE_LEVEL_HIGH
            IRQD_ACTIVATED
            IRQD_IRQ_STARTED
            IRQD_SINGLE_TARGET

Still doesn't know what device it's for.

Now I can finally test it!

Confirmed interrupt race is still there in stock kernel.

Confirmed after my patch I didn't see the race.  Didn't check why the
broken enable/disable as mask didn't appear cause a new race, but
something must be wrong somewhere.

Tried your 1st patch.  As I mentioned before in a reply to Gustavo,
just changing the enable to mask results in the MSI never getting
enabled in the first place.  Nothing else writes to the enable
register...

As a workaround, I added an irq_enable method to dw_pcie_msi_irq_chip
that just chains to the parent, and then a hacky irq_enable in
dw_pci_msi_bottom_irq_chip that manipulates the enable register.

Now it works again.  Race still present.  I don't see the
dw_pci_msi_bottom_(un)mask methods ever get called.  I seem to recall
that they are called as a substitute if enable/disable are not present,
but haven't confirmed that, which would explain why they are not called
after I added enable.

Next tried your next two patches.  No longer see lost interrupts, as
the status is cleared before the handler is called.

From what I see the clear of the status bit is effectively at the same
point in the irq path as the way I cleared it in my patch.  There's
just a longer call chain to get to it in the ack method.  Not that it's
not a better place for it (which isn't there in 4.16), but I don't
think it changes anything.  Is there some reason dw_pci_bottom_ack
would not be called?

Since I don't see the un(mask) methods ever get called, I'm not sure if
they are correct or not.  I also had some unanswered details of
behavior on unmask.  I can see possible flaws, depending on how this
works.
Trent Piepho Nov. 14, 2018, 7:19 p.m. UTC | #4
On Wed, 2018-11-14 at 09:54 +0000, Marc Zyngier wrote:
>         /* Initialize IRQ Status array */
> -       for (ctrl = 0; ctrl < num_ctrls; ctrl++)
> -               dw_pcie_rd_own_conf(pp, PCIE_MSI_INTR0_ENABLE +
> +       for (ctrl = 0; ctrl < num_ctrls; ctrl++) {
> +               dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_MASK +
>                                         (ctrl * MSI_REG_CTRL_BLOCK_SIZE),
> -                                   4, &pp->irq_status[ctrl]);
> +                                   4, ~0);
> +               dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_ENABLE +
> +                                       (ctrl * MSI_REG_CTRL_BLOCK_SIZE),
> +                                   4, ~0);
> +               pp->irq_status[ctrl] = 0;
> +       }
>  

I tested yesterday before this patch was sent and fixed this issue
another way.  I pretty sure this would work as well, though it's not
clear to me it's more correct.

It looks to me that the previous code was using whatever MSIs were
already enabled when the driver is initialized as the set to leave
enabled.

This looks like it changing that behavior, and instead enabling all
MSIs on initialization.

I would think the default value for a MSI should be disabled until
something enables it.

I speculated that the previous behavior was trying to work with an MSI
enabled by the bootloader, ACPI firmware, etc. that should be left
alone.  Or perhaps there was no good reason not to disable everything
on initialization and that code just got copied from somewhere else and
no one thought about it.  There's certainly evidence of that in this
driver.
Marc Zyngier Nov. 14, 2018, 10:01 p.m. UTC | #5
On Wed, 14 Nov 2018 19:19:27 +0000,
Trent Piepho <tpiepho@impinj.com> wrote:
> 
> On Wed, 2018-11-14 at 09:54 +0000, Marc Zyngier wrote:
> >         /* Initialize IRQ Status array */
> > -       for (ctrl = 0; ctrl < num_ctrls; ctrl++)
> > -               dw_pcie_rd_own_conf(pp, PCIE_MSI_INTR0_ENABLE +
> > +       for (ctrl = 0; ctrl < num_ctrls; ctrl++) {
> > +               dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_MASK +
> >                                         (ctrl * MSI_REG_CTRL_BLOCK_SIZE),
> > -                                   4, &pp->irq_status[ctrl]);
> > +                                   4, ~0);
> > +               dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_ENABLE +
> > +                                       (ctrl * MSI_REG_CTRL_BLOCK_SIZE),
> > +                                   4, ~0);
> > +               pp->irq_status[ctrl] = 0;
> > +       }
> >  
> 
> I tested yesterday before this patch was sent and fixed this issue
> another way.  I pretty sure this would work as well, though it's not
> clear to me it's more correct.

Given that we don't have a spec or any form of useful documentation
(except for the information that Gustavo gave us), I don't think
*anything* we'll write here has a remote chance of being
correct. We're simply poking in the dark.

> It looks to me that the previous code was using whatever MSIs were
> already enabled when the driver is initialized as the set to leave
> enabled.

And I claim that this is a gross bug. We don't want to inherit
anything at all, rather start from a fresh start.

> This looks like it changing that behavior, and instead enabling all
> MSIs on initialization.
> 
> I would think the default value for a MSI should be disabled until
> something enables it.

Sure, that's no big deal. We can plug in the enable/disable callbacks
to that effect, although we'll end-up with similar result, I'd expect.

> I speculated that the previous behavior was trying to work with an MSI
> enabled by the bootloader, ACPI firmware, etc. that should be left
> alone.  Or perhaps there was no good reason not to disable everything
> on initialization and that code just got copied from somewhere else and
> no one thought about it.  There's certainly evidence of that in this
> driver.

As you said, you're speculating. Nonetheless, there is no reason to
start with anything enabled the first place.

	M.
Marc Zyngier Nov. 14, 2018, 10:07 p.m. UTC | #6
On Wed, 14 Nov 2018 18:28:05 +0000,
Trent Piepho <tpiepho@impinj.com> wrote:
> 
> On Tue, 2018-11-13 at 22:57 +0000, Marc Zyngier wrote:
> > It recently came to light that the Designware PCIe driver is rather
> > broken in the way it handles MSI[1]:
> > 
> > - It masks interrupt by disabling them, meaning that MSIs generated
> >   during the masked window are simply lost. Oops.
> > 
> > - Acking of the currently pending MSI is done outside of the
> > interrupt
> >   flow, getting moved around randomly and ultimately breaking the
> >   driver. Not great.
> > 
> > This series attempts to address this by switching to using the MASK
> > register for masking interrupts (!), and move the ack into the
> > appropriate callback, giving it a fixed place in the MSI handling
> > flow.
> > 
> > Note that this is only compile-tested on my arm64 laptop, as I'm
> > travelling and do not have the required HW to test it anyway. I'd
> > welcome both review and testing by the interested parties (dwc
> > maintainer and users affected by existing bugs).
> > 
> 
> I've started to test this series after porting all the patches needed
> to make IMX7d work from 4.16.8 to 4.20.0-rc2.
> 
> Took a little while to figure out that the pcieport driver has a new
> config entry to enable, or one gets no interrupts.  I'm not sure if
> this is entirely correct behavior.
> 
> The new domain stuff does not appear to integrate into the existing irq
> framework perfectly.  My interrupt has changed from MSI #1 to MSI
> #524288.  Not the most user friendly number.

It is not supposed to be user friendly. It is not even supposed to be
interpreted by anyone. And if you print it in hex, you'll find that it
*is* actually useful.

	M.
Trent Piepho Nov. 14, 2018, 10:25 p.m. UTC | #7
On Wed, 2018-11-14 at 22:01 +0000, Marc Zyngier wrote:
> On Wed, 14 Nov 2018 19:19:27 +0000,
> Trent Piepho <tpiepho@impinj.com> wrote:
> > 
> > On Wed, 2018-11-14 at 09:54 +0000, Marc Zyngier wrote:
> > >         /* Initialize IRQ Status array */
> > > -       for (ctrl = 0; ctrl < num_ctrls; ctrl++)
> > > -               dw_pcie_rd_own_conf(pp, PCIE_MSI_INTR0_ENABLE +
> > > +       for (ctrl = 0; ctrl < num_ctrls; ctrl++) {
> > > +               dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_MASK +
> > >                                         (ctrl * MSI_REG_CTRL_BLOCK_SIZE),
> > > -                                   4, &pp->irq_status[ctrl]);
> > > +                                   4, ~0);
> > > +               dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_ENABLE +
> > > +                                       (ctrl * MSI_REG_CTRL_BLOCK_SIZE),
> > > +                                   4, ~0);
> > > +               pp->irq_status[ctrl] = 0;
> > > +       }
> > >  
> > 
> > I tested yesterday before this patch was sent and fixed this issue
> > another way.  I pretty sure this would work as well, though it's not
> > clear to me it's more correct.
> 
> Given that we don't have a spec or any form of useful documentation
> (except for the information that Gustavo gave us), I don't think
> *anything* we'll write here has a remote chance of being
> correct. We're simply poking in the dark.

Should all MSIs start enabled, or should they start disabled and be
enabled via the irq_enable method of the irq_chip, seems like a Linux
design decision to me.  Decide that, then try to figure out how to make
the hardware do what Linux expects it to do.

Starting disabled seems like the right design to me.  So here's my
attempt to make the driver do this.  Works in my tests.  I've not
tracked down all uses of irq_status outside the driver to determine how
it's supposed to work.

From dfc015f9821f5105cbcf9686d360105ffbac4ffb Mon Sep 17 00:00:00 2001
From: Trent Piepho <tpiepho@impinj.com>
Date: Wed, 14 Nov 2018 14:12:47 -0800
Subject: [PATCH] PCI: dwc: Allow enabling MSIs and start with disabled

Add irq_enable callbacks to let MSIs be enabled.

Previously the driver would leave any MSIs enabled when it initialized
that way.  Rather than that, disable them all.

Signed-off-by: Trent Piepho <tpiepho@impinj.com>
---
 drivers/pci/controller/dwc/pcie-designware-host.c | 37
++++++++++++++++++++---
 1 file changed, 33 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c
b/drivers/pci/controller/dwc/pcie-designware-host.c
index f06e67c60593..e7770fb1ced8 100644
--- a/drivers/pci/controller/dwc/pcie-designware-host.c
+++ b/drivers/pci/controller/dwc/pcie-designware-host.c
@@ -61,11 +61,17 @@ static void dw_msi_unmask_irq(struct irq_data *d)
 	irq_chip_unmask_parent(d);
 }
 
+static void dw_msi_enable_irq(struct irq_data *d)
+{
+	irq_chip_enable_parent(d);
+}
+
 static struct irq_chip dw_pcie_msi_irq_chip = {
 	.name = "PCI-MSI",
 	.irq_ack = dw_msi_ack_irq,
 	.irq_mask = dw_msi_mask_irq,
 	.irq_unmask = dw_msi_unmask_irq,
+	.irq_enable = dw_msi_enable_irq,
 };
 
 static struct msi_domain_info dw_pcie_msi_domain_info = {
@@ -215,6 +221,26 @@ static void dw_pci_bottom_ack(struct irq_data *d)
 	raw_spin_unlock_irqrestore(&pp->lock, flags);
 }
 
+static void dw_pci_bottom_enable(struct irq_data *data)
+{
+	struct pcie_port *pp = irq_data_get_irq_chip_data(data);
+	unsigned int res, bit, ctrl;
+	unsigned long flags;
+	u32 enable;
+
+	ctrl = data->hwirq / MAX_MSI_IRQS_PER_CTRL;
+	res = ctrl * MSI_REG_CTRL_BLOCK_SIZE;
+	bit = data->hwirq % MAX_MSI_IRQS_PER_CTRL;
+
+	raw_spin_lock_irqsave(&pp->lock, flags);
+
+	dw_pcie_rd_own_conf(pp, PCIE_MSI_INTR0_ENABLE + res, 4,
&enable);
+	enable |= BIT(bit);
+	dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_ENABLE + res, 4,
enable);
+
+	raw_spin_unlock_irqrestore(&pp->lock, flags);
+}
+
 static struct irq_chip dw_pci_msi_bottom_irq_chip = {
 	.name = "DWPCI-MSI",
 	.irq_ack = dw_pci_bottom_ack,
@@ -222,6 +248,7 @@ static struct irq_chip dw_pci_msi_bottom_irq_chip =
{
 	.irq_set_affinity = dw_pci_msi_set_affinity,
 	.irq_mask = dw_pci_bottom_mask,
 	.irq_unmask = dw_pci_bottom_unmask,
+	.irq_enable = dw_pci_bottom_enable,
 };
 
 static int dw_pcie_irq_domain_alloc(struct irq_domain *domain,
@@ -663,11 +690,13 @@ void dw_pcie_setup_rc(struct pcie_port *pp)
 
 	num_ctrls = pp->num_vectors / MAX_MSI_IRQS_PER_CTRL;
 
-	/* Initialize IRQ Status array */
-	for (ctrl = 0; ctrl < num_ctrls; ctrl++)
-		dw_pcie_rd_own_conf(pp, PCIE_MSI_INTR0_ENABLE +
+	/* Disable all MSIs and initialize IRQ Status array */
+	for (ctrl = 0; ctrl < num_ctrls; ctrl++) {
+		dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_ENABLE +
 					(ctrl *
MSI_REG_CTRL_BLOCK_SIZE),
-				    4, &pp->irq_status[ctrl]);
+				    4, 0);
+		pp->irq_status[ctrl] = 0;
+	}
 
 	/* Setup RC BARs */
 	dw_pcie_writel_dbi(pci, PCI_BASE_ADDRESS_0, 0x00000004);
Marc Zyngier Nov. 14, 2018, 10:44 p.m. UTC | #8
On Wed, 14 Nov 2018 22:25:37 +0000,
Trent Piepho <tpiepho@impinj.com> wrote:
> 
> On Wed, 2018-11-14 at 22:01 +0000, Marc Zyngier wrote:
> > On Wed, 14 Nov 2018 19:19:27 +0000,
> > Trent Piepho <tpiepho@impinj.com> wrote:
> > > 
> > > On Wed, 2018-11-14 at 09:54 +0000, Marc Zyngier wrote:
> > > >         /* Initialize IRQ Status array */
> > > > -       for (ctrl = 0; ctrl < num_ctrls; ctrl++)
> > > > -               dw_pcie_rd_own_conf(pp, PCIE_MSI_INTR0_ENABLE +
> > > > +       for (ctrl = 0; ctrl < num_ctrls; ctrl++) {
> > > > +               dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_MASK +
> > > >                                         (ctrl * MSI_REG_CTRL_BLOCK_SIZE),
> > > > -                                   4, &pp->irq_status[ctrl]);
> > > > +                                   4, ~0);
> > > > +               dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_ENABLE +
> > > > +                                       (ctrl * MSI_REG_CTRL_BLOCK_SIZE),
> > > > +                                   4, ~0);
> > > > +               pp->irq_status[ctrl] = 0;
> > > > +       }
> > > >  
> > > 
> > > I tested yesterday before this patch was sent and fixed this issue
> > > another way.  I pretty sure this would work as well, though it's not
> > > clear to me it's more correct.
> > 
> > Given that we don't have a spec or any form of useful documentation
> > (except for the information that Gustavo gave us), I don't think
> > *anything* we'll write here has a remote chance of being
> > correct. We're simply poking in the dark.
> 
> Should all MSIs start enabled, or should they start disabled and be
> enabled via the irq_enable method of the irq_chip, seems like a Linux
> design decision to me.  Decide that, then try to figure out how to make
> the hardware do what Linux expects it to do.
> 
> Starting disabled seems like the right design to me.  So here's my
> attempt to make the driver do this.  Works in my tests.  I've not
> tracked down all uses of irq_status outside the driver to determine how
> it's supposed to work.
> 
> From dfc015f9821f5105cbcf9686d360105ffbac4ffb Mon Sep 17 00:00:00 2001
> From: Trent Piepho <tpiepho@impinj.com>
> Date: Wed, 14 Nov 2018 14:12:47 -0800
> Subject: [PATCH] PCI: dwc: Allow enabling MSIs and start with disabled
> 
> Add irq_enable callbacks to let MSIs be enabled.
> 
> Previously the driver would leave any MSIs enabled when it initialized
> that way.  Rather than that, disable them all.
> 
> Signed-off-by: Trent Piepho <tpiepho@impinj.com>
> ---
>  drivers/pci/controller/dwc/pcie-designware-host.c | 37
> ++++++++++++++++++++---
>  1 file changed, 33 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c
> b/drivers/pci/controller/dwc/pcie-designware-host.c
> index f06e67c60593..e7770fb1ced8 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> @@ -61,11 +61,17 @@ static void dw_msi_unmask_irq(struct irq_data *d)
>  	irq_chip_unmask_parent(d);
>  }
>  
> +static void dw_msi_enable_irq(struct irq_data *d)
> +{
> +	irq_chip_enable_parent(d);
> +}
> +
>  static struct irq_chip dw_pcie_msi_irq_chip = {
>  	.name = "PCI-MSI",
>  	.irq_ack = dw_msi_ack_irq,
>  	.irq_mask = dw_msi_mask_irq,
>  	.irq_unmask = dw_msi_unmask_irq,
> +	.irq_enable = dw_msi_enable_irq,

If you're doing that, please implement both enable *and* disable.

>  };
>  
>  static struct msi_domain_info dw_pcie_msi_domain_info = {
> @@ -215,6 +221,26 @@ static void dw_pci_bottom_ack(struct irq_data *d)
>  	raw_spin_unlock_irqrestore(&pp->lock, flags);
>  }
>  
> +static void dw_pci_bottom_enable(struct irq_data *data)
> +{
> +	struct pcie_port *pp = irq_data_get_irq_chip_data(data);
> +	unsigned int res, bit, ctrl;
> +	unsigned long flags;
> +	u32 enable;
> +
> +	ctrl = data->hwirq / MAX_MSI_IRQS_PER_CTRL;
> +	res = ctrl * MSI_REG_CTRL_BLOCK_SIZE;
> +	bit = data->hwirq % MAX_MSI_IRQS_PER_CTRL;
> +
> +	raw_spin_lock_irqsave(&pp->lock, flags);
> +
> +	dw_pcie_rd_own_conf(pp, PCIE_MSI_INTR0_ENABLE + res, 4,
> &enable);
> +	enable |= BIT(bit);
> +	dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_ENABLE + res, 4,
> enable);
> +
> +	raw_spin_unlock_irqrestore(&pp->lock, flags);
> +}

How does it work for drivers that use the callbacks stuff?

	M.
Trent Piepho Nov. 14, 2018, 10:50 p.m. UTC | #9
On Wed, 2018-11-14 at 22:07 +0000, Marc Zyngier wrote:
> On Wed, 14 Nov 2018 18:28:05 +0000,
> Trent Piepho <tpiepho@impinj.com> wrote:
> > 
> > 
> > The new domain stuff does not appear to integrate into the existing irq
> > framework perfectly.  My interrupt has changed from MSI #1 to MSI
> > #524288.  Not the most user friendly number.
> 
> It is not supposed to be user friendly. It is not even supposed to be
> interpreted by anyone. And if you print it in hex, you'll find that it
> *is* actually useful.

The GPCv2 values match those in the datasheet.  This is very handy!

domain:  :soc:aips-bus@30800000:pcie@33800000-3
 hwirq:   0x80000
 chip:    PCI-MSI
  flags:   0x20
             IRQCHIP_ONESHOT_SAFE
 parent:
    domain:  :soc:aips-bus@30800000:pcie@33800000
     hwirq:   0x1
     chip:    DWPCI-MSI
      flags:   0x0

It's not clear to me what these two domains are for.  Perhaps if I had
multiple busses or multiple ports it would be.  I'm assuming the offset
is based on 2048 MSI-Xs per function, 8 functions per device, and 32
devices per bus.  So perhaps this means this is MSI 0 on bus 1 of the
controller.
Trent Piepho Nov. 14, 2018, 11:23 p.m. UTC | #10
On Wed, 2018-11-14 at 22:44 +0000, Marc Zyngier wrote:
>  static struct irq_chip dw_pcie_msi_irq_chip = {
> >  	.name = "PCI-MSI",
> >  	.irq_ack = dw_msi_ack_irq,
> >  	.irq_mask = dw_msi_mask_irq,
> >  	.irq_unmask = dw_msi_unmask_irq,
> > +	.irq_enable = dw_msi_enable_irq,
> 
> If you're doing that, please implement both enable *and* disable.

Yes, certainly.  I'm not yet convinced I've should put this here or if
my enable method has not missed something.  Just putting this out there
as a way to test the code.

> 
> > +static void dw_pci_bottom_enable(struct irq_data *data)
> > +{
> > +	struct pcie_port *pp = irq_data_get_irq_chip_data(data);
> > +	unsigned int res, bit, ctrl;
> > +	unsigned long flags;
> > +	u32 enable;
> > +
> > +	ctrl = data->hwirq / MAX_MSI_IRQS_PER_CTRL;
> > +	res = ctrl * MSI_REG_CTRL_BLOCK_SIZE;
> > +	bit = data->hwirq % MAX_MSI_IRQS_PER_CTRL;
> > +
> > +	raw_spin_lock_irqsave(&pp->lock, flags);
> > +
> > +	dw_pcie_rd_own_conf(pp, PCIE_MSI_INTR0_ENABLE + res, 4,
> > &enable);
> > +	enable |= BIT(bit);
> > +	dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_ENABLE + res, 4,
> > enable);

I wonder if there are ENABLE_SET and ENABLE_CLR registers to allow
setting/clearing a bit atomically with one write, as is possible with
the status register.

> > +
> > +	raw_spin_unlock_irqrestore(&pp->lock, flags);
> > +}
> 
> How does it work for drivers that use the callbacks stuff?

Do you mean the ops callbacks in the pcie_port?  There isn't a callback
in dw_pcie_host_ops for enable/disable.  But there is
msi_set/clear_irq, undocumented.  Looks like it's the maybe the same
thing using different terms.  Only user is keystone.  Only callsite is
from the dw_pci_bottom_un/**mask**() methods.

It seems someone else was unclear on the distinction between disabling
and masking an IRQ.

So my patch should play ok with the only affected user, keystone, in
the sense it causes no new problems.

But there might be another flaw here, fixed by:

1. Decide msi_set/clear_irq should enable/disable the irq, remove it
from the un/mask methods and move it into the en/disable methods.

2. Decide msi_set/clear_irq should un/mask the irq and keystone should
not be using it with the keystone "ENABLE" register.

3. Like 2, but decide the keystone enable register really is a mask and
everything is fine.

My bet is on 1.
Gustavo Pimentel Nov. 15, 2018, 3:22 p.m. UTC | #11
On 14/11/2018 18:28, Trent Piepho wrote:
> On Tue, 2018-11-13 at 22:57 +0000, Marc Zyngier wrote:
>> It recently came to light that the Designware PCIe driver is rather
>> broken in the way it handles MSI[1]:
>>
>> - It masks interrupt by disabling them, meaning that MSIs generated
>>   during the masked window are simply lost. Oops.
>>
>> - Acking of the currently pending MSI is done outside of the
>> interrupt
>>   flow, getting moved around randomly and ultimately breaking the
>>   driver. Not great.
>>
>> This series attempts to address this by switching to using the MASK
>> register for masking interrupts (!), and move the ack into the
>> appropriate callback, giving it a fixed place in the MSI handling
>> flow.
>>
>> Note that this is only compile-tested on my arm64 laptop, as I'm
>> travelling and do not have the required HW to test it anyway. I'd
>> welcome both review and testing by the interested parties (dwc
>> maintainer and users affected by existing bugs).
>>
> 
> I've started to test this series after porting all the patches needed
> to make IMX7d work from 4.16.8 to 4.20.0-rc2.
> 
> Took a little while to figure out that the pcieport driver has a new
> config entry to enable, or one gets no interrupts.  I'm not sure if
> this is entirely correct behavior.
> 
> The new domain stuff does not appear to integrate into the existing irq
> framework perfectly.  My interrupt has changed from MSI #1 to MSI
> #524288.  Not the most user friendly number.
> 
> 292:          0          0   PCI-MSI   0 Edge      PCIe PME, aerdrv
> 293:          1          0   PCI-MSI 524288 Edge      impinj-rfid-modem
> 
> Previously the dwc controller would show up as the owner of GPCv2 IRQ
> 122.  It doesn't any more.  Seems like the kernel info for it is wrong.
> 
> /sys/kernel/irq/65/actions:(null)
> /sys/kernel/irq/65/chip_name:GPCv2
> /sys/kernel/irq/65/hwirq:122
> /sys/kernel/irq/65/per_cpu_count:0,0
> /sys/kernel/irq/65/type:edge
> 
> Should be level and the count should be 1,0.  The debugfs interface is
> more accurate:
> 
> # cat /sys/kernel/debug/irq/irqs/65
> handler:  dw_chained_msi_isr
> device:   (null)
> status:   0x00010c00
>             _IRQ_NOPROBE
>             _IRQ_NOREQUEST
>             _IRQ_NOTHREAD
> dstate:   0x03400204
>             IRQ_TYPE_LEVEL_HIGH
>             IRQD_ACTIVATED
>             IRQD_IRQ_STARTED
>             IRQD_SINGLE_TARGET
> 
> Still doesn't know what device it's for.
> 
> Now I can finally test it!
> 
> Confirmed interrupt race is still there in stock kernel.
> 
> Confirmed after my patch I didn't see the race.  Didn't check why the
> broken enable/disable as mask didn't appear cause a new race, but
> something must be wrong somewhere.
> 
> Tried your 1st patch.  As I mentioned before in a reply to Gustavo,
> just changing the enable to mask results in the MSI never getting
> enabled in the first place.  Nothing else writes to the enable
> register...
> 
> As a workaround, I added an irq_enable method to dw_pcie_msi_irq_chip
> that just chains to the parent, and then a hacky irq_enable in
> dw_pci_msi_bottom_irq_chip that manipulates the enable register.
> 
> Now it works again.  Race still present.  I don't see the
> dw_pci_msi_bottom_(un)mask methods ever get called.  I seem to recall
> that they are called as a substitute if enable/disable are not present,
> but haven't confirmed that, which would explain why they are not called
> after I added enable.

Hum, this probably is correlated with [1] where on the describition the this
enumerator says that "One shot does not require mask/unmask" see [2]

[1]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/msi.c?h=v4.20-rc2#n1453

[2]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/irq.h?h=v4.20-rc2#n504

I'm still waiting for an internal confirmation from the IP team about the good
procedure to take on this matter.

As soon I get something I'll post here.

Regards,
Gustavo

> 
> Next tried your next two patches.  No longer see lost interrupts, as
> the status is cleared before the handler is called.
> 
> From what I see the clear of the status bit is effectively at the same
> point in the irq path as the way I cleared it in my patch.  There's
> just a longer call chain to get to it in the ack method.  Not that it's
> not a better place for it (which isn't there in 4.16), but I don't
> think it changes anything.  Is there some reason dw_pci_bottom_ack
> would not be called?
> 
> Since I don't see the un(mask) methods ever get called, I'm not sure if
> they are correct or not.  I also had some unanswered details of
> behavior on unmask.  I can see possible flaws, depending on how this
> works.
>
Trent Piepho Nov. 15, 2018, 6:37 p.m. UTC | #12
On Thu, 2018-11-15 at 15:22 +0000, Gustavo Pimentel wrote:
> On 14/11/2018 18:28, Trent Piepho wrote:
> > On Tue, 2018-11-13 at 22:57 +0000, Marc Zyngier wrote:
> > > 
> > Now it works again.  Race still present.  I don't see the
> > dw_pci_msi_bottom_(un)mask methods ever get called.  I seem to recall
> > that they are called as a substitute if enable/disable are not present,
> > but haven't confirmed that, which would explain why they are not called
> > after I added enable.
> 
> Hum, this probably is correlated with [1] where on the describition the this
> enumerator says that "One shot does not require mask/unmask" see [2]

That seems reasonable then.  I've said before that while I think
masking could be added in the interrupt flow without breaking anything,
I think it's redundant to do it at this layer.

I suspect it might be possible mask/unmask the interrupt "manually",
e.g. UIO does this to allow handling a level interrupt from userspace,
and that would be a path to reach the mask methods.
Marc Zyngier Nov. 15, 2018, 7:29 p.m. UTC | #13
On Thu, 15 Nov 2018 18:37:33 +0000,
Trent Piepho <tpiepho@impinj.com> wrote:
> 
> On Thu, 2018-11-15 at 15:22 +0000, Gustavo Pimentel wrote:
> > On 14/11/2018 18:28, Trent Piepho wrote:
> > > On Tue, 2018-11-13 at 22:57 +0000, Marc Zyngier wrote:
> > > > 
> > > Now it works again.  Race still present.  I don't see the
> > > dw_pci_msi_bottom_(un)mask methods ever get called.  I seem to recall
> > > that they are called as a substitute if enable/disable are not present,
> > > but haven't confirmed that, which would explain why they are not called
> > > after I added enable.
> > 
> > Hum, this probably is correlated with [1] where on the describition the this
> > enumerator says that "One shot does not require mask/unmask" see [2]
> 
> That seems reasonable then.  I've said before that while I think
> masking could be added in the interrupt flow without breaking anything,
> I think it's redundant to do it at this layer.
> 
> I suspect it might be possible mask/unmask the interrupt "manually",
> e.g. UIO does this to allow handling a level interrupt from userspace,
> and that would be a path to reach the mask methods.

The real use case is that a driver can perfectly disable (mask) an
interrupt while being in the interrupt handler, and enable it again at
a later time.

	M.
Trent Piepho Nov. 19, 2018, 8:14 p.m. UTC | #14
On Thu, 2018-11-15 at 19:29 +0000, Marc Zyngier wrote:
> On Thu, 15 Nov 2018 18:37:33 +0000,
> Trent Piepho <tpiepho@impinj.com> wrote:
> > 
> > On Thu, 2018-11-15 at 15:22 +0000, Gustavo Pimentel wrote:
> > > On 14/11/2018 18:28, Trent Piepho wrote:
> > > > On Tue, 2018-11-13 at 22:57 +0000, Marc Zyngier wrote:
> > > > > 
> > > > 
> > > > Now it works again.  Race still present.  I don't see the
> > > > dw_pci_msi_bottom_(un)mask methods ever get called.  I seem to recall
> > > > that they are called as a substitute if enable/disable are not present,
> > > > but haven't confirmed that, which would explain why they are not called
> > > > after I added enable.
> > > 
> > > Hum, this probably is correlated with [1] where on the describition the this
> > > enumerator says that "One shot does not require mask/unmask" see [2]
> > 
> > That seems reasonable then.  I've said before that while I think
> > masking could be added in the interrupt flow without breaking anything,
> > I think it's redundant to do it at this layer.
> > 
> > I suspect it might be possible mask/unmask the interrupt "manually",
> > e.g. UIO does this to allow handling a level interrupt from userspace,
> > and that would be a path to reach the mask methods.
> 
> The real use case is that a driver can perfectly disable (mask) an
> interrupt while being in the interrupt handler, and enable it again at
> a later time.

That's what I was thinking of with UIO, but uio_dmem_genirq_handler()
actually calls disable_irq_nosync().  I had thought it masked the irq
rather than disabled it.

I wonder if this is a UIO bug.  The distinction between disable and
mask is usually that the masked interrupt is not lost while the
disabled interrupt is.  But there are irq chips (like dwc!) that do not
 implement enable/disable methods and use mask methods instead.
Trent Piepho Nov. 19, 2018, 8:37 p.m. UTC | #15
On Wed, 2018-11-14 at 22:01 +0000, Marc Zyngier wrote:
> On Wed, 14 Nov 2018 19:19:27 +0000,
> 
> > It looks to me that the previous code was using whatever MSIs were
> > already enabled when the driver is initialized as the set to leave
> > enabled.
> 
> And I claim that this is a gross bug. We don't want to inherit
> anything at all, rather start from a fresh start.
> 
> > This looks like it changing that behavior, and instead enabling all
> > MSIs on initialization.
> > 
> > I would think the default value for a MSI should be disabled until
> > something enables it.
> 
> Sure, that's no big deal. We can plug in the enable/disable callbacks
> to that effect, although we'll end-up with similar result, I'd expect.
> 
> > I speculated that the previous behavior was trying to work with an MSI
> > enabled by the bootloader, ACPI firmware, etc. that should be left
> > alone.  Or perhaps there was no good reason not to disable everything
> > on initialization and that code just got copied from somewhere else and
> > no one thought about it.  There's certainly evidence of that in this
> > driver.
> 
> As you said, you're speculating. Nonetheless, there is no reason to
> start with anything enabled the first place.

First introduction of this concept explicitly appears to be by Gustavo
in commit 7c5925afbc.  It added irq_status to the driver state and
initialized it from the existing value of the enable register.  Prior
to this the bits were always set in a RMW operation I don't think
initialization was considered.

I don't see any note about why.  All disabled makes far more sense to
me.  You get hard to track down bugs with unexpected interrupts during
kernel boot, but only a soft reboot, because the irq is enabled before
the drivers expected it to be.
Stanimir Varbanov Nov. 21, 2018, 5:24 p.m. UTC | #16
Hi,

On 11/14/18 12:57 AM, Marc Zyngier wrote:
> It recently came to light that the Designware PCIe driver is rather
> broken in the way it handles MSI[1]:
> 
> - It masks interrupt by disabling them, meaning that MSIs generated
>   during the masked window are simply lost. Oops.
> 
> - Acking of the currently pending MSI is done outside of the interrupt
>   flow, getting moved around randomly and ultimately breaking the
>   driver. Not great.
> 
> This series attempts to address this by switching to using the MASK
> register for masking interrupts (!), and move the ack into the
> appropriate callback, giving it a fixed place in the MSI handling
> flow.
> 
> Note that this is only compile-tested on my arm64 laptop, as I'm
> travelling and do not have the required HW to test it anyway. I'd
> welcome both review and testing by the interested parties (dwc
> maintainer and users affected by existing bugs).
> 
> Thanks,
> 
> 	M.
> 
> [1] https://patchwork.kernel.org/patch/10657987/
> 
> Marc Zyngier (3):
>   PCI: designware: Use interrupt masking instead of disabling
>   PCI: designware: Take lock when ACKing an interrupt
>   PCI: designware: Move interrupt acking into the proper callback
> 
>  .../pci/controller/dwc/pcie-designware-host.c | 22 ++++++++++++-------
>  1 file changed, 14 insertions(+), 8 deletions(-)
> 

for pcie-qcom:

Tested-by: Stanimir Varbanov <svarbanov@mm-sol.com>
Gustavo Pimentel Nov. 22, 2018, 12:03 p.m. UTC | #17
Hi all,


> 
> I just realised (at 1am!) that the first patch is awfully buggy:
> - ENABLE and MASK have opposite polarities
> - There is nothing initialising the ENABLE and MASK registers
> 
> I've stashed the following fix-up on top of the existing series:
> 
> diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
> index f06e67c60593..0fa9e8fdce66 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> @@ -166,7 +166,7 @@ static void dw_pci_bottom_mask(struct irq_data *data)

(snip)

I used your patch and made it more perceptible in my opinion, (basically I
transformed the variable irq_mask (previous irq_status) into a mirror of MASK
registers, which removed the need for the *NOT* operation and forced the mask
operation swap in the callbacks)

Lorenzo can you apply this fix-up in your branch test/pci-dwc-msi? Maybe merging
with eca44651920c ("PCI: designware: Move interrupt acking into the proper
callback"), perhaps.

I've tested Marc's patch series (with and without my fix-up and both) with a
NVme EP that generates a burst of 19 MSI-X interrupts. Both approaches are
working fine.

Just a couple of suggestions Lorenzo, maybe you could exchange the *designware*
by *dwc* on all patch series titles and on eca44651920c("PCI: designware: Move
interrupt acking into the proper callback") replace *acking* by *ACKing* like
previous patch has.

Marc thanks for this patch fix! :)

Tested-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>

diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c
b/drivers/pci/controller/dwc/pcie-designware-host.c
index 0fa9e8f..a5132b3 100644
--- a/drivers/pci/controller/dwc/pcie-designware-host.c
+++ b/drivers/pci/controller/dwc/pcie-designware-host.c
@@ -164,9 +164,9 @@ static void dw_pci_bottom_mask(struct irq_data *data)
                res = ctrl * MSI_REG_CTRL_BLOCK_SIZE;
                bit = data->hwirq % MAX_MSI_IRQS_PER_CTRL;

-               pp->irq_status[ctrl] &= ~(1 << bit);
+               pp->irq_mask[ctrl] |= BIT(bit);
                dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_MASK + res, 4,
-                                   ~pp->irq_status[ctrl]);
+                                   pp->irq_mask[ctrl]);
        }

        raw_spin_unlock_irqrestore(&pp->lock, flags);
@@ -187,30 +187,30 @@ static void dw_pci_bottom_unmask(struct irq_data *data)
                res = ctrl * MSI_REG_CTRL_BLOCK_SIZE;
                bit = data->hwirq % MAX_MSI_IRQS_PER_CTRL;

-               pp->irq_status[ctrl] |= 1 << bit;
+               pp->irq_mask[ctrl] &= ~BIT(bit);
                dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_MASK + res, 4,
-                                   ~pp->irq_status[ctrl]);
+                                   pp->irq_mask[ctrl]);
        }

        raw_spin_unlock_irqrestore(&pp->lock, flags);
 }

-static void dw_pci_bottom_ack(struct irq_data *d)
+static void dw_pci_bottom_ack(struct irq_data *data)
 {
-       struct pcie_port *pp  = irq_data_get_irq_chip_data(d);
+       struct pcie_port *pp = irq_data_get_irq_chip_data(data);
        unsigned int res, bit, ctrl;
        unsigned long flags;

-       ctrl = d->hwirq / MAX_MSI_IRQS_PER_CTRL;
+       ctrl = data->hwirq / MAX_MSI_IRQS_PER_CTRL;
        res = ctrl * MSI_REG_CTRL_BLOCK_SIZE;
-       bit = d->hwirq % MAX_MSI_IRQS_PER_CTRL;
+       bit = data->hwirq % MAX_MSI_IRQS_PER_CTRL;

        raw_spin_lock_irqsave(&pp->lock, flags);

-       dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_STATUS + res, 4, 1 << bit);
+       dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_STATUS + res, 4, BIT(bit));

        if (pp->ops->msi_irq_ack)
-               pp->ops->msi_irq_ack(d->hwirq, pp);
+               pp->ops->msi_irq_ack(data->hwirq, pp);

        raw_spin_unlock_irqrestore(&pp->lock, flags);
 }
@@ -665,13 +665,13 @@ void dw_pcie_setup_rc(struct pcie_port *pp)

        /* Initialize IRQ Status array */
        for (ctrl = 0; ctrl < num_ctrls; ctrl++) {
+               pp->irq_mask[ctrl] = ~0;
                dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_MASK +
                                        (ctrl * MSI_REG_CTRL_BLOCK_SIZE),
-                                   4, ~0);
+                                   4, pp->irq_mask[ctrl]);
                dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_ENABLE +
                                        (ctrl * MSI_REG_CTRL_BLOCK_SIZE),
                                    4, ~0);
-               pp->irq_status[ctrl] = 0;
        }

        /* Setup RC BARs */
diff --git a/drivers/pci/controller/dwc/pcie-designware.h
b/drivers/pci/controller/dwc/pcie-designware.h
index 0989d88..9d1614a 100644
--- a/drivers/pci/controller/dwc/pcie-designware.h
+++ b/drivers/pci/controller/dwc/pcie-designware.h
@@ -169,7 +169,7 @@ struct pcie_port {
        struct irq_domain       *msi_domain;
        dma_addr_t              msi_data;
        u32                     num_vectors;
-       u32                     irq_status[MAX_MSI_CTRLS];
+       u32                     irq_mask[MAX_MSI_CTRLS];
        raw_spinlock_t          lock;
        DECLARE_BITMAP(msi_irq_in_use, MAX_MSI_IRQS);
 };
Gustavo Pimentel Nov. 22, 2018, 4:07 p.m. UTC | #18
Hi Lorenzo,

Just to ease the job, please also include this:

Fixes: 7c5925afbc58 ("PCI: dwc: Move MSI IRQs allocation to IRQ domains
hierarchical API")

On 22/11/2018 12:03, Gustavo Pimentel wrote:
> 
> Hi all,
> 
> 
>>
>> I just realised (at 1am!) that the first patch is awfully buggy:
>> - ENABLE and MASK have opposite polarities
>> - There is nothing initialising the ENABLE and MASK registers
>>
>> I've stashed the following fix-up on top of the existing series:
>>
>> diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
>> index f06e67c60593..0fa9e8fdce66 100644
>> --- a/drivers/pci/controller/dwc/pcie-designware-host.c
>> +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
>> @@ -166,7 +166,7 @@ static void dw_pci_bottom_mask(struct irq_data *data)
> 
> (snip)
> 
> I used your patch and made it more perceptible in my opinion, (basically I
> transformed the variable irq_mask (previous irq_status) into a mirror of MASK
> registers, which removed the need for the *NOT* operation and forced the mask
> operation swap in the callbacks)
> 
> Lorenzo can you apply this fix-up in your branch test/pci-dwc-msi? Maybe merging
> with eca44651920c ("PCI: designware: Move interrupt acking into the proper
> callback"), perhaps.
> 
> I've tested Marc's patch series (with and without my fix-up and both) with a
> NVme EP that generates a burst of 19 MSI-X interrupts. Both approaches are
> working fine.
> 
> Just a couple of suggestions Lorenzo, maybe you could exchange the *designware*
> by *dwc* on all patch series titles and on eca44651920c("PCI: designware: Move
> interrupt acking into the proper callback") replace *acking* by *ACKing* like
> previous patch has.
> 
> Marc thanks for this patch fix! :)
> 
> Tested-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
> 
> diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c
> b/drivers/pci/controller/dwc/pcie-designware-host.c
> index 0fa9e8f..a5132b3 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> @@ -164,9 +164,9 @@ static void dw_pci_bottom_mask(struct irq_data *data)
>                 res = ctrl * MSI_REG_CTRL_BLOCK_SIZE;
>                 bit = data->hwirq % MAX_MSI_IRQS_PER_CTRL;
> 
> -               pp->irq_status[ctrl] &= ~(1 << bit);
> +               pp->irq_mask[ctrl] |= BIT(bit);
>                 dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_MASK + res, 4,
> -                                   ~pp->irq_status[ctrl]);
> +                                   pp->irq_mask[ctrl]);
>         }
> 
>         raw_spin_unlock_irqrestore(&pp->lock, flags);
> @@ -187,30 +187,30 @@ static void dw_pci_bottom_unmask(struct irq_data *data)
>                 res = ctrl * MSI_REG_CTRL_BLOCK_SIZE;
>                 bit = data->hwirq % MAX_MSI_IRQS_PER_CTRL;
> 
> -               pp->irq_status[ctrl] |= 1 << bit;
> +               pp->irq_mask[ctrl] &= ~BIT(bit);
>                 dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_MASK + res, 4,
> -                                   ~pp->irq_status[ctrl]);
> +                                   pp->irq_mask[ctrl]);
>         }
> 
>         raw_spin_unlock_irqrestore(&pp->lock, flags);
>  }
> 
> -static void dw_pci_bottom_ack(struct irq_data *d)
> +static void dw_pci_bottom_ack(struct irq_data *data)
>  {
> -       struct pcie_port *pp  = irq_data_get_irq_chip_data(d);
> +       struct pcie_port *pp = irq_data_get_irq_chip_data(data);
>         unsigned int res, bit, ctrl;
>         unsigned long flags;
> 
> -       ctrl = d->hwirq / MAX_MSI_IRQS_PER_CTRL;
> +       ctrl = data->hwirq / MAX_MSI_IRQS_PER_CTRL;
>         res = ctrl * MSI_REG_CTRL_BLOCK_SIZE;
> -       bit = d->hwirq % MAX_MSI_IRQS_PER_CTRL;
> +       bit = data->hwirq % MAX_MSI_IRQS_PER_CTRL;
> 
>         raw_spin_lock_irqsave(&pp->lock, flags);
> 
> -       dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_STATUS + res, 4, 1 << bit);
> +       dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_STATUS + res, 4, BIT(bit));
> 
>         if (pp->ops->msi_irq_ack)
> -               pp->ops->msi_irq_ack(d->hwirq, pp);
> +               pp->ops->msi_irq_ack(data->hwirq, pp);
> 
>         raw_spin_unlock_irqrestore(&pp->lock, flags);
>  }
> @@ -665,13 +665,13 @@ void dw_pcie_setup_rc(struct pcie_port *pp)
> 
>         /* Initialize IRQ Status array */
>         for (ctrl = 0; ctrl < num_ctrls; ctrl++) {
> +               pp->irq_mask[ctrl] = ~0;
>                 dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_MASK +
>                                         (ctrl * MSI_REG_CTRL_BLOCK_SIZE),
> -                                   4, ~0);
> +                                   4, pp->irq_mask[ctrl]);
>                 dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_ENABLE +
>                                         (ctrl * MSI_REG_CTRL_BLOCK_SIZE),
>                                     4, ~0);
> -               pp->irq_status[ctrl] = 0;
>         }
> 
>         /* Setup RC BARs */
> diff --git a/drivers/pci/controller/dwc/pcie-designware.h
> b/drivers/pci/controller/dwc/pcie-designware.h
> index 0989d88..9d1614a 100644
> --- a/drivers/pci/controller/dwc/pcie-designware.h
> +++ b/drivers/pci/controller/dwc/pcie-designware.h
> @@ -169,7 +169,7 @@ struct pcie_port {
>         struct irq_domain       *msi_domain;
>         dma_addr_t              msi_data;
>         u32                     num_vectors;
> -       u32                     irq_status[MAX_MSI_CTRLS];
> +       u32                     irq_mask[MAX_MSI_CTRLS];
>         raw_spinlock_t          lock;
>         DECLARE_BITMAP(msi_irq_in_use, MAX_MSI_IRQS);
>  };
>
Lorenzo Pieralisi Nov. 22, 2018, 4:26 p.m. UTC | #19
On Thu, Nov 22, 2018 at 12:03:25PM +0000, Gustavo Pimentel wrote:
> 
> Hi all,
> 
> 
> > 
> > I just realised (at 1am!) that the first patch is awfully buggy:
> > - ENABLE and MASK have opposite polarities
> > - There is nothing initialising the ENABLE and MASK registers
> > 
> > I've stashed the following fix-up on top of the existing series:
> > 
> > diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
> > index f06e67c60593..0fa9e8fdce66 100644
> > --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> > +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> > @@ -166,7 +166,7 @@ static void dw_pci_bottom_mask(struct irq_data *data)
> 
> (snip)
> 
> I used your patch and made it more perceptible in my opinion, (basically I
> transformed the variable irq_mask (previous irq_status) into a mirror of MASK
> registers, which removed the need for the *NOT* operation and forced the mask
> operation swap in the callbacks)
> 
> Lorenzo can you apply this fix-up in your branch test/pci-dwc-msi? Maybe merging
> with eca44651920c ("PCI: designware: Move interrupt acking into the proper
> callback"), perhaps.
> 
> I've tested Marc's patch series (with and without my fix-up and both) with a
> NVme EP that generates a burst of 19 MSI-X interrupts. Both approaches are
> working fine.

I want to see more testing before getting this series merged upstream,
especially for the platform configurations on which this bug was
reported.

I am a bit annoyed with DWC platform maintainers in this regard.

> Just a couple of suggestions Lorenzo, maybe you could exchange the *designware*
> by *dwc* on all patch series titles and on eca44651920c("PCI: designware: Move
> interrupt acking into the proper callback") replace *acking* by *ACKing* like
> previous patch has.
> 
> Marc thanks for this patch fix! :)
> 
> Tested-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
> 
> diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c
> b/drivers/pci/controller/dwc/pcie-designware-host.c
> index 0fa9e8f..a5132b3 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> @@ -164,9 +164,9 @@ static void dw_pci_bottom_mask(struct irq_data *data)
>                 res = ctrl * MSI_REG_CTRL_BLOCK_SIZE;
>                 bit = data->hwirq % MAX_MSI_IRQS_PER_CTRL;
> 
> -               pp->irq_status[ctrl] &= ~(1 << bit);
> +               pp->irq_mask[ctrl] |= BIT(bit);
>                 dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_MASK + res, 4,
> -                                   ~pp->irq_status[ctrl]);
> +                                   pp->irq_mask[ctrl]);
>         }
> 
>         raw_spin_unlock_irqrestore(&pp->lock, flags);
> @@ -187,30 +187,30 @@ static void dw_pci_bottom_unmask(struct irq_data *data)
>                 res = ctrl * MSI_REG_CTRL_BLOCK_SIZE;
>                 bit = data->hwirq % MAX_MSI_IRQS_PER_CTRL;
> 
> -               pp->irq_status[ctrl] |= 1 << bit;
> +               pp->irq_mask[ctrl] &= ~BIT(bit);
>                 dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_MASK + res, 4,
> -                                   ~pp->irq_status[ctrl]);
> +                                   pp->irq_mask[ctrl]);
>         }
> 
>         raw_spin_unlock_irqrestore(&pp->lock, flags);
>  }
> 
> -static void dw_pci_bottom_ack(struct irq_data *d)
> +static void dw_pci_bottom_ack(struct irq_data *data)
>  {
> -       struct pcie_port *pp  = irq_data_get_irq_chip_data(d);
> +       struct pcie_port *pp = irq_data_get_irq_chip_data(data);
>         unsigned int res, bit, ctrl;
>         unsigned long flags;
> 
> -       ctrl = d->hwirq / MAX_MSI_IRQS_PER_CTRL;
> +       ctrl = data->hwirq / MAX_MSI_IRQS_PER_CTRL;
>         res = ctrl * MSI_REG_CTRL_BLOCK_SIZE;
> -       bit = d->hwirq % MAX_MSI_IRQS_PER_CTRL;
> +       bit = data->hwirq % MAX_MSI_IRQS_PER_CTRL;
> 
>         raw_spin_lock_irqsave(&pp->lock, flags);
> 
> -       dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_STATUS + res, 4, 1 << bit);
> +       dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_STATUS + res, 4, BIT(bit));
> 
>         if (pp->ops->msi_irq_ack)
> -               pp->ops->msi_irq_ack(d->hwirq, pp);
> +               pp->ops->msi_irq_ack(data->hwirq, pp);

Changes in this hunk are unrelated, I won't squash them in.

Lorenzo

> 
>         raw_spin_unlock_irqrestore(&pp->lock, flags);
>  }
> @@ -665,13 +665,13 @@ void dw_pcie_setup_rc(struct pcie_port *pp)
> 
>         /* Initialize IRQ Status array */
>         for (ctrl = 0; ctrl < num_ctrls; ctrl++) {
> +               pp->irq_mask[ctrl] = ~0;
>                 dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_MASK +
>                                         (ctrl * MSI_REG_CTRL_BLOCK_SIZE),
> -                                   4, ~0);
> +                                   4, pp->irq_mask[ctrl]);
>                 dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_ENABLE +
>                                         (ctrl * MSI_REG_CTRL_BLOCK_SIZE),
>                                     4, ~0);
> -               pp->irq_status[ctrl] = 0;
>         }
> 
>         /* Setup RC BARs */
> diff --git a/drivers/pci/controller/dwc/pcie-designware.h
> b/drivers/pci/controller/dwc/pcie-designware.h
> index 0989d88..9d1614a 100644
> --- a/drivers/pci/controller/dwc/pcie-designware.h
> +++ b/drivers/pci/controller/dwc/pcie-designware.h
> @@ -169,7 +169,7 @@ struct pcie_port {
>         struct irq_domain       *msi_domain;
>         dma_addr_t              msi_data;
>         u32                     num_vectors;
> -       u32                     irq_status[MAX_MSI_CTRLS];
> +       u32                     irq_mask[MAX_MSI_CTRLS];
>         raw_spinlock_t          lock;
>         DECLARE_BITMAP(msi_irq_in_use, MAX_MSI_IRQS);
>  };
>
Marc Zyngier Nov. 22, 2018, 4:38 p.m. UTC | #20
On 22/11/2018 16:26, Lorenzo Pieralisi wrote:
> On Thu, Nov 22, 2018 at 12:03:25PM +0000, Gustavo Pimentel wrote:

[...]

>> Just a couple of suggestions Lorenzo, maybe you could exchange the *designware*
>> by *dwc* on all patch series titles and on eca44651920c("PCI: designware: Move
>> interrupt acking into the proper callback") replace *acking* by *ACKing* like
>> previous patch has.
>>
>> Marc thanks for this patch fix! :)
>>
>> Tested-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
>>
>> diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c
>> b/drivers/pci/controller/dwc/pcie-designware-host.c
>> index 0fa9e8f..a5132b3 100644
>> --- a/drivers/pci/controller/dwc/pcie-designware-host.c
>> +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
>> @@ -164,9 +164,9 @@ static void dw_pci_bottom_mask(struct irq_data *data)
>>                 res = ctrl * MSI_REG_CTRL_BLOCK_SIZE;
>>                 bit = data->hwirq % MAX_MSI_IRQS_PER_CTRL;
>>
>> -               pp->irq_status[ctrl] &= ~(1 << bit);
>> +               pp->irq_mask[ctrl] |= BIT(bit);
>>                 dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_MASK + res, 4,
>> -                                   ~pp->irq_status[ctrl]);
>> +                                   pp->irq_mask[ctrl]);
>>         }
>>
>>         raw_spin_unlock_irqrestore(&pp->lock, flags);
>> @@ -187,30 +187,30 @@ static void dw_pci_bottom_unmask(struct irq_data *data)
>>                 res = ctrl * MSI_REG_CTRL_BLOCK_SIZE;
>>                 bit = data->hwirq % MAX_MSI_IRQS_PER_CTRL;
>>
>> -               pp->irq_status[ctrl] |= 1 << bit;
>> +               pp->irq_mask[ctrl] &= ~BIT(bit);
>>                 dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_MASK + res, 4,
>> -                                   ~pp->irq_status[ctrl]);
>> +                                   pp->irq_mask[ctrl]);
>>         }
>>
>>         raw_spin_unlock_irqrestore(&pp->lock, flags);
>>  }
>>
>> -static void dw_pci_bottom_ack(struct irq_data *d)
>> +static void dw_pci_bottom_ack(struct irq_data *data)
>>  {
>> -       struct pcie_port *pp  = irq_data_get_irq_chip_data(d);
>> +       struct pcie_port *pp = irq_data_get_irq_chip_data(data);
>>         unsigned int res, bit, ctrl;
>>         unsigned long flags;
>>
>> -       ctrl = d->hwirq / MAX_MSI_IRQS_PER_CTRL;
>> +       ctrl = data->hwirq / MAX_MSI_IRQS_PER_CTRL;
>>         res = ctrl * MSI_REG_CTRL_BLOCK_SIZE;
>> -       bit = d->hwirq % MAX_MSI_IRQS_PER_CTRL;
>> +       bit = data->hwirq % MAX_MSI_IRQS_PER_CTRL;
>>
>>         raw_spin_lock_irqsave(&pp->lock, flags);
>>
>> -       dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_STATUS + res, 4, 1 << bit);
>> +       dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_STATUS + res, 4, BIT(bit));
>>
>>         if (pp->ops->msi_irq_ack)
>> -               pp->ops->msi_irq_ack(d->hwirq, pp);
>> +               pp->ops->msi_irq_ack(data->hwirq, pp);
> 
> Changes in this hunk are unrelated, I won't squash them in.

To add to Lorenzo's comment, we're trying hard to have a *minimal* fix
that can be easily backported. Changing variable and field names as well
as flipping the semantic of other bits of the driver makes it harder to
review, and certainly doesn't help getting things backported to stable
(see the stable kernel rules).

I'd suggest this kind of repainting is better kept as a separate patch
and merged separately.

Thanks,

	M.
Gustavo Pimentel Nov. 22, 2018, 5:40 p.m. UTC | #21
On 22/11/2018 16:38, Marc Zyngier wrote:
> On 22/11/2018 16:26, Lorenzo Pieralisi wrote:
>> On Thu, Nov 22, 2018 at 12:03:25PM +0000, Gustavo Pimentel wrote:
> 
> [...]
> 
>>> Just a couple of suggestions Lorenzo, maybe you could exchange the *designware*
>>> by *dwc* on all patch series titles and on eca44651920c("PCI: designware: Move
>>> interrupt acking into the proper callback") replace *acking* by *ACKing* like
>>> previous patch has.
>>>
>>> Marc thanks for this patch fix! :)
>>>
>>> Tested-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
>>>
>>> diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c
>>> b/drivers/pci/controller/dwc/pcie-designware-host.c
>>> index 0fa9e8f..a5132b3 100644
>>> --- a/drivers/pci/controller/dwc/pcie-designware-host.c
>>> +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
>>> @@ -164,9 +164,9 @@ static void dw_pci_bottom_mask(struct irq_data *data)
>>>                 res = ctrl * MSI_REG_CTRL_BLOCK_SIZE;
>>>                 bit = data->hwirq % MAX_MSI_IRQS_PER_CTRL;
>>>
>>> -               pp->irq_status[ctrl] &= ~(1 << bit);
>>> +               pp->irq_mask[ctrl] |= BIT(bit);
>>>                 dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_MASK + res, 4,
>>> -                                   ~pp->irq_status[ctrl]);
>>> +                                   pp->irq_mask[ctrl]);
>>>         }
>>>
>>>         raw_spin_unlock_irqrestore(&pp->lock, flags);
>>> @@ -187,30 +187,30 @@ static void dw_pci_bottom_unmask(struct irq_data *data)
>>>                 res = ctrl * MSI_REG_CTRL_BLOCK_SIZE;
>>>                 bit = data->hwirq % MAX_MSI_IRQS_PER_CTRL;
>>>
>>> -               pp->irq_status[ctrl] |= 1 << bit;
>>> +               pp->irq_mask[ctrl] &= ~BIT(bit);
>>>                 dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_MASK + res, 4,
>>> -                                   ~pp->irq_status[ctrl]);
>>> +                                   pp->irq_mask[ctrl]);
>>>         }
>>>
>>>         raw_spin_unlock_irqrestore(&pp->lock, flags);
>>>  }
>>>
>>> -static void dw_pci_bottom_ack(struct irq_data *d)
>>> +static void dw_pci_bottom_ack(struct irq_data *data)
>>>  {
>>> -       struct pcie_port *pp  = irq_data_get_irq_chip_data(d);
>>> +       struct pcie_port *pp = irq_data_get_irq_chip_data(data);
>>>         unsigned int res, bit, ctrl;
>>>         unsigned long flags;
>>>
>>> -       ctrl = d->hwirq / MAX_MSI_IRQS_PER_CTRL;
>>> +       ctrl = data->hwirq / MAX_MSI_IRQS_PER_CTRL;
>>>         res = ctrl * MSI_REG_CTRL_BLOCK_SIZE;
>>> -       bit = d->hwirq % MAX_MSI_IRQS_PER_CTRL;
>>> +       bit = data->hwirq % MAX_MSI_IRQS_PER_CTRL;
>>>
>>>         raw_spin_lock_irqsave(&pp->lock, flags);
>>>
>>> -       dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_STATUS + res, 4, 1 << bit);
>>> +       dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_STATUS + res, 4, BIT(bit));
>>>
>>>         if (pp->ops->msi_irq_ack)
>>> -               pp->ops->msi_irq_ack(d->hwirq, pp);
>>> +               pp->ops->msi_irq_ack(data->hwirq, pp);
>>
>> Changes in this hunk are unrelated, I won't squash them in.
> 
> To add to Lorenzo's comment, we're trying hard to have a *minimal* fix
> that can be easily backported. Changing variable and field names as well
> as flipping the semantic of other bits of the driver makes it harder to
> review, and certainly doesn't help getting things backported to stable
> (see the stable kernel rules).
> 
> I'd suggest this kind of repainting is better kept as a separate patch
> and merged separately.

Makes sense.

Gustavo

> 
> Thanks,
> 
> 	M.
>
Gustavo Pimentel Nov. 22, 2018, 5:49 p.m. UTC | #22
On 22/11/2018 16:26, Lorenzo Pieralisi wrote:
> On Thu, Nov 22, 2018 at 12:03:25PM +0000, Gustavo Pimentel wrote:
>>
>> Hi all,
>>
>>
>>>
>>> I just realised (at 1am!) that the first patch is awfully buggy:
>>> - ENABLE and MASK have opposite polarities
>>> - There is nothing initialising the ENABLE and MASK registers
>>>
>>> I've stashed the following fix-up on top of the existing series:
>>>
>>> diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
>>> index f06e67c60593..0fa9e8fdce66 100644
>>> --- a/drivers/pci/controller/dwc/pcie-designware-host.c
>>> +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
>>> @@ -166,7 +166,7 @@ static void dw_pci_bottom_mask(struct irq_data *data)
>>
>> (snip)
>>
>> I used your patch and made it more perceptible in my opinion, (basically I
>> transformed the variable irq_mask (previous irq_status) into a mirror of MASK
>> registers, which removed the need for the *NOT* operation and forced the mask
>> operation swap in the callbacks)
>>
>> Lorenzo can you apply this fix-up in your branch test/pci-dwc-msi? Maybe merging
>> with eca44651920c ("PCI: designware: Move interrupt acking into the proper
>> callback"), perhaps.
>>
>> I've tested Marc's patch series (with and without my fix-up and both) with a
>> NVme EP that generates a burst of 19 MSI-X interrupts. Both approaches are
>> working fine.
> 
> I want to see more testing before getting this series merged upstream,
> especially for the platform configurations on which this bug was
> reported.

Yes, of course.

> 
> I am a bit annoyed with DWC platform maintainers in this regard.
> 
>> Just a couple of suggestions Lorenzo, maybe you could exchange the *designware*
>> by *dwc* on all patch series titles and on eca44651920c("PCI: designware: Move
>> interrupt acking into the proper callback") replace *acking* by *ACKing* like
>> previous patch has.
>>
>> Marc thanks for this patch fix! :)
>>
>> Tested-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
>>
>> diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c
>> b/drivers/pci/controller/dwc/pcie-designware-host.c
>> index 0fa9e8f..a5132b3 100644
>> --- a/drivers/pci/controller/dwc/pcie-designware-host.c
>> +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
>> @@ -164,9 +164,9 @@ static void dw_pci_bottom_mask(struct irq_data *data)
>>                 res = ctrl * MSI_REG_CTRL_BLOCK_SIZE;
>>                 bit = data->hwirq % MAX_MSI_IRQS_PER_CTRL;
>>
>> -               pp->irq_status[ctrl] &= ~(1 << bit);
>> +               pp->irq_mask[ctrl] |= BIT(bit);
>>                 dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_MASK + res, 4,
>> -                                   ~pp->irq_status[ctrl]);
>> +                                   pp->irq_mask[ctrl]);
>>         }
>>
>>         raw_spin_unlock_irqrestore(&pp->lock, flags);
>> @@ -187,30 +187,30 @@ static void dw_pci_bottom_unmask(struct irq_data *data)
>>                 res = ctrl * MSI_REG_CTRL_BLOCK_SIZE;
>>                 bit = data->hwirq % MAX_MSI_IRQS_PER_CTRL;
>>
>> -               pp->irq_status[ctrl] |= 1 << bit;
>> +               pp->irq_mask[ctrl] &= ~BIT(bit);
>>                 dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_MASK + res, 4,
>> -                                   ~pp->irq_status[ctrl]);
>> +                                   pp->irq_mask[ctrl]);
>>         }
>>
>>         raw_spin_unlock_irqrestore(&pp->lock, flags);
>>  }
>>
>> -static void dw_pci_bottom_ack(struct irq_data *d)
>> +static void dw_pci_bottom_ack(struct irq_data *data)
>>  {
>> -       struct pcie_port *pp  = irq_data_get_irq_chip_data(d);
>> +       struct pcie_port *pp = irq_data_get_irq_chip_data(data);
>>         unsigned int res, bit, ctrl;
>>         unsigned long flags;
>>
>> -       ctrl = d->hwirq / MAX_MSI_IRQS_PER_CTRL;
>> +       ctrl = data->hwirq / MAX_MSI_IRQS_PER_CTRL;
>>         res = ctrl * MSI_REG_CTRL_BLOCK_SIZE;
>> -       bit = d->hwirq % MAX_MSI_IRQS_PER_CTRL;
>> +       bit = data->hwirq % MAX_MSI_IRQS_PER_CTRL;
>>
>>         raw_spin_lock_irqsave(&pp->lock, flags);
>>
>> -       dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_STATUS + res, 4, 1 << bit);
>> +       dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_STATUS + res, 4, BIT(bit));
>>
>>         if (pp->ops->msi_irq_ack)
>> -               pp->ops->msi_irq_ack(d->hwirq, pp);
>> +               pp->ops->msi_irq_ack(data->hwirq, pp);
> 
> Changes in this hunk are unrelated, I won't squash them in.

Following Marc's suggestion, this can go in a separate patch.

> 
> Lorenzo
> 
>>
>>         raw_spin_unlock_irqrestore(&pp->lock, flags);
>>  }
>> @@ -665,13 +665,13 @@ void dw_pcie_setup_rc(struct pcie_port *pp)
>>
>>         /* Initialize IRQ Status array */
>>         for (ctrl = 0; ctrl < num_ctrls; ctrl++) {
>> +               pp->irq_mask[ctrl] = ~0;
>>                 dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_MASK +
>>                                         (ctrl * MSI_REG_CTRL_BLOCK_SIZE),
>> -                                   4, ~0);
>> +                                   4, pp->irq_mask[ctrl]);
>>                 dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_ENABLE +
>>                                         (ctrl * MSI_REG_CTRL_BLOCK_SIZE),
>>                                     4, ~0);
>> -               pp->irq_status[ctrl] = 0;
>>         }
>>
>>         /* Setup RC BARs */
>> diff --git a/drivers/pci/controller/dwc/pcie-designware.h
>> b/drivers/pci/controller/dwc/pcie-designware.h
>> index 0989d88..9d1614a 100644
>> --- a/drivers/pci/controller/dwc/pcie-designware.h
>> +++ b/drivers/pci/controller/dwc/pcie-designware.h
>> @@ -169,7 +169,7 @@ struct pcie_port {
>>         struct irq_domain       *msi_domain;
>>         dma_addr_t              msi_data;
>>         u32                     num_vectors;
>> -       u32                     irq_status[MAX_MSI_CTRLS];
>> +       u32                     irq_mask[MAX_MSI_CTRLS];
>>         raw_spinlock_t          lock;
>>         DECLARE_BITMAP(msi_irq_in_use, MAX_MSI_IRQS);
>>  };
>>
Trent Piepho Nov. 26, 2018, 3:52 p.m. UTC | #23
On Thu, 2018-11-22 at 12:03 +0000, Gustavo Pimentel wrote:
> 
> > I just realised (at 1am!) that the first patch is awfully buggy:
> > - ENABLE and MASK have opposite polarities
> > - There is nothing initialising the ENABLE and MASK registers
> > 
> > I've stashed the following fix-up on top of the existing series:
> > 
> > diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c
> > b/drivers/pci/controller/dwc/pcie-designware-host.c
> > index f06e67c60593..0fa9e8fdce66 100644
> > --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> > +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> > @@ -166,7 +166,7 @@ static void dw_pci_bottom_mask(struct irq_data
> > *data)
> 
> (snip)
> 
> I used your patch and made it more perceptible in my opinion, (basically I
> transformed the variable irq_mask (previous irq_status) into a mirror of MASK
> registers, which removed the need for the *NOT* operation and forced the mask
> operation swap in the callbacks)

This would be the patch that enables all MSI interrupts on driver
initialization?

I don't think that's a good idea.  I was under the impression Marc
thought that as well.  It would be better to enable them when they are
enabled, via enable and disable methods.

I've looked into the interplay with enable and mask some more. 
Previous versions of the driver used the mask registers to both
en/disable and un/mask MSIs.  Both enable and mask methods were
provided, but they were the same mask function in the driver.

Functions like irq_percpu_enable() will call the irq_enable method, but
fall-back to irq_unmask if the chip does not provide irq_enable.  So a
driver can be sloppy about the distinction between masking and disable
an irq and it isn't necessarily apparent.

Which is how keystone has ks_pcie_msi_set_irq() that is part of the
unmask method (now), but certainly looks like it should be part of
enable.

I think the right way to fix this driver would be to:
Ack the irq in the irq_ack method.
Write enable and disable methods that use the enable register
Write mask/unmask methods that use the mask register
Rename the msi_(set|clr)_irq methods to msi_(en|dis)able_irq
Call ops->msi_enable_irq() from dw_pci_bottom_enable()
irq_status should get renamed to irq_mask, since it's just a driver 
cache of the mask register.  Could also cache the enable register, but
that reg is used less often.  Could also use regcache to cache these
registers.
Trent Piepho Nov. 26, 2018, 4:06 p.m. UTC | #24
On Thu, 2018-11-22 at 16:38 +0000, Marc Zyngier wrote:
> 
> To add to Lorenzo's comment, we're trying hard to have a *minimal* fix
> that can be easily backported. Changing variable and field names as well
> as flipping the semantic of other bits of the driver makes it harder to
> review, and certainly doesn't help getting things backported to stable
> (see the stable kernel rules).
> 
> I'd suggest this kind of repainting is better kept as a separate patch
> and merged separately.

If you want a minimal fix to backport, you should just use my original
patch for 4.15-4.19 stable series.  It's a one line change to fix the
real bug.

It'll be far easier to backport to < 4.17,  where the hierarchical irq
domain stuff wasn't used in this part of the driver.

If you follow the control flow of an irq, you'll see that it ACKs the
irq at the same point as putting the ack in a new irq_ack method does.

While the driver shouldn't enable or disable on mask, that's also a new
bug from 4.17 that doesn't need to be fixed in earlier versions.

It's unlikely to come up, in 4.17+, since the mask methods don't get
used in normal irq flow.  Still a bug.
Marc Zyngier Nov. 27, 2018, 7:50 a.m. UTC | #25
On Mon, 26 Nov 2018 15:52:42 +0000,
Trent Piepho <tpiepho@impinj.com> wrote:
> 
> On Thu, 2018-11-22 at 12:03 +0000, Gustavo Pimentel wrote:
> > 
> > > I just realised (at 1am!) that the first patch is awfully buggy:
> > > - ENABLE and MASK have opposite polarities
> > > - There is nothing initialising the ENABLE and MASK registers
> > > 
> > > I've stashed the following fix-up on top of the existing series:
> > > 
> > > diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c
> > > b/drivers/pci/controller/dwc/pcie-designware-host.c
> > > index f06e67c60593..0fa9e8fdce66 100644
> > > --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> > > +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> > > @@ -166,7 +166,7 @@ static void dw_pci_bottom_mask(struct irq_data
> > > *data)
> > 
> > (snip)
> > 
> > I used your patch and made it more perceptible in my opinion, (basically I
> > transformed the variable irq_mask (previous irq_status) into a mirror of MASK
> > registers, which removed the need for the *NOT* operation and forced the mask
> > operation swap in the callbacks)
> 
> This would be the patch that enables all MSI interrupts on driver
> initialization?
> 
> I don't think that's a good idea.  I was under the impression Marc
> thought that as well.  It would be better to enable them when they are
> enabled, via enable and disable methods.

What gain does this bring? Frankly, I see exactly zero advantage of
doing so. It may look cosmetically appealing in the sense that it
makes use of of a register that the IP offers, but for Linux the
advantage is basically null.

Thanks,

	M.
Marc Zyngier Nov. 27, 2018, 7:51 a.m. UTC | #26
On Mon, 26 Nov 2018 16:06:54 +0000,
Trent Piepho <tpiepho@impinj.com> wrote:
> 
> On Thu, 2018-11-22 at 16:38 +0000, Marc Zyngier wrote:
> > 
> > To add to Lorenzo's comment, we're trying hard to have a *minimal* fix
> > that can be easily backported. Changing variable and field names as well
> > as flipping the semantic of other bits of the driver makes it harder to
> > review, and certainly doesn't help getting things backported to stable
> > (see the stable kernel rules).
> > 
> > I'd suggest this kind of repainting is better kept as a separate patch
> > and merged separately.
> 
> If you want a minimal fix to backport, you should just use my original
> patch for 4.15-4.19 stable series.  It's a one line change to fix the
> real bug.
> 
> It'll be far easier to backport to < 4.17,  where the hierarchical irq
> domain stuff wasn't used in this part of the driver.

The right fix is to create an irq_ack method and put the relevant code
there.

Thanks,

	M.
Trent Piepho Nov. 27, 2018, 5:23 p.m. UTC | #27
On Tue, 2018-11-27 at 07:51 +0000, Marc Zyngier wrote:
> On Mon, 26 Nov 2018 16:06:54 +0000,
> Trent Piepho <tpiepho@impinj.com> wrote:
> > 
> > On Thu, 2018-11-22 at 16:38 +0000, Marc Zyngier wrote:
> > > 
> > > To add to Lorenzo's comment, we're trying hard to have a
> > > *minimal* fix
> > > that can be easily backported. Changing variable and field names
> > > as well
> > > as flipping the semantic of other bits of the driver makes it
> > > harder to
> > > review, and certainly doesn't help getting things backported to
> > > stable
> > > (see the stable kernel rules).
> > > 
> > > I'd suggest this kind of repainting is better kept as a separate
> > > patch
> > > and merged separately.
> > 
> > If you want a minimal fix to backport, you should just use my original
> > patch for 4.15-4.19 stable series.  It's a one line change to fix the
> > real bug.
> > 
> > It'll be far easier to backport to < 4.17,  where the hierarchical irq
> > domain stuff wasn't used in this part of the driver.
> 
> The right fix is to create an irq_ack method and put the relevant code
> there.

Fine.

But the irq domain stuff should go into the stable 4.15 and 4.16
kernels.  It's not a fix and it changes behavior in how the interrupts
look in /proc significantly.
Trent Piepho Nov. 27, 2018, 6:12 p.m. UTC | #28
On Tue, 2018-11-27 at 07:50 +0000, Marc Zyngier wrote:
> On Mon, 26 Nov 2018 15:52:42 +0000,
> Trent Piepho <tpiepho@impinj.com> wrote:
> > 
> > > I used your patch and made it more perceptible in my opinion, (basically I
> > > transformed the variable irq_mask (previous irq_status) into a mirror of MASK
> > > registers, which removed the need for the *NOT* operation and forced the mask
> > > operation swap in the callbacks)
> > 
> > This would be the patch that enables all MSI interrupts on driver
> > initialization?
> > 
> > I don't think that's a good idea.  I was under the impression Marc
> > thought that as well.  It would be better to enable them when they are
> > enabled, via enable and disable methods.
> 
> What gain does this bring? Frankly, I see exactly zero advantage of
> doing so. It may look cosmetically appealing in the sense that it
> makes use of of a register that the IP offers, but for Linux the
> advantage is basically null.

Here's why:

1. It's a big change in driver behavior to enable all MSIs.  There will
certainly be hardware that writes to an MSI-X address, perhaps to
generate a MSI, perhaps not, where that MSI is disabled.  Now that
hardware will start generating interrupts.  That could be a big deal. 
The MSI might well have been not enabled very intentionally!  No reason
to create that change in behavior and also very much not a good idea to
backport to stable kernels.

2. Existing code is not clear about the difference between mask and
disable, getting it wrong in some places and causing bugs.  Creating
both mask and disable will make it clear.  It's the same reasoning you
use to reject my simple patch to put the irq ack at the correct time
and instead also put it in the semantically correct location.

3. A platform, keystone, has provided enable/disable methods for the
dwc driver.  But they are (now) called from the mask/unmask routines?!
That's not right; it'll drop irqs if it's really an enable/disable
feature in keystone.  Without dwc enable/disable methods, where will
the platform enable/disable methods be called from?  These methods are
getting randomly moved from mask to disable and back, like the ack
getting moved around, and clear distinction between disable and mask
will help stop that.
Niklas Cassel Dec. 1, 2018, 11:50 p.m. UTC | #29
On Wed, Nov 21, 2018 at 07:24:53PM +0200, Stanimir Varbanov wrote:
> Hi,
> 
> On 11/14/18 12:57 AM, Marc Zyngier wrote:
> > It recently came to light that the Designware PCIe driver is rather
> > broken in the way it handles MSI[1]:
> > 
> > - It masks interrupt by disabling them, meaning that MSIs generated
> >   during the masked window are simply lost. Oops.
> > 
> > - Acking of the currently pending MSI is done outside of the interrupt
> >   flow, getting moved around randomly and ultimately breaking the
> >   driver. Not great.
> > 
> > This series attempts to address this by switching to using the MASK
> > register for masking interrupts (!), and move the ack into the
> > appropriate callback, giving it a fixed place in the MSI handling
> > flow.
> > 
> > Note that this is only compile-tested on my arm64 laptop, as I'm
> > travelling and do not have the required HW to test it anyway. I'd
> > welcome both review and testing by the interested parties (dwc
> > maintainer and users affected by existing bugs).
> > 
> > Thanks,
> > 
> > 	M.
> > 
> > [1] https://patchwork.kernel.org/patch/10657987/
> > 
> > Marc Zyngier (3):
> >   PCI: designware: Use interrupt masking instead of disabling
> >   PCI: designware: Take lock when ACKing an interrupt
> >   PCI: designware: Move interrupt acking into the proper callback
> > 
> >  .../pci/controller/dwc/pcie-designware-host.c | 22 ++++++++++++-------
> >  1 file changed, 14 insertions(+), 8 deletions(-)
> > 
> 
> for pcie-qcom:
> 
> Tested-by: Stanimir Varbanov <svarbanov@mm-sol.com>

Hello PCI folks,

Since this is a real bug, we should try get a couple of Tested-by tags
before it's too late.
It would be nice if v4.20 was released without broken MSIs in this driver.

Personally I get confused just by looking at this mail thread.

I see 3 patches from Marc and a fix-up from Marc, but I also see
a patch from Gustavo, and another patch from Trent.

Is seems like Lorenzo has a branch with Marc's 3 patches + Marc's fix-up
folded in here:
https://git.kernel.org/pub/scm/linux/kernel/git/lpieralisi/pci.git/log/?h=test%2Fpci-dwc-msi

Perhaps it would be a good idea to send a V2, with proper Fixes-tags,
just so that people would know what to test, so that we can start getting
those Tested-by tags.

Stan, may I ask exactly what patches you tested?


Kind regards,
Niklas
Stanimir Varbanov Dec. 2, 2018, 11:28 a.m. UTC | #30
Hi Niklas,

On 2.12.18 г. 1:50 ч., Niklas Cassel wrote:
> On Wed, Nov 21, 2018 at 07:24:53PM +0200, Stanimir Varbanov wrote:
>> Hi,
>>
>> On 11/14/18 12:57 AM, Marc Zyngier wrote:
>>> It recently came to light that the Designware PCIe driver is rather
>>> broken in the way it handles MSI[1]:
>>>
>>> - It masks interrupt by disabling them, meaning that MSIs generated
>>>    during the masked window are simply lost. Oops.
>>>
>>> - Acking of the currently pending MSI is done outside of the interrupt
>>>    flow, getting moved around randomly and ultimately breaking the
>>>    driver. Not great.
>>>
>>> This series attempts to address this by switching to using the MASK
>>> register for masking interrupts (!), and move the ack into the
>>> appropriate callback, giving it a fixed place in the MSI handling
>>> flow.
>>>
>>> Note that this is only compile-tested on my arm64 laptop, as I'm
>>> travelling and do not have the required HW to test it anyway. I'd
>>> welcome both review and testing by the interested parties (dwc
>>> maintainer and users affected by existing bugs).
>>>
>>> Thanks,
>>>
>>> 	M.
>>>
>>> [1] https://patchwork.kernel.org/patch/10657987/
>>>
>>> Marc Zyngier (3):
>>>    PCI: designware: Use interrupt masking instead of disabling
>>>    PCI: designware: Take lock when ACKing an interrupt
>>>    PCI: designware: Move interrupt acking into the proper callback
>>>
>>>   .../pci/controller/dwc/pcie-designware-host.c | 22 ++++++++++++-------
>>>   1 file changed, 14 insertions(+), 8 deletions(-)
>>>
>>
>> for pcie-qcom:
>>
>> Tested-by: Stanimir Varbanov <svarbanov@mm-sol.com>
> 
> Hello PCI folks,
> 
> Since this is a real bug, we should try get a couple of Tested-by tags
> before it's too late.
> It would be nice if v4.20 was released without broken MSIs in this driver.
> 
> Personally I get confused just by looking at this mail thread.
> 
> I see 3 patches from Marc and a fix-up from Marc, but I also see
> a patch from Gustavo, and another patch from Trent.
> 
> Is seems like Lorenzo has a branch with Marc's 3 patches + Marc's fix-up
> folded in here:
> https://git.kernel.org/pub/scm/linux/kernel/git/lpieralisi/pci.git/log/?h=test%2Fpci-dwc-msi
> 
> Perhaps it would be a good idea to send a V2, with proper Fixes-tags,
> just so that people would know what to test, so that we can start getting
> those Tested-by tags.
> 
> Stan, may I ask exactly what patches you tested?

I tested Lorenzo's branch above ^^^.

regards,
Stan
Lorenzo Pieralisi Dec. 3, 2018, 10:42 a.m. UTC | #31
On Sun, Dec 02, 2018 at 12:50:58AM +0100, Niklas Cassel wrote:
> On Wed, Nov 21, 2018 at 07:24:53PM +0200, Stanimir Varbanov wrote:
> > Hi,
> > 
> > On 11/14/18 12:57 AM, Marc Zyngier wrote:
> > > It recently came to light that the Designware PCIe driver is rather
> > > broken in the way it handles MSI[1]:
> > > 
> > > - It masks interrupt by disabling them, meaning that MSIs generated
> > >   during the masked window are simply lost. Oops.
> > > 
> > > - Acking of the currently pending MSI is done outside of the interrupt
> > >   flow, getting moved around randomly and ultimately breaking the
> > >   driver. Not great.
> > > 
> > > This series attempts to address this by switching to using the MASK
> > > register for masking interrupts (!), and move the ack into the
> > > appropriate callback, giving it a fixed place in the MSI handling
> > > flow.
> > > 
> > > Note that this is only compile-tested on my arm64 laptop, as I'm
> > > travelling and do not have the required HW to test it anyway. I'd
> > > welcome both review and testing by the interested parties (dwc
> > > maintainer and users affected by existing bugs).
> > > 
> > > Thanks,
> > > 
> > > 	M.
> > > 
> > > [1] https://patchwork.kernel.org/patch/10657987/
> > > 
> > > Marc Zyngier (3):
> > >   PCI: designware: Use interrupt masking instead of disabling
> > >   PCI: designware: Take lock when ACKing an interrupt
> > >   PCI: designware: Move interrupt acking into the proper callback
> > > 
> > >  .../pci/controller/dwc/pcie-designware-host.c | 22 ++++++++++++-------
> > >  1 file changed, 14 insertions(+), 8 deletions(-)
> > > 
> > 
> > for pcie-qcom:
> > 
> > Tested-by: Stanimir Varbanov <svarbanov@mm-sol.com>
> 
> Hello PCI folks,
> 
> Since this is a real bug, we should try get a couple of Tested-by tags
> before it's too late.
> It would be nice if v4.20 was released without broken MSIs in this driver.
> 
> Personally I get confused just by looking at this mail thread.
> 
> I see 3 patches from Marc and a fix-up from Marc, but I also see
> a patch from Gustavo, and another patch from Trent.
> 
> Is seems like Lorenzo has a branch with Marc's 3 patches + Marc's fix-up
> folded in here:
> https://git.kernel.org/pub/scm/linux/kernel/git/lpieralisi/pci.git/log/?h=test%2Fpci-dwc-msi
> 
> Perhaps it would be a good idea to send a V2, with proper Fixes-tags,
> just so that people would know what to test, so that we can start getting
> those Tested-by tags.

Perhaps it would be a good idea to pull the branch above and test it
after I have sent three reminders to all DWC host bridge maintainers through
this email thread.

I have no problem reposting those patches but it is time you started
testing them, I have already explained what's the issue they are fixing
in this thread, I do not think a Fixes: tag will add any further degree
of urgency.

Thanks,
Lorenzo
Niklas Cassel Dec. 3, 2018, 1:09 p.m. UTC | #32
On Mon, Dec 03, 2018 at 10:42:19AM +0000, Lorenzo Pieralisi wrote:
> On Sun, Dec 02, 2018 at 12:50:58AM +0100, Niklas Cassel wrote:
> > On Wed, Nov 21, 2018 at 07:24:53PM +0200, Stanimir Varbanov wrote:
> > > Hi,
> > > 
> > > On 11/14/18 12:57 AM, Marc Zyngier wrote:
> > > > It recently came to light that the Designware PCIe driver is rather
> > > > broken in the way it handles MSI[1]:
> > > > 
> > > > - It masks interrupt by disabling them, meaning that MSIs generated
> > > >   during the masked window are simply lost. Oops.
> > > > 
> > > > - Acking of the currently pending MSI is done outside of the interrupt
> > > >   flow, getting moved around randomly and ultimately breaking the
> > > >   driver. Not great.
> > > > 
> > > > This series attempts to address this by switching to using the MASK
> > > > register for masking interrupts (!), and move the ack into the
> > > > appropriate callback, giving it a fixed place in the MSI handling
> > > > flow.
> > > > 
> > > > Note that this is only compile-tested on my arm64 laptop, as I'm
> > > > travelling and do not have the required HW to test it anyway. I'd
> > > > welcome both review and testing by the interested parties (dwc
> > > > maintainer and users affected by existing bugs).
> > > > 
> > > > Thanks,
> > > > 
> > > > 	M.
> > > > 
> > > > [1] https://patchwork.kernel.org/patch/10657987/
> > > > 
> > > > Marc Zyngier (3):
> > > >   PCI: designware: Use interrupt masking instead of disabling
> > > >   PCI: designware: Take lock when ACKing an interrupt
> > > >   PCI: designware: Move interrupt acking into the proper callback
> > > > 
> > > >  .../pci/controller/dwc/pcie-designware-host.c | 22 ++++++++++++-------
> > > >  1 file changed, 14 insertions(+), 8 deletions(-)
> > > > 
> > > 
> > > for pcie-qcom:
> > > 
> > > Tested-by: Stanimir Varbanov <svarbanov@mm-sol.com>
> > 
> > Hello PCI folks,
> > 
> > Since this is a real bug, we should try get a couple of Tested-by tags
> > before it's too late.
> > It would be nice if v4.20 was released without broken MSIs in this driver.
> > 
> > Personally I get confused just by looking at this mail thread.
> > 
> > I see 3 patches from Marc and a fix-up from Marc, but I also see
> > a patch from Gustavo, and another patch from Trent.
> > 
> > Is seems like Lorenzo has a branch with Marc's 3 patches + Marc's fix-up
> > folded in here:
> > https://git.kernel.org/pub/scm/linux/kernel/git/lpieralisi/pci.git/log/?h=test%2Fpci-dwc-msi
> > 
> > Perhaps it would be a good idea to send a V2, with proper Fixes-tags,
> > just so that people would know what to test, so that we can start getting
> > those Tested-by tags.
> 
> Perhaps it would be a good idea to pull the branch above and test it
> after I have sent three reminders to all DWC host bridge maintainers through
> this email thread.
> 
> I have no problem reposting those patches but it is time you started
> testing them, I have already explained what's the issue they are fixing
> in this thread, I do not think a Fixes: tag will add any further degree
> of urgency.
> 

I tested Lorenzo's
https://git.kernel.org/pub/scm/linux/kernel/git/lpieralisi/pci.git/log/?h=test%2Fpci-dwc-msi
branch with drivers/pci/controller/dwc/pcie-qcom.c.

Without this branch, when having an ath10k PCIe endpoint connected,
and simply running the ath10k as a host access point (running hostapd):
watch cat /proc/interrupts | grep ath10k_pci
I consistenly stop getting interrupts in less than a minute.

With this branch, I've been able to run the same test case successfully
for 30+ minutes.

Tested-by: Niklas Cassel <niklas.cassel@linaro.org>
Lorenzo Pieralisi Dec. 3, 2018, 5:42 p.m. UTC | #33
On Mon, Dec 03, 2018 at 02:09:24PM +0100, Niklas Cassel wrote:
> On Mon, Dec 03, 2018 at 10:42:19AM +0000, Lorenzo Pieralisi wrote:
> > On Sun, Dec 02, 2018 at 12:50:58AM +0100, Niklas Cassel wrote:
> > > On Wed, Nov 21, 2018 at 07:24:53PM +0200, Stanimir Varbanov wrote:
> > > > Hi,
> > > > 
> > > > On 11/14/18 12:57 AM, Marc Zyngier wrote:
> > > > > It recently came to light that the Designware PCIe driver is rather
> > > > > broken in the way it handles MSI[1]:
> > > > > 
> > > > > - It masks interrupt by disabling them, meaning that MSIs generated
> > > > >   during the masked window are simply lost. Oops.
> > > > > 
> > > > > - Acking of the currently pending MSI is done outside of the interrupt
> > > > >   flow, getting moved around randomly and ultimately breaking the
> > > > >   driver. Not great.
> > > > > 
> > > > > This series attempts to address this by switching to using the MASK
> > > > > register for masking interrupts (!), and move the ack into the
> > > > > appropriate callback, giving it a fixed place in the MSI handling
> > > > > flow.
> > > > > 
> > > > > Note that this is only compile-tested on my arm64 laptop, as I'm
> > > > > travelling and do not have the required HW to test it anyway. I'd
> > > > > welcome both review and testing by the interested parties (dwc
> > > > > maintainer and users affected by existing bugs).
> > > > > 
> > > > > Thanks,
> > > > > 
> > > > > 	M.
> > > > > 
> > > > > [1] https://patchwork.kernel.org/patch/10657987/
> > > > > 
> > > > > Marc Zyngier (3):
> > > > >   PCI: designware: Use interrupt masking instead of disabling
> > > > >   PCI: designware: Take lock when ACKing an interrupt
> > > > >   PCI: designware: Move interrupt acking into the proper callback
> > > > > 
> > > > >  .../pci/controller/dwc/pcie-designware-host.c | 22 ++++++++++++-------
> > > > >  1 file changed, 14 insertions(+), 8 deletions(-)
> > > > > 
> > > > 
> > > > for pcie-qcom:
> > > > 
> > > > Tested-by: Stanimir Varbanov <svarbanov@mm-sol.com>
> > > 
> > > Hello PCI folks,
> > > 
> > > Since this is a real bug, we should try get a couple of Tested-by tags
> > > before it's too late.
> > > It would be nice if v4.20 was released without broken MSIs in this driver.
> > > 
> > > Personally I get confused just by looking at this mail thread.
> > > 
> > > I see 3 patches from Marc and a fix-up from Marc, but I also see
> > > a patch from Gustavo, and another patch from Trent.
> > > 
> > > Is seems like Lorenzo has a branch with Marc's 3 patches + Marc's fix-up
> > > folded in here:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/lpieralisi/pci.git/log/?h=test%2Fpci-dwc-msi
> > > 
> > > Perhaps it would be a good idea to send a V2, with proper Fixes-tags,
> > > just so that people would know what to test, so that we can start getting
> > > those Tested-by tags.
> > 
> > Perhaps it would be a good idea to pull the branch above and test it
> > after I have sent three reminders to all DWC host bridge maintainers through
> > this email thread.
> > 
> > I have no problem reposting those patches but it is time you started
> > testing them, I have already explained what's the issue they are fixing
> > in this thread, I do not think a Fixes: tag will add any further degree
> > of urgency.
> > 
> 
> I tested Lorenzo's
> https://git.kernel.org/pub/scm/linux/kernel/git/lpieralisi/pci.git/log/?h=test%2Fpci-dwc-msi
> branch with drivers/pci/controller/dwc/pcie-qcom.c.
> 
> Without this branch, when having an ath10k PCIe endpoint connected,
> and simply running the ath10k as a host access point (running hostapd):
> watch cat /proc/interrupts | grep ath10k_pci
> I consistenly stop getting interrupts in less than a minute.
> 
> With this branch, I've been able to run the same test case successfully
> for 30+ minutes.
> 
> Tested-by: Niklas Cassel <niklas.cassel@linaro.org>

Thank you very much, I need this tag on linux-pci@vger.kernel.org in
reply to this series:

https://patchwork.ozlabs.org/project/linux-pci/list/?series=75827

I need more testing done, I encourage other DWC maintainers to test my
branch above. I am mulling over it but I may consider this v4.21
material if I do not get enough testing done this week.

Thanks,
Lorenzo
Trent Piepho Dec. 3, 2018, 8:31 p.m. UTC | #34
On Mon, 2018-12-03 at 17:42 +0000, Lorenzo Pieralisi wrote:
> 
> 
> I need more testing done, I encourage other DWC maintainers to test my
> branch above. I am mulling over it but I may consider this v4.21
> material if I do not get enough testing done this week.

There's a far simpler patch with fewer side effects.  I still fail to
see why reverting the bug in < 4.21 and doing a proper series in 4.21
is so bad, vs keeping < 4.21 broken and doing a proper series in 4.21.

I think this series has flaws.

1. It doesn't address the keystone platform apparently disabling the
interrupt in what should be a mask.  Just running the ath10k driver
will not show this.  It will require something to actually mask the
interrupt at the dwc msi layer and observe that an MSI that arrives
while masked is lost on unmask.  And only keystone has a platform
method which does this.

2. It enables all MSIs on startup.  This will change the behavior if
something generates a MSI which has not been enabled.  That would not
be normal, but possibly an edge case with certain hardware coming out
of an soft-reset or power save.  Just running the ath10k driver is not
going to test this at all.

3. I am unconvinced the new lock in the irq ack fast path does anything
other than add additional irq latency.
Gustavo Pimentel Dec. 7, 2018, 4:16 p.m. UTC | #35
Hi all,

(snip)

I was able to reproduce Trent's problem in my configuration while developing the
eDMA driver, always having lost the second interrupt when an interrupt burst
occurs (observed in 50 trials).

After applying Marc's patch series, I stopped losing the second interrupt when
an interrupt burst occurs (I ran about 50 trials).

I can say with greater certainty that this fix solves the reported problem.

Regards,
Gustavo
Lorenzo Pieralisi Dec. 10, 2018, 4:17 p.m. UTC | #36
On Tue, Nov 13, 2018 at 10:57:31PM +0000, Marc Zyngier wrote:
> It recently came to light that the Designware PCIe driver is rather
> broken in the way it handles MSI[1]:
> 
> - It masks interrupt by disabling them, meaning that MSIs generated
>   during the masked window are simply lost. Oops.
> 
> - Acking of the currently pending MSI is done outside of the interrupt
>   flow, getting moved around randomly and ultimately breaking the
>   driver. Not great.
> 
> This series attempts to address this by switching to using the MASK
> register for masking interrupts (!), and move the ack into the
> appropriate callback, giving it a fixed place in the MSI handling
> flow.
> 
> Note that this is only compile-tested on my arm64 laptop, as I'm
> travelling and do not have the required HW to test it anyway. I'd
> welcome both review and testing by the interested parties (dwc
> maintainer and users affected by existing bugs).
> 
> Thanks,
> 
> 	M.
> 
> [1] https://patchwork.kernel.org/patch/10657987/
> 
> Marc Zyngier (3):
>   PCI: designware: Use interrupt masking instead of disabling
>   PCI: designware: Take lock when ACKing an interrupt
>   PCI: designware: Move interrupt acking into the proper callback
> 
>  .../pci/controller/dwc/pcie-designware-host.c | 22 ++++++++++++-------
>  1 file changed, 14 insertions(+), 8 deletions(-)

Marc, Gustavo,

I have decided to queue this series - fixed-up as per this thread,
available at:

git://git.kernel.org/pub/scm/linux/kernel/git/lpieralisi/pci.git test/pci-dwc-msi

We allowed enough time for people to test it, we can't leave mainline
broken for the, apparently few, people who care.

I *think* that this is the Fixes: tag to be added to all patches in this
series, @Gustavo please countercheck:

7c5925afbc58 ("PCI: dwc: Move MSI IRQs allocation to IRQ domains
hierarchical API")

I will mark them for stable too and we will work on backports to be
sent in due course.

Thanks,
Lorenzo
Marc Zyngier Dec. 10, 2018, 4:30 p.m. UTC | #37
Hi Lorenzo,

On 10/12/2018 16:17, Lorenzo Pieralisi wrote:
> On Tue, Nov 13, 2018 at 10:57:31PM +0000, Marc Zyngier wrote:
>> It recently came to light that the Designware PCIe driver is rather
>> broken in the way it handles MSI[1]:
>>
>> - It masks interrupt by disabling them, meaning that MSIs generated
>>   during the masked window are simply lost. Oops.
>>
>> - Acking of the currently pending MSI is done outside of the interrupt
>>   flow, getting moved around randomly and ultimately breaking the
>>   driver. Not great.
>>
>> This series attempts to address this by switching to using the MASK
>> register for masking interrupts (!), and move the ack into the
>> appropriate callback, giving it a fixed place in the MSI handling
>> flow.
>>
>> Note that this is only compile-tested on my arm64 laptop, as I'm
>> travelling and do not have the required HW to test it anyway. I'd
>> welcome both review and testing by the interested parties (dwc
>> maintainer and users affected by existing bugs).
>>
>> Thanks,
>>
>> 	M.
>>
>> [1] https://patchwork.kernel.org/patch/10657987/
>>
>> Marc Zyngier (3):
>>   PCI: designware: Use interrupt masking instead of disabling
>>   PCI: designware: Take lock when ACKing an interrupt
>>   PCI: designware: Move interrupt acking into the proper callback
>>
>>  .../pci/controller/dwc/pcie-designware-host.c | 22 ++++++++++++-------
>>  1 file changed, 14 insertions(+), 8 deletions(-)
> 
> Marc, Gustavo,
> 
> I have decided to queue this series - fixed-up as per this thread,
> available at:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/lpieralisi/pci.git test/pci-dwc-msi
> 
> We allowed enough time for people to test it, we can't leave mainline
> broken for the, apparently few, people who care.
> 
> I *think* that this is the Fixes: tag to be added to all patches in this
> series, @Gustavo please countercheck:
> 
> 7c5925afbc58 ("PCI: dwc: Move MSI IRQs allocation to IRQ domains
> hierarchical API")
> 
> I will mark them for stable too and we will work on backports to be
> sent in due course.

Thanks for doing that Lorenzo. I'll work on a backport of the last patch
as soon as this hits mainline.

Cheers,

	M.
Trent Piepho Dec. 10, 2018, 6:15 p.m. UTC | #38
On Mon, 2018-12-10 at 16:17 +0000, Lorenzo Pieralisi wrote:
> On Tue, Nov 13, 2018 at 10:57:31PM +0000, Marc Zyngier wrote:
> > It recently came to light that the Designware PCIe driver is rather
> > broken in the way it handles MSI[1]:
> > 
> > - It masks interrupt by disabling them, meaning that MSIs generated
> >   during the masked window are simply lost. Oops.
> > 
> > - Acking of the currently pending MSI is done outside of the
> > interrupt
> >   flow, getting moved around randomly and ultimately breaking the
> >   driver. Not great.
> > 
> > 
> I have decided to queue this series - fixed-up as per this thread,
> available at:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/lpieralisi/pci.git
> test/pci-dwc-msi
> 
> We allowed enough time for people to test it, we can't leave mainline
> broken for the, apparently few, people who care.
> 
> I *think* that this is the Fixes: tag to be added to all patches in
> this
> series, @Gustavo please countercheck:
> 
> 7c5925afbc58 ("PCI: dwc: Move MSI IRQs allocation to IRQ domains
> hierarchical API")

That's not the correct Fixes.  It should be:

8c934095fa2f ("PCI: dwc: Clear MSI interrupt status after it is
handled, not before")

I had concerns about what appears to be an unnecessary extra lock taken
before handling an interrupt and enabling all MSIs even if nothing has
tried to enable them.
Marc Zyngier Dec. 10, 2018, 6:31 p.m. UTC | #39
On 10/12/2018 18:15, Trent Piepho wrote:
> On Mon, 2018-12-10 at 16:17 +0000, Lorenzo Pieralisi wrote:
>> On Tue, Nov 13, 2018 at 10:57:31PM +0000, Marc Zyngier wrote:
>>> It recently came to light that the Designware PCIe driver is rather
>>> broken in the way it handles MSI[1]:
>>>
>>> - It masks interrupt by disabling them, meaning that MSIs generated
>>>   during the masked window are simply lost. Oops.
>>>
>>> - Acking of the currently pending MSI is done outside of the
>>> interrupt
>>>   flow, getting moved around randomly and ultimately breaking the
>>>   driver. Not great.
>>>
>>>
>> I have decided to queue this series - fixed-up as per this thread,
>> available at:
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/lpieralisi/pci.git
>> test/pci-dwc-msi
>>
>> We allowed enough time for people to test it, we can't leave mainline
>> broken for the, apparently few, people who care.
>>
>> I *think* that this is the Fixes: tag to be added to all patches in
>> this
>> series, @Gustavo please countercheck:
>>
>> 7c5925afbc58 ("PCI: dwc: Move MSI IRQs allocation to IRQ domains
>> hierarchical API")
> 
> That's not the correct Fixes.  It should be:
> 
> 8c934095fa2f ("PCI: dwc: Clear MSI interrupt status after it is
> handled, not before")

This only applies to the last one, which should carry both tags.

> I had concerns about what appears to be an unnecessary extra lock taken
> before handling an interrupt and enabling all MSIs even if nothing has
> tried to enable them.
Regarding the lock: I'm quite puzzled that you consider it
"unnecessary", given that all the DWC callbacks expect such a locking. I
suspect you are considering from a pure performance angle, and I'd
suggest that you post numbers showing the unacceptable overhead of an
otherwise uncontended lock.

As for enabling all MSIs upfront, same thing. Please demonstrate how
harmful it is, given that they are all masked by default, consistently
with what other interrupt controllers are doing.

Once you've posted some of the above, we'll queue some additional fixes
on top. In the meantime, we'll fix it for everyone else.

Thanks,

	M.
Trent Piepho Dec. 10, 2018, 8:34 p.m. UTC | #40
On Mon, 2018-12-10 at 18:31 +0000, Marc Zyngier wrote:
> 
> > I had concerns about what appears to be an unnecessary extra lock taken
> > before handling an interrupt and enabling all MSIs even if nothing has
> > tried to enable them.
> 
> Regarding the lock: I'm quite puzzled that you consider it
> "unnecessary", given that all the DWC callbacks expect such a locking. I
> suspect you are considering from a pure performance angle, and I'd
> suggest that you post numbers showing the unacceptable overhead of an
> otherwise uncontended lock.

The difference is that the other callbacks can be called outside the
path of handling an interrupt.  Any thread could possibly make a call
to mask the interrupt at any time, so the lock is needed.

But the ack is only called in the path of handling an irq.  That's why
its different.  There's already a system insuring that callback is not
called reentrantly.  If the lock was ever taken when it was attempted
to be locked, then we'd already be broken.

> As for enabling all MSIs upfront, same thing. Please demonstrate how
> harmful it is, given that they are all masked by default, consistently
> with what other interrupt controllers are doing.

Without any information on the controller, who knows what the
differences between a masked and enabled vs a disabled msi are?  I
don't know there is a difference, but on the other hand, you don't know
there isn't.  It's like a dozen lines of code to not change the
behavior.
Lorenzo Pieralisi Dec. 11, 2018, 11:43 a.m. UTC | #41
On Tue, Nov 13, 2018 at 10:57:31PM +0000, Marc Zyngier wrote:
> It recently came to light that the Designware PCIe driver is rather
> broken in the way it handles MSI[1]:
> 
> - It masks interrupt by disabling them, meaning that MSIs generated
>   during the masked window are simply lost. Oops.
> 
> - Acking of the currently pending MSI is done outside of the interrupt
>   flow, getting moved around randomly and ultimately breaking the
>   driver. Not great.
> 
> This series attempts to address this by switching to using the MASK
> register for masking interrupts (!), and move the ack into the
> appropriate callback, giving it a fixed place in the MSI handling
> flow.
> 
> Note that this is only compile-tested on my arm64 laptop, as I'm
> travelling and do not have the required HW to test it anyway. I'd
> welcome both review and testing by the interested parties (dwc
> maintainer and users affected by existing bugs).
> 
> Thanks,
> 
> 	M.
> 
> [1] https://patchwork.kernel.org/patch/10657987/
> 
> Marc Zyngier (3):
>   PCI: designware: Use interrupt masking instead of disabling
>   PCI: designware: Take lock when ACKing an interrupt
>   PCI: designware: Move interrupt acking into the proper callback
> 
>  .../pci/controller/dwc/pcie-designware-host.c | 22 ++++++++++++-------
>  1 file changed, 14 insertions(+), 8 deletions(-)

Applied to pci/dwc-msi for v4.21 with tags and minor log updates.

All, please have a look and possibly keep testing it, I will ask
Bjorn to move it into -next shortly.

Thanks,
Lorenzo
Gustavo Pimentel Dec. 12, 2018, 8:55 a.m. UTC | #42
Hi,

On 10/12/2018 16:17, Lorenzo Pieralisi wrote:
> On Tue, Nov 13, 2018 at 10:57:31PM +0000, Marc Zyngier wrote:
>> It recently came to light that the Designware PCIe driver is rather
>> broken in the way it handles MSI[1]:
>>
>> - It masks interrupt by disabling them, meaning that MSIs generated
>>   during the masked window are simply lost. Oops.
>>
>> - Acking of the currently pending MSI is done outside of the interrupt
>>   flow, getting moved around randomly and ultimately breaking the
>>   driver. Not great.
>>
>> This series attempts to address this by switching to using the MASK
>> register for masking interrupts (!), and move the ack into the
>> appropriate callback, giving it a fixed place in the MSI handling
>> flow.
>>
>> Note that this is only compile-tested on my arm64 laptop, as I'm
>> travelling and do not have the required HW to test it anyway. I'd
>> welcome both review and testing by the interested parties (dwc
>> maintainer and users affected by existing bugs).
>>
>> Thanks,
>>
>> 	M.
>>
>> [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__patchwork.kernel.org_patch_10657987_&d=DwIBAg&c=DPL6_X_6JkXFx7AXWqB0tg&r=bkWxpLoW-f-E3EdiDCCa0_h0PicsViasSlvIpzZvPxs&m=SrPrRrHtYpjpRa6GKChsIMueIxK1EJVXMRMX4-JhuSE&s=wlMpkjgZDQ_a2lgGHhHLH6OPLkM2RKuYfSFPHbwGEBg&e=
>>
>> Marc Zyngier (3):
>>   PCI: designware: Use interrupt masking instead of disabling
>>   PCI: designware: Take lock when ACKing an interrupt
>>   PCI: designware: Move interrupt acking into the proper callback
>>
>>  .../pci/controller/dwc/pcie-designware-host.c | 22 ++++++++++++-------
>>  1 file changed, 14 insertions(+), 8 deletions(-)
> 
> Marc, Gustavo,
> 
> I have decided to queue this series - fixed-up as per this thread,
> available at:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/lpieralisi/pci.git test/pci-dwc-msi
> 
> We allowed enough time for people to test it, we can't leave mainline
> broken for the, apparently few, people who care.
> 
> I *think* that this is the Fixes: tag to be added to all patches in this
> series, @Gustavo please countercheck:
> 
> 7c5925afbc58 ("PCI: dwc: Move MSI IRQs allocation to IRQ domains
> hierarchical API")

Yes, I confirm, that is the patch to fix.

Regards,
Gustavo

> 
> I will mark them for stable too and we will work on backports to be
> sent in due course.
> 
> Thanks,
> Lorenzo
>
Gustavo Pimentel Dec. 12, 2018, 9:10 a.m. UTC | #43
Hi Trent,

On 10/12/2018 20:34, Trent Piepho wrote:
> On Mon, 2018-12-10 at 18:31 +0000, Marc Zyngier wrote:
>>
>>> I had concerns about what appears to be an unnecessary extra lock taken
>>> before handling an interrupt and enabling all MSIs even if nothing has
>>> tried to enable them.
>>
>> Regarding the lock: I'm quite puzzled that you consider it
>> "unnecessary", given that all the DWC callbacks expect such a locking. I
>> suspect you are considering from a pure performance angle, and I'd
>> suggest that you post numbers showing the unacceptable overhead of an
>> otherwise uncontended lock.
> 
> The difference is that the other callbacks can be called outside the
> path of handling an interrupt.  Any thread could possibly make a call
> to mask the interrupt at any time, so the lock is needed.
> 
> But the ack is only called in the path of handling an irq.  That's why
> its different.  There's already a system insuring that callback is not
> called reentrantly.  If the lock was ever taken when it was attempted
> to be locked, then we'd already be broken.
> 
>> As for enabling all MSIs upfront, same thing. Please demonstrate how
>> harmful it is, given that they are all masked by default, consistently
>> with what other interrupt controllers are doing.
> 
> Without any information on the controller, who knows what the
> differences between a masked and enabled vs a disabled msi are?  I
> don't know there is a difference, but on the other hand, you don't know
> there isn't.  It's like a dozen lines of code to not change the
> behavior.

That is not true, I've replied on 13/18/2018 with that information.

I leave here again the information provided.

*PCIE_MSI_INTRx_MASK*
When an MSI is received for a masked interrupt, the corresponding status bit
gets set in the interrupt status register but the controller will not signal it.
As soon as the masked interrupt is unmasked and assuming the status bit is still
set the controller will signal it.

*PCIE_MSI_INTRx_ENABLE*
Enables/disables every interrupt. When an MSI is received from a disabled
interrupt, no status bit gets set in MSI controller interrupt status register,
in another word, the interruption will be lost.

Regards,
Gustavo

>