diff mbox series

[v2] net: dsa: mv88e6xxx: Revise irq setup ordering

Message ID 824d011b-3692-69c3-5e2c-58e950a80abf@bell.net
State Changes Requested
Delegated to: David Miller
Headers show
Series [v2] net: dsa: mv88e6xxx: Revise irq setup ordering | expand

Commit Message

John David Anglin Feb. 4, 2019, 9:59 p.m. UTC
This change fixes a race condition in the setup of hardware irqs and the
code enabling PHY link detection in the mv88e6xxx driver.

This race was observed on the espressobin board where the GPIO interrupt
controller only supports edge interrupts.  If the INTn output pin goes low
before the GPIO interrupt is enabled, PHY link interrupts are not detected.

With this change, we
1) force INTn high by clearing all interrupt enables in global 1 control 1,
2) setup the hardware irq, and then
3) perform the remaining common setup.

This simplifies the setup and allows some unnecessary code to be removed.

     chip->g1_irq.domain = irq_domain_add_simple(
@@ -392,43 +411,14 @@ static int mv88e6xxx_g1_irq_setup_common(struct
mv88e6xxx_chip *chip)
     chip->g1_irq.chip = mv88e6xxx_g1_irq_chip;
     chip->g1_irq.masked = ~0;
 
-    err = mv88e6xxx_g1_read(chip, MV88E6XXX_G1_CTL1, &mask);
-    if (err)
-        goto out_mapping;
-
-    mask &= ~GENMASK(chip->g1_irq.nirqs, 0);
-
-    err = mv88e6xxx_g1_write(chip, MV88E6XXX_G1_CTL1, mask);
-    if (err)
-        goto out_disable;
-
-    /* Reading the interrupt status clears (most of) them */
-    err = mv88e6xxx_g1_read(chip, MV88E6XXX_G1_STS, &reg);
-    if (err)
-        goto out_disable;
-
     return 0;
-
-out_disable:
-    mask &= ~GENMASK(chip->g1_irq.nirqs, 0);
-    mv88e6xxx_g1_write(chip, MV88E6XXX_G1_CTL1, mask);
-
-out_mapping:
-    for (irq = 0; irq < 16; irq++) {
-        virq = irq_find_mapping(chip->g1_irq.domain, irq);
-        irq_dispose_mapping(virq);
-    }
-
-    irq_domain_remove(chip->g1_irq.domain);
-
-    return err;
 }
 
 static int mv88e6xxx_g1_irq_setup(struct mv88e6xxx_chip *chip)
 {
     int err;
 
-    err = mv88e6xxx_g1_irq_setup_common(chip);
+    err = mv88e6xxx_g1_irq_setup_masks(chip);
     if (err)
         return err;
 
@@ -437,9 +427,9 @@ static int mv88e6xxx_g1_irq_setup(struct
mv88e6xxx_chip *chip)
                    IRQF_ONESHOT | IRQF_SHARED,
                    dev_name(chip->dev), chip);
     if (err)
-        mv88e6xxx_g1_irq_free_common(chip);
+        return err;
 
-    return err;
+    return mv88e6xxx_g1_irq_setup_common(chip);
 }
 
 static void mv88e6xxx_irq_poll(struct kthread_work *work)
@@ -457,6 +447,10 @@ static int mv88e6xxx_irq_poll_setup(struct
mv88e6xxx_chip *chip)
 {
     int err;
 
+    err = mv88e6xxx_g1_irq_setup_masks(chip);
+    if (err)
+        return err;
+
     err = mv88e6xxx_g1_irq_setup_common(chip);
     if (err)
         return err;

Signed-off-by: John David Anglin <dave.anglin@bell.net>

Comments

Andrew Lunn Feb. 4, 2019, 11:14 p.m. UTC | #1
On Mon, Feb 04, 2019 at 04:59:13PM -0500, John David Anglin wrote:
> This change fixes a race condition in the setup of hardware irqs and the
> code enabling PHY link detection in the mv88e6xxx driver.
> 
> This race was observed on the espressobin board where the GPIO interrupt
> controller only supports edge interrupts.  If the INTn output pin goes low
> before the GPIO interrupt is enabled, PHY link interrupts are not detected.
> 
> With this change, we
> 1) force INTn high by clearing all interrupt enables in global 1 control 1,
> 2) setup the hardware irq, and then
> 3) perform the remaining common setup.
> 
> This simplifies the setup and allows some unnecessary code to be removed.

Hi Dave

I took a closer look now. I don't actually see why the current code is
wrong.

mv88e6xxx_g1_irq_setup() calls mv88e6xxx_g1_irq_setup_common() and
then registers the interrupt handler.

mv88e6xxx_g1_irq_setup_common() does what you want, it masks all
interrupts in the hardware and clears any pending interrupts which can
be cleared.

The change you made is actually dangerous. As soon as you request the
interrupt, it is live, it can fire, and call
mv88e6xxx_g1_irq_thread_work(). That needs the irq domain. But the
change you made defers the creating of the domain until after the
interrupt is registered. So we can de-refernece a NULL pointer in the
interrupt handler.

	  Andrew
John David Anglin Feb. 5, 2019, 12:38 a.m. UTC | #2
On 2019-02-04 6:14 p.m., Andrew Lunn wrote:
> On Mon, Feb 04, 2019 at 04:59:13PM -0500, John David Anglin wrote:
>> This change fixes a race condition in the setup of hardware irqs and the
>> code enabling PHY link detection in the mv88e6xxx driver.
>>
>> This race was observed on the espressobin board where the GPIO interrupt
>> controller only supports edge interrupts.  If the INTn output pin goes low
>> before the GPIO interrupt is enabled, PHY link interrupts are not detected.
>>
>> With this change, we
>> 1) force INTn high by clearing all interrupt enables in global 1 control 1,
>> 2) setup the hardware irq, and then
>> 3) perform the remaining common setup.
>>
>> This simplifies the setup and allows some unnecessary code to be removed.
> Hi Dave
>
> I took a closer look now. I don't actually see why the current code is
> wrong.
The problem is INTn can go low before the interrupt handler for it is
registered and enabled.
As a result, interrupts never occur if link happens to come up before
the interrupt handler
completes being enabled.
>
> mv88e6xxx_g1_irq_setup() calls mv88e6xxx_g1_irq_setup_common() and
> then registers the interrupt handler.
>
> mv88e6xxx_g1_irq_setup_common() does what you want, it masks all
> interrupts in the hardware and clears any pending interrupts which can
> be cleared.
>
> The change you made is actually dangerous. As soon as you request the
> interrupt, it is live, it can fire, and call
> mv88e6xxx_g1_irq_thread_work(). That needs the irq domain. But the
> change you made defers the creating of the domain until after the
> interrupt is registered. So we can de-refernece a NULL pointer in the
> interrupt handler.
This can't happen.  The domain is setup immediately after registering
the GPIO interrupt.
The interrupt can't fire until one of the enables is set.  These are set
by mv88e6xxx_g2_irq_setup(),
mv88e6xxx_g1_atu_prob_irq_setup() and
mv88e6xxx_g1_vtu_prob_irq_setup().  These irqs
are setup after mv88e6xxx_g1_irq_setup()/mv88e6xxx_irq_poll_setup() is
called.  Thus, the
irq domain is setup before the GPIO interrupt can fire.

I have tested both hardware and polled interrupts using espressobin with
v4.20.6 and networking
starts correctly.

Dave
Andrew Lunn Feb. 5, 2019, 2:21 a.m. UTC | #3
> The problem is INTn can go low before the interrupt handler for it is
> registered and enabled.

> This can't happen.  The domain is setup immediately after registering
> the GPIO interrupt.
> The interrupt can't fire until one of the enables is set.

These two statement seem to contradict each other?

> These are set
> by mv88e6xxx_g2_irq_setup(),
> mv88e6xxx_g1_atu_prob_irq_setup() and
> mv88e6xxx_g1_vtu_prob_irq_setup().  These irqs
> are setup after mv88e6xxx_g1_irq_setup()/mv88e6xxx_irq_poll_setup() is
> called.  Thus, the
> irq domain is setup before the GPIO interrupt can fire.

At what point is INTn going low, causing you all these problems? I've
yet to see a real description of the race. Please give us a blow by
blow of how the race happens. And then explain how your fix actually
fixes this.

Also, i'm not yet convinced this hardware can actually work correctly
with edge interrupts. You can probably reduce the size of the race
window, but i don't think you can eliminate it. And if you cannot
eliminate it, at some point it is going to hit you.

     Andrew
David Miller Feb. 5, 2019, 6:37 p.m. UTC | #4
Something is really wrong with how your patches are submitted.

The patch shows up in between the commit message and your signoff
tags, also the patch appears to be mangled by your email client.

Please fix this before submitting any future work.

Thank you.
John David Anglin Feb. 5, 2019, 7:20 p.m. UTC | #5
On 2019-02-04 9:21 p.m., Andrew Lunn wrote:
>> The problem is INTn can go low before the interrupt handler for it is
>> registered and enabled.
>> This can't happen.  The domain is setup immediately after registering
>> the GPIO interrupt.
>> The interrupt can't fire until one of the enables is set.
> These two statement seem to contradict each other?
I don't think so.  In the first, I am referring to the enabling of the
GPIO pin interrupt in the
Armada 3720 CPU.  Specifically, this would be the setting of the enable
in the South Bridge
GPIO2 Interrupt Enable Register (RD0018C00).  In the second, I'm
referring to the enables
in the switch's Global Control Register.  Setting these enables to zero
forces the the switch's
INTn output high (assuming there isn't a short).  This shouldn't cause a
CPU interrupt if the
South Bridge GPIO2 Polarity Control is set correctly at the time.  INTn
goes low after a
switch reset because EEIntEn is set on reset, so clearing the enables
will causes a rising
edge on INTn.
>
>> These are set
>> by mv88e6xxx_g2_irq_setup(),
>> mv88e6xxx_g1_atu_prob_irq_setup() and
>> mv88e6xxx_g1_vtu_prob_irq_setup().  These irqs
>> are setup after mv88e6xxx_g1_irq_setup()/mv88e6xxx_irq_poll_setup() is
>> called.  Thus, the
>> irq domain is setup before the GPIO interrupt can fire.
> At what point is INTn going low, causing you all these problems? I've
> yet to see a real description of the race. Please give us a blow by
> blow of how the race happens. And then explain how your fix actually
> fixes this.
I'm going to withdraw my proposed change.  I had thought that calling
request_threaded_irq()
earlier fixed the issue at boot.  But given your mail, I reverted the
change in my 4.20.6 build
and I wasn't able to reproduce the problem.  Yet, my 4.20.4 build fails
every time doing a power
on boot.

We have the following interrupt status when it fails:

 44:          2          0     GPIO2  23 Edge      d0032004.mdio-mii:01
 48:          0          0  mv88e6xxx-g1   3 Edge      mv88e6xxx-g1-atu-prob
 50:          0          0  mv88e6xxx-g1   5 Edge      mv88e6xxx-g1-vtu-prob
 52:          0          1  mv88e6xxx-g1   7 Edge      mv88e6xxx-g2
 55:          0          1  mv88e6xxx-g2   1 Edge     
!soc!internal-regs@d0000000!mdio@32004!switch0@1!mdio:11
 56:          0          0  mv88e6xxx-g2   2 Edge     
!soc!internal-regs@d0000000!mdio@32004!switch0@1!mdio:12
 57:          0          0  mv88e6xxx-g2   3 Edge     
!soc!internal-regs@d0000000!mdio@32004!switch0@1!mdio:13
 69:          0          0  mv88e6xxx-g2  15 Edge      mv88e6xxx-watchdog

We have the following values in the Global1 registers:

MCLI> sys dumpGlobal1

------------------------------------------------------
Switch Global Status(0x0)                       c881
ATU FID Register(0x1)                           0000
VTU FID Register(0x2)                           0000
VTU SID Register(0x3)                           0000
Switch Global Control Register(0x4)             40a8

The DeviceInt bit is set in the Global Status register.  It would appear
we have missed an edge on GPIO2 23.
Yet, we have handled 2 interrupts on it.  So, this isn't the setup issue
that I thought it was.

>
> Also, i'm not yet convinced this hardware can actually work correctly
> with edge interrupts. You can probably reduce the size of the race
> window, but i don't think you can eliminate it. And if you cannot
> eliminate it, at some point it is going to hit you.
You could be right but I don't want to give up just yet.  I need to go
back and rebuild v4.20.4 and retest.
My hunch is the second hunk of the original patch will fix this.

Dave
Andrew Lunn Feb. 5, 2019, 7:54 p.m. UTC | #6
> My hunch is the second hunk of the original patch will fix this.

O.K.

But please fully explain how the race happens and how the fix fully
fixes it. I don't really want to add code which just makes the race
window smaller, but the race still happens. That helps nobody.

In the end, i think you will end up polling. If so, feel free to
submit a patch which makes the poll interval build time configurable.

       Andrew
diff mbox series

Patch

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c
b/drivers/net/dsa/mv88e6xxx/chip.c
index b2a0e59b6252..9f5c416a3223 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -374,10 +374,29 @@  static void mv88e6xxx_g1_irq_free(struct
mv88e6xxx_chip *chip)
     mutex_unlock(&chip->reg_lock);
 }
 
+static int mv88e6xxx_g1_irq_setup_masks(struct mv88e6xxx_chip *chip)
+{
+    int err;
+    u16 reg;
+
+    /* The INTn output must be high when hardware interrupts are setup.
+       The EEPROM done interrupt enable is set on reset, so clear all
+       interrupt enable bits to ensure INTn is not driven low */
+    err = mv88e6xxx_g1_read(chip, MV88E6XXX_G1_CTL1, &reg);
+    if (err)
+        return err;
+    reg &= ~GENMASK(chip->info->g1_irqs, 0);
+    err = mv88e6xxx_g1_write(chip, MV88E6XXX_G1_CTL1, reg);
+    if (err)
+        return err;
+
+    /* Reading the interrupt status clears (most of) them */
+    return mv88e6xxx_g1_read(chip, MV88E6XXX_G1_STS, &reg);
+}
+
 static int mv88e6xxx_g1_irq_setup_common(struct mv88e6xxx_chip *chip)
 {
-    int err, irq, virq;
-    u16 reg, mask;
+    int irq;
 
     chip->g1_irq.nirqs = chip->info->g1_irqs;