Patchwork solution to printk() blocking interrupts?

login
register
mail settings
Submitter Jon Smirl
Date Sept. 21, 2008, 11:34 p.m.
Message ID <9e4733910809211634u5b37e297pe8503f08de4f19a8@mail.gmail.com>
Download mbox | patch
Permalink /patch/811/
State Superseded, archived
Delegated to: Grant Likely
Headers show

Comments

Jon Smirl - Sept. 21, 2008, 11:34 p.m.
On Sun, Sep 21, 2008 at 6:21 PM, Sergei Shtylyov
<sshtylyov@ru.mvista.com> wrote:
>  Implementing the poll_controller() method in the network driver is usually
> staightforward.

Good tip, the simple implementation worked.

What controls this? "carrier detect appears untrustworthy, waiting 4 seconds"
Get that fixed and this patch could be useful,

root@phyCORE-MPC5200B-tiny:~ dmesg | grep netconsole
Kernel command line: console=ttyPSC0,115200 rw debug root=/dev/nfs
ip=dhcp nfsroot=192.168.1.4:/home/OSELAS.BSP-Phytec-phyCORE-MPC5200B-tiny-6/root,v3,tcp
netconsole=6666@192.168.1.11/eth0,514@192.168.1.4/00:19:d1:e4:0f:8d
netconsole: local port 6666
netconsole: local IP 192.168.1.11
netconsole: interface eth0
netconsole: remote port 514
netconsole: remote IP 192.168.1.4
netconsole: remote ethernet address 00:19:d1:e4:0f:8d
netconsole: device eth0 not up yet, forcing it
netconsole: carrier detect appears untrustworthy, waiting 4 seconds
netconsole: network logging started
root@phyCORE-MPC5200B-tiny:~



 static irqreturn_t mpc52xx_fec_tx_interrupt(int irq, void *dev_id)
@@ -926,6 +936,9 @@ mpc52xx_fec_probe(struct of_device *op, const
struct of_device_id *match)
        ndev->tx_timeout        = mpc52xx_fec_tx_timeout;
        ndev->watchdog_timeo    = FEC_WATCHDOG_TIMEOUT;
        ndev->base_addr         = mem.start;
+#ifdef CONFIG_NET_POLL_CONTROLLER
+       ndev->poll_controller = mpc52xx_fec_poll_controller;
+#endif

        priv->t_irq = priv->r_irq = ndev->irq = NO_IRQ; /* IRQ are
free for now */
Benjamin Herrenschmidt - Sept. 22, 2008, 5:17 a.m.
On Sun, 2008-09-21 at 19:34 -0400, Jon Smirl wrote:
> On Sun, Sep 21, 2008 at 6:21 PM, Sergei Shtylyov
> <sshtylyov@ru.mvista.com> wrote:
> >  Implementing the poll_controller() method in the network driver is usually
> > staightforward.
> 
> Good tip, the simple implementation worked.
> 
> What controls this? "carrier detect appears untrustworthy, waiting 4 seconds"
> Get that fixed and this patch could be useful,

Does the driver properly uses netif_carrier_on/off to signal the
system when the link is up/down ?

Cheers,
Ben.
Sergei Shtylyov - Sept. 22, 2008, 10:03 a.m.
Hello.

Jon Smirl wrote:

>>  Implementing the poll_controller() method in the network driver is usually
>> staightforward.
>>     
>
> Good tip, the simple implementation worked.
>
> What controls this? "carrier detect appears untrustworthy, waiting 4 seconds"
>   

   IIRC, it happens if netpoll code sees the carrier OK right ast the 
first time it checks it. Look for this message in net/core/netpoll.c...

> Get that fixed and this patch could be useful,
>   

    How that hinders using netconsole?

> root@phyCORE-MPC5200B-tiny:~ dmesg | grep netconsole
> Kernel command line: console=ttyPSC0,115200 rw debug root=/dev/nfs
> ip=dhcp nfsroot=192.168.1.4:/home/OSELAS.BSP-Phytec-phyCORE-MPC5200B-tiny-6/root,v3,tcp
> netconsole=6666@192.168.1.11/eth0,514@192.168.1.4/00:19:d1:e4:0f:8d
> netconsole: local port 6666
> netconsole: local IP 192.168.1.11
> netconsole: interface eth0
> netconsole: remote port 514
> netconsole: remote IP 192.168.1.4
> netconsole: remote ethernet address 00:19:d1:e4:0f:8d
> netconsole: device eth0 not up yet, forcing it
> netconsole: carrier detect appears untrustworthy, waiting 4 seconds
> netconsole: network logging started
> root@phyCORE-MPC5200B-tiny:~
>
>
>
> diff --git a/drivers/net/fec_mpc52xx.c b/drivers/net/fec_mpc52xx.c
> index 4e4f683..72541b5 100644
> --- a/drivers/net/fec_mpc52xx.c
> +++ b/drivers/net/fec_mpc52xx.c
> @@ -401,6 +401,16 @@ static int mpc52xx_fec_hard_start_xmit(struct
> sk_buff *skb, struct net_device *d
>         return 0;
>  }
>
> +#ifdef CONFIG_NET_POLL_CONTROLLER
> +static void mpc52xx_fec_poll_controller(struct net_device *dev)
> +{
> +        disable_irq(dev->irq);
> +        mpc52xx_fec_tx_interrupt(dev->irq, dev);
>   

   The interrupt nu,ber seems wrong, although the handler doesn't care 
anyway...

> +        enable_irq(dev->irq);
> +}
> +#endif
>   

   Hey, how about the Rx interrupt? And I'm seeing 
mcp52xx_fec_interrupt() on separate IRQ too -- looks like for for this 
driver this is not as straightforward as it usually is.
You need to call all the handlers -- netconsole is not the only user of 
the netpoll API, so not only Tx should be handled.

WBR, Sergei
Sergei Shtylyov - Sept. 22, 2008, 11:01 a.m.
Hello, I wrote:

>> diff --git a/drivers/net/fec_mpc52xx.c b/drivers/net/fec_mpc52xx.c
>> index 4e4f683..72541b5 100644
>> --- a/drivers/net/fec_mpc52xx.c
>> +++ b/drivers/net/fec_mpc52xx.c
>> @@ -401,6 +401,16 @@ static int mpc52xx_fec_hard_start_xmit(struct
>> sk_buff *skb, struct net_device *d
>>         return 0;
>>  }
>>
>> +#ifdef CONFIG_NET_POLL_CONTROLLER
>> +static void mpc52xx_fec_poll_controller(struct net_device *dev)
>> +{
>> +        disable_irq(dev->irq);
>> +        mpc52xx_fec_tx_interrupt(dev->irq, dev);
>>   
>
>   The interrupt nu,ber seems wrong, although the handler doesn't care 
> anyway...
>
>> +        enable_irq(dev->irq);

   Ah, you're also dis/enabling the wrong IRQ -- it should be 
priv->t_irq. No, this patch won't do.

WBR, Sergei
Sergei Shtylyov - Sept. 22, 2008, 11:04 a.m.
Hello.

Benjamin Herrenschmidt wrote:

>>>  Implementing the poll_controller() method in the network driver is usually
>>> staightforward.
>>>       
>> Good tip, the simple implementation worked.
>>
>> What controls this? "carrier detect appears untrustworthy, waiting 4 seconds"
>> Get that fixed and this patch could be useful,
>>     
>
> Does the driver properly uses netif_carrier_on/off to signal the
> system when the link is up/down ?
>   

   Looks like the answer is "no"...

WBR, Sergei
Sergei Shtylyov - Sept. 22, 2008, 11:11 a.m.
Hello, I wrote:

>>> What controls this? "carrier detect appears untrustworthy, waiting 4 
>>> seconds"
>>> Get that fixed and this patch could be useful,
>>>     
>>
>> Does the driver properly uses netif_carrier_on/off to signal the
>> system when the link is up/down ?
>>   
>  Implementing the poll_controller() method in the network driver is 
> usually
>
>   Looks like the answer is "no"...

   Hm... it uses phylib, so probably it does that. That message and a 4 
s pause appears if the carrier is seen in <10 ms after opening the 
device. Maybe this threshold needs to be changed?

WBR, Sergei
Sergei Shtylyov - Sept. 22, 2008, 2:39 p.m.
Hello, I wrote:

   Not sure if Stephen was the right person to CC, including Matt now...

>>>> What controls this? "carrier detect appears untrustworthy, waiting 4 
>>>> seconds"
>>>> Get that fixed and this patch could be useful,

>>> Does the driver properly uses netif_carrier_on/off to signal the
>>> system when the link is up/down ?

>>  Implementing the poll_controller() method in the network driver is 
>> usually

>>   Looks like the answer is "no"...

>   Hm... it uses phylib, so probably it does that. That message and a 4 s 
> pause appears if the carrier is seen in <10 ms after opening the device. 

    Oops, not 10 -- 100 ms (HZ / 10).

> Maybe this threshold needs to be changed?

    Seems like too much indeed.

WBR, Sergei
Jon Smirl - Sept. 22, 2008, 2:45 p.m.
On Mon, Sep 22, 2008 at 10:39 AM, Sergei Shtylyov
<sshtylyov@ru.mvista.com> wrote:
> Hello, I wrote:
>
>  Not sure if Stephen was the right person to CC, including Matt now...
>
>>>>> What controls this? "carrier detect appears untrustworthy, waiting 4
>>>>> seconds"
>>>>> Get that fixed and this patch could be useful,
>
>>>> Does the driver properly uses netif_carrier_on/off to signal the
>>>> system when the link is up/down ?
>
>>>  Implementing the poll_controller() method in the network driver is
>>> usually
>
>>>  Looks like the answer is "no"...
>
>>  Hm... it uses phylib, so probably it does that. That message and a 4 s
>> pause appears if the carrier is seen in <10 ms after opening the device.
>
>   Oops, not 10 -- 100 ms (HZ / 10).
>
>> Maybe this threshold needs to be changed?
>
>   Seems like too much indeed.

The link is coming up, but it appears that netconsole has already
decided to wait 4s.

mpc52xx MII bus: probed
net eth0: Using PHY at MDIO address 0
netconsole: local port 6666
netconsole: local IP 192.168.1.11
netconsole: interface eth0
netconsole: remote port 514
netconsole: remote IP 192.168.1.4
netconsole: remote ethernet address 00:19:d1:e4:0f:8d
netconsole: device eth0 not up yet, forcing it
net eth0: attached phy 0 to driver Generic PHY
netconsole: carrier detect appears untrustworthy, waiting 4 seconds
PHY: f0003000:00 - Link is Up - 100/Full
console [netcon0] enabled
netconsole: network logging started



>
> WBR, Sergei
>
Sergei Shtylyov - Sept. 22, 2008, 2:59 p.m.
Jon Smirl wrote:

>> Not sure if Stephen was the right person to CC, including Matt now...

>>>>>>What controls this? "carrier detect appears untrustworthy, waiting 4
>>>>>>seconds"
>>>>>>Get that fixed and this patch could be useful,

>>>>>Does the driver properly uses netif_carrier_on/off to signal the
>>>>>system when the link is up/down ?

>>>> Implementing the poll_controller() method in the network driver is
>>>>usually

>>>> Looks like the answer is "no"...

>>> Hm... it uses phylib, so probably it does that. That message and a 4 s
>>>pause appears if the carrier is seen in <10 ms after opening the device.

>>  Oops, not 10 -- 100 ms (HZ / 10).

>>>Maybe this threshold needs to be changed?

>>  Seems like too much indeed.

> The link is coming up, but it appears that netconsole has already
> decided to wait 4s.
> mpc52xx MII bus: probed
> net eth0: Using PHY at MDIO address 0
> netconsole: local port 6666
> netconsole: local IP 192.168.1.11
> netconsole: interface eth0
> netconsole: remote port 514
> netconsole: remote IP 192.168.1.4
> netconsole: remote ethernet address 00:19:d1:e4:0f:8d
> netconsole: device eth0 not up yet, forcing it
> net eth0: attached phy 0 to driver Generic PHY
> netconsole: carrier detect appears untrustworthy, waiting 4 seconds

    Netpoll code decides to wait 4 secs after seeing that carrier is OK before 
100 ms passed...

> PHY: f0003000:00 - Link is Up - 100/Full

    ...but it's only reported to be OK later. So indeed, the carrier reported 
by the driver appears untrustworthy. ;-)

WBR, Sergei
Benjamin Herrenschmidt - Sept. 22, 2008, 9:27 p.m.
On Mon, 2008-09-22 at 15:11 +0400, Sergei Shtylyov wrote:
> Hello, I wrote:
> 
> >>> What controls this? "carrier detect appears untrustworthy, waiting 4 
> >>> seconds"
> >>> Get that fixed and this patch could be useful,
> >>>     
> >>
> >> Does the driver properly uses netif_carrier_on/off to signal the
> >> system when the link is up/down ?
> >>   
> >  Implementing the poll_controller() method in the network driver is 
> > usually
> >
> >   Looks like the answer is "no"...
> 
>    Hm... it uses phylib, so probably it does that. That message and a 4 
> s pause appears if the carrier is seen in <10 ms after opening the 
> device. Maybe this threshold needs to be changed?

This sounds fishy, some devices can setup the carrier before open...

Ben.

Patch

diff --git a/drivers/net/fec_mpc52xx.c b/drivers/net/fec_mpc52xx.c
index 4e4f683..72541b5 100644
--- a/drivers/net/fec_mpc52xx.c
+++ b/drivers/net/fec_mpc52xx.c
@@ -401,6 +401,16 @@  static int mpc52xx_fec_hard_start_xmit(struct
sk_buff *skb, struct net_device *d
        return 0;
 }

+#ifdef CONFIG_NET_POLL_CONTROLLER
+static void mpc52xx_fec_poll_controller(struct net_device *dev)
+{
+        disable_irq(dev->irq);
+        mpc52xx_fec_tx_interrupt(dev->irq, dev);
+        enable_irq(dev->irq);
+}
+#endif
+
+
 /* This handles BestComm transmit task interrupts
  */