diff mbox

netdev: enc28j60 kernel panic fix.

Message ID 20160506203546.GA10067@electric-eye.fr.zoreil.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Francois Romieu May 6, 2016, 8:35 p.m. UTC
(please don't top post)

David Russell <david@aprsworld.com> :
> I kind of thought my patch was at best incomplete.  When you state
> this change silences the bug but does not fix it, what are the
> implications of systems running this patch?  We have some production
> systems using this patch.  They reboot daily, but have been solid.

If my assumption is right it should drop an extra packet here and there.
No leak.

However transmit errors + transmit packets should still match the number
of times the driver calls enc28j60_send_packet (you would have to cook
your own stat to check the latter though).

> In addition, if we sent you a pi and the ethernet controller and a
> small but reasonable sum of money for your labor, would you be able to
> properly fix it ?

I'd rather see you testing my crap. :o)

Pi as multi-core (the expected race needs several cores or a netconsole
style transmit from an irq/bh context) ?

> Short of that, do you have any recommendations on quick overviews of
> the networking stack in the kernel and then documentation on the
> various flags and such?

A tad bit too high-level a question... Plain ctags + printk for a start ?

Does the patch below make a difference ?

Takes longer to crash counts as a difference.

Comments

David Russell May 9, 2016, 4:59 p.m. UTC | #1
On Fri, May 6, 2016 at 3:35 PM, Francois Romieu <romieu@fr.zoreil.com> wrote:
> (please don't top post)

Sorry about that.

> David Russell <david@aprsworld.com> :
>> I kind of thought my patch was at best incomplete.  When you state
>> this change silences the bug but does not fix it, what are the
>> implications of systems running this patch?  We have some production
>> systems using this patch.  They reboot daily, but have been solid.
>
> If my assumption is right it should drop an extra packet here and there.
> No leak.

Lost packets are "acceptable" in this case as much as I hate "Good Enough".

> However transmit errors + transmit packets should still match the number
> of times the driver calls enc28j60_send_packet (you would have to cook
> your own stat to check the latter though).
>
>> In addition, if we sent you a pi and the ethernet controller and a
>> small but reasonable sum of money for your labor, would you be able to
>> properly fix it ?
>
> I'd rather see you testing my crap. :o)
>
> Pi as multi-core (the expected race needs several cores or a netconsole
> style transmit from an irq/bh context) ?

Yes, multi-core ARM.

>> Short of that, do you have any recommendations on quick overviews of
>> the networking stack in the kernel and then documentation on the
>> various flags and such?
>
> A tad bit too high-level a question... Plain ctags + printk for a start ?

I was afraid of that.  I prefer proper documentation/specifications in
addition to source.

> Does the patch below make a difference ?
>
> Takes longer to crash counts as a difference.

As far as I can tell it solves the problem of the kernel panicing.

> diff --git a/drivers/net/ethernet/microchip/enc28j60.c b/drivers/net/ethernet/microchip/enc28j60.c
> index 7066954..405fe3f 100644
> --- a/drivers/net/ethernet/microchip/enc28j60.c
> +++ b/drivers/net/ethernet/microchip/enc28j60.c
> @@ -1170,7 +1170,8 @@ static void enc28j60_irq_work_handler(struct work_struct *work)
>                                 enc28j60_dump_tsv(priv, "Tx Done", tsv);
>                         }
>                         enc28j60_tx_clear(ndev, err);
> -                       locked_reg_bfclr(priv, EIR, EIR_TXIF);
> +                       locked_reg_bfclr(priv, EIR, EIR_TXIF | EIR_TXERIF);
> +                       intflags &= ~EIR_TXERIF;
>                 }
>                 /* TX Error handler */
>                 if ((intflags & EIR_TXERIF) != 0) {
> @@ -1190,6 +1191,7 @@ static void enc28j60_irq_work_handler(struct work_struct *work)
>                         nolock_reg_bfclr(priv, ECON1, ECON1_TXRST);
>                         nolock_txfifo_init(priv, TXSTART_INIT, TXEND_INIT);
>                         mutex_unlock(&priv->lock);
> +                       locked_reg_bfclr(priv, EIR, EIR_TXIF | EIR_TXERIF);
>                         /* Transmit Late collision check for retransmit */
>                         if (TSV_GETBIT(tsv, TSV_TXLATECOLLISION)) {
>                                 if (netif_msg_tx_err(priv))
> @@ -1203,7 +1205,6 @@ static void enc28j60_irq_work_handler(struct work_struct *work)
>                                         enc28j60_tx_clear(ndev, true);
>                         } else
>                                 enc28j60_tx_clear(ndev, true);
> -                       locked_reg_bfclr(priv, EIR, EIR_TXERIF);
>                 }
>                 /* RX Error handler */
>                 if ((intflags & EIR_RXERIF) != 0) {
diff mbox

Patch

diff --git a/drivers/net/ethernet/microchip/enc28j60.c b/drivers/net/ethernet/microchip/enc28j60.c
index 7066954..405fe3f 100644
--- a/drivers/net/ethernet/microchip/enc28j60.c
+++ b/drivers/net/ethernet/microchip/enc28j60.c
@@ -1170,7 +1170,8 @@  static void enc28j60_irq_work_handler(struct work_struct *work)
 				enc28j60_dump_tsv(priv, "Tx Done", tsv);
 			}
 			enc28j60_tx_clear(ndev, err);
-			locked_reg_bfclr(priv, EIR, EIR_TXIF);
+			locked_reg_bfclr(priv, EIR, EIR_TXIF | EIR_TXERIF);
+			intflags &= ~EIR_TXERIF;
 		}
 		/* TX Error handler */
 		if ((intflags & EIR_TXERIF) != 0) {
@@ -1190,6 +1191,7 @@  static void enc28j60_irq_work_handler(struct work_struct *work)
 			nolock_reg_bfclr(priv, ECON1, ECON1_TXRST);
 			nolock_txfifo_init(priv, TXSTART_INIT, TXEND_INIT);
 			mutex_unlock(&priv->lock);
+			locked_reg_bfclr(priv, EIR, EIR_TXIF | EIR_TXERIF);
 			/* Transmit Late collision check for retransmit */
 			if (TSV_GETBIT(tsv, TSV_TXLATECOLLISION)) {
 				if (netif_msg_tx_err(priv))
@@ -1203,7 +1205,6 @@  static void enc28j60_irq_work_handler(struct work_struct *work)
 					enc28j60_tx_clear(ndev, true);
 			} else
 				enc28j60_tx_clear(ndev, true);
-			locked_reg_bfclr(priv, EIR, EIR_TXERIF);
 		}
 		/* RX Error handler */
 		if ((intflags & EIR_RXERIF) != 0) {