Message ID | 9AA65D849A88EB44B5D9B6A8BA098E23040A60D6EE6E@Exchange1.lawo.de |
---|---|
State | Not Applicable, archived |
Delegated to: | David Miller |
Headers | show |
On 12-11-23 10:58 AM, Keitel, Tino (ALC NetworX GmbH) wrote: > Hi, > > commit d8a0f1b0af67679bba886784de10d8c21acc4e0e causes the following > trace on a Freescale RDB8313 board: Thanks for the report. > > NETDEV WATCHDOG: eth1 (fsl-gianfar): transmit queue 0 timed out > ------------[ cut here ]------------ > WARNING: > at /home/keitelt1/src/git/linux-stable/net/sched/sch_generic.c:255 > Modules linked in: > NIP: c02448b0 LR: c02448b0 CTR: c01c19b8 > REGS: c7ffbe40 TRAP: 0700 Not tainted (3.7.0-rc6-rt18) ^^^^^^^^^^^^^^^ I almost overlooked the above. It would have been nice to see more explicit information on what kernel you are running. I say that because the above concerns me. For several reasons. 1) it looks to be not mainline, but preempt_rt 2) There is no RT on 3.7 yet, so I'm assuming this is a custom forward port of the 250 odd RT patches. (The RT is 3.6.7-rt18, i.e. based on the 3.6 gregKH stable tree.) > MSR: 00029032 <EE,ME,IR,DR,RI> CR: 24002044 XER: 20000000 > TASK = c03dd370[0] 'swapper' THREAD: c03fe000 > GPR00: c02448b0 c7ffbef0 c03dd370 0000003f 00000001 c001aea8 00000000 > 00000001 > GPR08: 00000001 c03e0000 00000000 0000009d 24002084 1008eb5c 07ffb000 > ffffffff > GPR16: 00000004 c0362c7c c03dfbf8 00200000 c0411ed0 c0411cd0 c0411ad0 > ffffffff > GPR24: 00000000 c749e1d8 00000004 c783d1b0 c0400000 c03e0000 c749e000 > 00000000 > NIP [c02448b0] dev_watchdog+0x288/0x298 > LR [c02448b0] dev_watchdog+0x288/0x298 > Call Trace: > [c7ffbef0] [c02448b0] dev_watchdog+0x288/0x298 (unreliable) > [c7ffbf20] [c00267f8] call_timer_fn+0x6c/0xd8 > [c7ffbf50] [c00269e4] run_timer_softirq+0x180/0x1f8 > [c7ffbfa0] [c0021144] __do_softirq+0xc4/0x160 > [c7ffbff0] [c000d0b8] call_do_softirq+0x14/0x24 > [c03ffe90] [c00058e8] do_softirq+0x8c/0xb8 > [c03ffeb0] [c0021358] irq_exit+0x98/0xb4 > [c03ffec0] [c0009fb0] timer_interrupt+0x158/0x170 > [c03ffee0] [c000f02c] ret_from_except+0x0/0x14 > --- Exception: 901 at cpu_idle+0x94/0x100 > LR = cpu_idle+0x94/0x100 > [c03fffa0] [c00088ec] cpu_idle+0x5c/0x100 (unreliable) > [c03fffc0] [c03b37b0] start_kernel+0x2dc/0x2f0 > [c03ffff0] [00003438] 0x3438 > Instruction dump: > 7d2903a6 4e800421 80fe01fc 4bffff74 7fc3f378 4bfecb7d 7fc4f378 7fe6fb78 > 7c651b78 3c60c038 38637280 48090e69 <0fe00000> 39200001 993cc7c9 > 4bffffb8 > ---[ end trace 32125455035c2f70 ]--- > > With this commit reverted, it works fine. v3.3 is ok, v3.4 contains the > bad commit. The commit doesn't revert in a clean way in 3.7-rc6. I > attached diff without the tqi changes. Have you reproduced this on a mainline kernel, i.e. vanilla 3.4 or vanilla 3.7-rc6? And then done a revert on that baseline? The patch was relatively straightforward and reviewed by Eric who knows this stuff inside out; it isn't immediately clear to me why it would cause problems for you. Paul. -- > > The above trace happens while a ptp client (for IEEE1588) is running, so > there is some locally generated network traffic. The client stops to > work after this, but maybe this is due to bad error handling. > > Regards, > Tino > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Paul Gortmaker <paul.gortmaker@windriver.com> : > On 12-11-23 10:58 AM, Keitel, Tino (ALC NetworX GmbH) wrote: [...] > > With this commit reverted, it works fine. v3.3 is ok, v3.4 contains the > > bad commit. The commit doesn't revert in a clean way in 3.7-rc6. I > > attached diff without the tqi changes. Pre v3.4.5 stable kernel will miss some bql fixes. Namely: 4f4bdaeb40df95499c1ee7ea3fbca9d76174a59e (upstream 914bec1011a25f65cdc) 1414a53d956340ca8b1b27e05ab94ba63e82ed97 (upstream 25426b794efdc70dde7)
Paul Gortmaker <paul.gortmaker <at> windriver.com> writes: > > On 12-11-23 10:58 AM, Keitel, Tino (ALC NetworX GmbH) wrote: > > Hi, > > > > commit d8a0f1b0af67679bba886784de10d8c21acc4e0e causes the following > > trace on a Freescale RDB8313 board: > > Thanks for the report. > > > > > NETDEV WATCHDOG: eth1 (fsl-gianfar): transmit queue 0 timed out > > ------------[ cut here ]------------ > > WARNING: > > at /home/keitelt1/src/git/linux-stable/net/sched/sch_generic.c:255 > > Modules linked in: > > NIP: c02448b0 LR: c02448b0 CTR: c01c19b8 > > REGS: c7ffbe40 TRAP: 0700 Not tainted (3.7.0-rc6-rt18) > ^^^^^^^^^^^^^^^ > I almost overlooked the above. It would have been nice to > see more explicit information on what kernel you are running. > I say that because the above concerns me. For several reasons. > > 1) it looks to be not mainline, but preempt_rt > 2) There is no RT on 3.7 yet, so I'm assuming this is a custom > forward port of the 250 odd RT patches. (The RT is 3.6.7-rt18, > i.e. based on the 3.6 gregKH stable tree.) Sorry for the confusion. This was a 3.7.0-rc6 tree, and I forgot git clean after trying the rt-patches and git reset --hard v3.7.0-rc6, so the localversion file for -rt was still present, and the kernel was named 3.7.0-rc6-rt18. If I got this right, this should be a normal kernel with just the version file modified. I tried kernel 3.3, which doesnt have the issue. I tried 3.4, 3.6.7 and 3.7-rc6, which all show the kernel trace and ptp client misbehaviour. I tried 3.4, 3.6.7, 3.7-rc6 and 3.6.5-rt18 with the patch I posted, and they were ok. The patch I posted is for 3.7-rc6. Regards, Tino -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/ethernet/freescale/gianfar.c b/drivers/net/ethernet/freescale/gianfar.c index 19ac096..e314c6f 100644 --- a/drivers/net/ethernet/freescale/gianfar.c +++ b/drivers/net/ethernet/freescale/gianfar.c @@ -1748,13 +1748,9 @@ static void free_skb_resources(struct gfar_private *priv) /* Go through all the buffer descriptors and free their data buffers */ for (i = 0; i < priv->num_tx_queues; i++) { - struct netdev_queue *txq; - tx_queue = priv->tx_queue[i]; - txq = netdev_get_tx_queue(tx_queue->dev, tx_queue->qindex); if (tx_queue->tx_skbuff) free_skb_tx_queue(tx_queue); - netdev_tx_reset_queue(txq); } for (i = 0; i < priv->num_rx_queues; i++) { @@ -2210,8 +2206,6 @@ static int gfar_start_xmit(struct sk_buff *skb, struct net_device *dev) lstatus |= BD_LFLAG(TXBD_CRC | TXBD_READY) | skb_headlen(skb); } - netdev_tx_sent_queue(txq, skb->len); - /* We can work in parallel with gfar_clean_tx_ring(), except * when modifying num_txbdfree. Note that we didn't grab the lock * when we were reading the num_txbdfree and checking for available @@ -2456,7 +2450,6 @@ static void gfar_align_skb(struct sk_buff *skb) static int gfar_clean_tx_ring(struct gfar_priv_tx_q *tx_queue) { struct net_device *dev = tx_queue->dev; - struct netdev_queue *txq; struct gfar_private *priv = netdev_priv(dev); struct gfar_priv_rx_q *rx_queue = NULL; struct txbd8 *bdp, *next = NULL; @@ -2469,12 +2462,10 @@ static int gfar_clean_tx_ring(struct gfar_priv_tx_q *tx_queue) int i; int howmany = 0; int tqi = tx_queue->qindex; - unsigned int bytes_sent = 0; u32 lstatus; size_t buflen; rx_queue = priv->rx_queue[tqi]; - txq = netdev_get_tx_queue(dev, tqi); bdp = tx_queue->dirty_tx; skb_dirtytx = tx_queue->skb_dirtytx; @@ -2531,8 +2522,6 @@ static int gfar_clean_tx_ring(struct gfar_priv_tx_q *tx_queue) bdp = next_txbd(bdp, base, tx_ring_size); } - bytes_sent += skb->len; - dev_kfree_skb_any(skb); tx_queue->tx_skbuff[skb_dirtytx] = NULL; @@ -2547,15 +2536,13 @@ static int gfar_clean_tx_ring(struct gfar_priv_tx_q *tx_queue) } /* If we freed a buffer, we can restart transmission, if necessary */ - if (netif_tx_queue_stopped(txq) && tx_queue->num_txbdfree) + if (__netif_subqueue_stopped(dev, tx_queue->qindex) && tx_queue->num_txbdfree) netif_wake_subqueue(dev, tqi); /* Update dirty indicators */ tx_queue->skb_dirtytx = skb_dirtytx; tx_queue->dirty_tx = bdp; - netdev_tx_completed_queue(txq, howmany, bytes_sent); - return howmany; }