Message ID | 1385631882-12827-2-git-send-email-puzman@gmail.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
From: Ondrej Puzman <puzman@gmail.com> Date: Thu, 28 Nov 2013 10:44:39 +0100 > According to Documentation/networking/driver.txt the ndo_start_xmit method should not return NETDEV_TX_BUSY under normal circumstances. > Stop the transmit queue when tx_ring is full. Please format your commit message text to 80 columns. > > Signed-off-by: Ondrej Puzman <puzman@gmail.com> You should be instead preventing the transmit method from being invoked when it might be possible that a request cannot be satisfied. This means that at the end of a transmit request, you must stop the queue if a packet with the maximum number of possible descriptors cannot be satisfied. Then it is impossible for the transmit function to be invoked in a situation where it would need to fail for lack of available transmit descriptors. This is why drivers decided whether to stop their TX queues based upon calculations involving MAX_SKB_FRAGS. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please format your commit message text to 80 columns. Ok, no problem. I did not know that 80 columns limitation applies also for commit messages. > You should be instead preventing the transmit method from being > invoked when it might be possible that a request cannot be > satisfied. This means that at the end of a transmit request, you must > stop the queue if a packet with the maximum number of possible > descriptors cannot be satisfied. Then it is impossible for the > transmit function to be invoked in a situation where it would need to > fail for lack of available transmit descriptors. This is why drivers > decided whether to stop their TX queues based upon calculations > involving MAX_SKB_FRAGS. Hmm, correct me if I'm wrong ... but the driver and hardware does not support scatter-gather DMA. So fragmented skbs can't be passed to ndo_start_xmit function and each skb passed to the function represents one packet and one descriptor in tx_ring. I do not think that in this caseit makes sense to to involve MAX_SKB_FRAGS in any calculation and the queue should be stopped in the moment when it gets full. But maybe I am missing something, I am no expert in this kind of stuff. -- Ondrej Puzman -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> From: David Miller > > > According to Documentation/networking/driver.txt the ndo_start_xmit > > method should not return NETDEV_TX_BUSY under normal circumstances. > > Stop the transmit queue when tx_ring is full. ... > You should be instead preventing the transmit method from being invoked > when it might be possible that a request cannot be satisfied. > > This means that at the end of a transmit request, you must stop the > queue if a packet with the maximum number of possible descriptors > cannot be satisfied. Then it is impossible for the transmit function > to be invoked in a situation where it would need to fail for lack of > available transmit descriptors. I know you like ethernet drivers to work this way, but requiring enough descriptors for a maximally fragmented packet requires a difficult calculation and will cause the tx queue to be stopped unnecessarily. IIRC a normal skb can have a maximum of 17 fragments, if these end up being transmitted over USB using xhci they might need 34 descriptors (because descriptors can't cross 64k boundaries). However a typical 64k TCP packet just needs 4. xhci has a further constraint that the ring end can only be at specific alignments, in effect thus means that the descriptors for a single packet cannot cross the end of a ring segment. So if the available space in the xhci ring was used to stop the ethernet tx queue (which it isn't at the moment) it would have to be stopped very early. Isn't there also a new skb structure that allows lists of fragments? That might need even more descriptors for a worst case transmit. I would have thought it would make sense for the ethernet driver to stop the queue when there is a reasonable expectation that there won't be enough descriptors for the next packet, but have a reasonable code path when a packet with a large number of fragments won't fit in the tx ring. This either requires the driver itself to hold the skb, or to return NETDEV_TX_BUSY. It might make sense to monitor recent traffic to find the 'expected maximum number of fragments'. David -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: "David Laight" <David.Laight@ACULAB.COM> Date: Mon, 2 Dec 2013 10:04:49 -0000 > I know you like ethernet drivers to work this way, but requiring enough > descriptors for a maximally fragmented packet requires a difficult > calculation and will cause the tx queue to be stopped unnecessarily. ... > Isn't there also a new skb structure that allows lists of fragments? > That might need even more descriptors for a worst case transmit. Fragment lists are not to be found in the transmit function of the driver. Any such SKBs will be linearized or similar first. We do want to support them in some capacity in the future, but right now we don't. And here we're not talking about a USB driver, so using the rather extreme example of the USB segmentation restrictions is not really appropriate. This is a pretty standard ethernet driver, which can use a one to one correspondance between the segmentation count of the SKB and the number of TX descriptors that will be used. There is a large precedence for doing things this way and in this particular case it's straightforward. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2013-12-02 at 11:16 -0500, David Miller wrote: > Fragment lists are not to be found in the transmit function of the > driver. Any such SKBs will be linearized or similar first. We do > want to support them in some capacity in the future, but right now > we don't. Right. Such skb are segmented before reaching the driver, unless driver asserts NETIF_F_FRAGLIST. So far, only virtual drivers and 2 or 3 others advertise NETIF_F_FRAGLIST 'support'. I suspect that the '2 or 3 others' are plain bugs, I'll check this. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2013-12-02 at 08:33 -0800, Eric Dumazet wrote: > So far, only virtual drivers and 2 or 3 others advertise > NETIF_F_FRAGLIST 'support'. > > I suspect that the '2 or 3 others' are plain bugs, I'll check this. > Yep, these all are plain bugs, I submitted a fix. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c index 27ffe0e..6764e59 100644 --- a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c +++ b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c @@ -2157,6 +2157,10 @@ static int pch_gbe_xmit_frame(struct sk_buff *skb, struct net_device *netdev) /* CRC,ITAG no support */ pch_gbe_tx_queue(adapter, tx_ring, skb); + + if (unlikely(!PCH_GBE_DESC_UNUSED(tx_ring))) + netif_stop_queue(netdev); + spin_unlock_irqrestore(&tx_ring->tx_lock, flags); return NETDEV_TX_OK; }
According to Documentation/networking/driver.txt the ndo_start_xmit method should not return NETDEV_TX_BUSY under normal circumstances. Stop the transmit queue when tx_ring is full. Signed-off-by: Ondrej Puzman <puzman@gmail.com> --- .../net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c | 4 ++++ 1 files changed, 4 insertions(+), 0 deletions(-)