diff mbox

[1/4] pch_gbe: Fix transmit queue management

Message ID 1385631882-12827-2-git-send-email-puzman@gmail.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Ondrej Puzman Nov. 28, 2013, 9:44 a.m. UTC
According to Documentation/networking/driver.txt the ndo_start_xmit method should not return NETDEV_TX_BUSY under normal circumstances.
Stop the transmit queue when tx_ring is full.

Signed-off-by: Ondrej Puzman <puzman@gmail.com>
---
 .../net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c   |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

Comments

David Miller Nov. 29, 2013, 9:42 p.m. UTC | #1
From: Ondrej Puzman <puzman@gmail.com>
Date: Thu, 28 Nov 2013 10:44:39 +0100

> According to Documentation/networking/driver.txt the ndo_start_xmit method should not return NETDEV_TX_BUSY under normal circumstances.
> Stop the transmit queue when tx_ring is full.

Please format your commit message text to 80 columns.

> 
> Signed-off-by: Ondrej Puzman <puzman@gmail.com>

You should be instead preventing the transmit method from being invoked
when it might be possible that a request cannot be satisfied.

This means that at the end of a transmit request, you must stop the
queue if a packet with the maximum number of possible descriptors
cannot be satisfied.  Then it is impossible for the transmit function
to be invoked in a situation where it would need to fail for lack of
available transmit descriptors.

This is why drivers decided whether to stop their TX queues based upon
calculations involving MAX_SKB_FRAGS.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ondrej Puzman Nov. 30, 2013, 10:41 p.m. UTC | #2
> Please format your commit  message text to 80 columns.

Ok, no problem. I did not know that 80 columns limitation applies also
for commit messages.

> You should be instead  preventing the transmit method from being
> invoked when it might be possible that a request cannot be
> satisfied. This means that at the end of a transmit request, you must
> stop the queue if a packet with the maximum number of possible
> descriptors cannot be satisfied. Then it is impossible for the
> transmit function to be invoked in a situation where it would need to
> fail for lack of available transmit descriptors. This is why drivers
> decided whether to stop their TX queues based upon calculations
> involving MAX_SKB_FRAGS.

Hmm, correct me if I'm wrong ... but the driver and hardware does not
support scatter-gather DMA. So fragmented skbs can't be passed to
ndo_start_xmit function and each skb passed to the function represents
one packet and one descriptor in tx_ring.
I do not think that in this caseit makes sense to to involve
MAX_SKB_FRAGS in any calculation and the queue should be stopped in the
moment when it gets full.
But maybe I am missing something, I am no expert in this kind of stuff.

--
Ondrej Puzman

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Laight Dec. 2, 2013, 10:04 a.m. UTC | #3
> From: David Miller
> 
> > According to Documentation/networking/driver.txt the ndo_start_xmit
> > method should not return NETDEV_TX_BUSY under normal circumstances.
> > Stop the transmit queue when tx_ring is full.
... 
> You should be instead preventing the transmit method from being invoked
> when it might be possible that a request cannot be satisfied.
> 
> This means that at the end of a transmit request, you must stop the
> queue if a packet with the maximum number of possible descriptors
> cannot be satisfied.  Then it is impossible for the transmit function
> to be invoked in a situation where it would need to fail for lack of
> available transmit descriptors.

I know you like ethernet drivers to work this way, but requiring enough
descriptors for a maximally fragmented packet requires a difficult
calculation and will cause the tx queue to be stopped unnecessarily.

IIRC a normal skb can have a maximum of 17 fragments, if these end up
being transmitted over USB using xhci they might need 34 descriptors
(because descriptors can't cross 64k boundaries).
However a typical 64k TCP packet just needs 4.
xhci has a further constraint that the ring end can only be at
specific alignments, in effect thus means that the descriptors for
a single packet cannot cross the end of a ring segment.
So if the available space in the xhci ring was used to stop the
ethernet tx queue (which it isn't at the moment) it would have to
be stopped very early.

Isn't there also a new skb structure that allows lists of fragments?
That might need even more descriptors for a worst case transmit.

I would have thought it would make sense for the ethernet driver
to stop the queue when there is a reasonable expectation that there
won't be enough descriptors for the next packet, but have a reasonable
code path when a packet with a large number of fragments won't fit
in the tx ring. This either requires the driver itself to hold the
skb, or to return NETDEV_TX_BUSY.
It might make sense to monitor recent traffic to find the 'expected
maximum number of fragments'.

	David




--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Dec. 2, 2013, 4:16 p.m. UTC | #4
From: "David Laight" <David.Laight@ACULAB.COM>
Date: Mon, 2 Dec 2013 10:04:49 -0000

> I know you like ethernet drivers to work this way, but requiring enough
> descriptors for a maximally fragmented packet requires a difficult
> calculation and will cause the tx queue to be stopped unnecessarily.
 ...
> Isn't there also a new skb structure that allows lists of fragments?
> That might need even more descriptors for a worst case transmit.

Fragment lists are not to be found in the transmit function of the
driver.  Any such SKBs will be linearized or similar first.  We do
want to support them in some capacity in the future, but right now
we don't.

And here we're not talking about a USB driver, so using the rather
extreme example of the USB segmentation restrictions is not really
appropriate.

This is a pretty standard ethernet driver, which can use a one to
one correspondance between the segmentation count of the SKB and
the number of TX descriptors that will be used.

There is a large precedence for doing things this way and in this
particular case it's straightforward.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Dec. 2, 2013, 4:33 p.m. UTC | #5
On Mon, 2013-12-02 at 11:16 -0500, David Miller wrote:

> Fragment lists are not to be found in the transmit function of the
> driver.  Any such SKBs will be linearized or similar first.  We do
> want to support them in some capacity in the future, but right now
> we don't.

Right. Such skb are segmented before reaching the driver, unless driver
asserts NETIF_F_FRAGLIST.

So far, only virtual drivers and 2 or 3 others advertise
NETIF_F_FRAGLIST 'support'.

I suspect that the '2 or 3 others' are plain bugs, I'll check this.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Dec. 2, 2013, 4:51 p.m. UTC | #6
On Mon, 2013-12-02 at 08:33 -0800, Eric Dumazet wrote:

> So far, only virtual drivers and 2 or 3 others advertise
> NETIF_F_FRAGLIST 'support'.
> 
> I suspect that the '2 or 3 others' are plain bugs, I'll check this.
> 

Yep, these all are plain bugs, I submitted a fix.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c
index 27ffe0e..6764e59 100644
--- a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c
+++ b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c
@@ -2157,6 +2157,10 @@  static int pch_gbe_xmit_frame(struct sk_buff *skb, struct net_device *netdev)
 
 	/* CRC,ITAG no support */
 	pch_gbe_tx_queue(adapter, tx_ring, skb);
+
+	if (unlikely(!PCH_GBE_DESC_UNUSED(tx_ring)))
+		netif_stop_queue(netdev);
+
 	spin_unlock_irqrestore(&tx_ring->tx_lock, flags);
 	return NETDEV_TX_OK;
 }