diff mbox

[net,4/5] be2net: wait longer to reap TX compls in be_close()

Message ID 57bc5399-f8f6-4c65-8cae-131c83d548ad@CMEXHTCAS1.ad.emulex.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Somnath Kotur Feb. 24, 2014, 6:50 a.m. UTC
From: Vasundhara Volam <vasundhara.volam@emulex.com>

be_close() currently waits for a max of 200ms to receive all pending
TX compls. This timeout value was roughly calcuated based on 10G
transmission speeds and the TX queue depth. This timeout may not be
enough when the link is operating at lower speeds or in
multi-channel/SR-IOV configs with TX-rate limiting setting.
Increase the timeout to 2000ms and also bail-out if there is
a HW-error.

Signed-off-by: Vasundhara Volam <vasundhara.volam@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com>
---
 drivers/net/ethernet/emulex/benet/be_main.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

Comments

David Laight Feb. 24, 2014, 9:31 a.m. UTC | #1
From: Of Somnath Kotur
> be_close() currently waits for a max of 200ms to receive all pending
> TX compls. This timeout value was roughly calcuated based on 10G
> transmission speeds and the TX queue depth. This timeout may not be
> enough when the link is operating at lower speeds or in
> multi-channel/SR-IOV configs with TX-rate limiting setting.
> Increase the timeout to 2000ms and also bail-out if there is
> a HW-error.

What about monitoring whether transmits are actually completing,
and giving up if none are completed within a suitable time period?

	David



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sathya Perla Feb. 24, 2014, 10:58 a.m. UTC | #2
> -----Original Message-----
> From: David Laight [mailto:David.Laight@ACULAB.COM]
> 
> From: Of Somnath Kotur
> > be_close() currently waits for a max of 200ms to receive all pending
> > TX compls. This timeout value was roughly calcuated based on 10G
> > transmission speeds and the TX queue depth. This timeout may not be
> > enough when the link is operating at lower speeds or in
> > multi-channel/SR-IOV configs with TX-rate limiting setting.
> > Increase the timeout to 2000ms and also bail-out if there is
> > a HW-error.
> 
> What about monitoring whether transmits are actually completing,
> and giving up if none are completed within a suitable time period?
> 

David, the code already does that. All the queues are checked for
any new TX compls each 1ms in the be_close()->be_tx_compl_clean()
path. This patch is not changing that logic.

thanks,
-Sathya
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Laight Feb. 24, 2014, 11:30 a.m. UTC | #3
> From: Sathya Perla 
> > From: David Laight [mailto:David.Laight@ACULAB.COM]
> >
> > From: Of Somnath Kotur
> > > be_close() currently waits for a max of 200ms to receive all pending
> > > TX compls. This timeout value was roughly calcuated based on 10G
> > > transmission speeds and the TX queue depth. This timeout may not be
> > > enough when the link is operating at lower speeds or in
> > > multi-channel/SR-IOV configs with TX-rate limiting setting.
> > > Increase the timeout to 2000ms and also bail-out if there is
> > > a HW-error.
> >
> > What about monitoring whether transmits are actually completing,
> > and giving up if none are completed within a suitable time period?
> >
> 
> David, the code already does that. All the queues are checked for
> any new TX compls each 1ms in the be_close()->be_tx_compl_clean()
> path. This patch is not changing that logic.

Yes, but it might be better to modify the logic rather than just
increase the timeout.

The code is waiting for any pending transmits to complete.
The purpose of the timeout is to give up if the transmits aren't going
to complete.
So instead of assuming that a specific interval is long enough, abort
if no transmits completed in (say) a 10ms period.
Move the sleep() to the top of the loop and abort if nothing is found
on any queue.
Oh and move the zeroing of cmpl and num_wrbs to a sensible place.

	David





--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sathya Perla Feb. 24, 2014, 12:08 p.m. UTC | #4
> -----Original Message-----
> From: David Laight [mailto:David.Laight@ACULAB.COM]
> 
> > From: Sathya Perla
> > > From: David Laight [mailto:David.Laight@ACULAB.COM]
> > >
> > > From: Of Somnath Kotur
> > > > be_close() currently waits for a max of 200ms to receive all pending
> > > > TX compls. This timeout value was roughly calcuated based on 10G
> > > > transmission speeds and the TX queue depth. This timeout may not be
> > > > enough when the link is operating at lower speeds or in
> > > > multi-channel/SR-IOV configs with TX-rate limiting setting.
> > > > Increase the timeout to 2000ms and also bail-out if there is
> > > > a HW-error.
> > >
> > > What about monitoring whether transmits are actually completing,
> > > and giving up if none are completed within a suitable time period?
> > >
> >
> > David, the code already does that. All the queues are checked for
> > any new TX compls each 1ms in the be_close()->be_tx_compl_clean()
> > path. This patch is not changing that logic.
> 
> Yes, but it might be better to modify the logic rather than just
> increase the timeout.
> 
> The code is waiting for any pending transmits to complete.
> The purpose of the timeout is to give up if the transmits aren't going
> to complete.
> So instead of assuming that a specific interval is long enough, abort
> if no transmits completed in (say) a 10ms period.
> Move the sleep() to the top of the loop and abort if nothing is found
> on any queue.
> Oh and move the zeroing of cmpl and num_wrbs to a sensible place.

So you're saying, keep polling on completions till the HW has been
completely silent for a particular (you mention 10ms) period of time.

This does sound like a nicer scheme. Thanks!
Will implement this and re-post the patch.

-Sathya


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index a9da6f9..f6a4481 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -1985,7 +1985,7 @@  static void be_tx_compl_clean(struct be_adapter *adapter)
 				pending_txqs--;
 		}
 
-		if (pending_txqs == 0 || ++timeo > 200)
+		if (pending_txqs == 0 || ++timeo > 2000 || be_hw_error(adapter))
 			break;
 
 		mdelay(1);