diff mbox

[net-next] bnx2x: Disable LRO on FCoE or iSCSI boot device

Message ID 1318563481-19631-1-git-send-email-mchan@broadcom.com
State Rejected, archived
Delegated to: David Miller
Headers show

Commit Message

Michael Chan Oct. 14, 2011, 3:38 a.m. UTC
From: Dmitry Kravkov <dmitry@broadcom.com>

For an FCoE or iSCSI boot device, the networking side must stay "up" all
the time.  Otherwise, the FCoE/iSCSI interface driven by bnx2i/bnx2fc
will be reset and we'll lose the root file system.

If LRO is enabled, scripts that enable IP forwarding or bridging will
disable LRO and cause the device to be reset.  Disabling LRO on these
boot devices will prevent the reset.

Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c |    6 +++++-
 1 files changed, 5 insertions(+), 1 deletions(-)

Comments

Rick Jones Oct. 14, 2011, 3:31 p.m. UTC | #1
On 10/13/2011 08:38 PM, Michael Chan wrote:
> From: Dmitry Kravkov<dmitry@broadcom.com>
>
> For an FCoE or iSCSI boot device, the networking side must stay "up" all
> the time.  Otherwise, the FCoE/iSCSI interface driven by bnx2i/bnx2fc
> will be reset and we'll lose the root file system.
>
> If LRO is enabled, scripts that enable IP forwarding or bridging will
> disable LRO and cause the device to be reset.  Disabling LRO on these
> boot devices will prevent the reset.

Is this perhaps saying that a bnx2x-driven device being used for FCoE or 
iSCSI boot must not permit *any* run-time configuration change which 
leads to a NIC reset?

rick jones
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael Chan Oct. 14, 2011, 3:53 p.m. UTC | #2
Rick Jones wrote:

> Is this perhaps saying that a bnx2x-driven device being used for FCoE
> or
> iSCSI boot must not permit *any* run-time configuration change which
> leads to a NIC reset?
> 

That is right.  Unless you have a multipath configuration with multiple
ports, then you can reset one port at a time.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rick Jones Oct. 14, 2011, 4:06 p.m. UTC | #3
On 10/14/2011 08:53 AM, Michael Chan wrote:
> Rick Jones wrote:
>
>> Is this perhaps saying that a bnx2x-driven device being used for
>> FCoE or iSCSI boot must not permit *any* run-time configuration
>> change which leads to a NIC reset?
>>
>
> That is right.  Unless you have a multipath configuration with multiple
> ports, then you can reset one port at a time.

So, should there also be a "cnic_boot_device" check in many of the 
"capital letter" ethtool paths?

rick
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael Chan Oct. 14, 2011, 4:15 p.m. UTC | #4
Rick Jones wrote:

> On 10/14/2011 08:53 AM, Michael Chan wrote:
> > Rick Jones wrote:
> >
> >> Is this perhaps saying that a bnx2x-driven device being used for
> >> FCoE or iSCSI boot must not permit *any* run-time configuration
> >> change which leads to a NIC reset?
> >>
> >
> > That is right.  Unless you have a multipath configuration with
> multiple
> > ports, then you can reset one port at a time.
> 
> So, should there also be a "cnic_boot_device" check in many of the
> "capital letter" ethtool paths?
> 

If the user is doing ethtool configuration changes or device shutdown,
it is more obvious what the consequence will be.  The user may also be
careful to do it on a multipath setup.

The reset caused by the auto turn-off of LRO when you enable
ip_forward or bridging will not be obvious to the user.  In addition,
all devices with LRO turned on will be reset at the same time so even
multipath will not survive.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John Fastabend Oct. 14, 2011, 8:17 p.m. UTC | #5
On 10/14/2011 9:15 AM, Michael Chan wrote:
> Rick Jones wrote:
> 
>> On 10/14/2011 08:53 AM, Michael Chan wrote:
>>> Rick Jones wrote:
>>>
>>>> Is this perhaps saying that a bnx2x-driven device being used for
>>>> FCoE or iSCSI boot must not permit *any* run-time configuration
>>>> change which leads to a NIC reset?
>>>>
>>>
>>> That is right.  Unless you have a multipath configuration with
>> multiple
>>> ports, then you can reset one port at a time.
>>
>> So, should there also be a "cnic_boot_device" check in many of the
>> "capital letter" ethtool paths?
>>
> 
> If the user is doing ethtool configuration changes or device shutdown,
> it is more obvious what the consequence will be.  The user may also be
> careful to do it on a multipath setup.
> 
> The reset caused by the auto turn-off of LRO when you enable
> ip_forward or bridging will not be obvious to the user.  In addition,
> all devices with LRO turned on will be reset at the same time so even
> multipath will not survive.
>

But after the reset the device should login and SCSI layer should
handle retries. So I don't see why this is a problem. Why do we
need to handle this any different from any other link events?

.John
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael Chan Oct. 14, 2011, 8:59 p.m. UTC | #6
On Fri, 2011-10-14 at 13:17 -0700, John Fastabend wrote:
> On 10/14/2011 9:15 AM, Michael Chan wrote:
> > Rick Jones wrote:
> > 
> >> On 10/14/2011 08:53 AM, Michael Chan wrote:
> >>> Rick Jones wrote:
> >>>
> >>>> Is this perhaps saying that a bnx2x-driven device being used for
> >>>> FCoE or iSCSI boot must not permit *any* run-time configuration
> >>>> change which leads to a NIC reset?
> >>>>
> >>>
> >>> That is right.  Unless you have a multipath configuration with
> >> multiple
> >>> ports, then you can reset one port at a time.
> >>
> >> So, should there also be a "cnic_boot_device" check in many of the
> >> "capital letter" ethtool paths?
> >>
> > 
> > If the user is doing ethtool configuration changes or device shutdown,
> > it is more obvious what the consequence will be.  The user may also be
> > careful to do it on a multipath setup.
> > 
> > The reset caused by the auto turn-off of LRO when you enable
> > ip_forward or bridging will not be obvious to the user.  In addition,
> > all devices with LRO turned on will be reset at the same time so even
> > multipath will not survive.
> >
> 
> But after the reset the device should login and SCSI layer should
> handle retries. So I don't see why this is a problem. Why do we
> need to handle this any different from any other link events?
> 

During a link down event, the iSCSI state does not get reset.  When link
comes back up quickly enough, there should be just some retransmissions
and everything should recover.  The root file system won't tolerate a
chip reset that will reset the iSCSI state.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Oct. 19, 2011, 8:06 p.m. UTC | #7
From: "Michael Chan" <mchan@broadcom.com>
Date: Thu, 13 Oct 2011 20:38:01 -0700

> From: Dmitry Kravkov <dmitry@broadcom.com>
> 
> For an FCoE or iSCSI boot device, the networking side must stay "up" all
> the time.  Otherwise, the FCoE/iSCSI interface driven by bnx2i/bnx2fc
> will be reset and we'll lose the root file system.
> 
> If LRO is enabled, scripts that enable IP forwarding or bridging will
> disable LRO and cause the device to be reset.  Disabling LRO on these
> boot devices will prevent the reset.
> 
> Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
> Signed-off-by: Michael Chan <mchan@broadcom.com>

You're still going to have bugs after this.

What if you get a FIFO overflow or other error condition which requires
a chip reset?  You'll lose the root filesystem.  Why bother resetting
the chip at all if it's going to be useless afterwards?

The bug is in the fact that iSCSI context state isn't preserved or
reloaded across a chip reset.

Please fix that instead, and that way you won't have to add special
hacks for the root filesystem.  Everything will "just work'.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael Chan Oct. 19, 2011, 8:12 p.m. UTC | #8
On Wed, 2011-10-19 at 13:06 -0700, David Miller wrote:
> From: "Michael Chan" <mchan@broadcom.com>
> Date: Thu, 13 Oct 2011 20:38:01 -0700
> 
> > From: Dmitry Kravkov <dmitry@broadcom.com>
> > 
> > For an FCoE or iSCSI boot device, the networking side must stay "up" all
> > the time.  Otherwise, the FCoE/iSCSI interface driven by bnx2i/bnx2fc
> > will be reset and we'll lose the root file system.
> > 
> > If LRO is enabled, scripts that enable IP forwarding or bridging will
> > disable LRO and cause the device to be reset.  Disabling LRO on these
> > boot devices will prevent the reset.
> > 
> > Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
> > Signed-off-by: Michael Chan <mchan@broadcom.com>
> 
> You're still going to have bugs after this.
> 
> What if you get a FIFO overflow or other error condition which requires
> a chip reset?  You'll lose the root filesystem.

That would be no different than a scsi driver experiencing fatal errors,
wouldn't it?

>   Why bother resetting
> the chip at all if it's going to be useless afterwards?

If the user has configured multipath to the storage target, we can still
reset each port separately.

What we want to prevent is any hidden reset during normal operations.

> 
> The bug is in the fact that iSCSI context state isn't preserved or
> reloaded across a chip reset.
> 
> Please fix that instead, and that way you won't have to add special
> hacks for the root filesystem.  Everything will "just work'.
> 
> 


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Oct. 19, 2011, 8:47 p.m. UTC | #9
From: "Michael Chan" <mchan@broadcom.com>
Date: Wed, 19 Oct 2011 13:12:52 -0700

> 
> On Wed, 2011-10-19 at 13:06 -0700, David Miller wrote:
>> From: "Michael Chan" <mchan@broadcom.com>
>> Date: Thu, 13 Oct 2011 20:38:01 -0700
>> 
>> > From: Dmitry Kravkov <dmitry@broadcom.com>
>> > 
>> > For an FCoE or iSCSI boot device, the networking side must stay "up" all
>> > the time.  Otherwise, the FCoE/iSCSI interface driven by bnx2i/bnx2fc
>> > will be reset and we'll lose the root file system.
>> > 
>> > If LRO is enabled, scripts that enable IP forwarding or bridging will
>> > disable LRO and cause the device to be reset.  Disabling LRO on these
>> > boot devices will prevent the reset.
>> > 
>> > Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
>> > Signed-off-by: Michael Chan <mchan@broadcom.com>
>> 
>> You're still going to have bugs after this.
>> 
>> What if you get a FIFO overflow or other error condition which requires
>> a chip reset?  You'll lose the root filesystem.
> 
> That would be no different than a scsi driver experiencing fatal errors,
> wouldn't it?

It's not fatal if you can bring the chip back up after the reset
because this is networking.

These things are protocols, built on top of networking technology,
with retransmits, handshakes, and all sorts of features designed
to provide reliability.

Things like a LRO change ought to be completely transparent.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John Fastabend Oct. 19, 2011, 8:53 p.m. UTC | #10
On 10/19/2011 1:47 PM, David Miller wrote:
> From: "Michael Chan" <mchan@broadcom.com>
> Date: Wed, 19 Oct 2011 13:12:52 -0700
> 
>>
>> On Wed, 2011-10-19 at 13:06 -0700, David Miller wrote:
>>> From: "Michael Chan" <mchan@broadcom.com>
>>> Date: Thu, 13 Oct 2011 20:38:01 -0700
>>>
>>>> From: Dmitry Kravkov <dmitry@broadcom.com>
>>>>
>>>> For an FCoE or iSCSI boot device, the networking side must stay "up" all
>>>> the time.  Otherwise, the FCoE/iSCSI interface driven by bnx2i/bnx2fc
>>>> will be reset and we'll lose the root file system.
>>>>
>>>> If LRO is enabled, scripts that enable IP forwarding or bridging will
>>>> disable LRO and cause the device to be reset.  Disabling LRO on these
>>>> boot devices will prevent the reset.
>>>>
>>>> Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
>>>> Signed-off-by: Michael Chan <mchan@broadcom.com>
>>>
>>> You're still going to have bugs after this.
>>>
>>> What if you get a FIFO overflow or other error condition which requires
>>> a chip reset?  You'll lose the root filesystem.
>>
>> That would be no different than a scsi driver experiencing fatal errors,
>> wouldn't it?
> 
> It's not fatal if you can bring the chip back up after the reset
> because this is networking.
> 
> These things are protocols, built on top of networking technology,
> with retransmits, handshakes, and all sorts of features designed
> to provide reliability.
> 
> Things like a LRO change ought to be completely transparent.

As a reference point this works fine in both FCoE and iSCSI stacks
today. The device is reset or link is lost for whatever reason
when the link comes back up the stack logs back in, enumerates
the luns and the scsi stack recovers as expected.

Firmware should do the equivalent login, lun enumeration, etc as
needed.

.John
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Oct. 19, 2011, 9:03 p.m. UTC | #11
From: John Fastabend <john.r.fastabend@intel.com>
Date: Wed, 19 Oct 2011 13:53:42 -0700

> As a reference point this works fine in both FCoE and iSCSI stacks
> today. The device is reset or link is lost for whatever reason
> when the link comes back up the stack logs back in, enumerates
> the luns and the scsi stack recovers as expected.
> 
> Firmware should do the equivalent login, lun enumeration, etc as
> needed.

I rest my case :-)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael Chan Oct. 27, 2011, 11:30 p.m. UTC | #12
On Wed, 2011-10-19 at 13:53 -0700, John Fastabend wrote:
> As a reference point this works fine in both FCoE and iSCSI stacks
> today. The device is reset or link is lost for whatever reason
> when the link comes back up the stack logs back in, enumerates
> the luns and the scsi stack recovers as expected.
> 
> Firmware should do the equivalent login, lun enumeration, etc as
> needed.

Just a quick follow-up on this issue.  Our firmware actually performs
the same logout before the reset and login after the reset.  For iSCSI,
the problem on our device was actually caused by our userspace daemon
logging events to a log file in the root fs.  The file I/O was blocked
and the daemon could not proceed to do the important operations during
the reset, and this caused filesystem I/O errors.  We have now fixed the
problem in the userspace daemon.

For FCoE, there is no logging issue and the root fs failure seems to
happen only in a multipath configuration with all paths going down for a
short time (caused by reset in this case).  We believe this also affects
other devices and not just ours.  We are now working with the multipath
maintainer to understand this issue.

So this confirms that the original patch for bnx2x is not needed.

Thanks.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index 6486ab8..4960048 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -9794,6 +9794,7 @@  static int __devinit bnx2x_init_bp(struct bnx2x *bp)
 	int func;
 	int timer_interval;
 	int rc;
+	u32 cnic_boot_device;
 
 	mutex_init(&bp->port.phy_mutex);
 	mutex_init(&bp->fw_mb_mutex);
@@ -9840,8 +9841,11 @@  static int __devinit bnx2x_init_bp(struct bnx2x *bp)
 
 	bp->multi_mode = multi_mode;
 
+	cnic_boot_device =
+		!!SHMEM_RD(bp, func_mb[BP_FW_MB_IDX(bp)].iscsi_boot_signature);
+
 	/* Set TPA flags */
-	if (disable_tpa) {
+	if (disable_tpa || cnic_boot_device) {
 		bp->flags &= ~TPA_ENABLE_FLAG;
 		bp->dev->features &= ~NETIF_F_LRO;
 	} else {