Message ID | 1477005092-20632-4-git-send-email-gwshan@linux.vnet.ibm.com |
---|---|
State | Accepted, archived |
Headers | show |
On Fri, Oct 21, 2016 at 9:41 AM, Gavin Shan <gwshan@linux.vnet.ibm.com> wrote: > The issue was found on BCM5718 which has two NCSI channels in one > package: C0 and C1. C0 is in link-up state while C1 is in link-down > state. C0 is chosen as active channel until unplugging and plugging > C0's cable: On unplugging C0's cable, LSC (Link State Change) AEN > packet received on C0 to report link-down event. After that, C1 is > chosen as active channel. LSC AEN for link-up event is lost on C0 > when plugging C0's cable back. We lose the network even C0 is usable. > > This resolves the issue by recording the (hot) channel that was ever > chosen as active one. The hot channel is chosen to be active one > if none of available channels in link-up state. With this, C0 is still > the active one after unplugging C0's cable. LSC AEN packet received > on C0 when plugging its cable back. > > Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> > --- > net/ncsi/internal.h | 1 + > net/ncsi/ncsi-manage.c | 22 +++++++++++++++++++--- > 2 files changed, 20 insertions(+), 3 deletions(-) > > diff --git a/net/ncsi/internal.h b/net/ncsi/internal.h > index eac4858..1308a56 100644 > --- a/net/ncsi/internal.h > +++ b/net/ncsi/internal.h > @@ -265,6 +265,7 @@ struct ncsi_dev_priv { > #endif > unsigned int package_num; /* Number of packages */ > struct list_head packages; /* List of packages */ > + struct ncsi_channel *hot_channel; /* Channel was ever active */ > struct ncsi_request requests[256]; /* Request table */ > unsigned int request_id; /* Last used request ID */ > #define NCSI_REQ_START_IDX 1 > diff --git a/net/ncsi/ncsi-manage.c b/net/ncsi/ncsi-manage.c > index 4789f86..a3bd5fa 100644 > --- a/net/ncsi/ncsi-manage.c > +++ b/net/ncsi/ncsi-manage.c > @@ -645,6 +645,7 @@ static void ncsi_configure_channel(struct ncsi_dev_priv *ndp) > struct net_device *dev = nd->dev; > struct ncsi_package *np = ndp->active_package; > struct ncsi_channel *nc = ndp->active_channel; > + struct ncsi_channel *hot_nc = NULL; > struct ncsi_cmd_arg nca; > unsigned char index; > unsigned long flags; > @@ -750,12 +751,20 @@ static void ncsi_configure_channel(struct ncsi_dev_priv *ndp) > break; > case ncsi_dev_state_config_done: > spin_lock_irqsave(&nc->lock, flags); > - if (nc->modes[NCSI_MODE_LINK].data[2] & 0x1) > + if (nc->modes[NCSI_MODE_LINK].data[2] & 0x1) { > + hot_nc = nc; > nc->state = NCSI_CHANNEL_ACTIVE; > - else > + } else { > + hot_nc = NULL; > nc->state = NCSI_CHANNEL_INACTIVE; > + } > spin_unlock_irqrestore(&nc->lock, flags); > > + /* Update the hot channel */ > + spin_lock_irqsave(&ndp->lock, flags); > + ndp->hot_channel = hot_nc; > + spin_unlock_irqrestore(&ndp->lock, flags); > + > ncsi_start_channel_monitor(nc); > ncsi_process_next_channel(ndp); > break; > @@ -773,10 +782,14 @@ error: > static int ncsi_choose_active_channel(struct ncsi_dev_priv *ndp) > { > struct ncsi_package *np; > - struct ncsi_channel *nc, *found; > + struct ncsi_channel *nc, *found, *hot_nc; > struct ncsi_channel_mode *ncm; > unsigned long flags; > > + spin_lock_irqsave(&ndp->lock, flags); > + hot_nc = ndp->hot_channel; > + spin_unlock_irqrestore(&ndp->lock, flags); > + Can you explain why we need to take a lock here? > /* The search is done once an inactive channel with up > * link is found. > */ > @@ -794,6 +807,9 @@ static int ncsi_choose_active_channel(struct ncsi_dev_priv *ndp) > if (!found) > found = nc; > > + if (nc == hot_nc) > + found = nc; > + > ncm = &nc->modes[NCSI_MODE_LINK]; > if (ncm->data[2] & 0x1) { > spin_unlock_irqrestore(&nc->lock, flags); > -- > 2.1.0 >
On Fri, Oct 21, 2016 at 12:50:39PM +1030, Joel Stanley wrote: >On Fri, Oct 21, 2016 at 9:41 AM, Gavin Shan <gwshan@linux.vnet.ibm.com> wrote: >> The issue was found on BCM5718 which has two NCSI channels in one >> package: C0 and C1. C0 is in link-up state while C1 is in link-down >> state. C0 is chosen as active channel until unplugging and plugging >> C0's cable: On unplugging C0's cable, LSC (Link State Change) AEN >> packet received on C0 to report link-down event. After that, C1 is >> chosen as active channel. LSC AEN for link-up event is lost on C0 >> when plugging C0's cable back. We lose the network even C0 is usable. >> >> This resolves the issue by recording the (hot) channel that was ever >> chosen as active one. The hot channel is chosen to be active one >> if none of available channels in link-up state. With this, C0 is still >> the active one after unplugging C0's cable. LSC AEN packet received >> on C0 when plugging its cable back. >> >> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> >> --- >> net/ncsi/internal.h | 1 + >> net/ncsi/ncsi-manage.c | 22 +++++++++++++++++++--- >> 2 files changed, 20 insertions(+), 3 deletions(-) >> >> diff --git a/net/ncsi/internal.h b/net/ncsi/internal.h >> index eac4858..1308a56 100644 >> --- a/net/ncsi/internal.h >> +++ b/net/ncsi/internal.h >> @@ -265,6 +265,7 @@ struct ncsi_dev_priv { >> #endif >> unsigned int package_num; /* Number of packages */ >> struct list_head packages; /* List of packages */ >> + struct ncsi_channel *hot_channel; /* Channel was ever active */ >> struct ncsi_request requests[256]; /* Request table */ >> unsigned int request_id; /* Last used request ID */ >> #define NCSI_REQ_START_IDX 1 >> diff --git a/net/ncsi/ncsi-manage.c b/net/ncsi/ncsi-manage.c >> index 4789f86..a3bd5fa 100644 >> --- a/net/ncsi/ncsi-manage.c >> +++ b/net/ncsi/ncsi-manage.c >> @@ -645,6 +645,7 @@ static void ncsi_configure_channel(struct ncsi_dev_priv *ndp) >> struct net_device *dev = nd->dev; >> struct ncsi_package *np = ndp->active_package; >> struct ncsi_channel *nc = ndp->active_channel; >> + struct ncsi_channel *hot_nc = NULL; >> struct ncsi_cmd_arg nca; >> unsigned char index; >> unsigned long flags; >> @@ -750,12 +751,20 @@ static void ncsi_configure_channel(struct ncsi_dev_priv *ndp) >> break; >> case ncsi_dev_state_config_done: >> spin_lock_irqsave(&nc->lock, flags); >> - if (nc->modes[NCSI_MODE_LINK].data[2] & 0x1) >> + if (nc->modes[NCSI_MODE_LINK].data[2] & 0x1) { >> + hot_nc = nc; >> nc->state = NCSI_CHANNEL_ACTIVE; >> - else >> + } else { >> + hot_nc = NULL; >> nc->state = NCSI_CHANNEL_INACTIVE; >> + } >> spin_unlock_irqrestore(&nc->lock, flags); >> >> + /* Update the hot channel */ >> + spin_lock_irqsave(&ndp->lock, flags); >> + ndp->hot_channel = hot_nc; >> + spin_unlock_irqrestore(&ndp->lock, flags); >> + >> ncsi_start_channel_monitor(nc); >> ncsi_process_next_channel(ndp); >> break; >> @@ -773,10 +782,14 @@ error: >> static int ncsi_choose_active_channel(struct ncsi_dev_priv *ndp) >> { >> struct ncsi_package *np; >> - struct ncsi_channel *nc, *found; >> + struct ncsi_channel *nc, *found, *hot_nc; >> struct ncsi_channel_mode *ncm; >> unsigned long flags; >> >> + spin_lock_irqsave(&ndp->lock, flags); >> + hot_nc = ndp->hot_channel; >> + spin_unlock_irqrestore(&ndp->lock, flags); >> + > >Can you explain why we need to take a lock here? > I don't think it's too useful for now as AST2x00 are UP systems and HWA isn't enabled so far. In extreme situation when NCSI stack runs on SMP system and HWA (Hardware arbitration) is enabled. There might be multiple accessors and updaters. It's why we need a spinlock to ensure consistent access/update. Thanks, Gavin > >> /* The search is done once an inactive channel with up >> * link is found. >> */ >> @@ -794,6 +807,9 @@ static int ncsi_choose_active_channel(struct ncsi_dev_priv *ndp) >> if (!found) >> found = nc; >> >> + if (nc == hot_nc) >> + found = nc; >> + >> ncm = &nc->modes[NCSI_MODE_LINK]; >> if (ncm->data[2] & 0x1) { >> spin_unlock_irqrestore(&nc->lock, flags); >> -- >> 2.1.0 >> >
diff --git a/net/ncsi/internal.h b/net/ncsi/internal.h index eac4858..1308a56 100644 --- a/net/ncsi/internal.h +++ b/net/ncsi/internal.h @@ -265,6 +265,7 @@ struct ncsi_dev_priv { #endif unsigned int package_num; /* Number of packages */ struct list_head packages; /* List of packages */ + struct ncsi_channel *hot_channel; /* Channel was ever active */ struct ncsi_request requests[256]; /* Request table */ unsigned int request_id; /* Last used request ID */ #define NCSI_REQ_START_IDX 1 diff --git a/net/ncsi/ncsi-manage.c b/net/ncsi/ncsi-manage.c index 4789f86..a3bd5fa 100644 --- a/net/ncsi/ncsi-manage.c +++ b/net/ncsi/ncsi-manage.c @@ -645,6 +645,7 @@ static void ncsi_configure_channel(struct ncsi_dev_priv *ndp) struct net_device *dev = nd->dev; struct ncsi_package *np = ndp->active_package; struct ncsi_channel *nc = ndp->active_channel; + struct ncsi_channel *hot_nc = NULL; struct ncsi_cmd_arg nca; unsigned char index; unsigned long flags; @@ -750,12 +751,20 @@ static void ncsi_configure_channel(struct ncsi_dev_priv *ndp) break; case ncsi_dev_state_config_done: spin_lock_irqsave(&nc->lock, flags); - if (nc->modes[NCSI_MODE_LINK].data[2] & 0x1) + if (nc->modes[NCSI_MODE_LINK].data[2] & 0x1) { + hot_nc = nc; nc->state = NCSI_CHANNEL_ACTIVE; - else + } else { + hot_nc = NULL; nc->state = NCSI_CHANNEL_INACTIVE; + } spin_unlock_irqrestore(&nc->lock, flags); + /* Update the hot channel */ + spin_lock_irqsave(&ndp->lock, flags); + ndp->hot_channel = hot_nc; + spin_unlock_irqrestore(&ndp->lock, flags); + ncsi_start_channel_monitor(nc); ncsi_process_next_channel(ndp); break; @@ -773,10 +782,14 @@ error: static int ncsi_choose_active_channel(struct ncsi_dev_priv *ndp) { struct ncsi_package *np; - struct ncsi_channel *nc, *found; + struct ncsi_channel *nc, *found, *hot_nc; struct ncsi_channel_mode *ncm; unsigned long flags; + spin_lock_irqsave(&ndp->lock, flags); + hot_nc = ndp->hot_channel; + spin_unlock_irqrestore(&ndp->lock, flags); + /* The search is done once an inactive channel with up * link is found. */ @@ -794,6 +807,9 @@ static int ncsi_choose_active_channel(struct ncsi_dev_priv *ndp) if (!found) found = nc; + if (nc == hot_nc) + found = nc; + ncm = &nc->modes[NCSI_MODE_LINK]; if (ncm->data[2] & 0x1) { spin_unlock_irqrestore(&nc->lock, flags);
The issue was found on BCM5718 which has two NCSI channels in one package: C0 and C1. C0 is in link-up state while C1 is in link-down state. C0 is chosen as active channel until unplugging and plugging C0's cable: On unplugging C0's cable, LSC (Link State Change) AEN packet received on C0 to report link-down event. After that, C1 is chosen as active channel. LSC AEN for link-up event is lost on C0 when plugging C0's cable back. We lose the network even C0 is usable. This resolves the issue by recording the (hot) channel that was ever chosen as active one. The hot channel is chosen to be active one if none of available channels in link-up state. With this, C0 is still the active one after unplugging C0's cable. LSC AEN packet received on C0 when plugging its cable back. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> --- net/ncsi/internal.h | 1 + net/ncsi/ncsi-manage.c | 22 +++++++++++++++++++--- 2 files changed, 20 insertions(+), 3 deletions(-)