[ovs-dev,v4,2/2] Avoid packet drop on LACP bond after link up
diff mbox series

Message ID 1551391035-18844-2-git-send-email-nitin.katiyar@ericsson.com
State Superseded
Headers show
Series
  • [ovs-dev,v4,1/2] Avoid packet drop on LACP bond after link up
Related show

Commit Message

Nitin Katiyar Feb. 28, 2019, 1:56 p.m. UTC
Problem:

Comments

Ben Pfaff March 1, 2019, 6:45 p.m. UTC | #1
I don't entirely understand the problem.  It seems like a driver bug.
Why isn't the bug being fixed?

Patches 1 and 2 have the same title.  It would be better if they were
different.

[more below]

On Thu, Feb 28, 2019 at 01:56:20PM +0000, Nitin Katiyar wrote:
> +    /*
> +     * On some NICs L1 state reporting is slow. In case LACP packets are
> +     * received while carrier (L1) state is still down, drop the LACPDU and
> +     * trigger re-checking of L1 state.
> +     */
> +    carrier_up = bond_slave_get_carrier(bond, slave->aux);
> +    if (!carrier_up) {
> +        seq_change(connectivity_seq_get());
> +
> +        VLOG_INFO_RL(&rl, "carrier still DOWN - conn seq changed for %s, "
> +                  "dropping packet\n", slave->name);

It's going to be hard for a user to understand what this means,
especially the "conn seq changed" part.  The user also doesn't know the
context, so it's not clear what "carrier still DOWN" means--still down
after what?  Can you please rephrase the log message so that it's clear
what happened and why it's unusual?

Please put the name of the lacp object at the beginning of the message.

Please don't add \n at the end of a log message; the logger does that
itself.

In bond_slave_get_carrier(), please use CONST_CAST for casting away
const.
Nitin Katiyar March 2, 2019, 12:14 p.m. UTC | #2
Hi,

> -----Original Message-----
> From: Ben Pfaff [mailto:blp@ovn.org]
> Sent: Saturday, March 02, 2019 12:15 AM
> To: Nitin Katiyar <nitin.katiyar@ericsson.com>
> Cc: ovs-dev@openvswitch.org; Manohar Krishnappa Chidambaraswamy
> <manukc@gmail.com>
> Subject: Re: [ovs-dev] [PATCH v4 2/2] Avoid packet drop on LACP bond after
> link up
> 
> I don't entirely understand the problem.  It seems like a driver bug.
> Why isn't the bug being fixed?
I agree it is driver/firmware bug but OVS implementation also doesn't account for link state which causes traffic drop in case of bond.
> 
> Patches 1 and 2 have the same title.  It would be better if they were different.
Both issues had surfaced in same scenario so we decided to have 2 patches. But we can send it as 2 different patches next time.
> 
> [more below]
> 
> On Thu, Feb 28, 2019 at 01:56:20PM +0000, Nitin Katiyar wrote:
> > +    /*
> > +     * On some NICs L1 state reporting is slow. In case LACP packets are
> > +     * received while carrier (L1) state is still down, drop the LACPDU and
> > +     * trigger re-checking of L1 state.
> > +     */
> > +    carrier_up = bond_slave_get_carrier(bond, slave->aux);
> > +    if (!carrier_up) {
> > +        seq_change(connectivity_seq_get());
> > +
> > +        VLOG_INFO_RL(&rl, "carrier still DOWN - conn seq changed for %s, "
> > +                  "dropping packet\n", slave->name);
> 
> It's going to be hard for a user to understand what this means, especially the
> "conn seq changed" part.  The user also doesn't know the context, so it's not
> clear what "carrier still DOWN" means--still down after what?  Can you please
> rephrase the log message so that it's clear what happened and why it's
> unusual?
Sure, I will rephrase it.
> 
> Please put the name of the lacp object at the beginning of the message.
> 
> Please don't add \n at the end of a log message; the logger does that itself.
> 
> In bond_slave_get_carrier(), please use CONST_CAST for casting away const.
I will update all these three in next patch.

Thanks,
Nitin
Ben Pfaff March 4, 2019, 5:47 p.m. UTC | #3
On Sat, Mar 02, 2019 at 12:14:02PM +0000, Nitin Katiyar wrote:
> Hi,
> 
> > -----Original Message-----
> > From: Ben Pfaff [mailto:blp@ovn.org]
> > Sent: Saturday, March 02, 2019 12:15 AM
> > To: Nitin Katiyar <nitin.katiyar@ericsson.com>
> > Cc: ovs-dev@openvswitch.org; Manohar Krishnappa Chidambaraswamy
> > <manukc@gmail.com>
> > Subject: Re: [ovs-dev] [PATCH v4 2/2] Avoid packet drop on LACP bond after
> > link up
> > 
> > I don't entirely understand the problem.  It seems like a driver bug.
> > Why isn't the bug being fixed?
> I agree it is driver/firmware bug but OVS implementation also doesn't account for link state which causes traffic drop in case of bond.

OK.

Could you update the commit message to make it clear that there's an OVS
problem instead of just a NIC driver problem?  When I look at it as a
NIC driver problem only, I'm less inclined to take the fix, but if it's
a problem in OVS then of course we'll fix it.

> > Patches 1 and 2 have the same title.  It would be better if they were different.
> Both issues had surfaced in same scenario so we decided to have 2 patches. But we can send it as 2 different patches next time.

It's fine as two patches, I just want their subjects to be different.
Otherwise references to them post-commit will be ambiguous unless the
git commit number matches (which can't always be the case).

Patch
diff mbox series

========
On certain Fortville NICs it has been observed that PHY UP detection can
get delayed (sometimes up to 4-5 secs). When the driver fails to fetch
PHY status as UP even though its actually UP,  LACP packets can get
exchanged and LACP slave brought UP. In such a case, the remote end
would start sending traffic on that slave, but OVS drops it, as the
bond-slave is not yet enabled due to PHY DOWN.

Fix:
====
The main intention here is delay LACP negotiation until carrier (PHY)
status is read as UP.

1. In port_run()/bundle_run(), cache the carrier status in
  "struct ofport_dpif"/"struct bond_slave".

2. When carrier state is DOWN, do not send any LACPDUs and
  drop any received LACPDUs.

3. When LACP state changes or LACPDU is received, trigger re-checking
  of carrier-state (in port_run()) by incrementing the connectivity
  sequence number to find out the true carrier state as fast as
  possible.

Signed-off-by: Manohar Krishnappa Chidambaraswamy <manukc@gmail.com>
Co-authored-by: Manohar Krishnappa Chidambaraswamy <manukc@gmail.com>
Signed-off-by: Nitin Katiyar <nitin.katiyar@ericsson.com>
---
 lib/lacp.c                   | 44 +++++++++++++++++++++++++++++++++++---------
 lib/lacp.h                   |  8 ++++----
 ofproto/bond.c               | 30 ++++++++++++++++++++++++++++++
 ofproto/bond.h               |  3 +++
 ofproto/ofproto-dpif-xlate.c |  1 +
 ofproto/ofproto-dpif.c       | 10 +++++++---
 6 files changed, 80 insertions(+), 16 deletions(-)

diff --git a/lib/lacp.c b/lib/lacp.c
index e768012..2bfa267 100644
--- a/lib/lacp.c
+++ b/lib/lacp.c
@@ -33,6 +33,7 @@ 
 #include "unixctl.h"
 #include "openvswitch/vlog.h"
 #include "util.h"
+#include "ofproto/bond.h"
 
 VLOG_DEFINE_THIS_MODULE(lacp);
 
@@ -148,7 +149,7 @@  static void slave_get_actor(struct slave *, struct lacp_info *actor)
     OVS_REQUIRES(mutex);
 static void slave_get_priority(struct slave *, struct lacp_info *priority)
     OVS_REQUIRES(mutex);
-static bool slave_may_tx(const struct slave *)
+static bool slave_may_tx(const void *bond, const struct slave *)
     OVS_REQUIRES(mutex);
 static struct slave *slave_lookup(const struct lacp *, const void *slave)
     OVS_REQUIRES(mutex);
@@ -326,14 +327,15 @@  lacp_is_active(const struct lacp *lacp) OVS_EXCLUDED(mutex)
  * called on all packets received on 'slave_' with Ethernet Type ETH_TYPE_LACP.
  */
 bool
-lacp_process_packet(struct lacp *lacp, const void *slave_,
-                    const struct dp_packet *packet)
+lacp_process_packet(struct lacp *lacp, const void *bond,
+                    const void *slave_, const struct dp_packet *packet)
     OVS_EXCLUDED(mutex)
 {
     static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
     const struct lacp_pdu *pdu;
     long long int tx_rate;
     struct slave *slave;
+    bool carrier_up = false;
     bool lacp_may_enable = false;
 
     lacp_lock();
@@ -350,6 +352,21 @@  lacp_process_packet(struct lacp *lacp, const void *slave_,
         goto out;
     }
 
+    /*
+     * On some NICs L1 state reporting is slow. In case LACP packets are
+     * received while carrier (L1) state is still down, drop the LACPDU and
+     * trigger re-checking of L1 state.
+     */
+    carrier_up = bond_slave_get_carrier(bond, slave->aux);
+    if (!carrier_up) {
+        seq_change(connectivity_seq_get());
+
+        VLOG_INFO_RL(&rl, "carrier still DOWN - conn seq changed for %s, "
+                  "dropping packet\n", slave->name);
+
+        goto out;
+    }
+
     slave->status = LACP_CURRENT;
     tx_rate = lacp->fast ? LACP_FAST_TIME_TX : LACP_SLOW_TIME_TX;
     timer_set_duration(&slave->rx, LACP_RX_MULTIPLIER * tx_rate);
@@ -529,7 +546,8 @@  lacp_slave_is_current(const struct lacp *lacp, const void *slave_)
 
 /* This function should be called periodically to update 'lacp'. */
 void
-lacp_run(struct lacp *lacp, lacp_send_pdu *send_pdu) OVS_EXCLUDED(mutex)
+lacp_run(struct lacp *lacp, const void *bond,
+         lacp_send_pdu *send_pdu) OVS_EXCLUDED(mutex)
 {
     struct slave *slave;
 
@@ -559,7 +577,7 @@  lacp_run(struct lacp *lacp, lacp_send_pdu *send_pdu) OVS_EXCLUDED(mutex)
     HMAP_FOR_EACH (slave, node, &lacp->slaves) {
         struct lacp_info actor;
 
-        if (!slave_may_tx(slave)) {
+        if (!slave_may_tx(bond, slave)) {
             continue;
         }
 
@@ -588,13 +606,13 @@  lacp_run(struct lacp *lacp, lacp_send_pdu *send_pdu) OVS_EXCLUDED(mutex)
 
 /* Causes poll_block() to wake up when lacp_run() needs to be called again. */
 void
-lacp_wait(struct lacp *lacp) OVS_EXCLUDED(mutex)
+lacp_wait(struct lacp *lacp, const void *bond) OVS_EXCLUDED(mutex)
 {
     struct slave *slave;
 
     lacp_lock();
     HMAP_FOR_EACH (slave, node, &lacp->slaves) {
-        if (slave_may_tx(slave)) {
+        if (slave_may_tx(bond, slave)) {
             timer_wait(&slave->tx);
         }
 
@@ -818,9 +836,17 @@  slave_get_priority(struct slave *slave, struct lacp_info *priority)
 }
 
 static bool
-slave_may_tx(const struct slave *slave) OVS_REQUIRES(mutex)
+slave_may_tx(const void *bond, const struct slave *slave)
+    OVS_REQUIRES(mutex)
 {
-    return slave->lacp->active || slave->status != LACP_DEFAULTED;
+    bool carrier_up = true;
+
+    if (bond) {
+        carrier_up = bond_slave_get_carrier(bond, slave->aux);
+    }
+
+    return carrier_up &&
+        (slave->lacp->active || slave->status != LACP_DEFAULTED);
 }
 
 static struct slave *
diff --git a/lib/lacp.h b/lib/lacp.h
index 0dfaef0..808d671 100644
--- a/lib/lacp.h
+++ b/lib/lacp.h
@@ -46,8 +46,8 @@  struct lacp *lacp_ref(const struct lacp *);
 void lacp_configure(struct lacp *, const struct lacp_settings *);
 bool lacp_is_active(const struct lacp *);
 
-bool lacp_process_packet(struct lacp *, const void *slave,
-                         const struct dp_packet *packet);
+bool lacp_process_packet(struct lacp *, const void *bond,
+                         const void *slave, const struct dp_packet *packet);
 enum lacp_status lacp_status(const struct lacp *);
 
 struct lacp_slave_settings {
@@ -67,8 +67,8 @@  bool lacp_slave_is_current(const struct lacp *, const void *slave_);
 /* Callback function for lacp_run() for sending a LACP PDU. */
 typedef void lacp_send_pdu(void *slave, const void *pdu, size_t pdu_size);
 
-void lacp_run(struct lacp *, lacp_send_pdu *);
-void lacp_wait(struct lacp *);
+void lacp_run(struct lacp *, const void *, lacp_send_pdu *);
+void lacp_wait(struct lacp *, const void *);
 
 struct lacp_slave_stats {
     /* id */
diff --git a/ofproto/bond.c b/ofproto/bond.c
index cd6fd33..68c3acb 100644
--- a/ofproto/bond.c
+++ b/ofproto/bond.c
@@ -93,6 +93,7 @@  struct bond_slave {
     /* Link status. */
     bool enabled;               /* May be chosen for flows? */
     bool may_enable;            /* Client considers this slave bondable. */
+    bool carrier_up;            /* Carrier state from NIC port */
     long long delay_expires;    /* Time after which 'enabled' may change. */
 
     /* Rebalancing info.  Used only by bond_rebalance(). */
@@ -634,6 +635,35 @@  bond_slave_set_may_enable(struct bond *bond, void *slave_, bool may_enable)
     ovs_rwlock_unlock(&rwlock);
 }
 
+void
+bond_slave_set_carrier(struct bond *bond, void *slave_, bool carrier_up)
+{
+    struct bond_slave *slave = NULL;
+
+    ovs_rwlock_wrlock(&rwlock);
+    slave = bond_slave_lookup(bond, slave_);
+    if (slave) {
+        slave->carrier_up = carrier_up;
+    }
+    ovs_rwlock_unlock(&rwlock);
+}
+
+bool
+bond_slave_get_carrier(const struct bond *bond, void *slave_)
+{
+    bool carrier_up = false;
+    struct bond_slave *slave;
+
+    ovs_rwlock_rdlock(&rwlock);
+    slave = bond_slave_lookup((struct bond *)bond, slave_);
+    if (slave) {
+        carrier_up = slave->carrier_up;
+    }
+    ovs_rwlock_unlock(&rwlock);
+
+    return carrier_up;
+}
+
 /* Performs periodic maintenance on 'bond'.
  *
  * Returns true if the caller should revalidate its flows.
diff --git a/ofproto/bond.h b/ofproto/bond.h
index e7c3d9b..743e2b2 100644
--- a/ofproto/bond.h
+++ b/ofproto/bond.h
@@ -79,6 +79,9 @@  void bond_wait(struct bond *);
 
 void bond_slave_set_may_enable(struct bond *, void *slave_, bool may_enable);
 
+void bond_slave_set_carrier(struct bond *bond, void *slave_, bool carrier_up);
+bool bond_slave_get_carrier(const struct bond *bond, void *slave_);
+
 /* Special MAC learning support for SLB bonding. */
 bool bond_should_send_learning_packets(struct bond *);
 struct dp_packet *bond_compose_learning_packet(struct bond *,
diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index 8f44566..18fdb82 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -3329,6 +3329,7 @@  process_special(struct xlate_ctx *ctx, const struct xport *xport)
                && flow->dl_type == htons(ETH_TYPE_LACP)) {
         if (packet) {
             lacp_may_enable = lacp_process_packet(xport->xbundle->lacp,
+                                                  xport->xbundle->bond,
                                                   xport->ofport, packet);
             /* Update LACP status in bond-slave to avoid packet-drops until
              * LACP state machine is run by the main thread. */
diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index 21dd54b..f494a36 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -137,6 +137,7 @@  struct ofport_dpif {
     struct cfm *cfm;            /* Connectivity Fault Management, if any. */
     struct bfd *bfd;            /* BFD, if any. */
     struct lldp *lldp;          /* lldp, if any. */
+    bool carrier_up;            /* Carrier state from NIC */
     bool is_tunnel;             /* This port is a tunnel. */
     long long int carrier_seq;  /* Carrier status changes. */
     struct ofport_dpif *peer;   /* Peer if patch port. */
@@ -3324,12 +3325,13 @@  static void
 bundle_run(struct ofbundle *bundle)
 {
     if (bundle->lacp) {
-        lacp_run(bundle->lacp, send_pdu_cb);
+        lacp_run(bundle->lacp, bundle->bond, send_pdu_cb);
     }
     if (bundle->bond) {
         struct ofport_dpif *port;
 
         LIST_FOR_EACH (port, bundle_node, &bundle->ports) {
+            bond_slave_set_carrier(bundle->bond, port, port->carrier_up);
             bond_slave_set_may_enable(bundle->bond, port, port->up.may_enable);
         }
 
@@ -3347,7 +3349,7 @@  static void
 bundle_wait(struct ofbundle *bundle)
 {
     if (bundle->lacp) {
-        lacp_wait(bundle->lacp);
+        lacp_wait(bundle->lacp, bundle->bond);
     }
     if (bundle->bond) {
         bond_wait(bundle->bond);
@@ -3563,8 +3565,10 @@  ofport_update_peer(struct ofport_dpif *ofport)
 static bool
 may_enable_port(struct ofport_dpif *ofport)
 {
+    ofport->carrier_up = netdev_get_carrier(ofport->up.netdev);
+
     /* Carrier must be up. */
-    if (!netdev_get_carrier(ofport->up.netdev)) {
+    if (!ofport->carrier_up) {
         return false;
     }