diff mbox series

[ovs-dev,v3] bond: Reset stats when deleting post recirc rule.

Message ID 20240222155415.273552-1-amorenoz@redhat.com
State Accepted
Commit 436aba68d52891fb5775ec7651282ccf9d04176b
Delegated to: Ilya Maximets
Headers show
Series [ovs-dev,v3] bond: Reset stats when deleting post recirc rule. | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test success github build: passed
ovsrobot/intel-ovs-compilation success test: success

Commit Message

Adrian Moreno Feb. 22, 2024, 3:54 p.m. UTC
In order to properly balance bond traffic, ofproto/bond periodically
reads usage statistics of the post-recirculation rules (which are added
to a hidden internal table).

To do that, each "struct bond_entry" (which represents a hash within a
bond) stores the last seen statistics for its rule. When a hash is moved
to another member (due to a bond rebalance or the previous member going
down), the rule is typically just modified, i.e: same match different
actions. In this case, statistics are preserved and accounting continues
to work.

However, if the rule gets completely deleted (e.g: when all bond members
go down) and then re-created, the new rule will have 0 tx_bytes but its
associated entry will still store a non-zero last-seen value.
This situation leads to an overflow of the delta calculation (computed
as [current_stats_value - last_seen_value]), which can affect traffic
as the hash will be considered to carry a lot of traffic and rebalancing
will kick in.

In order to fix this situation, reset the value of last seen statistics
on rule deletion.

Implementation notes:
Modifying pr_tx_bytes requires write-locking the global rwlock but a
lockless version of update_recirc_rules was being maintained to avoid locking
on bon_unref().
Considering the small impact of locking during bond removal, removing the
lockless version and relying on clang's thread safety analysis is preferred.

Also, folding Ilya's [1], i.e: fixing thread safety annotation in
update_recirc_rules() to require holding write-lock.

[1]
https://patchwork.ozlabs.org/project/openvswitch/patch/20240209161718.1149494-1-i.maximets@ovn.org/

Reported-at: https://github.com/openvswitch/ovs-issues/issues/319
Co-authored-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
---
 ofproto/bond.c        | 33 +++++++++++++++------------------
 tests/ofproto-dpif.at | 16 ++++++++++++++++
 2 files changed, 31 insertions(+), 18 deletions(-)

Comments

Ilya Maximets March 1, 2024, 11:02 p.m. UTC | #1
On 2/22/24 16:54, Adrian Moreno wrote:
> In order to properly balance bond traffic, ofproto/bond periodically
> reads usage statistics of the post-recirculation rules (which are added
> to a hidden internal table).
> 
> To do that, each "struct bond_entry" (which represents a hash within a
> bond) stores the last seen statistics for its rule. When a hash is moved
> to another member (due to a bond rebalance or the previous member going
> down), the rule is typically just modified, i.e: same match different
> actions. In this case, statistics are preserved and accounting continues
> to work.
> 
> However, if the rule gets completely deleted (e.g: when all bond members
> go down) and then re-created, the new rule will have 0 tx_bytes but its
> associated entry will still store a non-zero last-seen value.
> This situation leads to an overflow of the delta calculation (computed
> as [current_stats_value - last_seen_value]), which can affect traffic
> as the hash will be considered to carry a lot of traffic and rebalancing
> will kick in.
> 
> In order to fix this situation, reset the value of last seen statistics
> on rule deletion.
> 
> Implementation notes:
> Modifying pr_tx_bytes requires write-locking the global rwlock but a
> lockless version of update_recirc_rules was being maintained to avoid locking
> on bon_unref().
> Considering the small impact of locking during bond removal, removing the
> lockless version and relying on clang's thread safety analysis is preferred.
> 
> Also, folding Ilya's [1], i.e: fixing thread safety annotation in
> update_recirc_rules() to require holding write-lock.
> 
> [1]
> https://patchwork.ozlabs.org/project/openvswitch/patch/20240209161718.1149494-1-i.maximets@ovn.org/
> 
> Reported-at: https://github.com/openvswitch/ovs-issues/issues/319
> Co-authored-by: Ilya Maximets <i.maximets@ovn.org>
> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
> ---

Thanks!  Applied and backported down to 2.17.

Best regards, Ilya Maximets.
diff mbox series

Patch

diff --git a/ofproto/bond.c b/ofproto/bond.c
index cfdf44f85..84f59335f 100644
--- a/ofproto/bond.c
+++ b/ofproto/bond.c
@@ -186,7 +186,7 @@  static struct bond_member *choose_output_member(const struct bond *,
                                                 struct flow_wildcards *,
                                                 uint16_t vlan)
     OVS_REQ_RDLOCK(rwlock);
-static void update_recirc_rules__(struct bond *);
+static void update_recirc_rules(struct bond *) OVS_REQ_WRLOCK(rwlock);
 static bool bond_may_recirc(const struct bond *);
 static void bond_update_post_recirc_rules__(struct bond *, bool force)
     OVS_REQ_WRLOCK(rwlock);
@@ -299,7 +299,10 @@  bond_unref(struct bond *bond)
     }
     free(bond->hash);
     bond->hash = NULL;
-    update_recirc_rules__(bond);
+
+    ovs_rwlock_wrlock(&rwlock);
+    update_recirc_rules(bond);
+    ovs_rwlock_unlock(&rwlock);
 
     hmap_destroy(&bond->pr_rule_ops);
     free(bond->primary);
@@ -331,17 +334,8 @@  add_pr_rule(struct bond *bond, const struct match *match,
     hmap_insert(&bond->pr_rule_ops, &pr_op->hmap_node, hash);
 }
 
-/* This function should almost never be called directly.
- * 'update_recirc_rules()' should be called instead.  Since
- * this function modifies 'bond->pr_rule_ops', it is only
- * safe when 'rwlock' is held.
- *
- * However, when the 'bond' is the only reference in the system,
- * calling this function avoid acquiring lock only to satisfy
- * lock annotation. Currently, only 'bond_unref()' calls
- * this function directly.  */
 static void
-update_recirc_rules__(struct bond *bond)
+update_recirc_rules(struct bond *bond) OVS_REQ_WRLOCK(rwlock)
 {
     struct match match;
     struct bond_pr_rule_op *pr_op;
@@ -407,6 +401,15 @@  update_recirc_rules__(struct bond *bond)
 
                 VLOG_ERR("failed to remove post recirculation flow %s", err_s);
                 free(err_s);
+            } else if (bond->hash) {
+                /* If the flow deletion failed, a subsecuent call to
+                 * ofproto_dpif_add_internal_flow() would just modify the flow
+                 * preserving its statistics. Therefore, only reset the entry's
+                 * byte counter if it succeeds. */
+                uint32_t hash = pr_op->match.flow.dp_hash & BOND_MASK;
+                struct bond_entry *entry = &bond->hash[hash];
+
+                entry->pr_tx_bytes = 0;
             }
 
             hmap_remove(&bond->pr_rule_ops, &pr_op->hmap_node);
@@ -421,12 +424,6 @@  update_recirc_rules__(struct bond *bond)
     ofpbuf_uninit(&ofpacts);
 }
 
-static void
-update_recirc_rules(struct bond *bond)
-    OVS_REQ_RDLOCK(rwlock)
-{
-    update_recirc_rules__(bond);
-}
 
 /* Updates 'bond''s overall configuration to 's'.
  *
diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
index e305e7b9c..71737db8c 100644
--- a/tests/ofproto-dpif.at
+++ b/tests/ofproto-dpif.at
@@ -547,6 +547,22 @@  ovs-appctl time/warp 1000 100
 ovs-appctl bond/show > bond3.txt
 AT_CHECK([sed -n '/member p2/,/^$/p' bond3.txt | grep 'hash'], [0], [ignore])
 
+AT_CHECK([ovs-appctl netdev-dummy/set-admin-state p1 down], 0, [OK
+])
+AT_CHECK([ovs-appctl netdev-dummy/set-admin-state p2 down], 0, [OK
+])
+ovs-appctl time/warp 1000 100
+AT_CHECK([ovs-appctl netdev-dummy/set-admin-state p1 up], 0, [OK
+])
+AT_CHECK([ovs-appctl netdev-dummy/set-admin-state p2 up], 0, [OK
+])
+ovs-appctl time/warp 1000 100
+
+AT_CHECK([SEND_TCP_BOND_PKTS([p5], [5], [65500])])
+# We sent 49125 KB of data total in 3 batches.  No hash should have more
+# than that amount of load. Just checking that it is within 5 digits.
+AT_CHECK([ovs-appctl bond/show | grep -E '[[0-9]]{6}'], [1])
+
 OVS_VSWITCHD_STOP()
 AT_CLEANUP