diff mbox

[ovs-dev] ofproto/bond: simplify rebalancing logic

Message ID 1442351399-27850-1-git-send-email-azhou@nicira.com
State Accepted
Headers show

Commit Message

Andy Zhou Sept. 15, 2015, 9:09 p.m. UTC
The current bond relancing logic is more complicated than necessary.
When considering a bucket for rebalancing, we just need to make sure
post rebalancing traffic will be closer to the ideal traffic split
than before. This patch implements the simplification.

There is a bug is current algorithm that causes a heavyly loaded bucket
to ping-pong for each reblancing interval. The simplied loigc also fixes
this bug.

Though not the main motivation for the change, computations are now
done with integer math rather than floating math.

Reported-by: Gregory Smith <gasmith@nutanix.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
---
 ofproto/bond.c | 37 +++++++++++++++++++------------------
 1 file changed, 19 insertions(+), 18 deletions(-)

Comments

Ben Pfaff Sept. 18, 2015, 9:56 p.m. UTC | #1
On Tue, Sep 15, 2015 at 02:09:59PM -0700, Andy Zhou wrote:
> The current bond relancing logic is more complicated than necessary.
> When considering a bucket for rebalancing, we just need to make sure
> post rebalancing traffic will be closer to the ideal traffic split
> than before. This patch implements the simplification.
> 
> There is a bug is current algorithm that causes a heavyly loaded bucket
> to ping-pong for each reblancing interval. The simplied loigc also fixes
> this bug.
> 
> Though not the main motivation for the change, computations are now
> done with integer math rather than floating math.
> 
> Reported-by: Gregory Smith <gasmith@nutanix.com>
> Signed-off-by: Andy Zhou <azhou@nicira.com>

It seems reasonable at first glance, but I wonder whether you went
through the history of this logic to make sure that we're not regressing
on any of the improvements we've made over time?

If you think so, then
Acked-by: Ben Pfaff <blp@nicira.com>
Andy Zhou Sept. 18, 2015, 10:32 p.m. UTC | #2
>
> It seems reasonable at first glance, but I wonder whether you went
> through the history of this logic to make sure that we're not regressing
> on any of the improvements we've made over time?

This algorithm did not change much. As far as I can tell, It has been
the same since 2.0. I believe
the particular bug Gregory reported exists even before 2.0,  although
I am not sure if it will be worth while to
to back port to branches older than 2.0.

>
> If you think so, then
> Acked-by: Ben Pfaff <blp@nicira.com>

Thanks.
Andy Zhou Sept. 23, 2015, 1:15 a.m. UTC | #3
On Fri, Sep 18, 2015 at 3:32 PM, Andy Zhou <azhou@nicira.com> wrote:
>>
>> It seems reasonable at first glance, but I wonder whether you went
>> through the history of this logic to make sure that we're not regressing
>> on any of the improvements we've made over time?
>
> This algorithm did not change much. As far as I can tell, It has been
> the same since 2.0. I believe
> the particular bug Gregory reported exists even before 2.0,  although
> I am not sure if it will be worth while to
> to back port to branches older than 2.0.
>
>>
>> If you think so, then
>> Acked-by: Ben Pfaff <blp@nicira.com>
>
> Thanks.

Pushed to master and branch-2.3.
diff mbox

Patch

diff --git a/ofproto/bond.c b/ofproto/bond.c
index d7a7d30..1dbf8f1 100644
--- a/ofproto/bond.c
+++ b/ofproto/bond.c
@@ -1,5 +1,5 @@ 
 /*
- * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc.
+ * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015 Nicira, Inc.
  *
  * Licensed under the Apache License, Version 2.0 (the "License");
  * you may not use this file except in compliance with the License.
@@ -1065,23 +1065,24 @@  choose_entry_to_migrate(const struct bond_slave *from, uint64_t to_tx_bytes)
     }
 
     LIST_FOR_EACH (e, list_node, &from->entries) {
-        double old_ratio, new_ratio;
-        uint64_t delta;
-
-        if (to_tx_bytes == 0) {
-            /* Nothing on the new slave, move it. */
-            return e;
-        }
-
-        delta = e->tx_bytes;
-        old_ratio = (double)from->tx_bytes / to_tx_bytes;
-        new_ratio = (double)(from->tx_bytes - delta) / (to_tx_bytes + delta);
-        if (old_ratio - new_ratio > 0.1
-            && fabs(new_ratio - 1.0) < fabs(old_ratio - 1.0)) {
-            /* We're aiming for an ideal ratio of 1, meaning both the 'from'
-               and 'to' slave have the same load.  Therefore, we only move an
-               entry if it decreases the load on 'from', and brings us closer
-               to equal traffic load. */
+        uint64_t delta = e->tx_bytes;  /* The amount to rebalance.  */
+        uint64_t ideal_tx_bytes = (from->tx_bytes + to_tx_bytes)/2;
+                             /* Note, the ideal traffic is the mid point
+                              * between 'from' and 'to'. This value does
+                              * not change by rebalancing.  */
+        uint64_t new_low;    /* The lower bandwidth between 'to' and 'from'
+                                after rebalancing. */
+
+        new_low = MIN(from->tx_bytes - delta, to_tx_bytes + delta);
+
+        if ((new_low > to_tx_bytes) &&
+            (new_low - to_tx_bytes >= (ideal_tx_bytes - to_tx_bytes) / 10)) {
+            /* Only rebalance if the new 'low' is closer to to the mid point,
+             * and the improvement exceeds 10% of current traffic
+             * deviation from the ideal split.
+             *
+             * The improvement on the 'high' side is always the same as the
+             * 'low' side. Thus consider 'low' side is sufficient.  */
             return e;
         }
     }