Message ID | 20240209070642.2412417-1-amorenoz@redhat.com |
---|---|
State | Changes Requested |
Headers | show |
Series | [ovs-dev] bond: Reset stats when deleting post recirc rule. | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | success | apply and check: success |
ovsrobot/github-robot-_Build_and_Test | fail | github build: failed |
ovsrobot/intel-ovs-compilation | success | test: success |
On 2/9/24 08:06, Adrian Moreno wrote: > In order to properly balance bond traffic, ofproto/bond periodically > reads usage statistics of the post-recirculation rules (which are added > to a hidden internal table). > > To do that, each "struct bond_entry" (which represents a hash within a > bond) stores the last seen statistics for its rule. When a hash is moved > to another member (due to a bond rebalance or the previous member going > down), the rule is typically just modified, i.e: same match different > actions. In this case, statistics are preserved and accounting continues > to work. > > However, if the rule gets completely deleted (e.g: when all bond members > go down) and then re-created, the new rule will have 0 tx_bytes but its > associated entry will still store a non-zero last-seen value. > This situation leads to an overflow of the delta calculation (computed > as [current_stats_value - last_seen_value]), which can affect traffic > as the hash will be considered to carry a lot of traffic and rebalancing > will kick in. > > In order to fix this situation, reset the value of last seen statistics > on rule deletion. > > Implementation note: > The field ((struct bond_entry *) -> pr_tx_bytes) is guarded by rwlock > but, as the comment above the function indicates, it's only called > without having locked it when the bond is being deleted. Given that > bond->hash is free()ed before deleting the rules (which makes writing to > the entries a use-after-free anyway), we can use that to avoid > double-locking. clang is not happy about this. All the clang jobs failed in CI. We have to either remove the guarding annotation or just take the lock in the unref() function. The latter probably makes more sense. Is there a good reason to not take the lock there? > > Reported-at: https://github.com/openvswitch/ovs-issues/issues/319 > Signed-off-by: Adrian Moreno <amorenoz@redhat.com> > --- > ofproto/bond.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/ofproto/bond.c b/ofproto/bond.c > index cfdf44f85..ad56bb96b 100644 > --- a/ofproto/bond.c > +++ b/ofproto/bond.c > @@ -407,6 +407,10 @@ update_recirc_rules__(struct bond *bond) > > VLOG_ERR("failed to remove post recirculation flow %s", err_s); > free(err_s); > + } else if (bond->hash) { > + uint32_t hash = pr_op->hmap_node.hash & BOND_MASK; > + struct bond_entry *entry = &bond->hash[hash]; > + entry->pr_tx_bytes = 0; A few lines below we will be leaning up the rule pointer regardless of the error. Should this code be moved there? Potentially re-ordered with hmap_remove to be sure that the hash is kept intact? > } > > hmap_remove(&bond->pr_rule_ops, &pr_op->hmap_node);
On 2/9/24 17:15, Ilya Maximets wrote: > On 2/9/24 08:06, Adrian Moreno wrote: >> In order to properly balance bond traffic, ofproto/bond periodically >> reads usage statistics of the post-recirculation rules (which are added >> to a hidden internal table). >> >> To do that, each "struct bond_entry" (which represents a hash within a >> bond) stores the last seen statistics for its rule. When a hash is moved >> to another member (due to a bond rebalance or the previous member going >> down), the rule is typically just modified, i.e: same match different >> actions. In this case, statistics are preserved and accounting continues >> to work. >> >> However, if the rule gets completely deleted (e.g: when all bond members >> go down) and then re-created, the new rule will have 0 tx_bytes but its >> associated entry will still store a non-zero last-seen value. >> This situation leads to an overflow of the delta calculation (computed >> as [current_stats_value - last_seen_value]), which can affect traffic >> as the hash will be considered to carry a lot of traffic and rebalancing >> will kick in. >> >> In order to fix this situation, reset the value of last seen statistics >> on rule deletion. >> >> Implementation note: >> The field ((struct bond_entry *) -> pr_tx_bytes) is guarded by rwlock >> but, as the comment above the function indicates, it's only called >> without having locked it when the bond is being deleted. Given that >> bond->hash is free()ed before deleting the rules (which makes writing to >> the entries a use-after-free anyway), we can use that to avoid >> double-locking. > > clang is not happy about this. All the clang jobs failed in CI. > We have to either remove the guarding annotation or just take the > lock in the unref() function. The latter probably makes more sense. > Is there a good reason to not take the lock there? > Yes, I was kind of expecting that one. TBH, I assumed there was a good reason based on the comment above the function. Given it's a function that frees resources, I assumed, it was handing some kind of corner-case. If you don't remember any context, then I'll take some more time and look at all possible implications of locking in bond_unref(). >> >> Reported-at: https://github.com/openvswitch/ovs-issues/issues/319 >> Signed-off-by: Adrian Moreno <amorenoz@redhat.com> >> --- >> ofproto/bond.c | 4 ++++ >> 1 file changed, 4 insertions(+) >> >> diff --git a/ofproto/bond.c b/ofproto/bond.c >> index cfdf44f85..ad56bb96b 100644 >> --- a/ofproto/bond.c >> +++ b/ofproto/bond.c >> @@ -407,6 +407,10 @@ update_recirc_rules__(struct bond *bond) >> >> VLOG_ERR("failed to remove post recirculation flow %s", err_s); >> free(err_s); >> + } else if (bond->hash) { >> + uint32_t hash = pr_op->hmap_node.hash & BOND_MASK; >> + struct bond_entry *entry = &bond->hash[hash]; >> + entry->pr_tx_bytes = 0; > > A few lines below we will be leaning up the rule pointer regardless > of the error. Should this code be moved there? Potentially re-ordered > with hmap_remove to be sure that the hash is kept intact? > I figured if there is an error deleting the internal flow, we should not reset the byte counter. Not that I can imagine a situation where this would happen but if we fail to delete the rule and we then account for it, we'll report a potentially big rate increase that could lead to an unnecessary rebalancing. >> } >> >> hmap_remove(&bond->pr_rule_ops, &pr_op->hmap_node); >
On 2/12/24 09:43, Adrian Moreno wrote: > > > On 2/9/24 17:15, Ilya Maximets wrote: >> On 2/9/24 08:06, Adrian Moreno wrote: >>> In order to properly balance bond traffic, ofproto/bond periodically >>> reads usage statistics of the post-recirculation rules (which are added >>> to a hidden internal table). >>> >>> To do that, each "struct bond_entry" (which represents a hash within a >>> bond) stores the last seen statistics for its rule. When a hash is moved >>> to another member (due to a bond rebalance or the previous member going >>> down), the rule is typically just modified, i.e: same match different >>> actions. In this case, statistics are preserved and accounting continues >>> to work. >>> >>> However, if the rule gets completely deleted (e.g: when all bond members >>> go down) and then re-created, the new rule will have 0 tx_bytes but its >>> associated entry will still store a non-zero last-seen value. >>> This situation leads to an overflow of the delta calculation (computed >>> as [current_stats_value - last_seen_value]), which can affect traffic >>> as the hash will be considered to carry a lot of traffic and rebalancing >>> will kick in. >>> >>> In order to fix this situation, reset the value of last seen statistics >>> on rule deletion. >>> >>> Implementation note: >>> The field ((struct bond_entry *) -> pr_tx_bytes) is guarded by rwlock >>> but, as the comment above the function indicates, it's only called >>> without having locked it when the bond is being deleted. Given that >>> bond->hash is free()ed before deleting the rules (which makes writing to >>> the entries a use-after-free anyway), we can use that to avoid >>> double-locking. >> >> clang is not happy about this. All the clang jobs failed in CI. >> We have to either remove the guarding annotation or just take the >> lock in the unref() function. The latter probably makes more sense. >> Is there a good reason to not take the lock there? >> > > Yes, I was kind of expecting that one. TBH, I assumed there was a good reason > based on the comment above the function. Given it's a function that frees > resources, I assumed, it was handing some kind of corner-case. > > If you don't remember any context, then I'll take some more time and look at all > possible implications of locking in bond_unref(). I wasn't part of the discussion when this code was introduced, and I don't really see any discussion about that part in the list archives. So, I'm just assuming that the logic was "there is only one reference to an object while it is being unrefed, so there is no need to take a mutex, i.e. we can avoid blocking some other threads on a global lock." But I don't have any evidence supporting this. At the same time, I don't see a clear reason to not lock, except for potentially locking some other threads while the destruction is ongoing. > >>> >>> Reported-at: https://github.com/openvswitch/ovs-issues/issues/319 >>> Signed-off-by: Adrian Moreno <amorenoz@redhat.com> >>> --- >>> ofproto/bond.c | 4 ++++ >>> 1 file changed, 4 insertions(+) >>> >>> diff --git a/ofproto/bond.c b/ofproto/bond.c >>> index cfdf44f85..ad56bb96b 100644 >>> --- a/ofproto/bond.c >>> +++ b/ofproto/bond.c >>> @@ -407,6 +407,10 @@ update_recirc_rules__(struct bond *bond) >>> >>> VLOG_ERR("failed to remove post recirculation flow %s", err_s); >>> free(err_s); >>> + } else if (bond->hash) { >>> + uint32_t hash = pr_op->hmap_node.hash & BOND_MASK; >>> + struct bond_entry *entry = &bond->hash[hash]; >>> + entry->pr_tx_bytes = 0; >> >> A few lines below we will be leaning up the rule pointer regardless >> of the error. Should this code be moved there? Potentially re-ordered >> with hmap_remove to be sure that the hash is kept intact? >> > > I figured if there is an error deleting the internal flow, we should not reset > the byte counter. Not that I can imagine a situation where this would happen but > if we fail to delete the rule and we then account for it, we'll report a > potentially big rate increase that could lead to an unnecessary rebalancing. But the rule reference will be cleared from the hash entry anyway, so we will not get the stats from this rule again. Or am I missing something here? > >>> } >>> >>> hmap_remove(&bond->pr_rule_ops, &pr_op->hmap_node); >> >
On 2/12/24 18:49, Ilya Maximets wrote: > On 2/12/24 09:43, Adrian Moreno wrote: >> >> >> On 2/9/24 17:15, Ilya Maximets wrote: >>> On 2/9/24 08:06, Adrian Moreno wrote: >>>> In order to properly balance bond traffic, ofproto/bond periodically >>>> reads usage statistics of the post-recirculation rules (which are added >>>> to a hidden internal table). >>>> >>>> To do that, each "struct bond_entry" (which represents a hash within a >>>> bond) stores the last seen statistics for its rule. When a hash is moved >>>> to another member (due to a bond rebalance or the previous member going >>>> down), the rule is typically just modified, i.e: same match different >>>> actions. In this case, statistics are preserved and accounting continues >>>> to work. >>>> >>>> However, if the rule gets completely deleted (e.g: when all bond members >>>> go down) and then re-created, the new rule will have 0 tx_bytes but its >>>> associated entry will still store a non-zero last-seen value. >>>> This situation leads to an overflow of the delta calculation (computed >>>> as [current_stats_value - last_seen_value]), which can affect traffic >>>> as the hash will be considered to carry a lot of traffic and rebalancing >>>> will kick in. >>>> >>>> In order to fix this situation, reset the value of last seen statistics >>>> on rule deletion. >>>> >>>> Implementation note: >>>> The field ((struct bond_entry *) -> pr_tx_bytes) is guarded by rwlock >>>> but, as the comment above the function indicates, it's only called >>>> without having locked it when the bond is being deleted. Given that >>>> bond->hash is free()ed before deleting the rules (which makes writing to >>>> the entries a use-after-free anyway), we can use that to avoid >>>> double-locking. >>> >>> clang is not happy about this. All the clang jobs failed in CI. >>> We have to either remove the guarding annotation or just take the >>> lock in the unref() function. The latter probably makes more sense. >>> Is there a good reason to not take the lock there? >>> >> >> Yes, I was kind of expecting that one. TBH, I assumed there was a good reason >> based on the comment above the function. Given it's a function that frees >> resources, I assumed, it was handing some kind of corner-case. >> >> If you don't remember any context, then I'll take some more time and look at all >> possible implications of locking in bond_unref(). > > I wasn't part of the discussion when this code was introduced, and > I don't really see any discussion about that part in the list archives. > So, I'm just assuming that the logic was "there is only one reference > to an object while it is being unrefed, so there is no need to take > a mutex, i.e. we can avoid blocking some other threads on a global lock." > But I don't have any evidence supporting this. At the same time, I don't > see a clear reason to not lock, except for potentially locking some other > threads while the destruction is ongoing. > I don't see any strong reason either. It could slow down rule updates while some other bond is being deleted (probably not a high-frequency event) but considering that the expensive task, the OFP table update, is serialized anyway (via global ofproto_mutex) impact should not be very high. I'll resend the patch locking at bond_unref() and adding the proper clang flags. Thanks. >> >>>> >>>> Reported-at: https://github.com/openvswitch/ovs-issues/issues/319 >>>> Signed-off-by: Adrian Moreno <amorenoz@redhat.com> >>>> --- >>>> ofproto/bond.c | 4 ++++ >>>> 1 file changed, 4 insertions(+) >>>> >>>> diff --git a/ofproto/bond.c b/ofproto/bond.c >>>> index cfdf44f85..ad56bb96b 100644 >>>> --- a/ofproto/bond.c >>>> +++ b/ofproto/bond.c >>>> @@ -407,6 +407,10 @@ update_recirc_rules__(struct bond *bond) >>>> >>>> VLOG_ERR("failed to remove post recirculation flow %s", err_s); >>>> free(err_s); >>>> + } else if (bond->hash) { >>>> + uint32_t hash = pr_op->hmap_node.hash & BOND_MASK; >>>> + struct bond_entry *entry = &bond->hash[hash]; >>>> + entry->pr_tx_bytes = 0; >>> >>> A few lines below we will be leaning up the rule pointer regardless >>> of the error. Should this code be moved there? Potentially re-ordered >>> with hmap_remove to be sure that the hash is kept intact? >>> >> >> I figured if there is an error deleting the internal flow, we should not reset >> the byte counter. Not that I can imagine a situation where this would happen but >> if we fail to delete the rule and we then account for it, we'll report a >> potentially big rate increase that could lead to an unnecessary rebalancing. > > But the rule reference will be cleared from the hash entry anyway, > so we will not get the stats from this rule again. Or am I missing > something here? > Right, it will be removed from the hash, but it'll remain in OFP. If the rule is then added back (i.e: one bond member comes up), the OFP rule stats will remain and on the first rebalance round we'll account for a big increase. >> >>>> } >>>> >>>> hmap_remove(&bond->pr_rule_ops, &pr_op->hmap_node); >>> >> >
On 2/13/24 09:28, Adrian Moreno wrote: > > > On 2/12/24 18:49, Ilya Maximets wrote: >> On 2/12/24 09:43, Adrian Moreno wrote: >>> >>> >>> On 2/9/24 17:15, Ilya Maximets wrote: >>>> On 2/9/24 08:06, Adrian Moreno wrote: >>>>> In order to properly balance bond traffic, ofproto/bond periodically >>>>> reads usage statistics of the post-recirculation rules (which are added >>>>> to a hidden internal table). >>>>> >>>>> To do that, each "struct bond_entry" (which represents a hash within a >>>>> bond) stores the last seen statistics for its rule. When a hash is moved >>>>> to another member (due to a bond rebalance or the previous member going >>>>> down), the rule is typically just modified, i.e: same match different >>>>> actions. In this case, statistics are preserved and accounting continues >>>>> to work. >>>>> >>>>> However, if the rule gets completely deleted (e.g: when all bond members >>>>> go down) and then re-created, the new rule will have 0 tx_bytes but its >>>>> associated entry will still store a non-zero last-seen value. >>>>> This situation leads to an overflow of the delta calculation (computed >>>>> as [current_stats_value - last_seen_value]), which can affect traffic >>>>> as the hash will be considered to carry a lot of traffic and rebalancing >>>>> will kick in. >>>>> >>>>> In order to fix this situation, reset the value of last seen statistics >>>>> on rule deletion. >>>>> >>>>> Implementation note: >>>>> The field ((struct bond_entry *) -> pr_tx_bytes) is guarded by rwlock >>>>> but, as the comment above the function indicates, it's only called >>>>> without having locked it when the bond is being deleted. Given that >>>>> bond->hash is free()ed before deleting the rules (which makes writing to >>>>> the entries a use-after-free anyway), we can use that to avoid >>>>> double-locking. >>>> >>>> clang is not happy about this. All the clang jobs failed in CI. >>>> We have to either remove the guarding annotation or just take the >>>> lock in the unref() function. The latter probably makes more sense. >>>> Is there a good reason to not take the lock there? >>>> >>> >>> Yes, I was kind of expecting that one. TBH, I assumed there was a good reason >>> based on the comment above the function. Given it's a function that frees >>> resources, I assumed, it was handing some kind of corner-case. >>> >>> If you don't remember any context, then I'll take some more time and look at all >>> possible implications of locking in bond_unref(). >> >> I wasn't part of the discussion when this code was introduced, and >> I don't really see any discussion about that part in the list archives. >> So, I'm just assuming that the logic was "there is only one reference >> to an object while it is being unrefed, so there is no need to take >> a mutex, i.e. we can avoid blocking some other threads on a global lock." >> But I don't have any evidence supporting this. At the same time, I don't >> see a clear reason to not lock, except for potentially locking some other >> threads while the destruction is ongoing. >> > > I don't see any strong reason either. It could slow down rule updates while some > other bond is being deleted (probably not a high-frequency event) but > considering that the expensive task, the OFP table update, is serialized anyway > (via global ofproto_mutex) impact should not be very high. > > I'll resend the patch locking at bond_unref() and adding the proper clang flags. > > Thanks. > >>> >>>>> >>>>> Reported-at: https://github.com/openvswitch/ovs-issues/issues/319 >>>>> Signed-off-by: Adrian Moreno <amorenoz@redhat.com> >>>>> --- >>>>> ofproto/bond.c | 4 ++++ >>>>> 1 file changed, 4 insertions(+) >>>>> >>>>> diff --git a/ofproto/bond.c b/ofproto/bond.c >>>>> index cfdf44f85..ad56bb96b 100644 >>>>> --- a/ofproto/bond.c >>>>> +++ b/ofproto/bond.c >>>>> @@ -407,6 +407,10 @@ update_recirc_rules__(struct bond *bond) >>>>> >>>>> VLOG_ERR("failed to remove post recirculation flow %s", err_s); >>>>> free(err_s); >>>>> + } else if (bond->hash) { >>>>> + uint32_t hash = pr_op->hmap_node.hash & BOND_MASK; >>>>> + struct bond_entry *entry = &bond->hash[hash]; An empty line here would be nice. >>>>> + entry->pr_tx_bytes = 0; >>>> >>>> A few lines below we will be leaning up the rule pointer regardless >>>> of the error. Should this code be moved there? Potentially re-ordered >>>> with hmap_remove to be sure that the hash is kept intact? >>>> >>> >>> I figured if there is an error deleting the internal flow, we should not reset >>> the byte counter. Not that I can imagine a situation where this would happen but >>> if we fail to delete the rule and we then account for it, we'll report a >>> potentially big rate increase that could lead to an unnecessary rebalancing. >> >> But the rule reference will be cleared from the hash entry anyway, >> so we will not get the stats from this rule again. Or am I missing >> something here? >> > > Right, it will be removed from the hash, but it'll remain in OFP. If the rule is > then added back (i.e: one bond member comes up), the OFP rule stats will remain > and on the first rebalance round we'll account for a big increase. OK, makes sense. Maybe add a small comment explaining that in case of a rule deletion failure the subsequent ofproto_dpif_add_internal_flow() will replace the old rule preserving the stats? Best regards, Ilya Maximets.
On 2/13/24 12:52, Ilya Maximets wrote: > On 2/13/24 09:28, Adrian Moreno wrote: >> >> >> On 2/12/24 18:49, Ilya Maximets wrote: >>> On 2/12/24 09:43, Adrian Moreno wrote: >>>> >>>> >>>> On 2/9/24 17:15, Ilya Maximets wrote: >>>>> On 2/9/24 08:06, Adrian Moreno wrote: >>>>>> In order to properly balance bond traffic, ofproto/bond periodically >>>>>> reads usage statistics of the post-recirculation rules (which are added >>>>>> to a hidden internal table). >>>>>> >>>>>> To do that, each "struct bond_entry" (which represents a hash within a >>>>>> bond) stores the last seen statistics for its rule. When a hash is moved >>>>>> to another member (due to a bond rebalance or the previous member going >>>>>> down), the rule is typically just modified, i.e: same match different >>>>>> actions. In this case, statistics are preserved and accounting continues >>>>>> to work. >>>>>> >>>>>> However, if the rule gets completely deleted (e.g: when all bond members >>>>>> go down) and then re-created, the new rule will have 0 tx_bytes but its >>>>>> associated entry will still store a non-zero last-seen value. >>>>>> This situation leads to an overflow of the delta calculation (computed >>>>>> as [current_stats_value - last_seen_value]), which can affect traffic >>>>>> as the hash will be considered to carry a lot of traffic and rebalancing >>>>>> will kick in. >>>>>> >>>>>> In order to fix this situation, reset the value of last seen statistics >>>>>> on rule deletion. >>>>>> >>>>>> Implementation note: >>>>>> The field ((struct bond_entry *) -> pr_tx_bytes) is guarded by rwlock >>>>>> but, as the comment above the function indicates, it's only called >>>>>> without having locked it when the bond is being deleted. Given that >>>>>> bond->hash is free()ed before deleting the rules (which makes writing to >>>>>> the entries a use-after-free anyway), we can use that to avoid >>>>>> double-locking. >>>>> >>>>> clang is not happy about this. All the clang jobs failed in CI. >>>>> We have to either remove the guarding annotation or just take the >>>>> lock in the unref() function. The latter probably makes more sense. >>>>> Is there a good reason to not take the lock there? >>>>> >>>> >>>> Yes, I was kind of expecting that one. TBH, I assumed there was a good reason >>>> based on the comment above the function. Given it's a function that frees >>>> resources, I assumed, it was handing some kind of corner-case. >>>> >>>> If you don't remember any context, then I'll take some more time and look at all >>>> possible implications of locking in bond_unref(). >>> >>> I wasn't part of the discussion when this code was introduced, and >>> I don't really see any discussion about that part in the list archives. >>> So, I'm just assuming that the logic was "there is only one reference >>> to an object while it is being unrefed, so there is no need to take >>> a mutex, i.e. we can avoid blocking some other threads on a global lock." >>> But I don't have any evidence supporting this. At the same time, I don't >>> see a clear reason to not lock, except for potentially locking some other >>> threads while the destruction is ongoing. >>> >> >> I don't see any strong reason either. It could slow down rule updates while some >> other bond is being deleted (probably not a high-frequency event) but >> considering that the expensive task, the OFP table update, is serialized anyway >> (via global ofproto_mutex) impact should not be very high. >> >> I'll resend the patch locking at bond_unref() and adding the proper clang flags. >> >> Thanks. >> >>>> >>>>>> >>>>>> Reported-at: https://github.com/openvswitch/ovs-issues/issues/319 >>>>>> Signed-off-by: Adrian Moreno <amorenoz@redhat.com> >>>>>> --- >>>>>> ofproto/bond.c | 4 ++++ >>>>>> 1 file changed, 4 insertions(+) >>>>>> >>>>>> diff --git a/ofproto/bond.c b/ofproto/bond.c >>>>>> index cfdf44f85..ad56bb96b 100644 >>>>>> --- a/ofproto/bond.c >>>>>> +++ b/ofproto/bond.c >>>>>> @@ -407,6 +407,10 @@ update_recirc_rules__(struct bond *bond) >>>>>> >>>>>> VLOG_ERR("failed to remove post recirculation flow %s", err_s); >>>>>> free(err_s); >>>>>> + } else if (bond->hash) { >>>>>> + uint32_t hash = pr_op->hmap_node.hash & BOND_MASK; >>>>>> + struct bond_entry *entry = &bond->hash[hash]; > > An empty line here would be nice. Sure. > >>>>>> + entry->pr_tx_bytes = 0; >>>>> >>>>> A few lines below we will be leaning up the rule pointer regardless >>>>> of the error. Should this code be moved there? Potentially re-ordered >>>>> with hmap_remove to be sure that the hash is kept intact? >>>>> >>>> >>>> I figured if there is an error deleting the internal flow, we should not reset >>>> the byte counter. Not that I can imagine a situation where this would happen but >>>> if we fail to delete the rule and we then account for it, we'll report a >>>> potentially big rate increase that could lead to an unnecessary rebalancing. >>> >>> But the rule reference will be cleared from the hash entry anyway, >>> so we will not get the stats from this rule again. Or am I missing >>> something here? >>> >> >> Right, it will be removed from the hash, but it'll remain in OFP. If the rule is >> then added back (i.e: one bond member comes up), the OFP rule stats will remain >> and on the first rebalance round we'll account for a big increase. > > OK, makes sense. Maybe add a small comment explaining that in case of > a rule deletion failure the subsequent ofproto_dpif_add_internal_flow() > will replace the old rule preserving the stats? > OK > Best regards, Ilya Maximets. >
diff --git a/ofproto/bond.c b/ofproto/bond.c index cfdf44f85..ad56bb96b 100644 --- a/ofproto/bond.c +++ b/ofproto/bond.c @@ -407,6 +407,10 @@ update_recirc_rules__(struct bond *bond) VLOG_ERR("failed to remove post recirculation flow %s", err_s); free(err_s); + } else if (bond->hash) { + uint32_t hash = pr_op->hmap_node.hash & BOND_MASK; + struct bond_entry *entry = &bond->hash[hash]; + entry->pr_tx_bytes = 0; } hmap_remove(&bond->pr_rule_ops, &pr_op->hmap_node);
In order to properly balance bond traffic, ofproto/bond periodically reads usage statistics of the post-recirculation rules (which are added to a hidden internal table). To do that, each "struct bond_entry" (which represents a hash within a bond) stores the last seen statistics for its rule. When a hash is moved to another member (due to a bond rebalance or the previous member going down), the rule is typically just modified, i.e: same match different actions. In this case, statistics are preserved and accounting continues to work. However, if the rule gets completely deleted (e.g: when all bond members go down) and then re-created, the new rule will have 0 tx_bytes but its associated entry will still store a non-zero last-seen value. This situation leads to an overflow of the delta calculation (computed as [current_stats_value - last_seen_value]), which can affect traffic as the hash will be considered to carry a lot of traffic and rebalancing will kick in. In order to fix this situation, reset the value of last seen statistics on rule deletion. Implementation note: The field ((struct bond_entry *) -> pr_tx_bytes) is guarded by rwlock but, as the comment above the function indicates, it's only called without having locked it when the bond is being deleted. Given that bond->hash is free()ed before deleting the rules (which makes writing to the entries a use-after-free anyway), we can use that to avoid double-locking. Reported-at: https://github.com/openvswitch/ovs-issues/issues/319 Signed-off-by: Adrian Moreno <amorenoz@redhat.com> --- ofproto/bond.c | 4 ++++ 1 file changed, 4 insertions(+)