diff mbox series

[ovs-dev] bond: Reset stats when deleting post recirc rule.

Message ID 20240209070642.2412417-1-amorenoz@redhat.com
State Changes Requested
Headers show
Series [ovs-dev] bond: Reset stats when deleting post recirc rule. | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test fail github build: failed
ovsrobot/intel-ovs-compilation success test: success

Commit Message

Adrian Moreno Feb. 9, 2024, 7:06 a.m. UTC
In order to properly balance bond traffic, ofproto/bond periodically
reads usage statistics of the post-recirculation rules (which are added
to a hidden internal table).

To do that, each "struct bond_entry" (which represents a hash within a
bond) stores the last seen statistics for its rule. When a hash is moved
to another member (due to a bond rebalance or the previous member going
down), the rule is typically just modified, i.e: same match different
actions. In this case, statistics are preserved and accounting continues
to work.

However, if the rule gets completely deleted (e.g: when all bond members
go down) and then re-created, the new rule will have 0 tx_bytes but its
associated entry will still store a non-zero last-seen value.
This situation leads to an overflow of the delta calculation (computed
as [current_stats_value - last_seen_value]), which can affect traffic
as the hash will be considered to carry a lot of traffic and rebalancing
will kick in.

In order to fix this situation, reset the value of last seen statistics
on rule deletion.

Implementation note:
The field ((struct bond_entry *) -> pr_tx_bytes) is guarded by rwlock
but, as the comment above the function indicates, it's only called
without having locked it when the bond is being deleted. Given that
bond->hash is free()ed before deleting the rules (which makes writing to
the entries a use-after-free anyway), we can use that to avoid
double-locking.

Reported-at: https://github.com/openvswitch/ovs-issues/issues/319
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
---
 ofproto/bond.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Ilya Maximets Feb. 9, 2024, 4:15 p.m. UTC | #1
On 2/9/24 08:06, Adrian Moreno wrote:
> In order to properly balance bond traffic, ofproto/bond periodically
> reads usage statistics of the post-recirculation rules (which are added
> to a hidden internal table).
> 
> To do that, each "struct bond_entry" (which represents a hash within a
> bond) stores the last seen statistics for its rule. When a hash is moved
> to another member (due to a bond rebalance or the previous member going
> down), the rule is typically just modified, i.e: same match different
> actions. In this case, statistics are preserved and accounting continues
> to work.
> 
> However, if the rule gets completely deleted (e.g: when all bond members
> go down) and then re-created, the new rule will have 0 tx_bytes but its
> associated entry will still store a non-zero last-seen value.
> This situation leads to an overflow of the delta calculation (computed
> as [current_stats_value - last_seen_value]), which can affect traffic
> as the hash will be considered to carry a lot of traffic and rebalancing
> will kick in.
> 
> In order to fix this situation, reset the value of last seen statistics
> on rule deletion.
> 
> Implementation note:
> The field ((struct bond_entry *) -> pr_tx_bytes) is guarded by rwlock
> but, as the comment above the function indicates, it's only called
> without having locked it when the bond is being deleted. Given that
> bond->hash is free()ed before deleting the rules (which makes writing to
> the entries a use-after-free anyway), we can use that to avoid
> double-locking.

clang is not happy about this.  All the clang jobs failed in CI.
We have to either remove the guarding annotation or just take the
lock in the unref() function.  The latter probably makes more sense.
Is there a good reason to not take the lock there?

> 
> Reported-at: https://github.com/openvswitch/ovs-issues/issues/319
> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
> ---
>  ofproto/bond.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/ofproto/bond.c b/ofproto/bond.c
> index cfdf44f85..ad56bb96b 100644
> --- a/ofproto/bond.c
> +++ b/ofproto/bond.c
> @@ -407,6 +407,10 @@ update_recirc_rules__(struct bond *bond)
>  
>                  VLOG_ERR("failed to remove post recirculation flow %s", err_s);
>                  free(err_s);
> +            } else if (bond->hash) {
> +                uint32_t hash = pr_op->hmap_node.hash & BOND_MASK;
> +                struct bond_entry *entry = &bond->hash[hash];
> +                entry->pr_tx_bytes = 0;

A few lines below we will be leaning up the rule pointer regardless
of the error.  Should this code be moved there?  Potentially re-ordered
with hmap_remove to be sure that the hash is kept intact?

>              }
>  
>              hmap_remove(&bond->pr_rule_ops, &pr_op->hmap_node);
Adrian Moreno Feb. 12, 2024, 8:43 a.m. UTC | #2
On 2/9/24 17:15, Ilya Maximets wrote:
> On 2/9/24 08:06, Adrian Moreno wrote:
>> In order to properly balance bond traffic, ofproto/bond periodically
>> reads usage statistics of the post-recirculation rules (which are added
>> to a hidden internal table).
>>
>> To do that, each "struct bond_entry" (which represents a hash within a
>> bond) stores the last seen statistics for its rule. When a hash is moved
>> to another member (due to a bond rebalance or the previous member going
>> down), the rule is typically just modified, i.e: same match different
>> actions. In this case, statistics are preserved and accounting continues
>> to work.
>>
>> However, if the rule gets completely deleted (e.g: when all bond members
>> go down) and then re-created, the new rule will have 0 tx_bytes but its
>> associated entry will still store a non-zero last-seen value.
>> This situation leads to an overflow of the delta calculation (computed
>> as [current_stats_value - last_seen_value]), which can affect traffic
>> as the hash will be considered to carry a lot of traffic and rebalancing
>> will kick in.
>>
>> In order to fix this situation, reset the value of last seen statistics
>> on rule deletion.
>>
>> Implementation note:
>> The field ((struct bond_entry *) -> pr_tx_bytes) is guarded by rwlock
>> but, as the comment above the function indicates, it's only called
>> without having locked it when the bond is being deleted. Given that
>> bond->hash is free()ed before deleting the rules (which makes writing to
>> the entries a use-after-free anyway), we can use that to avoid
>> double-locking.
> 
> clang is not happy about this.  All the clang jobs failed in CI.
> We have to either remove the guarding annotation or just take the
> lock in the unref() function.  The latter probably makes more sense.
> Is there a good reason to not take the lock there?
> 

Yes, I was kind of expecting that one. TBH, I assumed there was a good reason 
based on the comment above the function. Given it's a function that frees 
resources, I assumed, it was handing some kind  of corner-case.

If you don't remember any context, then I'll take some more time and look at all 
possible implications of locking in bond_unref().

>>
>> Reported-at: https://github.com/openvswitch/ovs-issues/issues/319
>> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
>> ---
>>   ofproto/bond.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/ofproto/bond.c b/ofproto/bond.c
>> index cfdf44f85..ad56bb96b 100644
>> --- a/ofproto/bond.c
>> +++ b/ofproto/bond.c
>> @@ -407,6 +407,10 @@ update_recirc_rules__(struct bond *bond)
>>   
>>                   VLOG_ERR("failed to remove post recirculation flow %s", err_s);
>>                   free(err_s);
>> +            } else if (bond->hash) {
>> +                uint32_t hash = pr_op->hmap_node.hash & BOND_MASK;
>> +                struct bond_entry *entry = &bond->hash[hash];
>> +                entry->pr_tx_bytes = 0;
> 
> A few lines below we will be leaning up the rule pointer regardless
> of the error.  Should this code be moved there?  Potentially re-ordered
> with hmap_remove to be sure that the hash is kept intact?
> 

I figured if there is an error deleting the internal flow, we should not reset 
the byte counter. Not that I can imagine a situation where this would happen but 
if we fail to delete the rule and we then account for it, we'll report a 
potentially big rate increase that could lead to an unnecessary rebalancing.

>>               }
>>   
>>               hmap_remove(&bond->pr_rule_ops, &pr_op->hmap_node);
>
Ilya Maximets Feb. 12, 2024, 5:49 p.m. UTC | #3
On 2/12/24 09:43, Adrian Moreno wrote:
> 
> 
> On 2/9/24 17:15, Ilya Maximets wrote:
>> On 2/9/24 08:06, Adrian Moreno wrote:
>>> In order to properly balance bond traffic, ofproto/bond periodically
>>> reads usage statistics of the post-recirculation rules (which are added
>>> to a hidden internal table).
>>>
>>> To do that, each "struct bond_entry" (which represents a hash within a
>>> bond) stores the last seen statistics for its rule. When a hash is moved
>>> to another member (due to a bond rebalance or the previous member going
>>> down), the rule is typically just modified, i.e: same match different
>>> actions. In this case, statistics are preserved and accounting continues
>>> to work.
>>>
>>> However, if the rule gets completely deleted (e.g: when all bond members
>>> go down) and then re-created, the new rule will have 0 tx_bytes but its
>>> associated entry will still store a non-zero last-seen value.
>>> This situation leads to an overflow of the delta calculation (computed
>>> as [current_stats_value - last_seen_value]), which can affect traffic
>>> as the hash will be considered to carry a lot of traffic and rebalancing
>>> will kick in.
>>>
>>> In order to fix this situation, reset the value of last seen statistics
>>> on rule deletion.
>>>
>>> Implementation note:
>>> The field ((struct bond_entry *) -> pr_tx_bytes) is guarded by rwlock
>>> but, as the comment above the function indicates, it's only called
>>> without having locked it when the bond is being deleted. Given that
>>> bond->hash is free()ed before deleting the rules (which makes writing to
>>> the entries a use-after-free anyway), we can use that to avoid
>>> double-locking.
>>
>> clang is not happy about this.  All the clang jobs failed in CI.
>> We have to either remove the guarding annotation or just take the
>> lock in the unref() function.  The latter probably makes more sense.
>> Is there a good reason to not take the lock there?
>>
> 
> Yes, I was kind of expecting that one. TBH, I assumed there was a good reason 
> based on the comment above the function. Given it's a function that frees 
> resources, I assumed, it was handing some kind  of corner-case.
> 
> If you don't remember any context, then I'll take some more time and look at all 
> possible implications of locking in bond_unref().

I wasn't part of the discussion when this code was introduced, and
I don't really see any discussion about that part in the list archives.
So, I'm just assuming that the logic was "there is only one reference
to an object while it is being unrefed, so there is no need to take
a mutex, i.e. we can avoid blocking some other threads on a global lock."
But I don't have any evidence supporting this.  At the same time, I don't
see a clear reason to not lock, except for potentially locking some other
threads while the destruction is ongoing.

> 
>>>
>>> Reported-at: https://github.com/openvswitch/ovs-issues/issues/319
>>> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
>>> ---
>>>   ofproto/bond.c | 4 ++++
>>>   1 file changed, 4 insertions(+)
>>>
>>> diff --git a/ofproto/bond.c b/ofproto/bond.c
>>> index cfdf44f85..ad56bb96b 100644
>>> --- a/ofproto/bond.c
>>> +++ b/ofproto/bond.c
>>> @@ -407,6 +407,10 @@ update_recirc_rules__(struct bond *bond)
>>>   
>>>                   VLOG_ERR("failed to remove post recirculation flow %s", err_s);
>>>                   free(err_s);
>>> +            } else if (bond->hash) {
>>> +                uint32_t hash = pr_op->hmap_node.hash & BOND_MASK;
>>> +                struct bond_entry *entry = &bond->hash[hash];
>>> +                entry->pr_tx_bytes = 0;
>>
>> A few lines below we will be leaning up the rule pointer regardless
>> of the error.  Should this code be moved there?  Potentially re-ordered
>> with hmap_remove to be sure that the hash is kept intact?
>>
> 
> I figured if there is an error deleting the internal flow, we should not reset 
> the byte counter. Not that I can imagine a situation where this would happen but 
> if we fail to delete the rule and we then account for it, we'll report a 
> potentially big rate increase that could lead to an unnecessary rebalancing.

But the rule reference will be cleared from the hash entry anyway,
so we will not get the stats from this rule again.  Or am I missing
something here?

> 
>>>               }
>>>   
>>>               hmap_remove(&bond->pr_rule_ops, &pr_op->hmap_node);
>>
>
Adrian Moreno Feb. 13, 2024, 8:28 a.m. UTC | #4
On 2/12/24 18:49, Ilya Maximets wrote:
> On 2/12/24 09:43, Adrian Moreno wrote:
>>
>>
>> On 2/9/24 17:15, Ilya Maximets wrote:
>>> On 2/9/24 08:06, Adrian Moreno wrote:
>>>> In order to properly balance bond traffic, ofproto/bond periodically
>>>> reads usage statistics of the post-recirculation rules (which are added
>>>> to a hidden internal table).
>>>>
>>>> To do that, each "struct bond_entry" (which represents a hash within a
>>>> bond) stores the last seen statistics for its rule. When a hash is moved
>>>> to another member (due to a bond rebalance or the previous member going
>>>> down), the rule is typically just modified, i.e: same match different
>>>> actions. In this case, statistics are preserved and accounting continues
>>>> to work.
>>>>
>>>> However, if the rule gets completely deleted (e.g: when all bond members
>>>> go down) and then re-created, the new rule will have 0 tx_bytes but its
>>>> associated entry will still store a non-zero last-seen value.
>>>> This situation leads to an overflow of the delta calculation (computed
>>>> as [current_stats_value - last_seen_value]), which can affect traffic
>>>> as the hash will be considered to carry a lot of traffic and rebalancing
>>>> will kick in.
>>>>
>>>> In order to fix this situation, reset the value of last seen statistics
>>>> on rule deletion.
>>>>
>>>> Implementation note:
>>>> The field ((struct bond_entry *) -> pr_tx_bytes) is guarded by rwlock
>>>> but, as the comment above the function indicates, it's only called
>>>> without having locked it when the bond is being deleted. Given that
>>>> bond->hash is free()ed before deleting the rules (which makes writing to
>>>> the entries a use-after-free anyway), we can use that to avoid
>>>> double-locking.
>>>
>>> clang is not happy about this.  All the clang jobs failed in CI.
>>> We have to either remove the guarding annotation or just take the
>>> lock in the unref() function.  The latter probably makes more sense.
>>> Is there a good reason to not take the lock there?
>>>
>>
>> Yes, I was kind of expecting that one. TBH, I assumed there was a good reason
>> based on the comment above the function. Given it's a function that frees
>> resources, I assumed, it was handing some kind  of corner-case.
>>
>> If you don't remember any context, then I'll take some more time and look at all
>> possible implications of locking in bond_unref().
> 
> I wasn't part of the discussion when this code was introduced, and
> I don't really see any discussion about that part in the list archives.
> So, I'm just assuming that the logic was "there is only one reference
> to an object while it is being unrefed, so there is no need to take
> a mutex, i.e. we can avoid blocking some other threads on a global lock."
> But I don't have any evidence supporting this.  At the same time, I don't
> see a clear reason to not lock, except for potentially locking some other
> threads while the destruction is ongoing.
> 

I don't see any strong reason either. It could slow down rule updates while some 
other bond is being deleted (probably not a high-frequency event) but 
considering that the expensive task, the OFP table update, is serialized anyway 
(via global ofproto_mutex) impact should not be very high.

I'll resend the patch locking at bond_unref() and adding the proper clang flags.

Thanks.

>>
>>>>
>>>> Reported-at: https://github.com/openvswitch/ovs-issues/issues/319
>>>> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
>>>> ---
>>>>    ofproto/bond.c | 4 ++++
>>>>    1 file changed, 4 insertions(+)
>>>>
>>>> diff --git a/ofproto/bond.c b/ofproto/bond.c
>>>> index cfdf44f85..ad56bb96b 100644
>>>> --- a/ofproto/bond.c
>>>> +++ b/ofproto/bond.c
>>>> @@ -407,6 +407,10 @@ update_recirc_rules__(struct bond *bond)
>>>>    
>>>>                    VLOG_ERR("failed to remove post recirculation flow %s", err_s);
>>>>                    free(err_s);
>>>> +            } else if (bond->hash) {
>>>> +                uint32_t hash = pr_op->hmap_node.hash & BOND_MASK;
>>>> +                struct bond_entry *entry = &bond->hash[hash];
>>>> +                entry->pr_tx_bytes = 0;
>>>
>>> A few lines below we will be leaning up the rule pointer regardless
>>> of the error.  Should this code be moved there?  Potentially re-ordered
>>> with hmap_remove to be sure that the hash is kept intact?
>>>
>>
>> I figured if there is an error deleting the internal flow, we should not reset
>> the byte counter. Not that I can imagine a situation where this would happen but
>> if we fail to delete the rule and we then account for it, we'll report a
>> potentially big rate increase that could lead to an unnecessary rebalancing.
> 
> But the rule reference will be cleared from the hash entry anyway,
> so we will not get the stats from this rule again.  Or am I missing
> something here?
> 

Right, it will be removed from the hash, but it'll remain in OFP. If the rule is 
then added back (i.e: one bond member comes up), the OFP rule stats will remain 
and on the first rebalance round we'll account for a big increase.


>>
>>>>                }
>>>>    
>>>>                hmap_remove(&bond->pr_rule_ops, &pr_op->hmap_node);
>>>
>>
>
Ilya Maximets Feb. 13, 2024, 11:52 a.m. UTC | #5
On 2/13/24 09:28, Adrian Moreno wrote:
> 
> 
> On 2/12/24 18:49, Ilya Maximets wrote:
>> On 2/12/24 09:43, Adrian Moreno wrote:
>>>
>>>
>>> On 2/9/24 17:15, Ilya Maximets wrote:
>>>> On 2/9/24 08:06, Adrian Moreno wrote:
>>>>> In order to properly balance bond traffic, ofproto/bond periodically
>>>>> reads usage statistics of the post-recirculation rules (which are added
>>>>> to a hidden internal table).
>>>>>
>>>>> To do that, each "struct bond_entry" (which represents a hash within a
>>>>> bond) stores the last seen statistics for its rule. When a hash is moved
>>>>> to another member (due to a bond rebalance or the previous member going
>>>>> down), the rule is typically just modified, i.e: same match different
>>>>> actions. In this case, statistics are preserved and accounting continues
>>>>> to work.
>>>>>
>>>>> However, if the rule gets completely deleted (e.g: when all bond members
>>>>> go down) and then re-created, the new rule will have 0 tx_bytes but its
>>>>> associated entry will still store a non-zero last-seen value.
>>>>> This situation leads to an overflow of the delta calculation (computed
>>>>> as [current_stats_value - last_seen_value]), which can affect traffic
>>>>> as the hash will be considered to carry a lot of traffic and rebalancing
>>>>> will kick in.
>>>>>
>>>>> In order to fix this situation, reset the value of last seen statistics
>>>>> on rule deletion.
>>>>>
>>>>> Implementation note:
>>>>> The field ((struct bond_entry *) -> pr_tx_bytes) is guarded by rwlock
>>>>> but, as the comment above the function indicates, it's only called
>>>>> without having locked it when the bond is being deleted. Given that
>>>>> bond->hash is free()ed before deleting the rules (which makes writing to
>>>>> the entries a use-after-free anyway), we can use that to avoid
>>>>> double-locking.
>>>>
>>>> clang is not happy about this.  All the clang jobs failed in CI.
>>>> We have to either remove the guarding annotation or just take the
>>>> lock in the unref() function.  The latter probably makes more sense.
>>>> Is there a good reason to not take the lock there?
>>>>
>>>
>>> Yes, I was kind of expecting that one. TBH, I assumed there was a good reason
>>> based on the comment above the function. Given it's a function that frees
>>> resources, I assumed, it was handing some kind  of corner-case.
>>>
>>> If you don't remember any context, then I'll take some more time and look at all
>>> possible implications of locking in bond_unref().
>>
>> I wasn't part of the discussion when this code was introduced, and
>> I don't really see any discussion about that part in the list archives.
>> So, I'm just assuming that the logic was "there is only one reference
>> to an object while it is being unrefed, so there is no need to take
>> a mutex, i.e. we can avoid blocking some other threads on a global lock."
>> But I don't have any evidence supporting this.  At the same time, I don't
>> see a clear reason to not lock, except for potentially locking some other
>> threads while the destruction is ongoing.
>>
> 
> I don't see any strong reason either. It could slow down rule updates while some 
> other bond is being deleted (probably not a high-frequency event) but 
> considering that the expensive task, the OFP table update, is serialized anyway 
> (via global ofproto_mutex) impact should not be very high.
> 
> I'll resend the patch locking at bond_unref() and adding the proper clang flags.
> 
> Thanks.
> 
>>>
>>>>>
>>>>> Reported-at: https://github.com/openvswitch/ovs-issues/issues/319
>>>>> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
>>>>> ---
>>>>>    ofproto/bond.c | 4 ++++
>>>>>    1 file changed, 4 insertions(+)
>>>>>
>>>>> diff --git a/ofproto/bond.c b/ofproto/bond.c
>>>>> index cfdf44f85..ad56bb96b 100644
>>>>> --- a/ofproto/bond.c
>>>>> +++ b/ofproto/bond.c
>>>>> @@ -407,6 +407,10 @@ update_recirc_rules__(struct bond *bond)
>>>>>    
>>>>>                    VLOG_ERR("failed to remove post recirculation flow %s", err_s);
>>>>>                    free(err_s);
>>>>> +            } else if (bond->hash) {
>>>>> +                uint32_t hash = pr_op->hmap_node.hash & BOND_MASK;
>>>>> +                struct bond_entry *entry = &bond->hash[hash];

An empty line here would be nice.

>>>>> +                entry->pr_tx_bytes = 0;
>>>>
>>>> A few lines below we will be leaning up the rule pointer regardless
>>>> of the error.  Should this code be moved there?  Potentially re-ordered
>>>> with hmap_remove to be sure that the hash is kept intact?
>>>>
>>>
>>> I figured if there is an error deleting the internal flow, we should not reset
>>> the byte counter. Not that I can imagine a situation where this would happen but
>>> if we fail to delete the rule and we then account for it, we'll report a
>>> potentially big rate increase that could lead to an unnecessary rebalancing.
>>
>> But the rule reference will be cleared from the hash entry anyway,
>> so we will not get the stats from this rule again.  Or am I missing
>> something here?
>>
> 
> Right, it will be removed from the hash, but it'll remain in OFP. If the rule is 
> then added back (i.e: one bond member comes up), the OFP rule stats will remain 
> and on the first rebalance round we'll account for a big increase.

OK, makes sense.  Maybe add a small comment explaining that in case of
a rule deletion failure the subsequent ofproto_dpif_add_internal_flow()
will replace the old rule preserving the stats?

Best regards, Ilya Maximets.
Adrian Moreno Feb. 13, 2024, 7:55 p.m. UTC | #6
On 2/13/24 12:52, Ilya Maximets wrote:
> On 2/13/24 09:28, Adrian Moreno wrote:
>>
>>
>> On 2/12/24 18:49, Ilya Maximets wrote:
>>> On 2/12/24 09:43, Adrian Moreno wrote:
>>>>
>>>>
>>>> On 2/9/24 17:15, Ilya Maximets wrote:
>>>>> On 2/9/24 08:06, Adrian Moreno wrote:
>>>>>> In order to properly balance bond traffic, ofproto/bond periodically
>>>>>> reads usage statistics of the post-recirculation rules (which are added
>>>>>> to a hidden internal table).
>>>>>>
>>>>>> To do that, each "struct bond_entry" (which represents a hash within a
>>>>>> bond) stores the last seen statistics for its rule. When a hash is moved
>>>>>> to another member (due to a bond rebalance or the previous member going
>>>>>> down), the rule is typically just modified, i.e: same match different
>>>>>> actions. In this case, statistics are preserved and accounting continues
>>>>>> to work.
>>>>>>
>>>>>> However, if the rule gets completely deleted (e.g: when all bond members
>>>>>> go down) and then re-created, the new rule will have 0 tx_bytes but its
>>>>>> associated entry will still store a non-zero last-seen value.
>>>>>> This situation leads to an overflow of the delta calculation (computed
>>>>>> as [current_stats_value - last_seen_value]), which can affect traffic
>>>>>> as the hash will be considered to carry a lot of traffic and rebalancing
>>>>>> will kick in.
>>>>>>
>>>>>> In order to fix this situation, reset the value of last seen statistics
>>>>>> on rule deletion.
>>>>>>
>>>>>> Implementation note:
>>>>>> The field ((struct bond_entry *) -> pr_tx_bytes) is guarded by rwlock
>>>>>> but, as the comment above the function indicates, it's only called
>>>>>> without having locked it when the bond is being deleted. Given that
>>>>>> bond->hash is free()ed before deleting the rules (which makes writing to
>>>>>> the entries a use-after-free anyway), we can use that to avoid
>>>>>> double-locking.
>>>>>
>>>>> clang is not happy about this.  All the clang jobs failed in CI.
>>>>> We have to either remove the guarding annotation or just take the
>>>>> lock in the unref() function.  The latter probably makes more sense.
>>>>> Is there a good reason to not take the lock there?
>>>>>
>>>>
>>>> Yes, I was kind of expecting that one. TBH, I assumed there was a good reason
>>>> based on the comment above the function. Given it's a function that frees
>>>> resources, I assumed, it was handing some kind  of corner-case.
>>>>
>>>> If you don't remember any context, then I'll take some more time and look at all
>>>> possible implications of locking in bond_unref().
>>>
>>> I wasn't part of the discussion when this code was introduced, and
>>> I don't really see any discussion about that part in the list archives.
>>> So, I'm just assuming that the logic was "there is only one reference
>>> to an object while it is being unrefed, so there is no need to take
>>> a mutex, i.e. we can avoid blocking some other threads on a global lock."
>>> But I don't have any evidence supporting this.  At the same time, I don't
>>> see a clear reason to not lock, except for potentially locking some other
>>> threads while the destruction is ongoing.
>>>
>>
>> I don't see any strong reason either. It could slow down rule updates while some
>> other bond is being deleted (probably not a high-frequency event) but
>> considering that the expensive task, the OFP table update, is serialized anyway
>> (via global ofproto_mutex) impact should not be very high.
>>
>> I'll resend the patch locking at bond_unref() and adding the proper clang flags.
>>
>> Thanks.
>>
>>>>
>>>>>>
>>>>>> Reported-at: https://github.com/openvswitch/ovs-issues/issues/319
>>>>>> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
>>>>>> ---
>>>>>>     ofproto/bond.c | 4 ++++
>>>>>>     1 file changed, 4 insertions(+)
>>>>>>
>>>>>> diff --git a/ofproto/bond.c b/ofproto/bond.c
>>>>>> index cfdf44f85..ad56bb96b 100644
>>>>>> --- a/ofproto/bond.c
>>>>>> +++ b/ofproto/bond.c
>>>>>> @@ -407,6 +407,10 @@ update_recirc_rules__(struct bond *bond)
>>>>>>     
>>>>>>                     VLOG_ERR("failed to remove post recirculation flow %s", err_s);
>>>>>>                     free(err_s);
>>>>>> +            } else if (bond->hash) {
>>>>>> +                uint32_t hash = pr_op->hmap_node.hash & BOND_MASK;
>>>>>> +                struct bond_entry *entry = &bond->hash[hash];
> 
> An empty line here would be nice.

Sure.

> 
>>>>>> +                entry->pr_tx_bytes = 0;
>>>>>
>>>>> A few lines below we will be leaning up the rule pointer regardless
>>>>> of the error.  Should this code be moved there?  Potentially re-ordered
>>>>> with hmap_remove to be sure that the hash is kept intact?
>>>>>
>>>>
>>>> I figured if there is an error deleting the internal flow, we should not reset
>>>> the byte counter. Not that I can imagine a situation where this would happen but
>>>> if we fail to delete the rule and we then account for it, we'll report a
>>>> potentially big rate increase that could lead to an unnecessary rebalancing.
>>>
>>> But the rule reference will be cleared from the hash entry anyway,
>>> so we will not get the stats from this rule again.  Or am I missing
>>> something here?
>>>
>>
>> Right, it will be removed from the hash, but it'll remain in OFP. If the rule is
>> then added back (i.e: one bond member comes up), the OFP rule stats will remain
>> and on the first rebalance round we'll account for a big increase.
> 
> OK, makes sense.  Maybe add a small comment explaining that in case of
> a rule deletion failure the subsequent ofproto_dpif_add_internal_flow()
> will replace the old rule preserving the stats?
> 

OK

> Best regards, Ilya Maximets.
>
diff mbox series

Patch

diff --git a/ofproto/bond.c b/ofproto/bond.c
index cfdf44f85..ad56bb96b 100644
--- a/ofproto/bond.c
+++ b/ofproto/bond.c
@@ -407,6 +407,10 @@  update_recirc_rules__(struct bond *bond)
 
                 VLOG_ERR("failed to remove post recirculation flow %s", err_s);
                 free(err_s);
+            } else if (bond->hash) {
+                uint32_t hash = pr_op->hmap_node.hash & BOND_MASK;
+                struct bond_entry *entry = &bond->hash[hash];
+                entry->pr_tx_bytes = 0;
             }
 
             hmap_remove(&bond->pr_rule_ops, &pr_op->hmap_node);