Message ID | 20230720150607.2544072-1-mkp@redhat.com |
---|---|
State | Changes Requested |
Headers | show |
Series | [ovs-dev] ofproto-dpif-xlate: Don't reinstall removed XC_LEARN rule | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | success | apply and check: success |
ovsrobot/github-robot-_Build_and_Test | fail | github build: failed |
ovsrobot/intel-ovs-compilation | fail | test: fail |
On 20 Jul 2023, at 17:06, Mike Pattrick wrote: > When the a revalidator thread is updating statistics for an XC_LEARN > xcache entry in xlate_push_stats_entry it uses ofproto_flow_mod_learn. > The revalidator will update stats for rules even if they are in a > removed state or marked as invisible. However, ofproto_flow_mod_learn > will detect if a flow has been removed and re-add it in that case. This > can result in an old learn action replacing the new learn action that > had replaced it in the first place. > > This change adds a new parameter to ofproto_flow_mod_learn allowing the > caller to specify an action to take if temp_rule is removed. If it is > removed and the caller is xlate_push_stats_entry, then a new rule will > not be replaced. > > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2213892 > Signed-off-by: Mike Pattrick <mkp@redhat.com> Hi Mike, Thanks for this, patch! I went over the code, and the fix looks fine to me. However, due to being on PTO in a couple of hours, I did not fully go over the actual LEARN action implementation to find potential side effects. Maybe someone else has time to look at the potential side effects of this change. I did run all the dp tests, with and without ASAN, and no problems were found. With the above side note, Acked-by: Eelco Chaudron <echaudro@redhat.com> Cheers, Eelco
On 7/20/23 17:06, Mike Pattrick wrote: > When the a revalidator thread is updating statistics for an XC_LEARN > xcache entry in xlate_push_stats_entry it uses ofproto_flow_mod_learn. > The revalidator will update stats for rules even if they are in a > removed state or marked as invisible. However, ofproto_flow_mod_learn > will detect if a flow has been removed and re-add it in that case. This > can result in an old learn action replacing the new learn action that > had replaced it in the first place. > > This change adds a new parameter to ofproto_flow_mod_learn allowing the > caller to specify an action to take if temp_rule is removed. If it is > removed and the caller is xlate_push_stats_entry, then a new rule will > not be replaced. Hi, Mike. That's an interesting issue, though I'm not sure that is a correct way to fix it. It is important to re-install removed learned flows during stats update. The logic is the following, AFAIU: 1. Packet hits the OF rule (1) with a learning action. 2. Handler translates OF and creates a datapath flow for it, creating a learned rule (2) in the process and filling the translation cache. 3. Datapath flow for the rule (1) is installed into the datapath. 4. Time goes by and timeout on the learned rule (2) hits. Learned rule is removed now. 5. A packet hits the datapath flow for rule (1). 6. Revalidator updates flow stats for the rule (1). 7. Since the packet hit the learning rule (1), the rule (2) has to be added back. IIUC, the step 7 will not happen with the change proposed in this patch. Flow will not be re-translated, so the only way to re-install it is to execute side effects when we see stats update on a parent datapath flow in the revalidator. One potential way to avoid breaking this scenario is to check the removal reason, but it's not bulletproof either as we do still need to reinstate the flow even if it was manually deleted or evicted. It also seems like we may hold the reference to this deleted rule for a long time in the cache until the parent datapath flow is deleted. What is triggering the deletion of the learned rule in your case? Is it deletion or modification of the parent rule? And does it have 'delete_learned' flag? Best regards, Ilya Maximets.
On Fri, Jul 28, 2023 at 7:48 AM Ilya Maximets <i.maximets@ovn.org> wrote: > On 7/20/23 17:06, Mike Pattrick wrote: > > When the a revalidator thread is updating statistics for an XC_LEARN > > xcache entry in xlate_push_stats_entry it uses ofproto_flow_mod_learn. > > The revalidator will update stats for rules even if they are in a > > removed state or marked as invisible. However, ofproto_flow_mod_learn > > will detect if a flow has been removed and re-add it in that case. This > > can result in an old learn action replacing the new learn action that > > had replaced it in the first place. > > > > This change adds a new parameter to ofproto_flow_mod_learn allowing the > > caller to specify an action to take if temp_rule is removed. If it is > > removed and the caller is xlate_push_stats_entry, then a new rule will > > not be replaced. > > This is a slightly unrelated point, but while reviewing some code and test results I think I found a different issue. rule_expire() considers rule modified time for hard_timeout and the man page says hard_timeout "Causes the flow to expire after the given number of seconds, regardless of activity", however, modified time is updated in ofproto_flow_mod_learn_refresh() even if the rule isn't changed. Due to the current behaviour, I don't think hard_timeout could actually ever trigger on a learned action if an idle timeout is set or if traffic is ongoing. > Hi, Mike. That's an interesting issue, though I'm not sure that is a > correct way to fix it. > > It is important to re-install removed learned flows during stats update. > The logic is the following, AFAIU: > > 1. Packet hits the OF rule (1) with a learning action. > 2. Handler translates OF and creates a datapath flow for it, creating > a learned rule (2) in the process and filling the translation cache. > 3. Datapath flow for the rule (1) is installed into the datapath. > 4. Time goes by and timeout on the learned rule (2) hits. Learned rule > is removed now. > 5. A packet hits the datapath flow for rule (1). > 6. Revalidator updates flow stats for the rule (1). > 7. Since the packet hit the learning rule (1), the rule (2) has to be > added back. > > IIUC, the step 7 will not happen with the change proposed in this patch. > Flow will not be re-translated, so the only way to re-install it is > to execute side effects when we see stats update on a parent datapath > flow in the revalidator. > > One potential way to avoid breaking this scenario is to check the > removal reason, but it's not bulletproof either as we do still need > to reinstate the flow even if it was manually deleted or evicted. > I added a log to check the removal reason in the XC_LEARN stats update, and the reason always seems to be eviction. I wasn't able to catch an idle or hard removal in that function, though I'm sure I just wasn't winning the race. But in this example if we hit 7, and don't re-add the rule, what is the real penalty? The next packet will have to upcall and we would process the learn action from scratch, but that may be what the user wants as indicated by setting a timeout. The alternative is we re-install a flow that may be different then the flow that the user actually wants installed. Isn't there a risk of different revalidator threads going out of sync with different versions of a learned rule in their xcache? > It also seems like we may hold the reference to this deleted rule for a > long time in the cache until the parent datapath flow is deleted. > > > What is triggering the deletion of the learned rule in your case? > Is it deletion or modification of the parent rule? And does it have > 'delete_learned' flag? > In the case that I was debugging, the parent learn action was not deleted. Instead, new traffic from a different port was supposed to cause the learn action to modify the existing rule. But instead we see the rule modified with the correct information, and then immediately reverted with old stale information. > > Best regards, Ilya Maximets. > > Thank you, Mike
On 7/28/23 23:13, Mike Pattrick wrote: > On Fri, Jul 28, 2023 at 7:48 AM Ilya Maximets <i.maximets@ovn.org <mailto:i.maximets@ovn.org>> wrote: > > On 7/20/23 17:06, Mike Pattrick wrote: > > When the a revalidator thread is updating statistics for an XC_LEARN > > xcache entry in xlate_push_stats_entry it uses ofproto_flow_mod_learn. > > The revalidator will update stats for rules even if they are in a > > removed state or marked as invisible. However, ofproto_flow_mod_learn > > will detect if a flow has been removed and re-add it in that case. This > > can result in an old learn action replacing the new learn action that > > had replaced it in the first place. > > > > This change adds a new parameter to ofproto_flow_mod_learn allowing the > > caller to specify an action to take if temp_rule is removed. If it is > > removed and the caller is xlate_push_stats_entry, then a new rule will > > not be replaced. > > > This is a slightly unrelated point, but while reviewing some code and test > results I think I found a different issue. rule_expire() considers rule > modified time for hard_timeout and the man page says hard_timeout "Causes > the flow to expire after the given number of seconds, regardless of activity", > however, modified time is updated in ofproto_flow_mod_learn_refresh() even if > the rule isn't changed. Due to the current behaviour, I don't think hard_timeout > could actually ever trigger on a learned action if an idle timeout is set or if > traffic is ongoing. The 'modified' time of a learned rule is updated in this function, because the packet hit the learn() rule. If the learned rule didn't exist, we would create it at this moment. Instead of re-creating, we just update the 'modified' time. If packets will stop hitting the learn() rule, but will continue to flow through the learned one, the learned one will be removed when the hard timeout fires up. That is a correct behavior. > > > Hi, Mike. That's an interesting issue, though I'm not sure that is a > correct way to fix it. > > It is important to re-install removed learned flows during stats update. > The logic is the following, AFAIU: > > 1. Packet hits the OF rule (1) with a learning action. > 2. Handler translates OF and creates a datapath flow for it, creating > a learned rule (2) in the process and filling the translation cache. > 3. Datapath flow for the rule (1) is installed into the datapath. > 4. Time goes by and timeout on the learned rule (2) hits. Learned rule > is removed now. > 5. A packet hits the datapath flow for rule (1). > 6. Revalidator updates flow stats for the rule (1). > 7. Since the packet hit the learning rule (1), the rule (2) has to be > added back. > > IIUC, the step 7 will not happen with the change proposed in this patch. > Flow will not be re-translated, so the only way to re-install it is > to execute side effects when we see stats update on a parent datapath > flow in the revalidator. > > One potential way to avoid breaking this scenario is to check the > removal reason, but it's not bulletproof either as we do still need > to reinstate the flow even if it was manually deleted or evicted. > > > I added a log to check the removal reason in the XC_LEARN stats update, and the > reason always seems to be eviction. I wasn't able to catch an idle or hard removal > in that function, though I'm sure I just wasn't winning the race. > > But in this example if we hit 7, and don't re-add the rule, what is the real penalty? > The next packet will have to upcall and we would process the learn action from scratch, > but that may be what the user wants as indicated by setting a timeout. Not really. If the learned flow is removed, we need a packet to hit a learn() rule, which is a different rule, in order for traffic to continue to flow. Let's say we implement a firewall that allows connections only from A to B. So, A can initiate connection to B, but B can't initiate connection to A. Once A sends tcp SYN, we learn a rule that allows traffic from B to A on these particular ports. Now B can send traffic back. Once the learned flow is removed, A will have to establish a new connection in order to re-learn the B->A rule. So, not just an upcall. May not be a best example, but should work as an illustration. > The alternative > is we re-install a flow that may be different then the flow that the user actually > wants installed. If the flow is modified, it means that there was a traffic on a learn() action and we have to learn that specific rule. If it was already removed, we have to reinstate it. In general, the logic in the current OVS code seems correct in this particular place. > > Isn't there a risk of different revalidator threads going out of sync with different > versions of a learned rule in their xcache? > > > It also seems like we may hold the reference to this deleted rule for a > long time in the cache until the parent datapath flow is deleted. > > > What is triggering the deletion of the learned rule in your case? > Is it deletion or modification of the parent rule? And does it have > 'delete_learned' flag? > > > In the case that I was debugging, the parent learn action was not deleted. Instead, > new traffic from a different port was supposed to cause the learn action to modify > the existing rule. But instead we see the rule modified with the correct information, > and then immediately reverted with old stale information. The problem here is that we can't define which one is 'correct'. It's a race where both values are correct, but you like one better for some reason. New rules are only learned if the parent datapath flow that has a learn() action seen packets after the previous stats check. What you're describing is that you have a single rule with a learn action that is capable of re-writing the same learned rule depending on what kind of packet actually hits it. Is that correct? I would consider this kind of setup as fairly fragile and would not recommend using it in general. However, the scenario is likely like this: 1. Upcall on a rule with learn() action. New datapath flow (1) is installed. This flow has a match on all the fields used in a learn action and ukey contains the translation cache with a rule that it produces. The new learned rule is installed from this. 2. Upcall with a different packet that doesn't match the first datapath flow (1), but on the same rule with a learn() action. New datapath flow (2) is installed. This flow has a match on all the fields used in a learn action (they are different from the first one, since we have an upcall) and ukey contains the translation cache with a rule that it produces. Let's say the learned rule has the same match, but different actions. This overwrites the previously installed learned rule. 3. Both datapath flows (1) and (2) get some traffic. 4. Revalidation starts. 5. Revalidator A dumps datapath flow (1). Revalidator B dumps datapath flow (2). 6. Revalidators A and B are trying to revalidate these datapath flows at the same time. Both datapath flows had traffic since the last revalidation. Both revalidators are trying to update the learned rule, which is the same for both of them. 7. One of them does that first, the other one ends up with a rule in a REMOVED state, since it was just replaced by the other revalidator. There is no correct or incorrect rule in this situation. Both of them are correct, because both datapath flows have seen packets since the previous revalidation. We can't guarantee which one will end up being written the last and therefore will take effect. if we refuse to reinstate the REMOVED rule, we still can't say which one will be active in the end. What can resolve the race is the next revalidation cycle. Once one of the datapath flows will stop receiving traffic, there should no longer be a race condition and the next revalidation will set the "correct" rule, i.e. the rule that corresponds to a learning datapath flow that seen some packets last. But until both flows have traffic, there is no "correct" rule and the result will be random, regardless of what we choose to do. One issue that I see in the current code is that it looks like the table version is not getting updated on learned flow refresh and hence the next revalidation may not actually be triggered. In this case the random result may stay for much longer than we would like. The suspicious code (and the potential fix) is the following: @@ -5548,22 +5558,15 @@ ofproto_flow_mod_learn_revert(struct ofproto_flow_mod *ofm) } enum ofperr -ofproto_flow_mod_learn_finish(struct ofproto_flow_mod *ofm, - struct ofproto *orig_ofproto) +ofproto_flow_mod_learn_finish(struct ofproto_flow_mod *ofm) OVS_REQUIRES(ofproto_mutex) { struct rule *rule = rule_collection_rules(&ofm->new_rules)[0]; enum ofperr error = 0; - /* If learning on a different bridge, must bump its version - * number and flush connmgr afterwards. */ - if (rule->ofproto != orig_ofproto) { - ofproto_bump_tables_version(rule->ofproto); - } + ofproto_bump_tables_version(rule->ofproto); error = ofproto_flow_mod_finish(rule->ofproto, ofm, NULL); - if (rule->ofproto != orig_ofproto) { - ofmonitor_flush(rule->ofproto->connmgr); - } + ofmonitor_flush(rule->ofproto->connmgr); return error; } --- We should actually bump the version and request a revalidation in any case, not only when the change is in a different bridge. Could you try that in your setup? The issue appears to come from the commit: 1f4a89336682 ("ofproto: Refactor packet_out handling.") There is a slight chance of launching ourselves into an infinite revlidation loop, I guess. Especially if both learning datapath flows will continue to receive the traffic, but there is a minimal revalidation interval, IIRC. Not sure if it applies in this case. Best regards, Ilya Maximets.
On Fri, Jul 28, 2023 at 6:31 PM Ilya Maximets <i.maximets@ovn.org> wrote: > > On 7/28/23 23:13, Mike Pattrick wrote: > > On Fri, Jul 28, 2023 at 7:48 AM Ilya Maximets <i.maximets@ovn.org <mailto:i.maximets@ovn.org>> wrote: > > > > On 7/20/23 17:06, Mike Pattrick wrote: > > > When the a revalidator thread is updating statistics for an XC_LEARN > > > xcache entry in xlate_push_stats_entry it uses ofproto_flow_mod_learn. > > > The revalidator will update stats for rules even if they are in a > > > removed state or marked as invisible. However, ofproto_flow_mod_learn > > > will detect if a flow has been removed and re-add it in that case. This > > > can result in an old learn action replacing the new learn action that > > > had replaced it in the first place. > > > > > > This change adds a new parameter to ofproto_flow_mod_learn allowing the > > > caller to specify an action to take if temp_rule is removed. If it is > > > removed and the caller is xlate_push_stats_entry, then a new rule will > > > not be replaced. > > > > > > This is a slightly unrelated point, but while reviewing some code and test > > results I think I found a different issue. rule_expire() considers rule > > modified time for hard_timeout and the man page says hard_timeout "Causes > > the flow to expire after the given number of seconds, regardless of activity", > > however, modified time is updated in ofproto_flow_mod_learn_refresh() even if > > the rule isn't changed. Due to the current behaviour, I don't think hard_timeout > > could actually ever trigger on a learned action if an idle timeout is set or if > > traffic is ongoing. > > The 'modified' time of a learned rule is updated in this function, because > the packet hit the learn() rule. If the learned rule didn't exist, we > would create it at this moment. Instead of re-creating, we just update > the 'modified' time. If packets will stop hitting the learn() rule, but > will continue to flow through the learned one, the learned one will be > removed when the hard timeout fires up. That is a correct behavior. > I can accept this is intended, but it doesn't seem like it would operate in that way based on the man page alone. > > > > > > > Hi, Mike. That's an interesting issue, though I'm not sure that is a > > correct way to fix it. > > > > It is important to re-install removed learned flows during stats update. > > The logic is the following, AFAIU: > > > > 1. Packet hits the OF rule (1) with a learning action. > > 2. Handler translates OF and creates a datapath flow for it, creating > > a learned rule (2) in the process and filling the translation cache. > > 3. Datapath flow for the rule (1) is installed into the datapath. > > 4. Time goes by and timeout on the learned rule (2) hits. Learned rule > > is removed now. > > 5. A packet hits the datapath flow for rule (1). > > 6. Revalidator updates flow stats for the rule (1). > > 7. Since the packet hit the learning rule (1), the rule (2) has to be > > added back. > > > > IIUC, the step 7 will not happen with the change proposed in this patch. > > Flow will not be re-translated, so the only way to re-install it is > > to execute side effects when we see stats update on a parent datapath > > flow in the revalidator. > > > > One potential way to avoid breaking this scenario is to check the > > removal reason, but it's not bulletproof either as we do still need > > to reinstate the flow even if it was manually deleted or evicted. > > > > > > I added a log to check the removal reason in the XC_LEARN stats update, and the > > reason always seems to be eviction. I wasn't able to catch an idle or hard removal > > in that function, though I'm sure I just wasn't winning the race. > > > > But in this example if we hit 7, and don't re-add the rule, what is the real penalty? > > The next packet will have to upcall and we would process the learn action from scratch, > > but that may be what the user wants as indicated by setting a timeout. > > Not really. If the learned flow is removed, we need a packet to hit a > learn() rule, which is a different rule, in order for traffic to continue > to flow. Let's say we implement a firewall that allows connections only > from A to B. So, A can initiate connection to B, but B can't initiate > connection to A. Once A sends tcp SYN, we learn a rule that allows traffic > from B to A on these particular ports. Now B can send traffic back. > Once the learned flow is removed, A will have to establish a new connection > in order to re-learn the B->A rule. So, not just an upcall. May not be a > best example, but should work as an illustration. > > > The alternative > > is we re-install a flow that may be different then the flow that the user actually > > wants installed. > > If the flow is modified, it means that there was a traffic on a learn() > action and we have to learn that specific rule. If it was already removed, > we have to reinstate it. In general, the logic in the current OVS code > seems correct in this particular place. > > > > > Isn't there a risk of different revalidator threads going out of sync with different > > versions of a learned rule in their xcache? > > > > > > It also seems like we may hold the reference to this deleted rule for a > > long time in the cache until the parent datapath flow is deleted. > > > > > > What is triggering the deletion of the learned rule in your case? > > Is it deletion or modification of the parent rule? And does it have > > 'delete_learned' flag? > > > > > > In the case that I was debugging, the parent learn action was not deleted. Instead, > > new traffic from a different port was supposed to cause the learn action to modify > > the existing rule. But instead we see the rule modified with the correct information, > > and then immediately reverted with old stale information. > > The problem here is that we can't define which one is 'correct'. > It's a race where both values are correct, but you like one better > for some reason. New rules are only learned if the parent datapath > flow that has a learn() action seen packets after the previous stats > check. > > What you're describing is that you have a single rule with a learn > action that is capable of re-writing the same learned rule depending > on what kind of packet actually hits it. Is that correct? In this case the rule is: learn(table=20,hard_timeout=300,priority=1,cookie=0xcb684d24d14b14ad,NXM_OF_VLAN_TCI[0..11],NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],load:0->NXM_OF_VLAN_TCI[],load:NXM_NX_TUN_ID[]->NXM_NX_TUN_ID[],output:OXM_OF_IN_PORT[]),output:1 And an example of the learned rule is: vlan_tci=0x0005/0x0fff,dl_dst=fc:16:3e:3e:5d:e0,actions=load:0->NXM_OF_VLAN_TCI[],load:0x3->NXM_NX_TUN_ID[],ou tput:2 The scenario is that ingressing packets are supposed to change the route that egress packets take, and that route can change over time. What I see from some additional logging is: 1. traffic begins to ingress on port A 2. learn action installs a rule to route response to A 3. traffic begins to egress on port A 4. port A becomes invalid, no further traffic ingresses from port A 5. traffic ingresses on port B 6. learn action installs a rule to route response to B 7. revalidator replaces (6) with (2) Eventually the egressing traffic will become exausted and the correct rule is installed, but this can take several minutes. The rule regression can even happen many milliseconds after learn installs the correct rule and even after packets have flowed through the new rule. > I would consider this kind of setup as fairly fragile and would not > recommend using it in general. > > However, the scenario is likely like this: > > 1. Upcall on a rule with learn() action. > New datapath flow (1) is installed. This flow has a match on > all the fields used in a learn action and ukey contains the > translation cache with a rule that it produces. > The new learned rule is installed from this. > > 2. Upcall with a different packet that doesn't match the first > datapath flow (1), but on the same rule with a learn() action. > New datapath flow (2) is installed. This flow has a match on > all the fields used in a learn action (they are different from > the first one, since we have an upcall) and ukey contains the > translation cache with a rule that it produces. Let's say the > learned rule has the same match, but different actions. > This overwrites the previously installed learned rule. > > 3. Both datapath flows (1) and (2) get some traffic. > > 4. Revalidation starts. > > 5. Revalidator A dumps datapath flow (1). > Revalidator B dumps datapath flow (2). > > 6. Revalidators A and B are trying to revalidate these datapath flows > at the same time. Both datapath flows had traffic since the > last revalidation. Both revalidators are trying to update the > learned rule, which is the same for both of them. > > 7. One of them does that first, the other one ends up with a rule in a > REMOVED state, since it was just replaced by the other revalidator. > > > There is no correct or incorrect rule in this situation. Both > of them are correct, because both datapath flows have seen packets > since the previous revalidation. We can't guarantee which one will > end up being written the last and therefore will take effect. > if we refuse to reinstate the REMOVED rule, we still can't say which > one will be active in the end. > > What can resolve the race is the next revalidation cycle. > Once one of the datapath flows will stop receiving traffic, there > should no longer be a race condition and the next revalidation will > set the "correct" rule, i.e. the rule that corresponds to a learning > datapath flow that seen some packets last. But until both flows have > traffic, there is no "correct" rule and the result will be random, > regardless of what we choose to do. > > > One issue that I see in the current code is that it looks like the table > version is not getting updated on learned flow refresh and hence the next > revalidation may not actually be triggered. In this case the random > result may stay for much longer than we would like. > > The suspicious code (and the potential fix) is the following: > > @@ -5548,22 +5558,15 @@ ofproto_flow_mod_learn_revert(struct ofproto_flow_mod *ofm) > } > > enum ofperr > -ofproto_flow_mod_learn_finish(struct ofproto_flow_mod *ofm, > - struct ofproto *orig_ofproto) > +ofproto_flow_mod_learn_finish(struct ofproto_flow_mod *ofm) > OVS_REQUIRES(ofproto_mutex) > { > struct rule *rule = rule_collection_rules(&ofm->new_rules)[0]; > enum ofperr error = 0; > > - /* If learning on a different bridge, must bump its version > - * number and flush connmgr afterwards. */ > - if (rule->ofproto != orig_ofproto) { > - ofproto_bump_tables_version(rule->ofproto); > - } > + ofproto_bump_tables_version(rule->ofproto); > error = ofproto_flow_mod_finish(rule->ofproto, ofm, NULL); > - if (rule->ofproto != orig_ofproto) { > - ofmonitor_flush(rule->ofproto->connmgr); > - } > + ofmonitor_flush(rule->ofproto->connmgr); > > return error; > } > --- > > We should actually bump the version and request a revalidation in > any case, not only when the change is in a different bridge. > > Could you try that in your setup? That was actually my initial suspicion when investigating this bug, but this change doesn't improve the buggy behaviour. Thank you, M > > > The issue appears to come from the commit: > 1f4a89336682 ("ofproto: Refactor packet_out handling.") > > There is a slight chance of launching ourselves into an infinite > revlidation loop, I guess. Especially if both learning datapath > flows will continue to receive the traffic, but there is a minimal > revalidation interval, IIRC. Not sure if it applies in this case. > > Best regards, Ilya Maximets. >
On 7/31/23 07:34, Mike Pattrick wrote: > On Fri, Jul 28, 2023 at 6:31 PM Ilya Maximets <i.maximets@ovn.org> wrote: >> >> On 7/28/23 23:13, Mike Pattrick wrote: >>> On Fri, Jul 28, 2023 at 7:48 AM Ilya Maximets <i.maximets@ovn.org <mailto:i.maximets@ovn.org>> wrote: >>> >>> On 7/20/23 17:06, Mike Pattrick wrote: >>> > When the a revalidator thread is updating statistics for an XC_LEARN >>> > xcache entry in xlate_push_stats_entry it uses ofproto_flow_mod_learn. >>> > The revalidator will update stats for rules even if they are in a >>> > removed state or marked as invisible. However, ofproto_flow_mod_learn >>> > will detect if a flow has been removed and re-add it in that case. This >>> > can result in an old learn action replacing the new learn action that >>> > had replaced it in the first place. >>> > >>> > This change adds a new parameter to ofproto_flow_mod_learn allowing the >>> > caller to specify an action to take if temp_rule is removed. If it is >>> > removed and the caller is xlate_push_stats_entry, then a new rule will >>> > not be replaced. >>> >>> >>> This is a slightly unrelated point, but while reviewing some code and test >>> results I think I found a different issue. rule_expire() considers rule >>> modified time for hard_timeout and the man page says hard_timeout "Causes >>> the flow to expire after the given number of seconds, regardless of activity", >>> however, modified time is updated in ofproto_flow_mod_learn_refresh() even if >>> the rule isn't changed. Due to the current behaviour, I don't think hard_timeout >>> could actually ever trigger on a learned action if an idle timeout is set or if >>> traffic is ongoing. >> >> The 'modified' time of a learned rule is updated in this function, because >> the packet hit the learn() rule. If the learned rule didn't exist, we >> would create it at this moment. Instead of re-creating, we just update >> the 'modified' time. If packets will stop hitting the learn() rule, but >> will continue to flow through the learned one, the learned one will be >> removed when the hard timeout fires up. That is a correct behavior. >> > > I can accept this is intended, but it doesn't seem like it would > operate in that way based on the man page alone. It's a generic characteristic of an OpenFlow rule. "activity" means traffic matching it. learn() action is triggering a flow modification equal to a strict mod-flow, as described in the ovs-actions man page. And ovs-ofctl page mentions that "mod-flows restarts the hard_timeout". Maybe it can be clarified somehow better, I'm not sure. Feel free to submit a patch. > >> >>> >>> >>> Hi, Mike. That's an interesting issue, though I'm not sure that is a >>> correct way to fix it. >>> >>> It is important to re-install removed learned flows during stats update. >>> The logic is the following, AFAIU: >>> >>> 1. Packet hits the OF rule (1) with a learning action. >>> 2. Handler translates OF and creates a datapath flow for it, creating >>> a learned rule (2) in the process and filling the translation cache. >>> 3. Datapath flow for the rule (1) is installed into the datapath. >>> 4. Time goes by and timeout on the learned rule (2) hits. Learned rule >>> is removed now. >>> 5. A packet hits the datapath flow for rule (1). >>> 6. Revalidator updates flow stats for the rule (1). >>> 7. Since the packet hit the learning rule (1), the rule (2) has to be >>> added back. >>> >>> IIUC, the step 7 will not happen with the change proposed in this patch. >>> Flow will not be re-translated, so the only way to re-install it is >>> to execute side effects when we see stats update on a parent datapath >>> flow in the revalidator. >>> >>> One potential way to avoid breaking this scenario is to check the >>> removal reason, but it's not bulletproof either as we do still need >>> to reinstate the flow even if it was manually deleted or evicted. >>> >>> >>> I added a log to check the removal reason in the XC_LEARN stats update, and the >>> reason always seems to be eviction. I wasn't able to catch an idle or hard removal >>> in that function, though I'm sure I just wasn't winning the race. >>> >>> But in this example if we hit 7, and don't re-add the rule, what is the real penalty? >>> The next packet will have to upcall and we would process the learn action from scratch, >>> but that may be what the user wants as indicated by setting a timeout. >> >> Not really. If the learned flow is removed, we need a packet to hit a >> learn() rule, which is a different rule, in order for traffic to continue >> to flow. Let's say we implement a firewall that allows connections only >> from A to B. So, A can initiate connection to B, but B can't initiate >> connection to A. Once A sends tcp SYN, we learn a rule that allows traffic >> from B to A on these particular ports. Now B can send traffic back. >> Once the learned flow is removed, A will have to establish a new connection >> in order to re-learn the B->A rule. So, not just an upcall. May not be a >> best example, but should work as an illustration. >> >>> The alternative >>> is we re-install a flow that may be different then the flow that the user actually >>> wants installed. >> >> If the flow is modified, it means that there was a traffic on a learn() >> action and we have to learn that specific rule. If it was already removed, >> we have to reinstate it. In general, the logic in the current OVS code >> seems correct in this particular place. >> >>> >>> Isn't there a risk of different revalidator threads going out of sync with different >>> versions of a learned rule in their xcache? >>> >>> >>> It also seems like we may hold the reference to this deleted rule for a >>> long time in the cache until the parent datapath flow is deleted. >>> >>> >>> What is triggering the deletion of the learned rule in your case? >>> Is it deletion or modification of the parent rule? And does it have >>> 'delete_learned' flag? >>> >>> >>> In the case that I was debugging, the parent learn action was not deleted. Instead, >>> new traffic from a different port was supposed to cause the learn action to modify >>> the existing rule. But instead we see the rule modified with the correct information, >>> and then immediately reverted with old stale information. >> >> The problem here is that we can't define which one is 'correct'. >> It's a race where both values are correct, but you like one better >> for some reason. New rules are only learned if the parent datapath >> flow that has a learn() action seen packets after the previous stats >> check. >> >> What you're describing is that you have a single rule with a learn >> action that is capable of re-writing the same learned rule depending >> on what kind of packet actually hits it. Is that correct? > > In this case the rule is: > > learn(table=20,hard_timeout=300,priority=1,cookie=0xcb684d24d14b14ad,NXM_OF_VLAN_TCI[0..11],NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],load:0->NXM_OF_VLAN_TCI[],load:NXM_NX_TUN_ID[]->NXM_NX_TUN_ID[],output:OXM_OF_IN_PORT[]),output:1 > > And an example of the learned rule is: > > vlan_tci=0x0005/0x0fff,dl_dst=fc:16:3e:3e:5d:e0,actions=load:0->NXM_OF_VLAN_TCI[],load:0x3->NXM_NX_TUN_ID[],ou > tput:2 > > The scenario is that ingressing packets are supposed to change the > route that egress packets take, and that route can change over time. > > What I see from some additional logging is: > > 1. traffic begins to ingress on port A > 2. learn action installs a rule to route response to A > 3. traffic begins to egress on port A > 4. port A becomes invalid, no further traffic ingresses from port A > 5. traffic ingresses on port B > 6. learn action installs a rule to route response to B > 7. revalidator replaces (6) with (2) > > Eventually the egressing traffic will become exausted and the correct > rule is installed, but this can take several minutes. The rule > regression can even happen many milliseconds after learn installs the > correct rule and even after packets have flowed through the new rule. The rule can be replaced if there was no revalidation between steps 4 and 5, i.e. at the point of revalidation there is a stats increase on both datapath flows (ingress A and ingress B). Revalidators can sleep for up to 500ms by default between revalidation cycles, so the replacement can happen many milliseconds after. Is the traffic on port B continuous, or is it a one-shot packet? The point is, at least one revalidation cycle should pass after step 4. And the traffic on port B should be seen after this revalidation cycle is completed. So, step 5 should be at least 500+ ms apart from step 4. > > >> I would consider this kind of setup as fairly fragile and would not >> recommend using it in general. >> >> However, the scenario is likely like this: >> >> 1. Upcall on a rule with learn() action. >> New datapath flow (1) is installed. This flow has a match on >> all the fields used in a learn action and ukey contains the >> translation cache with a rule that it produces. >> The new learned rule is installed from this. >> >> 2. Upcall with a different packet that doesn't match the first >> datapath flow (1), but on the same rule with a learn() action. >> New datapath flow (2) is installed. This flow has a match on >> all the fields used in a learn action (they are different from >> the first one, since we have an upcall) and ukey contains the >> translation cache with a rule that it produces. Let's say the >> learned rule has the same match, but different actions. >> This overwrites the previously installed learned rule. >> >> 3. Both datapath flows (1) and (2) get some traffic. >> >> 4. Revalidation starts. >> >> 5. Revalidator A dumps datapath flow (1). >> Revalidator B dumps datapath flow (2). >> >> 6. Revalidators A and B are trying to revalidate these datapath flows >> at the same time. Both datapath flows had traffic since the >> last revalidation. Both revalidators are trying to update the >> learned rule, which is the same for both of them. >> >> 7. One of them does that first, the other one ends up with a rule in a >> REMOVED state, since it was just replaced by the other revalidator. >> >> >> There is no correct or incorrect rule in this situation. Both >> of them are correct, because both datapath flows have seen packets >> since the previous revalidation. We can't guarantee which one will >> end up being written the last and therefore will take effect. >> if we refuse to reinstate the REMOVED rule, we still can't say which >> one will be active in the end. >> >> What can resolve the race is the next revalidation cycle. >> Once one of the datapath flows will stop receiving traffic, there >> should no longer be a race condition and the next revalidation will >> set the "correct" rule, i.e. the rule that corresponds to a learning >> datapath flow that seen some packets last. But until both flows have >> traffic, there is no "correct" rule and the result will be random, >> regardless of what we choose to do. >> >> >> One issue that I see in the current code is that it looks like the table >> version is not getting updated on learned flow refresh and hence the next >> revalidation may not actually be triggered. In this case the random >> result may stay for much longer than we would like. >> >> The suspicious code (and the potential fix) is the following: >> >> @@ -5548,22 +5558,15 @@ ofproto_flow_mod_learn_revert(struct ofproto_flow_mod *ofm) >> } >> >> enum ofperr >> -ofproto_flow_mod_learn_finish(struct ofproto_flow_mod *ofm, >> - struct ofproto *orig_ofproto) >> +ofproto_flow_mod_learn_finish(struct ofproto_flow_mod *ofm) >> OVS_REQUIRES(ofproto_mutex) >> { >> struct rule *rule = rule_collection_rules(&ofm->new_rules)[0]; >> enum ofperr error = 0; >> >> - /* If learning on a different bridge, must bump its version >> - * number and flush connmgr afterwards. */ >> - if (rule->ofproto != orig_ofproto) { >> - ofproto_bump_tables_version(rule->ofproto); >> - } >> + ofproto_bump_tables_version(rule->ofproto); >> error = ofproto_flow_mod_finish(rule->ofproto, ofm, NULL); >> - if (rule->ofproto != orig_ofproto) { >> - ofmonitor_flush(rule->ofproto->connmgr); >> - } >> + ofmonitor_flush(rule->ofproto->connmgr); >> >> return error; >> } >> --- >> >> We should actually bump the version and request a revalidation in >> any case, not only when the change is in a different bridge. FWIW, the original code in commit 1f4a89336682 was written mostly for a case where learning is triggered by a packet-out executed within a bundle. And we can't actually publish changes for a current bridge in this case. So, the fix should be a bit more clever. >> >> Could you try that in your setup? > > That was actually my initial suspicion when investigating this bug, > but this change doesn't improve the buggy behaviour. > > Thank you, > M > >> >> >> The issue appears to come from the commit: >> 1f4a89336682 ("ofproto: Refactor packet_out handling.") >> >> There is a slight chance of launching ourselves into an infinite >> revlidation loop, I guess. Especially if both learning datapath >> flows will continue to receive the traffic, but there is a minimal >> revalidation interval, IIRC. Not sure if it applies in this case. >> >> Best regards, Ilya Maximets. >> >
On Mon, Jul 31, 2023 at 7:20 AM Ilya Maximets <i.maximets@ovn.org> wrote: > > On 7/31/23 07:34, Mike Pattrick wrote: > > On Fri, Jul 28, 2023 at 6:31 PM Ilya Maximets <i.maximets@ovn.org> wrote: > >> > >> On 7/28/23 23:13, Mike Pattrick wrote: > >>> On Fri, Jul 28, 2023 at 7:48 AM Ilya Maximets <i.maximets@ovn.org <mailto:i.maximets@ovn.org>> wrote: > >>> > >>> On 7/20/23 17:06, Mike Pattrick wrote: > >>> > When the a revalidator thread is updating statistics for an XC_LEARN > >>> > xcache entry in xlate_push_stats_entry it uses ofproto_flow_mod_learn. > >>> > The revalidator will update stats for rules even if they are in a > >>> > removed state or marked as invisible. However, ofproto_flow_mod_learn > >>> > will detect if a flow has been removed and re-add it in that case. This > >>> > can result in an old learn action replacing the new learn action that > >>> > had replaced it in the first place. > >>> > > >>> > This change adds a new parameter to ofproto_flow_mod_learn allowing the > >>> > caller to specify an action to take if temp_rule is removed. If it is > >>> > removed and the caller is xlate_push_stats_entry, then a new rule will > >>> > not be replaced. > >>> > >>> > >>> This is a slightly unrelated point, but while reviewing some code and test > >>> results I think I found a different issue. rule_expire() considers rule > >>> modified time for hard_timeout and the man page says hard_timeout "Causes > >>> the flow to expire after the given number of seconds, regardless of activity", > >>> however, modified time is updated in ofproto_flow_mod_learn_refresh() even if > >>> the rule isn't changed. Due to the current behaviour, I don't think hard_timeout > >>> could actually ever trigger on a learned action if an idle timeout is set or if > >>> traffic is ongoing. > >> > >> The 'modified' time of a learned rule is updated in this function, because > >> the packet hit the learn() rule. If the learned rule didn't exist, we > >> would create it at this moment. Instead of re-creating, we just update > >> the 'modified' time. If packets will stop hitting the learn() rule, but > >> will continue to flow through the learned one, the learned one will be > >> removed when the hard timeout fires up. That is a correct behavior. > >> > > > > I can accept this is intended, but it doesn't seem like it would > > operate in that way based on the man page alone. > > It's a generic characteristic of an OpenFlow rule. "activity" means traffic > matching it. learn() action is triggering a flow modification equal to a > strict mod-flow, as described in the ovs-actions man page. And ovs-ofctl page > mentions that "mod-flows restarts the hard_timeout". > > Maybe it can be clarified somehow better, I'm not sure. Feel free to submit > a patch. > > > > >> > >>> > >>> > >>> Hi, Mike. That's an interesting issue, though I'm not sure that is a > >>> correct way to fix it. > >>> > >>> It is important to re-install removed learned flows during stats update. > >>> The logic is the following, AFAIU: > >>> > >>> 1. Packet hits the OF rule (1) with a learning action. > >>> 2. Handler translates OF and creates a datapath flow for it, creating > >>> a learned rule (2) in the process and filling the translation cache. > >>> 3. Datapath flow for the rule (1) is installed into the datapath. > >>> 4. Time goes by and timeout on the learned rule (2) hits. Learned rule > >>> is removed now. > >>> 5. A packet hits the datapath flow for rule (1). > >>> 6. Revalidator updates flow stats for the rule (1). > >>> 7. Since the packet hit the learning rule (1), the rule (2) has to be > >>> added back. > >>> > >>> IIUC, the step 7 will not happen with the change proposed in this patch. > >>> Flow will not be re-translated, so the only way to re-install it is > >>> to execute side effects when we see stats update on a parent datapath > >>> flow in the revalidator. > >>> > >>> One potential way to avoid breaking this scenario is to check the > >>> removal reason, but it's not bulletproof either as we do still need > >>> to reinstate the flow even if it was manually deleted or evicted. > >>> > >>> > >>> I added a log to check the removal reason in the XC_LEARN stats update, and the > >>> reason always seems to be eviction. I wasn't able to catch an idle or hard removal > >>> in that function, though I'm sure I just wasn't winning the race. > >>> > >>> But in this example if we hit 7, and don't re-add the rule, what is the real penalty? > >>> The next packet will have to upcall and we would process the learn action from scratch, > >>> but that may be what the user wants as indicated by setting a timeout. > >> > >> Not really. If the learned flow is removed, we need a packet to hit a > >> learn() rule, which is a different rule, in order for traffic to continue > >> to flow. Let's say we implement a firewall that allows connections only > >> from A to B. So, A can initiate connection to B, but B can't initiate > >> connection to A. Once A sends tcp SYN, we learn a rule that allows traffic > >> from B to A on these particular ports. Now B can send traffic back. > >> Once the learned flow is removed, A will have to establish a new connection > >> in order to re-learn the B->A rule. So, not just an upcall. May not be a > >> best example, but should work as an illustration. > >> > >>> The alternative > >>> is we re-install a flow that may be different then the flow that the user actually > >>> wants installed. > >> > >> If the flow is modified, it means that there was a traffic on a learn() > >> action and we have to learn that specific rule. If it was already removed, > >> we have to reinstate it. In general, the logic in the current OVS code > >> seems correct in this particular place. > >> > >>> > >>> Isn't there a risk of different revalidator threads going out of sync with different > >>> versions of a learned rule in their xcache? > >>> > >>> > >>> It also seems like we may hold the reference to this deleted rule for a > >>> long time in the cache until the parent datapath flow is deleted. > >>> > >>> > >>> What is triggering the deletion of the learned rule in your case? > >>> Is it deletion or modification of the parent rule? And does it have > >>> 'delete_learned' flag? > >>> > >>> > >>> In the case that I was debugging, the parent learn action was not deleted. Instead, > >>> new traffic from a different port was supposed to cause the learn action to modify > >>> the existing rule. But instead we see the rule modified with the correct information, > >>> and then immediately reverted with old stale information. > >> > >> The problem here is that we can't define which one is 'correct'. > >> It's a race where both values are correct, but you like one better > >> for some reason. New rules are only learned if the parent datapath > >> flow that has a learn() action seen packets after the previous stats > >> check. > >> > >> What you're describing is that you have a single rule with a learn > >> action that is capable of re-writing the same learned rule depending > >> on what kind of packet actually hits it. Is that correct? > > > > In this case the rule is: > > > > learn(table=20,hard_timeout=300,priority=1,cookie=0xcb684d24d14b14ad,NXM_OF_VLAN_TCI[0..11],NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],load:0->NXM_OF_VLAN_TCI[],load:NXM_NX_TUN_ID[]->NXM_NX_TUN_ID[],output:OXM_OF_IN_PORT[]),output:1 > > > > And an example of the learned rule is: > > > > vlan_tci=0x0005/0x0fff,dl_dst=fc:16:3e:3e:5d:e0,actions=load:0->NXM_OF_VLAN_TCI[],load:0x3->NXM_NX_TUN_ID[],ou > > tput:2 > > > > The scenario is that ingressing packets are supposed to change the > > route that egress packets take, and that route can change over time. > > > > What I see from some additional logging is: > > > > 1. traffic begins to ingress on port A > > 2. learn action installs a rule to route response to A > > 3. traffic begins to egress on port A > > 4. port A becomes invalid, no further traffic ingresses from port A > > 5. traffic ingresses on port B > > 6. learn action installs a rule to route response to B > > 7. revalidator replaces (6) with (2) > > > > Eventually the egressing traffic will become exausted and the correct > > rule is installed, but this can take several minutes. The rule > > regression can even happen many milliseconds after learn installs the > > correct rule and even after packets have flowed through the new rule. > > The rule can be replaced if there was no revalidation between steps > 4 and 5, i.e. at the point of revalidation there is a stats increase > on both datapath flows (ingress A and ingress B). > Revalidators can sleep for up to 500ms by default between revalidation > cycles, so the replacement can happen many milliseconds after. > > Is the traffic on port B continuous, or is it a one-shot packet? Continuous but less frequent than traffic in the opposite direction. In a client's environment there were ingress packets every few seconds and egress packets continued to flow incorrectly for several minutes afterwards. If this solution is unacceptable; I could resubmit with an additional check to only suppress the flow_mod if the learn_modify rule is in a removed state and there is already a conflicting rule in the classifier that isn't removed. Or would you prefer a completely different approach be taken here? Thanks, M > The point is, at least one revalidation cycle should pass after step 4. > And the traffic on port B should be seen after this revalidation cycle > is completed. So, step 5 should be at least 500+ ms apart from step 4. > > > > > > >> I would consider this kind of setup as fairly fragile and would not > >> recommend using it in general. > >> > >> However, the scenario is likely like this: > >> > >> 1. Upcall on a rule with learn() action. > >> New datapath flow (1) is installed. This flow has a match on > >> all the fields used in a learn action and ukey contains the > >> translation cache with a rule that it produces. > >> The new learned rule is installed from this. > >> > >> 2. Upcall with a different packet that doesn't match the first > >> datapath flow (1), but on the same rule with a learn() action. > >> New datapath flow (2) is installed. This flow has a match on > >> all the fields used in a learn action (they are different from > >> the first one, since we have an upcall) and ukey contains the > >> translation cache with a rule that it produces. Let's say the > >> learned rule has the same match, but different actions. > >> This overwrites the previously installed learned rule. > >> > >> 3. Both datapath flows (1) and (2) get some traffic. > >> > >> 4. Revalidation starts. > >> > >> 5. Revalidator A dumps datapath flow (1). > >> Revalidator B dumps datapath flow (2). > >> > >> 6. Revalidators A and B are trying to revalidate these datapath flows > >> at the same time. Both datapath flows had traffic since the > >> last revalidation. Both revalidators are trying to update the > >> learned rule, which is the same for both of them. > >> > >> 7. One of them does that first, the other one ends up with a rule in a > >> REMOVED state, since it was just replaced by the other revalidator. > >> > >> > >> There is no correct or incorrect rule in this situation. Both > >> of them are correct, because both datapath flows have seen packets > >> since the previous revalidation. We can't guarantee which one will > >> end up being written the last and therefore will take effect. > >> if we refuse to reinstate the REMOVED rule, we still can't say which > >> one will be active in the end. > >> > >> What can resolve the race is the next revalidation cycle. > >> Once one of the datapath flows will stop receiving traffic, there > >> should no longer be a race condition and the next revalidation will > >> set the "correct" rule, i.e. the rule that corresponds to a learning > >> datapath flow that seen some packets last. But until both flows have > >> traffic, there is no "correct" rule and the result will be random, > >> regardless of what we choose to do. > >> > >> > >> One issue that I see in the current code is that it looks like the table > >> version is not getting updated on learned flow refresh and hence the next > >> revalidation may not actually be triggered. In this case the random > >> result may stay for much longer than we would like. > >> > >> The suspicious code (and the potential fix) is the following: > >> > >> @@ -5548,22 +5558,15 @@ ofproto_flow_mod_learn_revert(struct ofproto_flow_mod *ofm) > >> } > >> > >> enum ofperr > >> -ofproto_flow_mod_learn_finish(struct ofproto_flow_mod *ofm, > >> - struct ofproto *orig_ofproto) > >> +ofproto_flow_mod_learn_finish(struct ofproto_flow_mod *ofm) > >> OVS_REQUIRES(ofproto_mutex) > >> { > >> struct rule *rule = rule_collection_rules(&ofm->new_rules)[0]; > >> enum ofperr error = 0; > >> > >> - /* If learning on a different bridge, must bump its version > >> - * number and flush connmgr afterwards. */ > >> - if (rule->ofproto != orig_ofproto) { > >> - ofproto_bump_tables_version(rule->ofproto); > >> - } > >> + ofproto_bump_tables_version(rule->ofproto); > >> error = ofproto_flow_mod_finish(rule->ofproto, ofm, NULL); > >> - if (rule->ofproto != orig_ofproto) { > >> - ofmonitor_flush(rule->ofproto->connmgr); > >> - } > >> + ofmonitor_flush(rule->ofproto->connmgr); > >> > >> return error; > >> } > >> --- > >> > >> We should actually bump the version and request a revalidation in > >> any case, not only when the change is in a different bridge. > > FWIW, the original code in commit 1f4a89336682 was written mostly > for a case where learning is triggered by a packet-out executed > within a bundle. And we can't actually publish changes for a current > bridge in this case. So, the fix should be a bit more clever. > > >> > >> Could you try that in your setup? > > > > That was actually my initial suspicion when investigating this bug, > > but this change doesn't improve the buggy behaviour. > > > > Thank you, > > M > > > >> > >> > >> The issue appears to come from the commit: > >> 1f4a89336682 ("ofproto: Refactor packet_out handling.") > >> > >> There is a slight chance of launching ourselves into an infinite > >> revlidation loop, I guess. Especially if both learning datapath > >> flows will continue to receive the traffic, but there is a minimal > >> revalidation interval, IIRC. Not sure if it applies in this case. > >> > >> Best regards, Ilya Maximets. > >> > > >
On 7/31/23 18:01, Mike Pattrick wrote: > On Mon, Jul 31, 2023 at 7:20 AM Ilya Maximets <i.maximets@ovn.org> wrote: >> >> On 7/31/23 07:34, Mike Pattrick wrote: >>> On Fri, Jul 28, 2023 at 6:31 PM Ilya Maximets <i.maximets@ovn.org> wrote: >>>> >>>> On 7/28/23 23:13, Mike Pattrick wrote: >>>>> On Fri, Jul 28, 2023 at 7:48 AM Ilya Maximets <i.maximets@ovn.org <mailto:i.maximets@ovn.org>> wrote: >>>>> >>>>> On 7/20/23 17:06, Mike Pattrick wrote: >>>>> > When the a revalidator thread is updating statistics for an XC_LEARN >>>>> > xcache entry in xlate_push_stats_entry it uses ofproto_flow_mod_learn. >>>>> > The revalidator will update stats for rules even if they are in a >>>>> > removed state or marked as invisible. However, ofproto_flow_mod_learn >>>>> > will detect if a flow has been removed and re-add it in that case. This >>>>> > can result in an old learn action replacing the new learn action that >>>>> > had replaced it in the first place. >>>>> > >>>>> > This change adds a new parameter to ofproto_flow_mod_learn allowing the >>>>> > caller to specify an action to take if temp_rule is removed. If it is >>>>> > removed and the caller is xlate_push_stats_entry, then a new rule will >>>>> > not be replaced. >>>>> >>>>> >>>>> This is a slightly unrelated point, but while reviewing some code and test >>>>> results I think I found a different issue. rule_expire() considers rule >>>>> modified time for hard_timeout and the man page says hard_timeout "Causes >>>>> the flow to expire after the given number of seconds, regardless of activity", >>>>> however, modified time is updated in ofproto_flow_mod_learn_refresh() even if >>>>> the rule isn't changed. Due to the current behaviour, I don't think hard_timeout >>>>> could actually ever trigger on a learned action if an idle timeout is set or if >>>>> traffic is ongoing. >>>> >>>> The 'modified' time of a learned rule is updated in this function, because >>>> the packet hit the learn() rule. If the learned rule didn't exist, we >>>> would create it at this moment. Instead of re-creating, we just update >>>> the 'modified' time. If packets will stop hitting the learn() rule, but >>>> will continue to flow through the learned one, the learned one will be >>>> removed when the hard timeout fires up. That is a correct behavior. >>>> >>> >>> I can accept this is intended, but it doesn't seem like it would >>> operate in that way based on the man page alone. >> >> It's a generic characteristic of an OpenFlow rule. "activity" means traffic >> matching it. learn() action is triggering a flow modification equal to a >> strict mod-flow, as described in the ovs-actions man page. And ovs-ofctl page >> mentions that "mod-flows restarts the hard_timeout". >> >> Maybe it can be clarified somehow better, I'm not sure. Feel free to submit >> a patch. >> >>> >>>> >>>>> >>>>> >>>>> Hi, Mike. That's an interesting issue, though I'm not sure that is a >>>>> correct way to fix it. >>>>> >>>>> It is important to re-install removed learned flows during stats update. >>>>> The logic is the following, AFAIU: >>>>> >>>>> 1. Packet hits the OF rule (1) with a learning action. >>>>> 2. Handler translates OF and creates a datapath flow for it, creating >>>>> a learned rule (2) in the process and filling the translation cache. >>>>> 3. Datapath flow for the rule (1) is installed into the datapath. >>>>> 4. Time goes by and timeout on the learned rule (2) hits. Learned rule >>>>> is removed now. >>>>> 5. A packet hits the datapath flow for rule (1). >>>>> 6. Revalidator updates flow stats for the rule (1). >>>>> 7. Since the packet hit the learning rule (1), the rule (2) has to be >>>>> added back. >>>>> >>>>> IIUC, the step 7 will not happen with the change proposed in this patch. >>>>> Flow will not be re-translated, so the only way to re-install it is >>>>> to execute side effects when we see stats update on a parent datapath >>>>> flow in the revalidator. >>>>> >>>>> One potential way to avoid breaking this scenario is to check the >>>>> removal reason, but it's not bulletproof either as we do still need >>>>> to reinstate the flow even if it was manually deleted or evicted. >>>>> >>>>> >>>>> I added a log to check the removal reason in the XC_LEARN stats update, and the >>>>> reason always seems to be eviction. I wasn't able to catch an idle or hard removal >>>>> in that function, though I'm sure I just wasn't winning the race. >>>>> >>>>> But in this example if we hit 7, and don't re-add the rule, what is the real penalty? >>>>> The next packet will have to upcall and we would process the learn action from scratch, >>>>> but that may be what the user wants as indicated by setting a timeout. >>>> >>>> Not really. If the learned flow is removed, we need a packet to hit a >>>> learn() rule, which is a different rule, in order for traffic to continue >>>> to flow. Let's say we implement a firewall that allows connections only >>>> from A to B. So, A can initiate connection to B, but B can't initiate >>>> connection to A. Once A sends tcp SYN, we learn a rule that allows traffic >>>> from B to A on these particular ports. Now B can send traffic back. >>>> Once the learned flow is removed, A will have to establish a new connection >>>> in order to re-learn the B->A rule. So, not just an upcall. May not be a >>>> best example, but should work as an illustration. >>>> >>>>> The alternative >>>>> is we re-install a flow that may be different then the flow that the user actually >>>>> wants installed. >>>> >>>> If the flow is modified, it means that there was a traffic on a learn() >>>> action and we have to learn that specific rule. If it was already removed, >>>> we have to reinstate it. In general, the logic in the current OVS code >>>> seems correct in this particular place. >>>> >>>>> >>>>> Isn't there a risk of different revalidator threads going out of sync with different >>>>> versions of a learned rule in their xcache? >>>>> >>>>> >>>>> It also seems like we may hold the reference to this deleted rule for a >>>>> long time in the cache until the parent datapath flow is deleted. >>>>> >>>>> >>>>> What is triggering the deletion of the learned rule in your case? >>>>> Is it deletion or modification of the parent rule? And does it have >>>>> 'delete_learned' flag? >>>>> >>>>> >>>>> In the case that I was debugging, the parent learn action was not deleted. Instead, >>>>> new traffic from a different port was supposed to cause the learn action to modify >>>>> the existing rule. But instead we see the rule modified with the correct information, >>>>> and then immediately reverted with old stale information. >>>> >>>> The problem here is that we can't define which one is 'correct'. >>>> It's a race where both values are correct, but you like one better >>>> for some reason. New rules are only learned if the parent datapath >>>> flow that has a learn() action seen packets after the previous stats >>>> check. >>>> >>>> What you're describing is that you have a single rule with a learn >>>> action that is capable of re-writing the same learned rule depending >>>> on what kind of packet actually hits it. Is that correct? >>> >>> In this case the rule is: >>> >>> learn(table=20,hard_timeout=300,priority=1,cookie=0xcb684d24d14b14ad,>>> NXM_OF_VLAN_TCI[0..11],NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[], >>> load:0->NXM_OF_VLAN_TCI[],load:NXM_NX_TUN_ID[]->NXM_NX_TUN_ID[], >>> output:OXM_OF_IN_PORT[]),output:1 >>> >>> And an example of the learned rule is: >>> >>> vlan_tci=0x0005/0x0fff,dl_dst=fc:16:3e:3e:5d:e0, >>> actions=load:0->NXM_OF_VLAN_TCI[],load:0x3->NXM_NX_TUN_ID[],output:2 Could you also show how the datapath flows look like for the learning flow? For both ingress ports - A and B. Since you have TUN_ID in the actions, I suppose these should have TUN_ID and IN_PORT as a match. >>> >>> The scenario is that ingressing packets are supposed to change the >>> route that egress packets take, and that route can change over time. >>> >>> What I see from some additional logging is: >>> >>> 1. traffic begins to ingress on port A >>> 2. learn action installs a rule to route response to A >>> 3. traffic begins to egress on port A >>> 4. port A becomes invalid, no further traffic ingresses from port A >>> 5. traffic ingresses on port B >>> 6. learn action installs a rule to route response to B >>> 7. revalidator replaces (6) with (2) >>> >>> Eventually the egressing traffic will become exausted and the correct >>> rule is installed, but this can take several minutes. The rule >>> regression can even happen many milliseconds after learn installs the >>> correct rule and even after packets have flowed through the new rule. >> >> The rule can be replaced if there was no revalidation between steps >> 4 and 5, i.e. at the point of revalidation there is a stats increase >> on both datapath flows (ingress A and ingress B). >> Revalidators can sleep for up to 500ms by default between revalidation >> cycles, so the replacement can happen many milliseconds after. >> >> Is the traffic on port B continuous, or is it a one-shot packet? > > Continuous but less frequent than traffic in the opposite direction. IIUC, the traffic in the opposite direction is going through the learned rule, not the learning one. In this case, the frequency of that traffic doesn't matter. We only care about packets flowing through the original learning rule, unless you have a lot of drops or otherwise unhealthy system that is causing packet drops on the learning datapath flow. > In a client's environment there were ingress packets every few seconds > and egress packets continued to flow incorrectly for several minutes > afterwards. If the traffic ingresses from B every few seconds and there is no traffic ingressing from A, I'd expect the issue to be resolved on the next revalidation. And I don't see a reason why this doesn't happen for several minutes. Unless revalidation is not happening due to a bug below. > If this solution is unacceptable; I could resubmit with an additional > check to only suppress the flow_mod if the learn_modify rule is in a > removed state and there is already a conflicting rule in the > classifier that isn't removed. Or would you prefer a completely > different approach be taken here? The "revalidation is not requested" bug described below is the only issue I see in the current code. The rule replacement is correct, AFAICT. If the fixing of the revalidation request is not fixing the issue within a few seconds, then we have a bug that we do not understand. And this bug is likely somewhere else. Meaning, we have not identified a root cause of the issue. Changing the correct rule re-instantiation code doesn't make much sense if we do not understand what is actually going wrong in the setup. > > > Thanks, > M > >> The point is, at least one revalidation cycle should pass after step 4. >> And the traffic on port B should be seen after this revalidation cycle >> is completed. So, step 5 should be at least 500+ ms apart from step 4. >> >>> >>> >>>> I would consider this kind of setup as fairly fragile and would not >>>> recommend using it in general. >>>> >>>> However, the scenario is likely like this: >>>> >>>> 1. Upcall on a rule with learn() action. >>>> New datapath flow (1) is installed. This flow has a match on >>>> all the fields used in a learn action and ukey contains the >>>> translation cache with a rule that it produces. >>>> The new learned rule is installed from this. >>>> >>>> 2. Upcall with a different packet that doesn't match the first >>>> datapath flow (1), but on the same rule with a learn() action. >>>> New datapath flow (2) is installed. This flow has a match on >>>> all the fields used in a learn action (they are different from >>>> the first one, since we have an upcall) and ukey contains the >>>> translation cache with a rule that it produces. Let's say the >>>> learned rule has the same match, but different actions. >>>> This overwrites the previously installed learned rule. >>>> >>>> 3. Both datapath flows (1) and (2) get some traffic. >>>> >>>> 4. Revalidation starts. >>>> >>>> 5. Revalidator A dumps datapath flow (1). >>>> Revalidator B dumps datapath flow (2). >>>> >>>> 6. Revalidators A and B are trying to revalidate these datapath flows >>>> at the same time. Both datapath flows had traffic since the >>>> last revalidation. Both revalidators are trying to update the >>>> learned rule, which is the same for both of them. >>>> >>>> 7. One of them does that first, the other one ends up with a rule in a >>>> REMOVED state, since it was just replaced by the other revalidator. >>>> >>>> >>>> There is no correct or incorrect rule in this situation. Both >>>> of them are correct, because both datapath flows have seen packets >>>> since the previous revalidation. We can't guarantee which one will >>>> end up being written the last and therefore will take effect. >>>> if we refuse to reinstate the REMOVED rule, we still can't say which >>>> one will be active in the end. >>>> >>>> What can resolve the race is the next revalidation cycle. >>>> Once one of the datapath flows will stop receiving traffic, there >>>> should no longer be a race condition and the next revalidation will >>>> set the "correct" rule, i.e. the rule that corresponds to a learning >>>> datapath flow that seen some packets last. But until both flows have >>>> traffic, there is no "correct" rule and the result will be random, >>>> regardless of what we choose to do. >>>> >>>> >>>> One issue that I see in the current code is that it looks like the table >>>> version is not getting updated on learned flow refresh and hence the next >>>> revalidation may not actually be triggered. In this case the random >>>> result may stay for much longer than we would like. >>>> >>>> The suspicious code (and the potential fix) is the following: >>>> >>>> @@ -5548,22 +5558,15 @@ ofproto_flow_mod_learn_revert(struct ofproto_flow_mod *ofm) >>>> } >>>> >>>> enum ofperr >>>> -ofproto_flow_mod_learn_finish(struct ofproto_flow_mod *ofm, >>>> - struct ofproto *orig_ofproto) >>>> +ofproto_flow_mod_learn_finish(struct ofproto_flow_mod *ofm) >>>> OVS_REQUIRES(ofproto_mutex) >>>> { >>>> struct rule *rule = rule_collection_rules(&ofm->new_rules)[0]; >>>> enum ofperr error = 0; >>>> >>>> - /* If learning on a different bridge, must bump its version >>>> - * number and flush connmgr afterwards. */ >>>> - if (rule->ofproto != orig_ofproto) { >>>> - ofproto_bump_tables_version(rule->ofproto); >>>> - } >>>> + ofproto_bump_tables_version(rule->ofproto); >>>> error = ofproto_flow_mod_finish(rule->ofproto, ofm, NULL); >>>> - if (rule->ofproto != orig_ofproto) { >>>> - ofmonitor_flush(rule->ofproto->connmgr); >>>> - } >>>> + ofmonitor_flush(rule->ofproto->connmgr); >>>> >>>> return error; >>>> } >>>> --- >>>> >>>> We should actually bump the version and request a revalidation in >>>> any case, not only when the change is in a different bridge. >> >> FWIW, the original code in commit 1f4a89336682 was written mostly >> for a case where learning is triggered by a packet-out executed >> within a bundle. And we can't actually publish changes for a current >> bridge in this case. So, the fix should be a bit more clever. >> >>>> >>>> Could you try that in your setup? >>> >>> That was actually my initial suspicion when investigating this bug, >>> but this change doesn't improve the buggy behaviour. Could you describe a buggy behavior again? Is it that you can see the traffic still going to an old port several minutes after you stopped receiving any packets from this old port, but did receive packets from a new port hitting the learning rule every few seconds? If that is a correct description and that is happening after the change above is applied, we have a serious issue somewhere else. >>> >>> Thank you, >>> M >>> >>>> >>>> >>>> The issue appears to come from the commit: >>>> 1f4a89336682 ("ofproto: Refactor packet_out handling.") >>>> >>>> There is a slight chance of launching ourselves into an infinite >>>> revlidation loop, I guess. Especially if both learning datapath >>>> flows will continue to receive the traffic, but there is a minimal >>>> revalidation interval, IIRC. Not sure if it applies in this case. >>>> >>>> Best regards, Ilya Maximets. >>>> >>> >> >
diff --git a/ofproto/ofproto-dpif-xlate-cache.c b/ofproto/ofproto-dpif-xlate-cache.c index 9224ee2e6..843903614 100644 --- a/ofproto/ofproto-dpif-xlate-cache.c +++ b/ofproto/ofproto-dpif-xlate-cache.c @@ -125,7 +125,7 @@ xlate_push_stats_entry(struct xc_entry *entry, case XC_LEARN: { enum ofperr error; error = ofproto_flow_mod_learn(entry->learn.ofm, true, - entry->learn.limit, NULL); + entry->learn.limit, NULL, false); if (error) { static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5); VLOG_WARN_RL(&rl, "xcache LEARN action execution failed."); diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c index 4928ea99c..8aa2ed699 100644 --- a/ofproto/ofproto-dpif-xlate.c +++ b/ofproto/ofproto-dpif-xlate.c @@ -5701,7 +5701,7 @@ xlate_learn_action(struct xlate_ctx *ctx, const struct ofpact_learn *learn) bool success = true; if (ctx->xin->allow_side_effects) { error = ofproto_flow_mod_learn(ofm, ctx->xin->xcache != NULL, - learn->limit, &success); + learn->limit, &success, true); } else if (learn->limit) { if (!ofm->temp_rule || ofm->temp_rule->state != RULE_INSERTED) { diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c index fad7342b0..465f2a3f4 100644 --- a/ofproto/ofproto-dpif.c +++ b/ofproto/ofproto-dpif.c @@ -4876,7 +4876,7 @@ packet_xlate(struct ofproto *ofproto_, struct ofproto_packet_out *opo) if (entry->type == XC_LEARN) { struct ofproto_flow_mod *ofm = entry->learn.ofm; - error = ofproto_flow_mod_learn_refresh(ofm); + error = ofproto_flow_mod_learn_refresh(ofm, true); if (error) { goto error_out; } diff --git a/ofproto/ofproto-provider.h b/ofproto/ofproto-provider.h index 143ded690..de1823c89 100644 --- a/ofproto/ofproto-provider.h +++ b/ofproto/ofproto-provider.h @@ -2027,9 +2027,11 @@ enum ofperr ofproto_flow_mod_init_for_learn(struct ofproto *, struct ofproto_flow_mod *) OVS_EXCLUDED(ofproto_mutex); enum ofperr ofproto_flow_mod_learn(struct ofproto_flow_mod *, bool keep_ref, - unsigned limit, bool *below_limit) + unsigned limit, bool *below_limit, + bool recreate) OVS_EXCLUDED(ofproto_mutex); -enum ofperr ofproto_flow_mod_learn_refresh(struct ofproto_flow_mod *ofm); +enum ofperr ofproto_flow_mod_learn_refresh(struct ofproto_flow_mod *ofm, + bool recreate); enum ofperr ofproto_flow_mod_learn_start(struct ofproto_flow_mod *ofm) OVS_REQUIRES(ofproto_mutex); void ofproto_flow_mod_learn_revert(struct ofproto_flow_mod *ofm) diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c index dbf4958bc..dd2562acb 100644 --- a/ofproto/ofproto.c +++ b/ofproto/ofproto.c @@ -5472,7 +5472,7 @@ ofproto_flow_mod_init_for_learn(struct ofproto *ofproto, } enum ofperr -ofproto_flow_mod_learn_refresh(struct ofproto_flow_mod *ofm) +ofproto_flow_mod_learn_refresh(struct ofproto_flow_mod *ofm, bool recreate) { enum ofperr error = 0; @@ -5492,6 +5492,11 @@ ofproto_flow_mod_learn_refresh(struct ofproto_flow_mod *ofm) * after we have read the 'rule->state' below. In this case the next time * this function is executed the rule will be reinstated. */ if (rule->state == RULE_REMOVED) { + if (!recreate) { + /* If the rule is marked as removed and the caller doesn't want it + * recreated, just return an error here as if there was no rule. */ + return OFPERR_OFPFMFC_UNKNOWN; + } struct cls_rule cr; cls_rule_clone(&cr, &rule->cr); @@ -5565,10 +5570,12 @@ ofproto_flow_mod_learn_finish(struct ofproto_flow_mod *ofm, /* Refresh 'ofm->temp_rule', for which the caller holds a reference, if already * in the classifier, insert it otherwise. If the rule has already been - * removed from the classifier, a new rule is created using 'ofm->temp_rule' as - * a template and the reference to the old 'ofm->temp_rule' is freed. If - * 'keep_ref' is true, then a reference to the current rule is held, otherwise - * it is released and 'ofm->temp_rule' is set to NULL. + * removed from the classifier and 'recreate' is true, a new rule is created + * using 'ofm->temp_rule' as a template and the reference to the old + * 'ofm->temp_rule' is freed. If recreate is false and the rule has been + * removed, do not reinsert a new copy of that rule. If 'keep_ref' is true, + * then a reference to the current rule is held, otherwise it is released and + * 'ofm->temp_rule' is set to NULL. * * If 'limit' != 0, insertion will fail if there are more than 'limit' rules * in the same table with the same cookie. If insertion succeeds, @@ -5579,10 +5586,10 @@ ofproto_flow_mod_learn_finish(struct ofproto_flow_mod *ofm, * during the call. */ enum ofperr ofproto_flow_mod_learn(struct ofproto_flow_mod *ofm, bool keep_ref, - unsigned limit, bool *below_limitp) + unsigned limit, bool *below_limitp, bool recreate) OVS_EXCLUDED(ofproto_mutex) { - enum ofperr error = ofproto_flow_mod_learn_refresh(ofm); + enum ofperr error = ofproto_flow_mod_learn_refresh(ofm, recreate); struct rule *rule = ofm->temp_rule; bool below_limit = true;
When the a revalidator thread is updating statistics for an XC_LEARN xcache entry in xlate_push_stats_entry it uses ofproto_flow_mod_learn. The revalidator will update stats for rules even if they are in a removed state or marked as invisible. However, ofproto_flow_mod_learn will detect if a flow has been removed and re-add it in that case. This can result in an old learn action replacing the new learn action that had replaced it in the first place. This change adds a new parameter to ofproto_flow_mod_learn allowing the caller to specify an action to take if temp_rule is removed. If it is removed and the caller is xlate_push_stats_entry, then a new rule will not be replaced. Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2213892 Signed-off-by: Mike Pattrick <mkp@redhat.com> --- ofproto/ofproto-dpif-xlate-cache.c | 2 +- ofproto/ofproto-dpif-xlate.c | 2 +- ofproto/ofproto-dpif.c | 2 +- ofproto/ofproto-provider.h | 6 ++++-- ofproto/ofproto.c | 21 ++++++++++++++------- 5 files changed, 21 insertions(+), 12 deletions(-)