Message ID | 165971680356.246236.1723217950402767004.stgit@dceara.remote.csb |
---|---|
Headers | show |
Series | Add OVN component templates. | expand |
Hi Dumitru, I read the patch series, and I think the idea of chassis-specific variables is a good idea to reduce the number of DB records for certain things. Aside from load balancers, I suspect this could have a positive impact for other structures as well. Rather than criticizing the individual lines of code, I'll focus instead on some higher-level questions/ideas. First, one question I had was what happens when a template variable name is used in a load balancer, but there is no appropriate value to substitute? For instance, what if a load balancer applies to chassis-3, but you only have template variables for chassis-1 and chassis-2? This might be addressed in the code but I didn't notice if it was. Second, it seems like template variables are a natural extension of existing concepts like address sets and port groups. In those cases, they were an unconditional collection of IP addresses or ports. For template variables, they're a collection of untyped values with the condition of only applying on certain Chassis. I wonder if this could all be reconciled with a single table that uses untyped values with user-specified conditions. Right now template variables have a "Chassis" column, but maybe this could be replaced with a broader "condition", "when", or "match" column. To get something integrated quickly, this column could just accept the syntax of "chassis.name == <blah>" or "chassis.uuid == <blah>" to allow for chassis-specific application of the values. With this foundation, we could eventually allow unconditional application of the value, or more complex conditions (e.g. only apply to logical switch ports that are connected to a router with a distributed gateway port). Doing this, we could deprecate address sets and port groups eventually in favor of template variables. Third, I was wondering if there could be some layer that exists between the IDL and the application that expands the template variables as early as possible. I'm thinking the application could inject some callback in the IDL layer that might allow for the values to be substituted. This way, the variable substitution is taken care of in a single place, and by the time the application gets the data, it knows that all substitutions have been made and there is no need to special case template variable names vs. plain tokens. They should all be plain tokens. I don't think construction of such a layer should be a barrier to merging the code, but it's something worth considering as a later improvement. Anyway, those were my high-level thoughts on the topic. Let me know what you think. On 8/5/22 12:26, Dumitru Ceara wrote: > Sometimes network components are compute node-specific. Sometimes such > components are replicated, almost identically, for multiple nodes > in the cluster. > > One such example is the case of Kubernetes NodePort services which > translate (in the ovn-kubernetes case) to Load_Balancer > objects being applied to each and every node's logical gateway router. > These load balancers are almost identical, the main difference being > the fact that they use different VIPs (the node's IP). > > With the current OVN load balancer design, this becomes a problem at > scale because the number of load balancers that must be configured is > N x M (N nodes times M services). > > This series proposes a new concept in OVN: virtual network component > templates. The goal of the templates is to help reduce resource > consumption in the OVN central components in specific cases like the one > described above. > > To achieve that, the CMS will instead configure a "templated" load > balancer for every service and apply that single template record to > the cluster-wide load balancer group. This template is then > instantiated differently on different compute nodes. This translation > is controlled through per-chassis "template variables" configured by > the CMS in the new NB.Template_Var table. > > A syntetic benchmark simulating what an OpenShift router (using Node > Port services) scale test would do shows the following preliminary > results: > A. 120 node, 2K NodePort services: > - before: > - Southbound DB size on disk (compacted): ~385MB > - Southbound DB memory usage (RSS): ~3GB > - Southbound DB logical flows: 720K > > - after: > - Southbound DB size on disk (compacted): ~100MB > - Southbound DB memory usage (RSS): ~250MB > - Southbound DB logical flows: 6K > > B. 250 node, 2K NodePort services: > - after (didn't run the "before" test as it was taking way too long): > - Southbound DB size on disk (compacted): ~155MB > - Southbound DB memory usage (RSS): ~760MB > - Southbound DB logical flows: 6K > > The series is sent as RFC because there's still the need to add > some template specific unit tests and the "ovn-nbctl lb-*" helper > utilities need to be adapted to support templated load balancers. > > With these two items addressed the code is self can likely qualify > for acceptance as a new feature in the upcoming release. > > There also exists a more extensive TODO list (also listed in the commit > log of every patch in the series for now) but these are mainly load > balancer related functionalities that are not yet implemented for > templated load balancers but can definitely be implemented as follow ups: > - No support for LB health check if the LB is templated. > - No support for VIP ARP responder if the LB is templated. > - No support for routed VIPs if the LB is templated. > - Figure out a way to deal with templates in ovn-trace > - Determine if we need to allow Template_Var to match against chassis > hostname or other IDs. > - Make ofctrl_inject_pkt() work with template_vars. > - Make test-ovn work with template_vars. > > A basic example of how to configure a templated load balancer follows: > $ ovn-nbctl create load_balancer name=lb-test \ > protocol=tcp options:template=true \ > vips:\"^vip:4200\"="^backends" > > $ ovn-nbctl ls-add ls > $ ovn-nbctl ls-lb-add ls lb-test > > # Instantiate the load balancer on chassis-1 > $ ovn-nbctl create template_var name=vip value=80.80.80.1 chassis=chassis-1 > $ ovn-nbctl create template_var name=backends value='"42.42.42.1:1000"' chassis=chassis-1 > > # Instantiate the load balancer on chassis-2 > $ ovn-nbctl create template_var name=vip value=80.80.80.2 chassis=chassis-2 > $ ovn-nbctl create template_var name=backends value='"42.42.42.2:1000"' chassis=chassis-2 > > Dumitru Ceara (5): > Add NB and SB Template_Var tables. > controller: Add support for templated actions and matches. > controller: Make resource references more generic. > lb: Support using templates. > controller: Add Template_Var <- LB references. > > > controller/lflow.c | 248 ++++++++++++++++++++++++-------- > controller/lflow.h | 98 +++++++------ > controller/ofctrl.c | 9 +- > controller/ofctrl.h | 3 +- > controller/ovn-controller.c | 277 ++++++++++++++++++++++++++++++++++-- > include/ovn/expr.h | 4 +- > include/ovn/lex.h | 14 +- > lib/actions.c | 9 +- > lib/expr.c | 14 +- > lib/lb.c | 201 ++++++++++++++++++++++---- > lib/lb.h | 36 +++-- > lib/lex.c | 55 +++++++ > northd/northd.c | 64 +++++---- > tests/ovn.at | 2 +- > tests/test-ovn.c | 16 ++- > utilities/ovn-trace.c | 26 +++- > 16 files changed, 869 insertions(+), 207 deletions(-) > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >
On 8/10/22 19:54, Mark Michelson wrote: > Hi Dumitru, > Hi Mark, > I read the patch series, and I think the idea of chassis-specific > variables is a good idea to reduce the number of DB records for certain > things. Aside from load balancers, I suspect this could have a positive > impact for other structures as well. > Thanks for taking a look! Yes, I think this might be applicable to other structures too. > Rather than criticizing the individual lines of code, I'll focus instead > on some higher-level questions/ideas. > Sure, thanks! :) > First, one question I had was what happens when a template variable name > is used in a load balancer, but there is no appropriate value to > substitute? For instance, what if a load balancer applies to chassis-3, > but you only have template variables for chassis-1 and chassis-2? This > might be addressed in the code but I didn't notice if it was. > There are actually two things to consider here: 1. there might be a logical flow that uses a template variable: in this case if template expansion/instantiation fails we currently leave the token untouched (e.g., '^variable' stays '^variable'). That will cause the flow action/match parsing to fail and currently logs a warning. The flow itself is skipped, as it should be. We probably need to avoid logging a warning though. 2. like you pointed out, there might be a load balancer using templates in its backends/vips: if some of those templates cannot be instantiated locally the backend/vip where they're added is skipped. Unless I missed something, the code should already do that. > Second, it seems like template variables are a natural extension of > existing concepts like address sets and port groups. In those cases, > they were an unconditional collection of IP addresses or ports. For You're right to some extent template variables are similar to port groups. The southbound database port group table splits the northbound port group per datapath though not per chassis like template variables. > template variables, they're a collection of untyped values with the > condition of only applying on certain Chassis. I wonder if this could > all be reconciled with a single table that uses untyped values with > user-specified conditions. Right now template variables have a "Chassis" > column, but maybe this could be replaced with a broader "condition", > "when", or "match" column. To get something integrated quickly, this > column could just accept the syntax of "chassis.name == <blah>" or > "chassis.uuid == <blah>" to allow for chassis-specific application of > the values. With this foundation, we could eventually allow > unconditional application of the value, or more complex conditions (e.g. > only apply to logical switch ports that are connected to a router with a > distributed gateway port). Doing this, we could deprecate address sets > and port groups eventually in favor of template variables. This sounds like a good idea to me. I wasn't too happy with the "chassis" string column of the Template_Var table anyway. A generic condition field makes more sense. Regarding deprecating and replacing address sets and port groups, I'm not sure how easy that would be but we can try it when we get to that point. > > Third, I was wondering if there could be some layer that exists between > the IDL and the application that expands the template variables as early > as possible. I'm thinking the application could inject some callback in > the IDL layer that might allow for the values to be substituted. This > way, the variable substitution is taken care of in a single place, and > by the time the application gets the data, it knows that all > substitutions have been made and there is no need to special case > template variable names vs. plain tokens. They should all be plain > tokens. I don't think construction of such a layer should be a barrier > to merging the code, but it's something worth considering as a later > improvement. This could work. I'll think more about it. But like you said, it's probably a longer term goal. It will need some significant changes in the IDL layer (e.g., to re-evaluate some records when template instantiations change). > > Anyway, those were my high-level thoughts on the topic. Let me know what > you think. > I can work on changing the Template_Var schema to add a broader way of specifying conditions (when/match/etc). I'm already working on adding proper nbctl support for templated load balancers and trying to tackle the rest of the todos. I can probably send a v1 sometime in the first half of next week. Do you want to share any specific code related comments that I should already integrate or shall we start a proper review when v1 gets posted? Thanks again for your input on this RFC! Regards, Dumitru > On 8/5/22 12:26, Dumitru Ceara wrote: >> Sometimes network components are compute node-specific. Sometimes such >> components are replicated, almost identically, for multiple nodes >> in the cluster. >> >> One such example is the case of Kubernetes NodePort services which >> translate (in the ovn-kubernetes case) to Load_Balancer >> objects being applied to each and every node's logical gateway router. >> These load balancers are almost identical, the main difference being >> the fact that they use different VIPs (the node's IP). >> >> With the current OVN load balancer design, this becomes a problem at >> scale because the number of load balancers that must be configured is >> N x M (N nodes times M services). >> >> This series proposes a new concept in OVN: virtual network component >> templates. The goal of the templates is to help reduce resource >> consumption in the OVN central components in specific cases like the one >> described above. >> >> To achieve that, the CMS will instead configure a "templated" load >> balancer for every service and apply that single template record to >> the cluster-wide load balancer group. This template is then >> instantiated differently on different compute nodes. This translation >> is controlled through per-chassis "template variables" configured by >> the CMS in the new NB.Template_Var table. >> >> A syntetic benchmark simulating what an OpenShift router (using Node >> Port services) scale test would do shows the following preliminary >> results: >> A. 120 node, 2K NodePort services: >> - before: >> - Southbound DB size on disk (compacted): ~385MB >> - Southbound DB memory usage (RSS): ~3GB >> - Southbound DB logical flows: 720K >> >> - after: >> - Southbound DB size on disk (compacted): ~100MB >> - Southbound DB memory usage (RSS): ~250MB >> - Southbound DB logical flows: 6K >> >> B. 250 node, 2K NodePort services: >> - after (didn't run the "before" test as it was taking way too long): >> - Southbound DB size on disk (compacted): ~155MB >> - Southbound DB memory usage (RSS): ~760MB >> - Southbound DB logical flows: 6K >> >> The series is sent as RFC because there's still the need to add >> some template specific unit tests and the "ovn-nbctl lb-*" helper >> utilities need to be adapted to support templated load balancers. >> >> With these two items addressed the code is self can likely qualify >> for acceptance as a new feature in the upcoming release. >> >> There also exists a more extensive TODO list (also listed in the commit >> log of every patch in the series for now) but these are mainly load >> balancer related functionalities that are not yet implemented for >> templated load balancers but can definitely be implemented as follow ups: >> - No support for LB health check if the LB is templated. >> - No support for VIP ARP responder if the LB is templated. >> - No support for routed VIPs if the LB is templated. >> - Figure out a way to deal with templates in ovn-trace >> - Determine if we need to allow Template_Var to match against chassis >> hostname or other IDs. >> - Make ofctrl_inject_pkt() work with template_vars. >> - Make test-ovn work with template_vars. >> >> A basic example of how to configure a templated load balancer follows: >> $ ovn-nbctl create load_balancer name=lb-test \ >> protocol=tcp options:template=true \ >> vips:\"^vip:4200\"="^backends" >> >> $ ovn-nbctl ls-add ls >> $ ovn-nbctl ls-lb-add ls lb-test >> >> # Instantiate the load balancer on chassis-1 >> $ ovn-nbctl create template_var name=vip value=80.80.80.1 >> chassis=chassis-1 >> $ ovn-nbctl create template_var name=backends >> value='"42.42.42.1:1000"' chassis=chassis-1 >> >> # Instantiate the load balancer on chassis-2 >> $ ovn-nbctl create template_var name=vip value=80.80.80.2 >> chassis=chassis-2 >> $ ovn-nbctl create template_var name=backends >> value='"42.42.42.2:1000"' chassis=chassis-2 >> >> Dumitru Ceara (5): >> Add NB and SB Template_Var tables. >> controller: Add support for templated actions and matches. >> controller: Make resource references more generic. >> lb: Support using templates. >> controller: Add Template_Var <- LB references. >> >> >> controller/lflow.c | 248 ++++++++++++++++++++++++-------- >> controller/lflow.h | 98 +++++++------ >> controller/ofctrl.c | 9 +- >> controller/ofctrl.h | 3 +- >> controller/ovn-controller.c | 277 ++++++++++++++++++++++++++++++++++-- >> include/ovn/expr.h | 4 +- >> include/ovn/lex.h | 14 +- >> lib/actions.c | 9 +- >> lib/expr.c | 14 +- >> lib/lb.c | 201 ++++++++++++++++++++++---- >> lib/lb.h | 36 +++-- >> lib/lex.c | 55 +++++++ >> northd/northd.c | 64 +++++---- >> tests/ovn.at | 2 +- >> tests/test-ovn.c | 16 ++- >> utilities/ovn-trace.c | 26 +++- >> 16 files changed, 869 insertions(+), 207 deletions(-) >> >> _______________________________________________ >> dev mailing list >> dev@openvswitch.org >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >> >
Thanks Dumitru for this promising optimization! On Thu, Aug 11, 2022 at 1:03 AM Dumitru Ceara <dceara@redhat.com> wrote: > > On 8/10/22 19:54, Mark Michelson wrote: > > Hi Dumitru, > > > > Hi Mark, > > > I read the patch series, and I think the idea of chassis-specific > > variables is a good idea to reduce the number of DB records for certain > > things. Aside from load balancers, I suspect this could have a positive > > impact for other structures as well. > > > > Thanks for taking a look! Yes, I think this might be applicable to > other structures too. > I think it is a good idea to make it more generic, but for my understanding this template concept is for anything that's "node/chassis" specific, and supposed to be instantiated at chassis level. Probably we should name the DB tables as something like chassis_template_var. > > Rather than criticizing the individual lines of code, I'll focus instead > > on some higher-level questions/ideas. > > > > Sure, thanks! :) > > > First, one question I had was what happens when a template variable name > > is used in a load balancer, but there is no appropriate value to > > substitute? For instance, what if a load balancer applies to chassis-3, > > but you only have template variables for chassis-1 and chassis-2? This > > might be addressed in the code but I didn't notice if it was. > > > > There are actually two things to consider here: > > 1. there might be a logical flow that uses a template variable: in this > case if template expansion/instantiation fails we currently leave the > token untouched (e.g., '^variable' stays '^variable'). That will cause > the flow action/match parsing to fail and currently logs a warning. The > flow itself is skipped, as it should be. We probably need to avoid > logging a warning though. > > 2. like you pointed out, there might be a load balancer using templates > in its backends/vips: if some of those templates cannot be instantiated > locally the backend/vip where they're added is skipped. Unless I missed > something, the code should already do that. > > > Second, it seems like template variables are a natural extension of > > existing concepts like address sets and port groups. In those cases, > > they were an unconditional collection of IP addresses or ports. For > > You're right to some extent template variables are similar to port > groups. The southbound database port group table splits the northbound > port group per datapath though not per chassis like template variables. > > > template variables, they're a collection of untyped values with the > > condition of only applying on certain Chassis. I wonder if this could > > all be reconciled with a single table that uses untyped values with > > user-specified conditions. Right now template variables have a "Chassis" > > column, but maybe this could be replaced with a broader "condition", > > "when", or "match" column. To get something integrated quickly, this > > column could just accept the syntax of "chassis.name == <blah>" or > > "chassis.uuid == <blah>" to allow for chassis-specific application of > > the values. With this foundation, we could eventually allow > > unconditional application of the value, or more complex conditions (e.g. > > only apply to logical switch ports that are connected to a router with a > > distributed gateway port). Doing this, we could deprecate address sets > > and port groups eventually in favor of template variables. > > This sounds like a good idea to me. I wasn't too happy with the > "chassis" string column of the Template_Var table anyway. A generic > condition field makes more sense. > If it is chassis-specific template, a column "chassis" seems to be straightforward. With a "match" column it is another burden of parsing (which is costly and error prone). In addition, the LB object (or other structures) is not a logical-flow, and it doesn't directly map to logical-flows (unlike ACLs), so I didn't understand how would a match string be applied to the template. Is there a more detailed example of this? Maybe I am missing something, and hope we will see more details in the formal patch. > Regarding deprecating and replacing address sets and port groups, I'm > not sure how easy that would be but we can try it when we get to that point. > Address sets and port groups are something different in my view. Although they can be treated as variables in a template, they are not really chassis-specific, and each variable needs to be instantiated to a big number of instances (sometimes huge). For this reason, fine-grained I-P embedded in the expression parsing (for Address set) was introduced for the performance of ovn-controller. Maybe we can say there are still some similarities of templates, but I am not really sure if it is really helpful to generalize them and how difficult it would be. Thanks, Han > > > > Third, I was wondering if there could be some layer that exists between > > the IDL and the application that expands the template variables as early > > as possible. I'm thinking the application could inject some callback in > > the IDL layer that might allow for the values to be substituted. This > > way, the variable substitution is taken care of in a single place, and > > by the time the application gets the data, it knows that all > > substitutions have been made and there is no need to special case > > template variable names vs. plain tokens. They should all be plain > > tokens. I don't think construction of such a layer should be a barrier > > to merging the code, but it's something worth considering as a later > > improvement. > > This could work. I'll think more about it. But like you said, it's > probably a longer term goal. It will need some significant changes in > the IDL layer (e.g., to re-evaluate some records when template > instantiations change). > > > > > Anyway, those were my high-level thoughts on the topic. Let me know what > > you think. > > > > I can work on changing the Template_Var schema to add a broader way of > specifying conditions (when/match/etc). I'm already working on adding > proper nbctl support for templated load balancers and trying to tackle > the rest of the todos. I can probably send a v1 sometime in the first > half of next week. Do you want to share any specific code related > comments that I should already integrate or shall we start a proper > review when v1 gets posted? > > Thanks again for your input on this RFC! > > Regards, > Dumitru > > > On 8/5/22 12:26, Dumitru Ceara wrote: > >> Sometimes network components are compute node-specific. Sometimes such > >> components are replicated, almost identically, for multiple nodes > >> in the cluster. > >> > >> One such example is the case of Kubernetes NodePort services which > >> translate (in the ovn-kubernetes case) to Load_Balancer > >> objects being applied to each and every node's logical gateway router. > >> These load balancers are almost identical, the main difference being > >> the fact that they use different VIPs (the node's IP). > >> > >> With the current OVN load balancer design, this becomes a problem at > >> scale because the number of load balancers that must be configured is > >> N x M (N nodes times M services). > >> > >> This series proposes a new concept in OVN: virtual network component > >> templates. The goal of the templates is to help reduce resource > >> consumption in the OVN central components in specific cases like the one > >> described above. > >> > >> To achieve that, the CMS will instead configure a "templated" load > >> balancer for every service and apply that single template record to > >> the cluster-wide load balancer group. This template is then > >> instantiated differently on different compute nodes. This translation > >> is controlled through per-chassis "template variables" configured by > >> the CMS in the new NB.Template_Var table. > >> > >> A syntetic benchmark simulating what an OpenShift router (using Node > >> Port services) scale test would do shows the following preliminary > >> results: > >> A. 120 node, 2K NodePort services: > >> - before: > >> - Southbound DB size on disk (compacted): ~385MB > >> - Southbound DB memory usage (RSS): ~3GB > >> - Southbound DB logical flows: 720K > >> > >> - after: > >> - Southbound DB size on disk (compacted): ~100MB > >> - Southbound DB memory usage (RSS): ~250MB > >> - Southbound DB logical flows: 6K > >> > >> B. 250 node, 2K NodePort services: > >> - after (didn't run the "before" test as it was taking way too long): > >> - Southbound DB size on disk (compacted): ~155MB > >> - Southbound DB memory usage (RSS): ~760MB > >> - Southbound DB logical flows: 6K > >> > >> The series is sent as RFC because there's still the need to add > >> some template specific unit tests and the "ovn-nbctl lb-*" helper > >> utilities need to be adapted to support templated load balancers. > >> > >> With these two items addressed the code is self can likely qualify > >> for acceptance as a new feature in the upcoming release. > >> > >> There also exists a more extensive TODO list (also listed in the commit > >> log of every patch in the series for now) but these are mainly load > >> balancer related functionalities that are not yet implemented for > >> templated load balancers but can definitely be implemented as follow ups: > >> - No support for LB health check if the LB is templated. > >> - No support for VIP ARP responder if the LB is templated. > >> - No support for routed VIPs if the LB is templated. > >> - Figure out a way to deal with templates in ovn-trace > >> - Determine if we need to allow Template_Var to match against chassis > >> hostname or other IDs. > >> - Make ofctrl_inject_pkt() work with template_vars. > >> - Make test-ovn work with template_vars. > >> > >> A basic example of how to configure a templated load balancer follows: > >> $ ovn-nbctl create load_balancer name=lb-test \ > >> protocol=tcp options:template=true \ > >> vips:\"^vip:4200\"="^backends" > >> > >> $ ovn-nbctl ls-add ls > >> $ ovn-nbctl ls-lb-add ls lb-test > >> > >> # Instantiate the load balancer on chassis-1 > >> $ ovn-nbctl create template_var name=vip value=80.80.80.1 > >> chassis=chassis-1 > >> $ ovn-nbctl create template_var name=backends > >> value='"42.42.42.1:1000"' chassis=chassis-1 > >> > >> # Instantiate the load balancer on chassis-2 > >> $ ovn-nbctl create template_var name=vip value=80.80.80.2 > >> chassis=chassis-2 > >> $ ovn-nbctl create template_var name=backends > >> value='"42.42.42.2:1000"' chassis=chassis-2 > >> > >> Dumitru Ceara (5): > >> Add NB and SB Template_Var tables. > >> controller: Add support for templated actions and matches. > >> controller: Make resource references more generic. > >> lb: Support using templates. > >> controller: Add Template_Var <- LB references. > >> > >> > >> controller/lflow.c | 248 ++++++++++++++++++++++++-------- > >> controller/lflow.h | 98 +++++++------ > >> controller/ofctrl.c | 9 +- > >> controller/ofctrl.h | 3 +- > >> controller/ovn-controller.c | 277 ++++++++++++++++++++++++++++++++++-- > >> include/ovn/expr.h | 4 +- > >> include/ovn/lex.h | 14 +- > >> lib/actions.c | 9 +- > >> lib/expr.c | 14 +- > >> lib/lb.c | 201 ++++++++++++++++++++++---- > >> lib/lb.h | 36 +++-- > >> lib/lex.c | 55 +++++++ > >> northd/northd.c | 64 +++++---- > >> tests/ovn.at | 2 +- > >> tests/test-ovn.c | 16 ++- > >> utilities/ovn-trace.c | 26 +++- > >> 16 files changed, 869 insertions(+), 207 deletions(-) > >> > >> _______________________________________________ > >> dev mailing list > >> dev@openvswitch.org > >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev > >> > > > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Hi Han, On 9/21/22 23:06, Han Zhou wrote: > Thanks Dumitru for this promising optimization! > Thanks for checking it out! > On Thu, Aug 11, 2022 at 1:03 AM Dumitru Ceara <dceara@redhat.com> wrote: >> >> On 8/10/22 19:54, Mark Michelson wrote: >>> Hi Dumitru, >>> >> >> Hi Mark, >> >>> I read the patch series, and I think the idea of chassis-specific >>> variables is a good idea to reduce the number of DB records for certain >>> things. Aside from load balancers, I suspect this could have a positive >>> impact for other structures as well. >>> >> >> Thanks for taking a look! Yes, I think this might be applicable to >> other structures too. >> > > I think it is a good idea to make it more generic, but for my understanding > this template concept is for anything that's "node/chassis" specific, and > supposed to be instantiated at chassis level. Probably we should name the > DB tables as something like chassis_template_var. > If we decide to go for a simple "chassis" column (instead of a generic "match" column) then naming the table Chassis_Template_Var makes sense indeed. >>> Rather than criticizing the individual lines of code, I'll focus instead >>> on some higher-level questions/ideas. >>> >> >> Sure, thanks! :) >> >>> First, one question I had was what happens when a template variable name >>> is used in a load balancer, but there is no appropriate value to >>> substitute? For instance, what if a load balancer applies to chassis-3, >>> but you only have template variables for chassis-1 and chassis-2? This >>> might be addressed in the code but I didn't notice if it was. >>> >> >> There are actually two things to consider here: >> >> 1. there might be a logical flow that uses a template variable: in this >> case if template expansion/instantiation fails we currently leave the >> token untouched (e.g., '^variable' stays '^variable'). That will cause >> the flow action/match parsing to fail and currently logs a warning. The >> flow itself is skipped, as it should be. We probably need to avoid >> logging a warning though. >> >> 2. like you pointed out, there might be a load balancer using templates >> in its backends/vips: if some of those templates cannot be instantiated >> locally the backend/vip where they're added is skipped. Unless I missed >> something, the code should already do that. >> >>> Second, it seems like template variables are a natural extension of >>> existing concepts like address sets and port groups. In those cases, >>> they were an unconditional collection of IP addresses or ports. For >> >> You're right to some extent template variables are similar to port >> groups. The southbound database port group table splits the northbound >> port group per datapath though not per chassis like template variables. >> >>> template variables, they're a collection of untyped values with the >>> condition of only applying on certain Chassis. I wonder if this could >>> all be reconciled with a single table that uses untyped values with >>> user-specified conditions. Right now template variables have a "Chassis" >>> column, but maybe this could be replaced with a broader "condition", >>> "when", or "match" column. To get something integrated quickly, this >>> column could just accept the syntax of "chassis.name == <blah>" or >>> "chassis.uuid == <blah>" to allow for chassis-specific application of >>> the values. With this foundation, we could eventually allow >>> unconditional application of the value, or more complex conditions (e.g. >>> only apply to logical switch ports that are connected to a router with a >>> distributed gateway port). Doing this, we could deprecate address sets >>> and port groups eventually in favor of template variables. >> >> This sounds like a good idea to me. I wasn't too happy with the >> "chassis" string column of the Template_Var table anyway. A generic >> condition field makes more sense. >> > If it is chassis-specific template, a column "chassis" seems to be > straightforward. With a "match" column it is another burden of parsing I have a generic implementation (with a "predicate" column) almost ready for review. I agree it's a bit more work to parse and maintain references. I think it's probably best to discuss these details once I post v1. It's no problem for me to go back to the "chassis" column version if we decide to use that approach. > (which is costly and error prone). In addition, the LB object (or other > structures) is not a logical-flow, and it doesn't directly map to > logical-flows (unlike ACLs), so I didn't understand how would a match > string be applied to the template. Is there a more detailed example of > this? Maybe I am missing something, and hope we will see more details in > the formal patch. > Load balancer VIPs are a collection of key-value pairs (vip:backends). These are currently IPs (and optionally L4 ports). We can allow LBs to support "templated" VIPs/backends. For example (this is with the WIP v1 version of the code): # Create a template LB. ovn-nbctl --template lb-add lb-test "^vip:^vport" "^backends" tcp \ -- ls-lb-add ls1 lb-test \ -- lr-lb-add rtr lb-test # Instantiate the LB template variables, for the chassis where port # vm1 is bound: ovn-nbctl \ -- create template_var name=vip value=66.66.66.66 \ predicate="is_chassis_resident(\"vm1\")" \ -- create template_var name=vport value=666 \ predicate="is_chassis_resident(\"vm1\")" \ -- create template_var name=backends value=\"42.42.42.2:4242\" \ predicate="is_chassis_resident(\"vm1\")" This is equivalent to: ovn-nbctl lb-add lb-test 66.66.66.66:666 42.42.42.2:4242 tcp ovn-nbctl ls-lb-add ls1 lb-test ovn-nbctl lr-lb-add lr1 lb-test >> Regarding deprecating and replacing address sets and port groups, I'm >> not sure how easy that would be but we can try it when we get to that > point. >> > > Address sets and port groups are something different in my view. Although > they can be treated as variables in a template, they are not really > chassis-specific, and each variable needs to be instantiated to a big > number of instances (sometimes huge). For this reason, fine-grained I-P > embedded in the expression parsing (for Address set) was introduced for the > performance of ovn-controller. Maybe we can say there are still some > similarities of templates, but I am not really sure if it is really helpful > to generalize them and how difficult it would be. > It doesn't look straight forward indeed. If everyone agrees, I'd suggest discussing this option (replacing AS and PG) only after/if the template support is accepted. > Thanks, > Han > Thanks, Dumitru >>> >>> Third, I was wondering if there could be some layer that exists between >>> the IDL and the application that expands the template variables as early >>> as possible. I'm thinking the application could inject some callback in >>> the IDL layer that might allow for the values to be substituted. This >>> way, the variable substitution is taken care of in a single place, and >>> by the time the application gets the data, it knows that all >>> substitutions have been made and there is no need to special case >>> template variable names vs. plain tokens. They should all be plain >>> tokens. I don't think construction of such a layer should be a barrier >>> to merging the code, but it's something worth considering as a later >>> improvement. >> >> This could work. I'll think more about it. But like you said, it's >> probably a longer term goal. It will need some significant changes in >> the IDL layer (e.g., to re-evaluate some records when template >> instantiations change). >> >>> >>> Anyway, those were my high-level thoughts on the topic. Let me know what >>> you think. >>> >> >> I can work on changing the Template_Var schema to add a broader way of >> specifying conditions (when/match/etc). I'm already working on adding >> proper nbctl support for templated load balancers and trying to tackle >> the rest of the todos. I can probably send a v1 sometime in the first >> half of next week. Do you want to share any specific code related >> comments that I should already integrate or shall we start a proper >> review when v1 gets posted? >> >> Thanks again for your input on this RFC! >> >> Regards, >> Dumitru >> >>> On 8/5/22 12:26, Dumitru Ceara wrote: >>>> Sometimes network components are compute node-specific. Sometimes such >>>> components are replicated, almost identically, for multiple nodes >>>> in the cluster. >>>> >>>> One such example is the case of Kubernetes NodePort services which >>>> translate (in the ovn-kubernetes case) to Load_Balancer >>>> objects being applied to each and every node's logical gateway router. >>>> These load balancers are almost identical, the main difference being >>>> the fact that they use different VIPs (the node's IP). >>>> >>>> With the current OVN load balancer design, this becomes a problem at >>>> scale because the number of load balancers that must be configured is >>>> N x M (N nodes times M services). >>>> >>>> This series proposes a new concept in OVN: virtual network component >>>> templates. The goal of the templates is to help reduce resource >>>> consumption in the OVN central components in specific cases like the > one >>>> described above. >>>> >>>> To achieve that, the CMS will instead configure a "templated" load >>>> balancer for every service and apply that single template record to >>>> the cluster-wide load balancer group. This template is then >>>> instantiated differently on different compute nodes. This translation >>>> is controlled through per-chassis "template variables" configured by >>>> the CMS in the new NB.Template_Var table. >>>> >>>> A syntetic benchmark simulating what an OpenShift router (using Node >>>> Port services) scale test would do shows the following preliminary >>>> results: >>>> A. 120 node, 2K NodePort services: >>>> - before: >>>> - Southbound DB size on disk (compacted): ~385MB >>>> - Southbound DB memory usage (RSS): ~3GB >>>> - Southbound DB logical flows: 720K >>>> >>>> - after: >>>> - Southbound DB size on disk (compacted): ~100MB >>>> - Southbound DB memory usage (RSS): ~250MB >>>> - Southbound DB logical flows: 6K >>>> >>>> B. 250 node, 2K NodePort services: >>>> - after (didn't run the "before" test as it was taking way too long): >>>> - Southbound DB size on disk (compacted): ~155MB >>>> - Southbound DB memory usage (RSS): ~760MB >>>> - Southbound DB logical flows: 6K >>>> >>>> The series is sent as RFC because there's still the need to add >>>> some template specific unit tests and the "ovn-nbctl lb-*" helper >>>> utilities need to be adapted to support templated load balancers. >>>> >>>> With these two items addressed the code is self can likely qualify >>>> for acceptance as a new feature in the upcoming release. >>>> >>>> There also exists a more extensive TODO list (also listed in the commit >>>> log of every patch in the series for now) but these are mainly load >>>> balancer related functionalities that are not yet implemented for >>>> templated load balancers but can definitely be implemented as follow > ups: >>>> - No support for LB health check if the LB is templated. >>>> - No support for VIP ARP responder if the LB is templated. >>>> - No support for routed VIPs if the LB is templated. >>>> - Figure out a way to deal with templates in ovn-trace >>>> - Determine if we need to allow Template_Var to match against chassis >>>> hostname or other IDs. >>>> - Make ofctrl_inject_pkt() work with template_vars. >>>> - Make test-ovn work with template_vars. >>>> >>>> A basic example of how to configure a templated load balancer follows: >>>> $ ovn-nbctl create load_balancer name=lb-test \ >>>> protocol=tcp options:template=true \ >>>> vips:\"^vip:4200\"="^backends" >>>> >>>> $ ovn-nbctl ls-add ls >>>> $ ovn-nbctl ls-lb-add ls lb-test >>>> >>>> # Instantiate the load balancer on chassis-1 >>>> $ ovn-nbctl create template_var name=vip value=80.80.80.1 >>>> chassis=chassis-1 >>>> $ ovn-nbctl create template_var name=backends >>>> value='"42.42.42.1:1000"' chassis=chassis-1 >>>> >>>> # Instantiate the load balancer on chassis-2 >>>> $ ovn-nbctl create template_var name=vip value=80.80.80.2 >>>> chassis=chassis-2 >>>> $ ovn-nbctl create template_var name=backends >>>> value='"42.42.42.2:1000"' chassis=chassis-2 >>>> >>>> Dumitru Ceara (5): >>>> Add NB and SB Template_Var tables. >>>> controller: Add support for templated actions and matches. >>>> controller: Make resource references more generic. >>>> lb: Support using templates. >>>> controller: Add Template_Var <- LB references. >>>> >>>> >>>> controller/lflow.c | 248 ++++++++++++++++++++++++-------- >>>> controller/lflow.h | 98 +++++++------ >>>> controller/ofctrl.c | 9 +- >>>> controller/ofctrl.h | 3 +- >>>> controller/ovn-controller.c | 277 > ++++++++++++++++++++++++++++++++++-- >>>> include/ovn/expr.h | 4 +- >>>> include/ovn/lex.h | 14 +- >>>> lib/actions.c | 9 +- >>>> lib/expr.c | 14 +- >>>> lib/lb.c | 201 ++++++++++++++++++++++---- >>>> lib/lb.h | 36 +++-- >>>> lib/lex.c | 55 +++++++ >>>> northd/northd.c | 64 +++++---- >>>> tests/ovn.at | 2 +- >>>> tests/test-ovn.c | 16 ++- >>>> utilities/ovn-trace.c | 26 +++- >>>> 16 files changed, 869 insertions(+), 207 deletions(-) >>>> >>>> _______________________________________________ >>>> dev mailing list >>>> dev@openvswitch.org >>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >>>> >>> >> >> _______________________________________________ >> dev mailing list >> dev@openvswitch.org >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >
On Thu, Sep 22, 2022 at 1:00 AM Dumitru Ceara <dceara@redhat.com> wrote: > > Hi Han, > > On 9/21/22 23:06, Han Zhou wrote: > > Thanks Dumitru for this promising optimization! > > > > Thanks for checking it out! > > > On Thu, Aug 11, 2022 at 1:03 AM Dumitru Ceara <dceara@redhat.com> wrote: > >> > >> On 8/10/22 19:54, Mark Michelson wrote: > >>> Hi Dumitru, > >>> > >> > >> Hi Mark, > >> > >>> I read the patch series, and I think the idea of chassis-specific > >>> variables is a good idea to reduce the number of DB records for certain > >>> things. Aside from load balancers, I suspect this could have a positive > >>> impact for other structures as well. > >>> > >> > >> Thanks for taking a look! Yes, I think this might be applicable to > >> other structures too. > >> > > > > I think it is a good idea to make it more generic, but for my understanding > > this template concept is for anything that's "node/chassis" specific, and > > supposed to be instantiated at chassis level. Probably we should name the > > DB tables as something like chassis_template_var. > > > > If we decide to go for a simple "chassis" column (instead of a generic > "match" column) then naming the table Chassis_Template_Var makes sense > indeed. > > >>> Rather than criticizing the individual lines of code, I'll focus instead > >>> on some higher-level questions/ideas. > >>> > >> > >> Sure, thanks! :) > >> > >>> First, one question I had was what happens when a template variable name > >>> is used in a load balancer, but there is no appropriate value to > >>> substitute? For instance, what if a load balancer applies to chassis-3, > >>> but you only have template variables for chassis-1 and chassis-2? This > >>> might be addressed in the code but I didn't notice if it was. > >>> > >> > >> There are actually two things to consider here: > >> > >> 1. there might be a logical flow that uses a template variable: in this > >> case if template expansion/instantiation fails we currently leave the > >> token untouched (e.g., '^variable' stays '^variable'). That will cause > >> the flow action/match parsing to fail and currently logs a warning. The > >> flow itself is skipped, as it should be. We probably need to avoid > >> logging a warning though. > >> > >> 2. like you pointed out, there might be a load balancer using templates > >> in its backends/vips: if some of those templates cannot be instantiated > >> locally the backend/vip where they're added is skipped. Unless I missed > >> something, the code should already do that. > >> > >>> Second, it seems like template variables are a natural extension of > >>> existing concepts like address sets and port groups. In those cases, > >>> they were an unconditional collection of IP addresses or ports. For > >> > >> You're right to some extent template variables are similar to port > >> groups. The southbound database port group table splits the northbound > >> port group per datapath though not per chassis like template variables. > >> > >>> template variables, they're a collection of untyped values with the > >>> condition of only applying on certain Chassis. I wonder if this could > >>> all be reconciled with a single table that uses untyped values with > >>> user-specified conditions. Right now template variables have a "Chassis" > >>> column, but maybe this could be replaced with a broader "condition", > >>> "when", or "match" column. To get something integrated quickly, this > >>> column could just accept the syntax of "chassis.name == <blah>" or > >>> "chassis.uuid == <blah>" to allow for chassis-specific application of > >>> the values. With this foundation, we could eventually allow > >>> unconditional application of the value, or more complex conditions (e.g. > >>> only apply to logical switch ports that are connected to a router with a > >>> distributed gateway port). Doing this, we could deprecate address sets > >>> and port groups eventually in favor of template variables. > >> > >> This sounds like a good idea to me. I wasn't too happy with the > >> "chassis" string column of the Template_Var table anyway. A generic > >> condition field makes more sense. > >> > > If it is chassis-specific template, a column "chassis" seems to be > > straightforward. With a "match" column it is another burden of parsing > > I have a generic implementation (with a "predicate" column) almost ready > for review. I agree it's a bit more work to parse and maintain > references. I think it's probably best to discuss these details once I > post v1. It's no problem for me to go back to the "chassis" column > version if we decide to use that approach. > > > (which is costly and error prone). In addition, the LB object (or other > > structures) is not a logical-flow, and it doesn't directly map to > > logical-flows (unlike ACLs), so I didn't understand how would a match > > string be applied to the template. Is there a more detailed example of > > this? Maybe I am missing something, and hope we will see more details in > > the formal patch. > > > > Load balancer VIPs are a collection of key-value pairs (vip:backends). > These are currently IPs (and optionally L4 ports). We can allow LBs to > support "templated" VIPs/backends. For example (this is with the WIP v1 > version of the code): > > # Create a template LB. > ovn-nbctl --template lb-add lb-test "^vip:^vport" "^backends" tcp \ > -- ls-lb-add ls1 lb-test \ > -- lr-lb-add rtr lb-test > > # Instantiate the LB template variables, for the chassis where port > # vm1 is bound: > ovn-nbctl \ > -- create template_var name=vip value=66.66.66.66 \ > predicate="is_chassis_resident(\"vm1\")" \ > -- create template_var name=vport value=666 \ > predicate="is_chassis_resident(\"vm1\")" \ > -- create template_var name=backends value=\"42.42.42.2:4242\" \ > predicate="is_chassis_resident(\"vm1\")" > > This is equivalent to: > ovn-nbctl lb-add lb-test 66.66.66.66:666 42.42.42.2:4242 tcp > ovn-nbctl ls-lb-add ls1 lb-test > ovn-nbctl lr-lb-add lr1 lb-test > Thanks, this helps. I guess we wouldn't want to use backends as variables in the real use cases, right? Otherwise, creating backends value for each chassis sounds not helpful for the scale. So, if I understand correctly, what we would do is: # Create a template LB. ovn-nbctl --template lb-add lb-test "^vip:<the real port>" "<real backends>" tcp \ -- ls-lb-add ls1 lb-test \ -- lr-lb-add rtr lb-test # Instantiate the LB template variables, for the chassis where DGPs are bound: ovn-nbctl \ -- create template_var name=vip value=66.66.66.66 \ predicate="is_chassis_resident(\"lrp1\")" \ -- create template_var name=vip value=66.66.66.67 \ predicate="is_chassis_resident(\"lrp2\")" \ -- create template_var name=vip value=66.66.66.68 \ predicate="is_chassis_resident(\"lrp3\")" Now that in ovn-k8s since we use DGP to pin LS to chassis, the lrp1, 2, 3 resides on different nodes, so the "vip" var is instantiated to the chassis-specific value. So this effectively binds the LB VIP to chassis without explicit "chassis" column. Did I understand the magic correctly? If so, the predicate needs to be in logical flow match format, and the string will be directly used as match condition incorporated to related logical flows, right? One problem though, is that for DGPs, is_chassis_resident needs to evaluate the CR port "cr-xxx", which is internally constructed by OVN, and is not exposed to CMS. Once the implementation changes, the predicate wouldn't work. (I am looking forward to seeing your v1, but I think this discussion should help me understand the concept before going to the review :) Thanks, Han > >> Regarding deprecating and replacing address sets and port groups, I'm > >> not sure how easy that would be but we can try it when we get to that > > point. > >> > > > > Address sets and port groups are something different in my view. Although > > they can be treated as variables in a template, they are not really > > chassis-specific, and each variable needs to be instantiated to a big > > number of instances (sometimes huge). For this reason, fine-grained I-P > > embedded in the expression parsing (for Address set) was introduced for the > > performance of ovn-controller. Maybe we can say there are still some > > similarities of templates, but I am not really sure if it is really helpful > > to generalize them and how difficult it would be. > > > > It doesn't look straight forward indeed. If everyone agrees, I'd > suggest discussing this option (replacing AS and PG) only after/if the > template support is accepted. > > > Thanks, > > Han > > > > Thanks, > Dumitru > > >>> > >>> Third, I was wondering if there could be some layer that exists between > >>> the IDL and the application that expands the template variables as early > >>> as possible. I'm thinking the application could inject some callback in > >>> the IDL layer that might allow for the values to be substituted. This > >>> way, the variable substitution is taken care of in a single place, and > >>> by the time the application gets the data, it knows that all > >>> substitutions have been made and there is no need to special case > >>> template variable names vs. plain tokens. They should all be plain > >>> tokens. I don't think construction of such a layer should be a barrier > >>> to merging the code, but it's something worth considering as a later > >>> improvement. > >> > >> This could work. I'll think more about it. But like you said, it's > >> probably a longer term goal. It will need some significant changes in > >> the IDL layer (e.g., to re-evaluate some records when template > >> instantiations change). > >> > >>> > >>> Anyway, those were my high-level thoughts on the topic. Let me know what > >>> you think. > >>> > >> > >> I can work on changing the Template_Var schema to add a broader way of > >> specifying conditions (when/match/etc). I'm already working on adding > >> proper nbctl support for templated load balancers and trying to tackle > >> the rest of the todos. I can probably send a v1 sometime in the first > >> half of next week. Do you want to share any specific code related > >> comments that I should already integrate or shall we start a proper > >> review when v1 gets posted? > >> > >> Thanks again for your input on this RFC! > >> > >> Regards, > >> Dumitru > >> > >>> On 8/5/22 12:26, Dumitru Ceara wrote: > >>>> Sometimes network components are compute node-specific. Sometimes such > >>>> components are replicated, almost identically, for multiple nodes > >>>> in the cluster. > >>>> > >>>> One such example is the case of Kubernetes NodePort services which > >>>> translate (in the ovn-kubernetes case) to Load_Balancer > >>>> objects being applied to each and every node's logical gateway router. > >>>> These load balancers are almost identical, the main difference being > >>>> the fact that they use different VIPs (the node's IP). > >>>> > >>>> With the current OVN load balancer design, this becomes a problem at > >>>> scale because the number of load balancers that must be configured is > >>>> N x M (N nodes times M services). > >>>> > >>>> This series proposes a new concept in OVN: virtual network component > >>>> templates. The goal of the templates is to help reduce resource > >>>> consumption in the OVN central components in specific cases like the > > one > >>>> described above. > >>>> > >>>> To achieve that, the CMS will instead configure a "templated" load > >>>> balancer for every service and apply that single template record to > >>>> the cluster-wide load balancer group. This template is then > >>>> instantiated differently on different compute nodes. This translation > >>>> is controlled through per-chassis "template variables" configured by > >>>> the CMS in the new NB.Template_Var table. > >>>> > >>>> A syntetic benchmark simulating what an OpenShift router (using Node > >>>> Port services) scale test would do shows the following preliminary > >>>> results: > >>>> A. 120 node, 2K NodePort services: > >>>> - before: > >>>> - Southbound DB size on disk (compacted): ~385MB > >>>> - Southbound DB memory usage (RSS): ~3GB > >>>> - Southbound DB logical flows: 720K > >>>> > >>>> - after: > >>>> - Southbound DB size on disk (compacted): ~100MB > >>>> - Southbound DB memory usage (RSS): ~250MB > >>>> - Southbound DB logical flows: 6K > >>>> > >>>> B. 250 node, 2K NodePort services: > >>>> - after (didn't run the "before" test as it was taking way too long): > >>>> - Southbound DB size on disk (compacted): ~155MB > >>>> - Southbound DB memory usage (RSS): ~760MB > >>>> - Southbound DB logical flows: 6K > >>>> > >>>> The series is sent as RFC because there's still the need to add > >>>> some template specific unit tests and the "ovn-nbctl lb-*" helper > >>>> utilities need to be adapted to support templated load balancers. > >>>> > >>>> With these two items addressed the code is self can likely qualify > >>>> for acceptance as a new feature in the upcoming release. > >>>> > >>>> There also exists a more extensive TODO list (also listed in the commit > >>>> log of every patch in the series for now) but these are mainly load > >>>> balancer related functionalities that are not yet implemented for > >>>> templated load balancers but can definitely be implemented as follow > > ups: > >>>> - No support for LB health check if the LB is templated. > >>>> - No support for VIP ARP responder if the LB is templated. > >>>> - No support for routed VIPs if the LB is templated. > >>>> - Figure out a way to deal with templates in ovn-trace > >>>> - Determine if we need to allow Template_Var to match against chassis > >>>> hostname or other IDs. > >>>> - Make ofctrl_inject_pkt() work with template_vars. > >>>> - Make test-ovn work with template_vars. > >>>> > >>>> A basic example of how to configure a templated load balancer follows: > >>>> $ ovn-nbctl create load_balancer name=lb-test \ > >>>> protocol=tcp options:template=true \ > >>>> vips:\"^vip:4200\"="^backends" > >>>> > >>>> $ ovn-nbctl ls-add ls > >>>> $ ovn-nbctl ls-lb-add ls lb-test > >>>> > >>>> # Instantiate the load balancer on chassis-1 > >>>> $ ovn-nbctl create template_var name=vip value=80.80.80.1 > >>>> chassis=chassis-1 > >>>> $ ovn-nbctl create template_var name=backends > >>>> value='"42.42.42.1:1000"' chassis=chassis-1 > >>>> > >>>> # Instantiate the load balancer on chassis-2 > >>>> $ ovn-nbctl create template_var name=vip value=80.80.80.2 > >>>> chassis=chassis-2 > >>>> $ ovn-nbctl create template_var name=backends > >>>> value='"42.42.42.2:1000"' chassis=chassis-2 > >>>> > >>>> Dumitru Ceara (5): > >>>> Add NB and SB Template_Var tables. > >>>> controller: Add support for templated actions and matches. > >>>> controller: Make resource references more generic. > >>>> lb: Support using templates. > >>>> controller: Add Template_Var <- LB references. > >>>> > >>>> > >>>> controller/lflow.c | 248 ++++++++++++++++++++++++-------- > >>>> controller/lflow.h | 98 +++++++------ > >>>> controller/ofctrl.c | 9 +- > >>>> controller/ofctrl.h | 3 +- > >>>> controller/ovn-controller.c | 277 > > ++++++++++++++++++++++++++++++++++-- > >>>> include/ovn/expr.h | 4 +- > >>>> include/ovn/lex.h | 14 +- > >>>> lib/actions.c | 9 +- > >>>> lib/expr.c | 14 +- > >>>> lib/lb.c | 201 ++++++++++++++++++++++---- > >>>> lib/lb.h | 36 +++-- > >>>> lib/lex.c | 55 +++++++ > >>>> northd/northd.c | 64 +++++---- > >>>> tests/ovn.at | 2 +- > >>>> tests/test-ovn.c | 16 ++- > >>>> utilities/ovn-trace.c | 26 +++- > >>>> 16 files changed, 869 insertions(+), 207 deletions(-) > >>>> > >>>> _______________________________________________ > >>>> dev mailing list > >>>> dev@openvswitch.org > >>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev > >>>> > >>> > >> > >> _______________________________________________ > >> dev mailing list > >> dev@openvswitch.org > >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > >
On Thu, Sep 22, 2022 at 10:38 AM Han Zhou <hzhou@ovn.org> wrote: > > > > On Thu, Sep 22, 2022 at 1:00 AM Dumitru Ceara <dceara@redhat.com> wrote: > > > > Hi Han, > > > > On 9/21/22 23:06, Han Zhou wrote: > > > Thanks Dumitru for this promising optimization! > > > > > > > Thanks for checking it out! > > > > > On Thu, Aug 11, 2022 at 1:03 AM Dumitru Ceara <dceara@redhat.com> wrote: > > >> > > >> On 8/10/22 19:54, Mark Michelson wrote: > > >>> Hi Dumitru, > > >>> > > >> > > >> Hi Mark, > > >> > > >>> I read the patch series, and I think the idea of chassis-specific > > >>> variables is a good idea to reduce the number of DB records for certain > > >>> things. Aside from load balancers, I suspect this could have a positive > > >>> impact for other structures as well. > > >>> > > >> > > >> Thanks for taking a look! Yes, I think this might be applicable to > > >> other structures too. > > >> > > > > > > I think it is a good idea to make it more generic, but for my understanding > > > this template concept is for anything that's "node/chassis" specific, and > > > supposed to be instantiated at chassis level. Probably we should name the > > > DB tables as something like chassis_template_var. > > > > > > > If we decide to go for a simple "chassis" column (instead of a generic > > "match" column) then naming the table Chassis_Template_Var makes sense > > indeed. > > > > >>> Rather than criticizing the individual lines of code, I'll focus instead > > >>> on some higher-level questions/ideas. > > >>> > > >> > > >> Sure, thanks! :) > > >> > > >>> First, one question I had was what happens when a template variable name > > >>> is used in a load balancer, but there is no appropriate value to > > >>> substitute? For instance, what if a load balancer applies to chassis-3, > > >>> but you only have template variables for chassis-1 and chassis-2? This > > >>> might be addressed in the code but I didn't notice if it was. > > >>> > > >> > > >> There are actually two things to consider here: > > >> > > >> 1. there might be a logical flow that uses a template variable: in this > > >> case if template expansion/instantiation fails we currently leave the > > >> token untouched (e.g., '^variable' stays '^variable'). That will cause > > >> the flow action/match parsing to fail and currently logs a warning. The > > >> flow itself is skipped, as it should be. We probably need to avoid > > >> logging a warning though. > > >> > > >> 2. like you pointed out, there might be a load balancer using templates > > >> in its backends/vips: if some of those templates cannot be instantiated > > >> locally the backend/vip where they're added is skipped. Unless I missed > > >> something, the code should already do that. > > >> > > >>> Second, it seems like template variables are a natural extension of > > >>> existing concepts like address sets and port groups. In those cases, > > >>> they were an unconditional collection of IP addresses or ports. For > > >> > > >> You're right to some extent template variables are similar to port > > >> groups. The southbound database port group table splits the northbound > > >> port group per datapath though not per chassis like template variables. > > >> > > >>> template variables, they're a collection of untyped values with the > > >>> condition of only applying on certain Chassis. I wonder if this could > > >>> all be reconciled with a single table that uses untyped values with > > >>> user-specified conditions. Right now template variables have a "Chassis" > > >>> column, but maybe this could be replaced with a broader "condition", > > >>> "when", or "match" column. To get something integrated quickly, this > > >>> column could just accept the syntax of "chassis.name == <blah>" or > > >>> "chassis.uuid == <blah>" to allow for chassis-specific application of > > >>> the values. With this foundation, we could eventually allow > > >>> unconditional application of the value, or more complex conditions (e.g. > > >>> only apply to logical switch ports that are connected to a router with a > > >>> distributed gateway port). Doing this, we could deprecate address sets > > >>> and port groups eventually in favor of template variables. > > >> > > >> This sounds like a good idea to me. I wasn't too happy with the > > >> "chassis" string column of the Template_Var table anyway. A generic > > >> condition field makes more sense. > > >> > > > If it is chassis-specific template, a column "chassis" seems to be > > > straightforward. With a "match" column it is another burden of parsing > > > > I have a generic implementation (with a "predicate" column) almost ready > > for review. I agree it's a bit more work to parse and maintain > > references. I think it's probably best to discuss these details once I > > post v1. It's no problem for me to go back to the "chassis" column > > version if we decide to use that approach. > > > > > (which is costly and error prone). In addition, the LB object (or other > > > structures) is not a logical-flow, and it doesn't directly map to > > > logical-flows (unlike ACLs), so I didn't understand how would a match > > > string be applied to the template. Is there a more detailed example of > > > this? Maybe I am missing something, and hope we will see more details in > > > the formal patch. > > > > > > > Load balancer VIPs are a collection of key-value pairs (vip:backends). > > These are currently IPs (and optionally L4 ports). We can allow LBs to > > support "templated" VIPs/backends. For example (this is with the WIP v1 > > version of the code): > > > > # Create a template LB. > > ovn-nbctl --template lb-add lb-test "^vip:^vport" "^backends" tcp \ > > -- ls-lb-add ls1 lb-test \ > > -- lr-lb-add rtr lb-test > > > > # Instantiate the LB template variables, for the chassis where port > > # vm1 is bound: > > ovn-nbctl \ > > -- create template_var name=vip value=66.66.66.66 \ > > predicate="is_chassis_resident(\"vm1\")" \ > > -- create template_var name=vport value=666 \ > > predicate="is_chassis_resident(\"vm1\")" \ > > -- create template_var name=backends value=\"42.42.42.2:4242\" \ > > predicate="is_chassis_resident(\"vm1\")" > > > > This is equivalent to: > > ovn-nbctl lb-add lb-test 66.66.66.66:666 42.42.42.2:4242 tcp > > ovn-nbctl ls-lb-add ls1 lb-test > > ovn-nbctl lr-lb-add lr1 lb-test > > > > Thanks, this helps. I guess we wouldn't want to use backends as variables in the real use cases, right? Otherwise, creating backends value for each chassis sounds not helpful for the scale. So, if I understand correctly, what we would do is: > > # Create a template LB. > ovn-nbctl --template lb-add lb-test "^vip:<the real port>" "<real backends>" tcp \ > -- ls-lb-add ls1 lb-test \ > -- lr-lb-add rtr lb-test > > # Instantiate the LB template variables, for the chassis where DGPs are bound: > ovn-nbctl \ > -- create template_var name=vip value=66.66.66.66 \ > predicate="is_chassis_resident(\"lrp1\")" \ > -- create template_var name=vip value=66.66.66.67 \ > predicate="is_chassis_resident(\"lrp2\")" \ > -- create template_var name=vip value=66.66.66.68 \ > predicate="is_chassis_resident(\"lrp3\")" > > Now that in ovn-k8s since we use DGP to pin LS to chassis, the lrp1, 2, 3 resides on different nodes, so the "vip" var is instantiated to the chassis-specific value. So this effectively binds the LB VIP to chassis without explicit "chassis" column. Did I understand the magic correctly? > If so, the predicate needs to be in logical flow match format, and the string will be directly used as match condition incorporated to related logical flows, right? One problem though, is that for DGPs, is_chassis_resident needs to evaluate the CR port "cr-xxx", which is internally constructed by OVN, and is not exposed to CMS. Once the implementation changes, the predicate wouldn't work. > (I am looking forward to seeing your v1, but I think this discussion should help me understand the concept before going to the review :) > Sorry, one more point. With the "chassis" column, it is easy for conditional monitoring, considering that only a small number of values are useful for a specific chassis. The predicate alone wouldn't tell this information. If we go with the predicate approach, I think it may be helpful to add another column to tell the datapath information (in NB, use LS/LR), so that it can be converted to DP in SB and be conditionally monitored (reusing the existing ovn-controller "local datapath" mechanism). It can be "any", to indicate something that applies to multiple/all DPs, if useful for the future use cases. > Thanks, > Han > > > >> Regarding deprecating and replacing address sets and port groups, I'm > > >> not sure how easy that would be but we can try it when we get to that > > > point. > > >> > > > > > > Address sets and port groups are something different in my view. Although > > > they can be treated as variables in a template, they are not really > > > chassis-specific, and each variable needs to be instantiated to a big > > > number of instances (sometimes huge). For this reason, fine-grained I-P > > > embedded in the expression parsing (for Address set) was introduced for the > > > performance of ovn-controller. Maybe we can say there are still some > > > similarities of templates, but I am not really sure if it is really helpful > > > to generalize them and how difficult it would be. > > > > > > > It doesn't look straight forward indeed. If everyone agrees, I'd > > suggest discussing this option (replacing AS and PG) only after/if the > > template support is accepted. > > > > > Thanks, > > > Han > > > > > > > Thanks, > > Dumitru > > > > >>> > > >>> Third, I was wondering if there could be some layer that exists between > > >>> the IDL and the application that expands the template variables as early > > >>> as possible. I'm thinking the application could inject some callback in > > >>> the IDL layer that might allow for the values to be substituted. This > > >>> way, the variable substitution is taken care of in a single place, and > > >>> by the time the application gets the data, it knows that all > > >>> substitutions have been made and there is no need to special case > > >>> template variable names vs. plain tokens. They should all be plain > > >>> tokens. I don't think construction of such a layer should be a barrier > > >>> to merging the code, but it's something worth considering as a later > > >>> improvement. > > >> > > >> This could work. I'll think more about it. But like you said, it's > > >> probably a longer term goal. It will need some significant changes in > > >> the IDL layer (e.g., to re-evaluate some records when template > > >> instantiations change). > > >> > > >>> > > >>> Anyway, those were my high-level thoughts on the topic. Let me know what > > >>> you think. > > >>> > > >> > > >> I can work on changing the Template_Var schema to add a broader way of > > >> specifying conditions (when/match/etc). I'm already working on adding > > >> proper nbctl support for templated load balancers and trying to tackle > > >> the rest of the todos. I can probably send a v1 sometime in the first > > >> half of next week. Do you want to share any specific code related > > >> comments that I should already integrate or shall we start a proper > > >> review when v1 gets posted? > > >> > > >> Thanks again for your input on this RFC! > > >> > > >> Regards, > > >> Dumitru > > >> > > >>> On 8/5/22 12:26, Dumitru Ceara wrote: > > >>>> Sometimes network components are compute node-specific. Sometimes such > > >>>> components are replicated, almost identically, for multiple nodes > > >>>> in the cluster. > > >>>> > > >>>> One such example is the case of Kubernetes NodePort services which > > >>>> translate (in the ovn-kubernetes case) to Load_Balancer > > >>>> objects being applied to each and every node's logical gateway router. > > >>>> These load balancers are almost identical, the main difference being > > >>>> the fact that they use different VIPs (the node's IP). > > >>>> > > >>>> With the current OVN load balancer design, this becomes a problem at > > >>>> scale because the number of load balancers that must be configured is > > >>>> N x M (N nodes times M services). > > >>>> > > >>>> This series proposes a new concept in OVN: virtual network component > > >>>> templates. The goal of the templates is to help reduce resource > > >>>> consumption in the OVN central components in specific cases like the > > > one > > >>>> described above. > > >>>> > > >>>> To achieve that, the CMS will instead configure a "templated" load > > >>>> balancer for every service and apply that single template record to > > >>>> the cluster-wide load balancer group. This template is then > > >>>> instantiated differently on different compute nodes. This translation > > >>>> is controlled through per-chassis "template variables" configured by > > >>>> the CMS in the new NB.Template_Var table. > > >>>> > > >>>> A syntetic benchmark simulating what an OpenShift router (using Node > > >>>> Port services) scale test would do shows the following preliminary > > >>>> results: > > >>>> A. 120 node, 2K NodePort services: > > >>>> - before: > > >>>> - Southbound DB size on disk (compacted): ~385MB > > >>>> - Southbound DB memory usage (RSS): ~3GB > > >>>> - Southbound DB logical flows: 720K > > >>>> > > >>>> - after: > > >>>> - Southbound DB size on disk (compacted): ~100MB > > >>>> - Southbound DB memory usage (RSS): ~250MB > > >>>> - Southbound DB logical flows: 6K > > >>>> > > >>>> B. 250 node, 2K NodePort services: > > >>>> - after (didn't run the "before" test as it was taking way too long): > > >>>> - Southbound DB size on disk (compacted): ~155MB > > >>>> - Southbound DB memory usage (RSS): ~760MB > > >>>> - Southbound DB logical flows: 6K > > >>>> > > >>>> The series is sent as RFC because there's still the need to add > > >>>> some template specific unit tests and the "ovn-nbctl lb-*" helper > > >>>> utilities need to be adapted to support templated load balancers. > > >>>> > > >>>> With these two items addressed the code is self can likely qualify > > >>>> for acceptance as a new feature in the upcoming release. > > >>>> > > >>>> There also exists a more extensive TODO list (also listed in the commit > > >>>> log of every patch in the series for now) but these are mainly load > > >>>> balancer related functionalities that are not yet implemented for > > >>>> templated load balancers but can definitely be implemented as follow > > > ups: > > >>>> - No support for LB health check if the LB is templated. > > >>>> - No support for VIP ARP responder if the LB is templated. > > >>>> - No support for routed VIPs if the LB is templated. > > >>>> - Figure out a way to deal with templates in ovn-trace > > >>>> - Determine if we need to allow Template_Var to match against chassis > > >>>> hostname or other IDs. > > >>>> - Make ofctrl_inject_pkt() work with template_vars. > > >>>> - Make test-ovn work with template_vars. > > >>>> > > >>>> A basic example of how to configure a templated load balancer follows: > > >>>> $ ovn-nbctl create load_balancer name=lb-test \ > > >>>> protocol=tcp options:template=true \ > > >>>> vips:\"^vip:4200\"="^backends" > > >>>> > > >>>> $ ovn-nbctl ls-add ls > > >>>> $ ovn-nbctl ls-lb-add ls lb-test > > >>>> > > >>>> # Instantiate the load balancer on chassis-1 > > >>>> $ ovn-nbctl create template_var name=vip value=80.80.80.1 > > >>>> chassis=chassis-1 > > >>>> $ ovn-nbctl create template_var name=backends > > >>>> value='"42.42.42.1:1000"' chassis=chassis-1 > > >>>> > > >>>> # Instantiate the load balancer on chassis-2 > > >>>> $ ovn-nbctl create template_var name=vip value=80.80.80.2 > > >>>> chassis=chassis-2 > > >>>> $ ovn-nbctl create template_var name=backends > > >>>> value='"42.42.42.2:1000"' chassis=chassis-2 > > >>>> > > >>>> Dumitru Ceara (5): > > >>>> Add NB and SB Template_Var tables. > > >>>> controller: Add support for templated actions and matches. > > >>>> controller: Make resource references more generic. > > >>>> lb: Support using templates. > > >>>> controller: Add Template_Var <- LB references. > > >>>> > > >>>> > > >>>> controller/lflow.c | 248 ++++++++++++++++++++++++-------- > > >>>> controller/lflow.h | 98 +++++++------ > > >>>> controller/ofctrl.c | 9 +- > > >>>> controller/ofctrl.h | 3 +- > > >>>> controller/ovn-controller.c | 277 > > > ++++++++++++++++++++++++++++++++++-- > > >>>> include/ovn/expr.h | 4 +- > > >>>> include/ovn/lex.h | 14 +- > > >>>> lib/actions.c | 9 +- > > >>>> lib/expr.c | 14 +- > > >>>> lib/lb.c | 201 ++++++++++++++++++++++---- > > >>>> lib/lb.h | 36 +++-- > > >>>> lib/lex.c | 55 +++++++ > > >>>> northd/northd.c | 64 +++++---- > > >>>> tests/ovn.at | 2 +- > > >>>> tests/test-ovn.c | 16 ++- > > >>>> utilities/ovn-trace.c | 26 +++- > > >>>> 16 files changed, 869 insertions(+), 207 deletions(-) > > >>>> > > >>>> _______________________________________________ > > >>>> dev mailing list > > >>>> dev@openvswitch.org > > >>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > >>>> > > >>> > > >> > > >> _______________________________________________ > > >> dev mailing list > > >> dev@openvswitch.org > > >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > > > >
On 9/22/22 19:55, Han Zhou wrote: > On Thu, Sep 22, 2022 at 10:38 AM Han Zhou <hzhou@ovn.org> wrote: >> >> >> >> On Thu, Sep 22, 2022 at 1:00 AM Dumitru Ceara <dceara@redhat.com> wrote: >>> >>> Hi Han, >>> >>> On 9/21/22 23:06, Han Zhou wrote: >>>> Thanks Dumitru for this promising optimization! >>>> >>> >>> Thanks for checking it out! >>> >>>> On Thu, Aug 11, 2022 at 1:03 AM Dumitru Ceara <dceara@redhat.com> > wrote: >>>>> >>>>> On 8/10/22 19:54, Mark Michelson wrote: >>>>>> Hi Dumitru, >>>>>> >>>>> >>>>> Hi Mark, >>>>> >>>>>> I read the patch series, and I think the idea of chassis-specific >>>>>> variables is a good idea to reduce the number of DB records for > certain >>>>>> things. Aside from load balancers, I suspect this could have a > positive >>>>>> impact for other structures as well. >>>>>> >>>>> >>>>> Thanks for taking a look! Yes, I think this might be applicable to >>>>> other structures too. >>>>> >>>> >>>> I think it is a good idea to make it more generic, but for my > understanding >>>> this template concept is for anything that's "node/chassis" specific, > and >>>> supposed to be instantiated at chassis level. Probably we should name > the >>>> DB tables as something like chassis_template_var. >>>> >>> >>> If we decide to go for a simple "chassis" column (instead of a generic >>> "match" column) then naming the table Chassis_Template_Var makes sense >>> indeed. >>> >>>>>> Rather than criticizing the individual lines of code, I'll focus > instead >>>>>> on some higher-level questions/ideas. >>>>>> >>>>> >>>>> Sure, thanks! :) >>>>> >>>>>> First, one question I had was what happens when a template variable > name >>>>>> is used in a load balancer, but there is no appropriate value to >>>>>> substitute? For instance, what if a load balancer applies to > chassis-3, >>>>>> but you only have template variables for chassis-1 and chassis-2? > This >>>>>> might be addressed in the code but I didn't notice if it was. >>>>>> >>>>> >>>>> There are actually two things to consider here: >>>>> >>>>> 1. there might be a logical flow that uses a template variable: in > this >>>>> case if template expansion/instantiation fails we currently leave the >>>>> token untouched (e.g., '^variable' stays '^variable'). That will > cause >>>>> the flow action/match parsing to fail and currently logs a warning. > The >>>>> flow itself is skipped, as it should be. We probably need to avoid >>>>> logging a warning though. >>>>> >>>>> 2. like you pointed out, there might be a load balancer using > templates >>>>> in its backends/vips: if some of those templates cannot be > instantiated >>>>> locally the backend/vip where they're added is skipped. Unless I > missed >>>>> something, the code should already do that. >>>>> >>>>>> Second, it seems like template variables are a natural extension of >>>>>> existing concepts like address sets and port groups. In those cases, >>>>>> they were an unconditional collection of IP addresses or ports. For >>>>> >>>>> You're right to some extent template variables are similar to port >>>>> groups. The southbound database port group table splits the > northbound >>>>> port group per datapath though not per chassis like template > variables. >>>>> >>>>>> template variables, they're a collection of untyped values with the >>>>>> condition of only applying on certain Chassis. I wonder if this > could >>>>>> all be reconciled with a single table that uses untyped values with >>>>>> user-specified conditions. Right now template variables have a > "Chassis" >>>>>> column, but maybe this could be replaced with a broader "condition", >>>>>> "when", or "match" column. To get something integrated quickly, this >>>>>> column could just accept the syntax of "chassis.name == <blah>" or >>>>>> "chassis.uuid == <blah>" to allow for chassis-specific application > of >>>>>> the values. With this foundation, we could eventually allow >>>>>> unconditional application of the value, or more complex conditions > (e.g. >>>>>> only apply to logical switch ports that are connected to a router > with a >>>>>> distributed gateway port). Doing this, we could deprecate address > sets >>>>>> and port groups eventually in favor of template variables. >>>>> >>>>> This sounds like a good idea to me. I wasn't too happy with the >>>>> "chassis" string column of the Template_Var table anyway. A generic >>>>> condition field makes more sense. >>>>> >>>> If it is chassis-specific template, a column "chassis" seems to be >>>> straightforward. With a "match" column it is another burden of parsing >>> >>> I have a generic implementation (with a "predicate" column) almost ready >>> for review. I agree it's a bit more work to parse and maintain >>> references. I think it's probably best to discuss these details once I >>> post v1. It's no problem for me to go back to the "chassis" column >>> version if we decide to use that approach. >>> >>>> (which is costly and error prone). In addition, the LB object (or > other >>>> structures) is not a logical-flow, and it doesn't directly map to >>>> logical-flows (unlike ACLs), so I didn't understand how would a match >>>> string be applied to the template. Is there a more detailed example of >>>> this? Maybe I am missing something, and hope we will see more details > in >>>> the formal patch. >>>> >>> >>> Load balancer VIPs are a collection of key-value pairs (vip:backends). >>> These are currently IPs (and optionally L4 ports). We can allow LBs to >>> support "templated" VIPs/backends. For example (this is with the WIP v1 >>> version of the code): >>> >>> # Create a template LB. >>> ovn-nbctl --template lb-add lb-test "^vip:^vport" "^backends" tcp \ >>> -- ls-lb-add ls1 lb-test \ >>> -- lr-lb-add rtr lb-test >>> >>> # Instantiate the LB template variables, for the chassis where port >>> # vm1 is bound: >>> ovn-nbctl \ >>> -- create template_var name=vip value=66.66.66.66 \ >>> predicate="is_chassis_resident(\"vm1\")" \ >>> -- create template_var name=vport value=666 \ >>> predicate="is_chassis_resident(\"vm1\")" \ >>> -- create template_var name=backends value=\"42.42.42.2:4242\" \ >>> predicate="is_chassis_resident(\"vm1\")" >>> >>> This is equivalent to: >>> ovn-nbctl lb-add lb-test 66.66.66.66:666 42.42.42.2:4242 tcp >>> ovn-nbctl ls-lb-add ls1 lb-test >>> ovn-nbctl lr-lb-add lr1 lb-test >>> >> >> Thanks, this helps. I guess we wouldn't want to use backends as variables > in the real use cases, right? Otherwise, creating backends value for each > chassis sounds not helpful for the scale. So, if I understand correctly, > what we would do is: >> >> # Create a template LB. >> ovn-nbctl --template lb-add lb-test "^vip:<the real port>" "<real > backends>" tcp \ >> -- ls-lb-add ls1 lb-test \ >> -- lr-lb-add rtr lb-test >> >> # Instantiate the LB template variables, for the chassis where DGPs are > bound: >> ovn-nbctl \ >> -- create template_var name=vip value=66.66.66.66 \ >> predicate="is_chassis_resident(\"lrp1\")" \ >> -- create template_var name=vip value=66.66.66.67 \ >> predicate="is_chassis_resident(\"lrp2\")" \ >> -- create template_var name=vip value=66.66.66.68 \ >> predicate="is_chassis_resident(\"lrp3\")" >> It can be that we need to use backends as variables too. In the worst case, we still probably end up using less resources than with explicit load balancers. But yes, very likely in a lot of cases with ovn-k8s deployments we'll be able to define load balancers like you did above. >> Now that in ovn-k8s since we use DGP to pin LS to chassis, the lrp1, 2, 3 > resides on different nodes, so the "vip" var is instantiated to the > chassis-specific value. So this effectively binds the LB VIP to chassis > without explicit "chassis" column. Did I understand the magic correctly? That's the idea, yes. >> If so, the predicate needs to be in logical flow match format, and the > string will be directly used as match condition incorporated to related > logical flows, right? One problem though, is that for DGPs, > is_chassis_resident needs to evaluate the CR port "cr-xxx", which is > internally constructed by OVN, and is not exposed to CMS. Once the > implementation changes, the predicate wouldn't work. True. I was thinking of using the logical router port of the GW router in ovn-k8s though. >> (I am looking forward to seeing your v1, but I think this discussion > should help me understand the concept before going to the review :) >> > > Sorry, one more point. With the "chassis" column, it is easy for > conditional monitoring, considering that only a small number of values are > useful for a specific chassis. The predicate alone wouldn't tell this This is a very good point. Conditional monitoring will be hard/impossible. > information. If we go with the predicate approach, I think it may be > helpful to add another column to tell the datapath information (in NB, use > LS/LR), so that it can be converted to DP in SB and be conditionally > monitored (reusing the existing ovn-controller "local datapath" mechanism). > It can be "any", to indicate something that applies to multiple/all DPs, if > useful for the future use cases. I wonder if this is not too complex. Maybe we should just go to the "chassis" column approach instead. To argue against chassis not being a NB OVN entity, we already have NB tables that refer to chassis: HA_Chassis_Group, HA_Chassis, Gateway_Chassis. > >> Thanks, >> Han >> >>>>> Regarding deprecating and replacing address sets and port groups, I'm >>>>> not sure how easy that would be but we can try it when we get to that >>>> point. >>>>> >>>> >>>> Address sets and port groups are something different in my view. > Although >>>> they can be treated as variables in a template, they are not really >>>> chassis-specific, and each variable needs to be instantiated to a big >>>> number of instances (sometimes huge). For this reason, fine-grained > I-P >>>> embedded in the expression parsing (for Address set) was introduced > for the >>>> performance of ovn-controller. Maybe we can say there are still some >>>> similarities of templates, but I am not really sure if it is really > helpful >>>> to generalize them and how difficult it would be. >>>> >>> >>> It doesn't look straight forward indeed. If everyone agrees, I'd >>> suggest discussing this option (replacing AS and PG) only after/if the >>> template support is accepted. >>> >>>> Thanks, >>>> Han >>>> >>> >>> Thanks, >>> Dumitru >>> >>>>>> >>>>>> Third, I was wondering if there could be some layer that exists > between >>>>>> the IDL and the application that expands the template variables as > early >>>>>> as possible. I'm thinking the application could inject some > callback in >>>>>> the IDL layer that might allow for the values to be substituted. > This >>>>>> way, the variable substitution is taken care of in a single place, > and >>>>>> by the time the application gets the data, it knows that all >>>>>> substitutions have been made and there is no need to special case >>>>>> template variable names vs. plain tokens. They should all be plain >>>>>> tokens. I don't think construction of such a layer should be a > barrier >>>>>> to merging the code, but it's something worth considering as a later >>>>>> improvement. >>>>> >>>>> This could work. I'll think more about it. But like you said, it's >>>>> probably a longer term goal. It will need some significant changes > in >>>>> the IDL layer (e.g., to re-evaluate some records when template >>>>> instantiations change). >>>>> >>>>>> >>>>>> Anyway, those were my high-level thoughts on the topic. Let me know > what >>>>>> you think. >>>>>> >>>>> >>>>> I can work on changing the Template_Var schema to add a broader way > of >>>>> specifying conditions (when/match/etc). I'm already working on > adding >>>>> proper nbctl support for templated load balancers and trying to > tackle >>>>> the rest of the todos. I can probably send a v1 sometime in the > first >>>>> half of next week. Do you want to share any specific code related >>>>> comments that I should already integrate or shall we start a proper >>>>> review when v1 gets posted? >>>>> >>>>> Thanks again for your input on this RFC! >>>>> >>>>> Regards, >>>>> Dumitru >>>>> >>>>>> On 8/5/22 12:26, Dumitru Ceara wrote: >>>>>>> Sometimes network components are compute node-specific. Sometimes > such >>>>>>> components are replicated, almost identically, for multiple nodes >>>>>>> in the cluster. >>>>>>> >>>>>>> One such example is the case of Kubernetes NodePort services which >>>>>>> translate (in the ovn-kubernetes case) to Load_Balancer >>>>>>> objects being applied to each and every node's logical gateway > router. >>>>>>> These load balancers are almost identical, the main difference > being >>>>>>> the fact that they use different VIPs (the node's IP). >>>>>>> >>>>>>> With the current OVN load balancer design, this becomes a problem > at >>>>>>> scale because the number of load balancers that must be configured > is >>>>>>> N x M (N nodes times M services). >>>>>>> >>>>>>> This series proposes a new concept in OVN: virtual network > component >>>>>>> templates. The goal of the templates is to help reduce resource >>>>>>> consumption in the OVN central components in specific cases like > the >>>> one >>>>>>> described above. >>>>>>> >>>>>>> To achieve that, the CMS will instead configure a "templated" load >>>>>>> balancer for every service and apply that single template record to >>>>>>> the cluster-wide load balancer group. This template is then >>>>>>> instantiated differently on different compute nodes. This > translation >>>>>>> is controlled through per-chassis "template variables" configured > by >>>>>>> the CMS in the new NB.Template_Var table. >>>>>>> >>>>>>> A syntetic benchmark simulating what an OpenShift router (using > Node >>>>>>> Port services) scale test would do shows the following preliminary >>>>>>> results: >>>>>>> A. 120 node, 2K NodePort services: >>>>>>> - before: >>>>>>> - Southbound DB size on disk (compacted): ~385MB >>>>>>> - Southbound DB memory usage (RSS): ~3GB >>>>>>> - Southbound DB logical flows: 720K >>>>>>> >>>>>>> - after: >>>>>>> - Southbound DB size on disk (compacted): ~100MB >>>>>>> - Southbound DB memory usage (RSS): ~250MB >>>>>>> - Southbound DB logical flows: 6K >>>>>>> >>>>>>> B. 250 node, 2K NodePort services: >>>>>>> - after (didn't run the "before" test as it was taking way too > long): >>>>>>> - Southbound DB size on disk (compacted): ~155MB >>>>>>> - Southbound DB memory usage (RSS): ~760MB >>>>>>> - Southbound DB logical flows: 6K >>>>>>> >>>>>>> The series is sent as RFC because there's still the need to add >>>>>>> some template specific unit tests and the "ovn-nbctl lb-*" helper >>>>>>> utilities need to be adapted to support templated load balancers. >>>>>>> >>>>>>> With these two items addressed the code is self can likely qualify >>>>>>> for acceptance as a new feature in the upcoming release. >>>>>>> >>>>>>> There also exists a more extensive TODO list (also listed in the > commit >>>>>>> log of every patch in the series for now) but these are mainly load >>>>>>> balancer related functionalities that are not yet implemented for >>>>>>> templated load balancers but can definitely be implemented as > follow >>>> ups: >>>>>>> - No support for LB health check if the LB is templated. >>>>>>> - No support for VIP ARP responder if the LB is templated. >>>>>>> - No support for routed VIPs if the LB is templated. >>>>>>> - Figure out a way to deal with templates in ovn-trace >>>>>>> - Determine if we need to allow Template_Var to match against > chassis >>>>>>> hostname or other IDs. >>>>>>> - Make ofctrl_inject_pkt() work with template_vars. >>>>>>> - Make test-ovn work with template_vars. >>>>>>> >>>>>>> A basic example of how to configure a templated load balancer > follows: >>>>>>> $ ovn-nbctl create load_balancer name=lb-test \ >>>>>>> protocol=tcp options:template=true \ >>>>>>> vips:\"^vip:4200\"="^backends" >>>>>>> >>>>>>> $ ovn-nbctl ls-add ls >>>>>>> $ ovn-nbctl ls-lb-add ls lb-test >>>>>>> >>>>>>> # Instantiate the load balancer on chassis-1 >>>>>>> $ ovn-nbctl create template_var name=vip value=80.80.80.1 >>>>>>> chassis=chassis-1 >>>>>>> $ ovn-nbctl create template_var name=backends >>>>>>> value='"42.42.42.1:1000"' chassis=chassis-1 >>>>>>> >>>>>>> # Instantiate the load balancer on chassis-2 >>>>>>> $ ovn-nbctl create template_var name=vip value=80.80.80.2 >>>>>>> chassis=chassis-2 >>>>>>> $ ovn-nbctl create template_var name=backends >>>>>>> value='"42.42.42.2:1000"' chassis=chassis-2 >>>>>>> >>>>>>> Dumitru Ceara (5): >>>>>>> Add NB and SB Template_Var tables. >>>>>>> controller: Add support for templated actions and matches. >>>>>>> controller: Make resource references more generic. >>>>>>> lb: Support using templates. >>>>>>> controller: Add Template_Var <- LB references. >>>>>>> >>>>>>> >>>>>>> controller/lflow.c | 248 > ++++++++++++++++++++++++-------- >>>>>>> controller/lflow.h | 98 +++++++------ >>>>>>> controller/ofctrl.c | 9 +- >>>>>>> controller/ofctrl.h | 3 +- >>>>>>> controller/ovn-controller.c | 277 >>>> ++++++++++++++++++++++++++++++++++-- >>>>>>> include/ovn/expr.h | 4 +- >>>>>>> include/ovn/lex.h | 14 +- >>>>>>> lib/actions.c | 9 +- >>>>>>> lib/expr.c | 14 +- >>>>>>> lib/lb.c | 201 ++++++++++++++++++++++---- >>>>>>> lib/lb.h | 36 +++-- >>>>>>> lib/lex.c | 55 +++++++ >>>>>>> northd/northd.c | 64 +++++---- >>>>>>> tests/ovn.at | 2 +- >>>>>>> tests/test-ovn.c | 16 ++- >>>>>>> utilities/ovn-trace.c | 26 +++- >>>>>>> 16 files changed, 869 insertions(+), 207 deletions(-) >>>>>>> >>>>>>> _______________________________________________ >>>>>>> dev mailing list >>>>>>> dev@openvswitch.org >>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >>>>>>> >>>>>> >>>>> >>>>> _______________________________________________ >>>>> dev mailing list >>>>> dev@openvswitch.org >>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >>>> >>> >
On Fri, Sep 23, 2022 at 8:10 AM Dumitru Ceara <dceara@redhat.com> wrote: > > On 9/22/22 19:55, Han Zhou wrote: > > On Thu, Sep 22, 2022 at 10:38 AM Han Zhou <hzhou@ovn.org> wrote: > >> > >> > >> > >> On Thu, Sep 22, 2022 at 1:00 AM Dumitru Ceara <dceara@redhat.com> wrote: > >>> > >>> Hi Han, > >>> > >>> On 9/21/22 23:06, Han Zhou wrote: > >>>> Thanks Dumitru for this promising optimization! > >>>> > >>> > >>> Thanks for checking it out! > >>> > >>>> On Thu, Aug 11, 2022 at 1:03 AM Dumitru Ceara <dceara@redhat.com> > > wrote: > >>>>> > >>>>> On 8/10/22 19:54, Mark Michelson wrote: > >>>>>> Hi Dumitru, > >>>>>> > >>>>> > >>>>> Hi Mark, > >>>>> > >>>>>> I read the patch series, and I think the idea of chassis-specific > >>>>>> variables is a good idea to reduce the number of DB records for > > certain > >>>>>> things. Aside from load balancers, I suspect this could have a > > positive > >>>>>> impact for other structures as well. > >>>>>> > >>>>> > >>>>> Thanks for taking a look! Yes, I think this might be applicable to > >>>>> other structures too. > >>>>> > >>>> > >>>> I think it is a good idea to make it more generic, but for my > > understanding > >>>> this template concept is for anything that's "node/chassis" specific, > > and > >>>> supposed to be instantiated at chassis level. Probably we should name > > the > >>>> DB tables as something like chassis_template_var. > >>>> > >>> > >>> If we decide to go for a simple "chassis" column (instead of a generic > >>> "match" column) then naming the table Chassis_Template_Var makes sense > >>> indeed. > >>> > >>>>>> Rather than criticizing the individual lines of code, I'll focus > > instead > >>>>>> on some higher-level questions/ideas. > >>>>>> > >>>>> > >>>>> Sure, thanks! :) > >>>>> > >>>>>> First, one question I had was what happens when a template variable > > name > >>>>>> is used in a load balancer, but there is no appropriate value to > >>>>>> substitute? For instance, what if a load balancer applies to > > chassis-3, > >>>>>> but you only have template variables for chassis-1 and chassis-2? > > This > >>>>>> might be addressed in the code but I didn't notice if it was. > >>>>>> > >>>>> > >>>>> There are actually two things to consider here: > >>>>> > >>>>> 1. there might be a logical flow that uses a template variable: in > > this > >>>>> case if template expansion/instantiation fails we currently leave the > >>>>> token untouched (e.g., '^variable' stays '^variable'). That will > > cause > >>>>> the flow action/match parsing to fail and currently logs a warning. > > The > >>>>> flow itself is skipped, as it should be. We probably need to avoid > >>>>> logging a warning though. > >>>>> > >>>>> 2. like you pointed out, there might be a load balancer using > > templates > >>>>> in its backends/vips: if some of those templates cannot be > > instantiated > >>>>> locally the backend/vip where they're added is skipped. Unless I > > missed > >>>>> something, the code should already do that. > >>>>> > >>>>>> Second, it seems like template variables are a natural extension of > >>>>>> existing concepts like address sets and port groups. In those cases, > >>>>>> they were an unconditional collection of IP addresses or ports. For > >>>>> > >>>>> You're right to some extent template variables are similar to port > >>>>> groups. The southbound database port group table splits the > > northbound > >>>>> port group per datapath though not per chassis like template > > variables. > >>>>> > >>>>>> template variables, they're a collection of untyped values with the > >>>>>> condition of only applying on certain Chassis. I wonder if this > > could > >>>>>> all be reconciled with a single table that uses untyped values with > >>>>>> user-specified conditions. Right now template variables have a > > "Chassis" > >>>>>> column, but maybe this could be replaced with a broader "condition", > >>>>>> "when", or "match" column. To get something integrated quickly, this > >>>>>> column could just accept the syntax of "chassis.name == <blah>" or > >>>>>> "chassis.uuid == <blah>" to allow for chassis-specific application > > of > >>>>>> the values. With this foundation, we could eventually allow > >>>>>> unconditional application of the value, or more complex conditions > > (e.g. > >>>>>> only apply to logical switch ports that are connected to a router > > with a > >>>>>> distributed gateway port). Doing this, we could deprecate address > > sets > >>>>>> and port groups eventually in favor of template variables. > >>>>> > >>>>> This sounds like a good idea to me. I wasn't too happy with the > >>>>> "chassis" string column of the Template_Var table anyway. A generic > >>>>> condition field makes more sense. > >>>>> > >>>> If it is chassis-specific template, a column "chassis" seems to be > >>>> straightforward. With a "match" column it is another burden of parsing > >>> > >>> I have a generic implementation (with a "predicate" column) almost ready > >>> for review. I agree it's a bit more work to parse and maintain > >>> references. I think it's probably best to discuss these details once I > >>> post v1. It's no problem for me to go back to the "chassis" column > >>> version if we decide to use that approach. > >>> > >>>> (which is costly and error prone). In addition, the LB object (or > > other > >>>> structures) is not a logical-flow, and it doesn't directly map to > >>>> logical-flows (unlike ACLs), so I didn't understand how would a match > >>>> string be applied to the template. Is there a more detailed example of > >>>> this? Maybe I am missing something, and hope we will see more details > > in > >>>> the formal patch. > >>>> > >>> > >>> Load balancer VIPs are a collection of key-value pairs (vip:backends). > >>> These are currently IPs (and optionally L4 ports). We can allow LBs to > >>> support "templated" VIPs/backends. For example (this is with the WIP v1 > >>> version of the code): > >>> > >>> # Create a template LB. > >>> ovn-nbctl --template lb-add lb-test "^vip:^vport" "^backends" tcp \ > >>> -- ls-lb-add ls1 lb-test \ > >>> -- lr-lb-add rtr lb-test > >>> > >>> # Instantiate the LB template variables, for the chassis where port > >>> # vm1 is bound: > >>> ovn-nbctl \ > >>> -- create template_var name=vip value=66.66.66.66 \ > >>> predicate="is_chassis_resident(\"vm1\")" \ > >>> -- create template_var name=vport value=666 \ > >>> predicate="is_chassis_resident(\"vm1\")" \ > >>> -- create template_var name=backends value=\"42.42.42.2:4242\" \ > >>> predicate="is_chassis_resident(\"vm1\")" > >>> > >>> This is equivalent to: > >>> ovn-nbctl lb-add lb-test 66.66.66.66:666 42.42.42.2:4242 tcp > >>> ovn-nbctl ls-lb-add ls1 lb-test > >>> ovn-nbctl lr-lb-add lr1 lb-test > >>> > >> > >> Thanks, this helps. I guess we wouldn't want to use backends as variables > > in the real use cases, right? Otherwise, creating backends value for each > > chassis sounds not helpful for the scale. So, if I understand correctly, > > what we would do is: > >> > >> # Create a template LB. > >> ovn-nbctl --template lb-add lb-test "^vip:<the real port>" "<real > > backends>" tcp \ > >> -- ls-lb-add ls1 lb-test \ > >> -- lr-lb-add rtr lb-test > >> > >> # Instantiate the LB template variables, for the chassis where DGPs are > > bound: > >> ovn-nbctl \ > >> -- create template_var name=vip value=66.66.66.66 \ > >> predicate="is_chassis_resident(\"lrp1\")" \ > >> -- create template_var name=vip value=66.66.66.67 \ > >> predicate="is_chassis_resident(\"lrp2\")" \ > >> -- create template_var name=vip value=66.66.66.68 \ > >> predicate="is_chassis_resident(\"lrp3\")" > >> > > It can be that we need to use backends as variables too. In the worst > case, we still probably end up using less resources than with explicit > load balancers. But yes, very likely in a lot of cases with ovn-k8s > deployments we'll be able to define load balancers like you did above. > > >> Now that in ovn-k8s since we use DGP to pin LS to chassis, the lrp1, 2, 3 > > resides on different nodes, so the "vip" var is instantiated to the > > chassis-specific value. So this effectively binds the LB VIP to chassis > > without explicit "chassis" column. Did I understand the magic correctly? > > That's the idea, yes. > > >> If so, the predicate needs to be in logical flow match format, and the > > string will be directly used as match condition incorporated to related > > logical flows, right? One problem though, is that for DGPs, > > is_chassis_resident needs to evaluate the CR port "cr-xxx", which is > > internally constructed by OVN, and is not exposed to CMS. Once the > > implementation changes, the predicate wouldn't work. > > True. I was thinking of using the logical router port of the GW router > in ovn-k8s though. > > >> (I am looking forward to seeing your v1, but I think this discussion > > should help me understand the concept before going to the review :) > >> > > > > Sorry, one more point. With the "chassis" column, it is easy for > > conditional monitoring, considering that only a small number of values are > > useful for a specific chassis. The predicate alone wouldn't tell this > > This is a very good point. Conditional monitoring will be hard/impossible. > > > information. If we go with the predicate approach, I think it may be > > helpful to add another column to tell the datapath information (in NB, use > > LS/LR), so that it can be converted to DP in SB and be conditionally > > monitored (reusing the existing ovn-controller "local datapath" mechanism). > > It can be "any", to indicate something that applies to multiple/all DPs, if > > useful for the future use cases. > > I wonder if this is not too complex. Maybe we should just go to the > "chassis" column approach instead. To argue against chassis not being a > NB OVN entity, we already have NB tables that refer to chassis: > HA_Chassis_Group, HA_Chassis, Gateway_Chassis. > I am not too worried about the chassis not being a NB entity either. I think it is more of a tradeoff between flexibility and simplicity. The predicate approach can support something not related to chassis, such as: "instantiate the backends to value abc for ip.src == xxx". Maybe it is not a meaningful example, and I am not really sure if it is something useful in the future at all. So, just my 2 cents - for the purpose of this series, if we don't want to make it too complex, maybe "chassis" is a better option. If in the future we do need more flexibility, we can still introduce the predicate, although we may also need to take care of the backward compatibility/upgrade, if that's not too bad. Of course maybe I am short-sighted. I'd like to hear if anyone comes up with some realistic use cases that require the flexible approach, and then we will need to figure out a way for cond-monitoring. Thanks, Han > > > >> Thanks, > >> Han > >> > >>>>> Regarding deprecating and replacing address sets and port groups, I'm > >>>>> not sure how easy that would be but we can try it when we get to that > >>>> point. > >>>>> > >>>> > >>>> Address sets and port groups are something different in my view. > > Although > >>>> they can be treated as variables in a template, they are not really > >>>> chassis-specific, and each variable needs to be instantiated to a big > >>>> number of instances (sometimes huge). For this reason, fine-grained > > I-P > >>>> embedded in the expression parsing (for Address set) was introduced > > for the > >>>> performance of ovn-controller. Maybe we can say there are still some > >>>> similarities of templates, but I am not really sure if it is really > > helpful > >>>> to generalize them and how difficult it would be. > >>>> > >>> > >>> It doesn't look straight forward indeed. If everyone agrees, I'd > >>> suggest discussing this option (replacing AS and PG) only after/if the > >>> template support is accepted. > >>> > >>>> Thanks, > >>>> Han > >>>> > >>> > >>> Thanks, > >>> Dumitru > >>> > >>>>>> > >>>>>> Third, I was wondering if there could be some layer that exists > > between > >>>>>> the IDL and the application that expands the template variables as > > early > >>>>>> as possible. I'm thinking the application could inject some > > callback in > >>>>>> the IDL layer that might allow for the values to be substituted. > > This > >>>>>> way, the variable substitution is taken care of in a single place, > > and > >>>>>> by the time the application gets the data, it knows that all > >>>>>> substitutions have been made and there is no need to special case > >>>>>> template variable names vs. plain tokens. They should all be plain > >>>>>> tokens. I don't think construction of such a layer should be a > > barrier > >>>>>> to merging the code, but it's something worth considering as a later > >>>>>> improvement. > >>>>> > >>>>> This could work. I'll think more about it. But like you said, it's > >>>>> probably a longer term goal. It will need some significant changes > > in > >>>>> the IDL layer (e.g., to re-evaluate some records when template > >>>>> instantiations change). > >>>>> > >>>>>> > >>>>>> Anyway, those were my high-level thoughts on the topic. Let me know > > what > >>>>>> you think. > >>>>>> > >>>>> > >>>>> I can work on changing the Template_Var schema to add a broader way > > of > >>>>> specifying conditions (when/match/etc). I'm already working on > > adding > >>>>> proper nbctl support for templated load balancers and trying to > > tackle > >>>>> the rest of the todos. I can probably send a v1 sometime in the > > first > >>>>> half of next week. Do you want to share any specific code related > >>>>> comments that I should already integrate or shall we start a proper > >>>>> review when v1 gets posted? > >>>>> > >>>>> Thanks again for your input on this RFC! > >>>>> > >>>>> Regards, > >>>>> Dumitru > >>>>> > >>>>>> On 8/5/22 12:26, Dumitru Ceara wrote: > >>>>>>> Sometimes network components are compute node-specific. Sometimes > > such > >>>>>>> components are replicated, almost identically, for multiple nodes > >>>>>>> in the cluster. > >>>>>>> > >>>>>>> One such example is the case of Kubernetes NodePort services which > >>>>>>> translate (in the ovn-kubernetes case) to Load_Balancer > >>>>>>> objects being applied to each and every node's logical gateway > > router. > >>>>>>> These load balancers are almost identical, the main difference > > being > >>>>>>> the fact that they use different VIPs (the node's IP). > >>>>>>> > >>>>>>> With the current OVN load balancer design, this becomes a problem > > at > >>>>>>> scale because the number of load balancers that must be configured > > is > >>>>>>> N x M (N nodes times M services). > >>>>>>> > >>>>>>> This series proposes a new concept in OVN: virtual network > > component > >>>>>>> templates. The goal of the templates is to help reduce resource > >>>>>>> consumption in the OVN central components in specific cases like > > the > >>>> one > >>>>>>> described above. > >>>>>>> > >>>>>>> To achieve that, the CMS will instead configure a "templated" load > >>>>>>> balancer for every service and apply that single template record to > >>>>>>> the cluster-wide load balancer group. This template is then > >>>>>>> instantiated differently on different compute nodes. This > > translation > >>>>>>> is controlled through per-chassis "template variables" configured > > by > >>>>>>> the CMS in the new NB.Template_Var table. > >>>>>>> > >>>>>>> A syntetic benchmark simulating what an OpenShift router (using > > Node > >>>>>>> Port services) scale test would do shows the following preliminary > >>>>>>> results: > >>>>>>> A. 120 node, 2K NodePort services: > >>>>>>> - before: > >>>>>>> - Southbound DB size on disk (compacted): ~385MB > >>>>>>> - Southbound DB memory usage (RSS): ~3GB > >>>>>>> - Southbound DB logical flows: 720K > >>>>>>> > >>>>>>> - after: > >>>>>>> - Southbound DB size on disk (compacted): ~100MB > >>>>>>> - Southbound DB memory usage (RSS): ~250MB > >>>>>>> - Southbound DB logical flows: 6K > >>>>>>> > >>>>>>> B. 250 node, 2K NodePort services: > >>>>>>> - after (didn't run the "before" test as it was taking way too > > long): > >>>>>>> - Southbound DB size on disk (compacted): ~155MB > >>>>>>> - Southbound DB memory usage (RSS): ~760MB > >>>>>>> - Southbound DB logical flows: 6K > >>>>>>> > >>>>>>> The series is sent as RFC because there's still the need to add > >>>>>>> some template specific unit tests and the "ovn-nbctl lb-*" helper > >>>>>>> utilities need to be adapted to support templated load balancers. > >>>>>>> > >>>>>>> With these two items addressed the code is self can likely qualify > >>>>>>> for acceptance as a new feature in the upcoming release. > >>>>>>> > >>>>>>> There also exists a more extensive TODO list (also listed in the > > commit > >>>>>>> log of every patch in the series for now) but these are mainly load > >>>>>>> balancer related functionalities that are not yet implemented for > >>>>>>> templated load balancers but can definitely be implemented as > > follow > >>>> ups: > >>>>>>> - No support for LB health check if the LB is templated. > >>>>>>> - No support for VIP ARP responder if the LB is templated. > >>>>>>> - No support for routed VIPs if the LB is templated. > >>>>>>> - Figure out a way to deal with templates in ovn-trace > >>>>>>> - Determine if we need to allow Template_Var to match against > > chassis > >>>>>>> hostname or other IDs. > >>>>>>> - Make ofctrl_inject_pkt() work with template_vars. > >>>>>>> - Make test-ovn work with template_vars. > >>>>>>> > >>>>>>> A basic example of how to configure a templated load balancer > > follows: > >>>>>>> $ ovn-nbctl create load_balancer name=lb-test \ > >>>>>>> protocol=tcp options:template=true \ > >>>>>>> vips:\"^vip:4200\"="^backends" > >>>>>>> > >>>>>>> $ ovn-nbctl ls-add ls > >>>>>>> $ ovn-nbctl ls-lb-add ls lb-test > >>>>>>> > >>>>>>> # Instantiate the load balancer on chassis-1 > >>>>>>> $ ovn-nbctl create template_var name=vip value=80.80.80.1 > >>>>>>> chassis=chassis-1 > >>>>>>> $ ovn-nbctl create template_var name=backends > >>>>>>> value='"42.42.42.1:1000"' chassis=chassis-1 > >>>>>>> > >>>>>>> # Instantiate the load balancer on chassis-2 > >>>>>>> $ ovn-nbctl create template_var name=vip value=80.80.80.2 > >>>>>>> chassis=chassis-2 > >>>>>>> $ ovn-nbctl create template_var name=backends > >>>>>>> value='"42.42.42.2:1000"' chassis=chassis-2 > >>>>>>> > >>>>>>> Dumitru Ceara (5): > >>>>>>> Add NB and SB Template_Var tables. > >>>>>>> controller: Add support for templated actions and matches. > >>>>>>> controller: Make resource references more generic. > >>>>>>> lb: Support using templates. > >>>>>>> controller: Add Template_Var <- LB references. > >>>>>>> > >>>>>>> > >>>>>>> controller/lflow.c | 248 > > ++++++++++++++++++++++++-------- > >>>>>>> controller/lflow.h | 98 +++++++------ > >>>>>>> controller/ofctrl.c | 9 +- > >>>>>>> controller/ofctrl.h | 3 +- > >>>>>>> controller/ovn-controller.c | 277 > >>>> ++++++++++++++++++++++++++++++++++-- > >>>>>>> include/ovn/expr.h | 4 +- > >>>>>>> include/ovn/lex.h | 14 +- > >>>>>>> lib/actions.c | 9 +- > >>>>>>> lib/expr.c | 14 +- > >>>>>>> lib/lb.c | 201 ++++++++++++++++++++++---- > >>>>>>> lib/lb.h | 36 +++-- > >>>>>>> lib/lex.c | 55 +++++++ > >>>>>>> northd/northd.c | 64 +++++---- > >>>>>>> tests/ovn.at | 2 +- > >>>>>>> tests/test-ovn.c | 16 ++- > >>>>>>> utilities/ovn-trace.c | 26 +++- > >>>>>>> 16 files changed, 869 insertions(+), 207 deletions(-) > >>>>>>> > >>>>>>> _______________________________________________ > >>>>>>> dev mailing list > >>>>>>> dev@openvswitch.org > >>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev > >>>>>>> > >>>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> dev mailing list > >>>>> dev@openvswitch.org > >>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev > >>>> > >>> > > >
On 9/23/22 18:18, Han Zhou wrote: > On Fri, Sep 23, 2022 at 8:10 AM Dumitru Ceara <dceara@redhat.com> wrote: >> >> On 9/22/22 19:55, Han Zhou wrote: >>> On Thu, Sep 22, 2022 at 10:38 AM Han Zhou <hzhou@ovn.org> wrote: >>>> >>>> >>>> >>>> On Thu, Sep 22, 2022 at 1:00 AM Dumitru Ceara <dceara@redhat.com> > wrote: >>>>> >>>>> Hi Han, >>>>> >>>>> On 9/21/22 23:06, Han Zhou wrote: >>>>>> Thanks Dumitru for this promising optimization! >>>>>> >>>>> >>>>> Thanks for checking it out! >>>>> >>>>>> On Thu, Aug 11, 2022 at 1:03 AM Dumitru Ceara <dceara@redhat.com> >>> wrote: >>>>>>> >>>>>>> On 8/10/22 19:54, Mark Michelson wrote: >>>>>>>> Hi Dumitru, >>>>>>>> >>>>>>> >>>>>>> Hi Mark, >>>>>>> >>>>>>>> I read the patch series, and I think the idea of chassis-specific >>>>>>>> variables is a good idea to reduce the number of DB records for >>> certain >>>>>>>> things. Aside from load balancers, I suspect this could have a >>> positive >>>>>>>> impact for other structures as well. >>>>>>>> >>>>>>> >>>>>>> Thanks for taking a look! Yes, I think this might be applicable to >>>>>>> other structures too. >>>>>>> >>>>>> >>>>>> I think it is a good idea to make it more generic, but for my >>> understanding >>>>>> this template concept is for anything that's "node/chassis" specific, >>> and >>>>>> supposed to be instantiated at chassis level. Probably we should name >>> the >>>>>> DB tables as something like chassis_template_var. >>>>>> >>>>> >>>>> If we decide to go for a simple "chassis" column (instead of a generic >>>>> "match" column) then naming the table Chassis_Template_Var makes sense >>>>> indeed. >>>>> >>>>>>>> Rather than criticizing the individual lines of code, I'll focus >>> instead >>>>>>>> on some higher-level questions/ideas. >>>>>>>> >>>>>>> >>>>>>> Sure, thanks! :) >>>>>>> >>>>>>>> First, one question I had was what happens when a template variable >>> name >>>>>>>> is used in a load balancer, but there is no appropriate value to >>>>>>>> substitute? For instance, what if a load balancer applies to >>> chassis-3, >>>>>>>> but you only have template variables for chassis-1 and chassis-2? >>> This >>>>>>>> might be addressed in the code but I didn't notice if it was. >>>>>>>> >>>>>>> >>>>>>> There are actually two things to consider here: >>>>>>> >>>>>>> 1. there might be a logical flow that uses a template variable: in >>> this >>>>>>> case if template expansion/instantiation fails we currently leave > the >>>>>>> token untouched (e.g., '^variable' stays '^variable'). That will >>> cause >>>>>>> the flow action/match parsing to fail and currently logs a warning. >>> The >>>>>>> flow itself is skipped, as it should be. We probably need to avoid >>>>>>> logging a warning though. >>>>>>> >>>>>>> 2. like you pointed out, there might be a load balancer using >>> templates >>>>>>> in its backends/vips: if some of those templates cannot be >>> instantiated >>>>>>> locally the backend/vip where they're added is skipped. Unless I >>> missed >>>>>>> something, the code should already do that. >>>>>>> >>>>>>>> Second, it seems like template variables are a natural extension of >>>>>>>> existing concepts like address sets and port groups. In those > cases, >>>>>>>> they were an unconditional collection of IP addresses or ports. For >>>>>>> >>>>>>> You're right to some extent template variables are similar to port >>>>>>> groups. The southbound database port group table splits the >>> northbound >>>>>>> port group per datapath though not per chassis like template >>> variables. >>>>>>> >>>>>>>> template variables, they're a collection of untyped values with the >>>>>>>> condition of only applying on certain Chassis. I wonder if this >>> could >>>>>>>> all be reconciled with a single table that uses untyped values with >>>>>>>> user-specified conditions. Right now template variables have a >>> "Chassis" >>>>>>>> column, but maybe this could be replaced with a broader > "condition", >>>>>>>> "when", or "match" column. To get something integrated quickly, > this >>>>>>>> column could just accept the syntax of "chassis.name == <blah>" or >>>>>>>> "chassis.uuid == <blah>" to allow for chassis-specific application >>> of >>>>>>>> the values. With this foundation, we could eventually allow >>>>>>>> unconditional application of the value, or more complex conditions >>> (e.g. >>>>>>>> only apply to logical switch ports that are connected to a router >>> with a >>>>>>>> distributed gateway port). Doing this, we could deprecate address >>> sets >>>>>>>> and port groups eventually in favor of template variables. >>>>>>> >>>>>>> This sounds like a good idea to me. I wasn't too happy with the >>>>>>> "chassis" string column of the Template_Var table anyway. A generic >>>>>>> condition field makes more sense. >>>>>>> >>>>>> If it is chassis-specific template, a column "chassis" seems to be >>>>>> straightforward. With a "match" column it is another burden of > parsing >>>>> >>>>> I have a generic implementation (with a "predicate" column) almost > ready >>>>> for review. I agree it's a bit more work to parse and maintain >>>>> references. I think it's probably best to discuss these details once > I >>>>> post v1. It's no problem for me to go back to the "chassis" column >>>>> version if we decide to use that approach. >>>>> >>>>>> (which is costly and error prone). In addition, the LB object (or >>> other >>>>>> structures) is not a logical-flow, and it doesn't directly map to >>>>>> logical-flows (unlike ACLs), so I didn't understand how would a match >>>>>> string be applied to the template. Is there a more detailed example > of >>>>>> this? Maybe I am missing something, and hope we will see more details >>> in >>>>>> the formal patch. >>>>>> >>>>> >>>>> Load balancer VIPs are a collection of key-value pairs (vip:backends). >>>>> These are currently IPs (and optionally L4 ports). We can allow LBs > to >>>>> support "templated" VIPs/backends. For example (this is with the WIP > v1 >>>>> version of the code): >>>>> >>>>> # Create a template LB. >>>>> ovn-nbctl --template lb-add lb-test "^vip:^vport" "^backends" tcp \ >>>>> -- ls-lb-add ls1 lb-test \ >>>>> -- lr-lb-add rtr lb-test >>>>> >>>>> # Instantiate the LB template variables, for the chassis where port >>>>> # vm1 is bound: >>>>> ovn-nbctl \ >>>>> -- create template_var name=vip value=66.66.66.66 \ >>>>> predicate="is_chassis_resident(\"vm1\")" \ >>>>> -- create template_var name=vport value=666 \ >>>>> predicate="is_chassis_resident(\"vm1\")" \ >>>>> -- create template_var name=backends value=\"42.42.42.2:4242\" \ >>>>> predicate="is_chassis_resident(\"vm1\")" >>>>> >>>>> This is equivalent to: >>>>> ovn-nbctl lb-add lb-test 66.66.66.66:666 42.42.42.2:4242 tcp >>>>> ovn-nbctl ls-lb-add ls1 lb-test >>>>> ovn-nbctl lr-lb-add lr1 lb-test >>>>> >>>> >>>> Thanks, this helps. I guess we wouldn't want to use backends as > variables >>> in the real use cases, right? Otherwise, creating backends value for > each >>> chassis sounds not helpful for the scale. So, if I understand correctly, >>> what we would do is: >>>> >>>> # Create a template LB. >>>> ovn-nbctl --template lb-add lb-test "^vip:<the real port>" "<real >>> backends>" tcp \ >>>> -- ls-lb-add ls1 lb-test \ >>>> -- lr-lb-add rtr lb-test >>>> >>>> # Instantiate the LB template variables, for the chassis where DGPs are >>> bound: >>>> ovn-nbctl \ >>>> -- create template_var name=vip value=66.66.66.66 \ >>>> predicate="is_chassis_resident(\"lrp1\")" \ >>>> -- create template_var name=vip value=66.66.66.67 \ >>>> predicate="is_chassis_resident(\"lrp2\")" \ >>>> -- create template_var name=vip value=66.66.66.68 \ >>>> predicate="is_chassis_resident(\"lrp3\")" >>>> >> >> It can be that we need to use backends as variables too. In the worst >> case, we still probably end up using less resources than with explicit >> load balancers. But yes, very likely in a lot of cases with ovn-k8s >> deployments we'll be able to define load balancers like you did above. >> >>>> Now that in ovn-k8s since we use DGP to pin LS to chassis, the lrp1, > 2, 3 >>> resides on different nodes, so the "vip" var is instantiated to the >>> chassis-specific value. So this effectively binds the LB VIP to chassis >>> without explicit "chassis" column. Did I understand the magic correctly? >> >> That's the idea, yes. >> >>>> If so, the predicate needs to be in logical flow match format, and the >>> string will be directly used as match condition incorporated to related >>> logical flows, right? One problem though, is that for DGPs, >>> is_chassis_resident needs to evaluate the CR port "cr-xxx", which is >>> internally constructed by OVN, and is not exposed to CMS. Once the >>> implementation changes, the predicate wouldn't work. >> >> True. I was thinking of using the logical router port of the GW router >> in ovn-k8s though. >> >>>> (I am looking forward to seeing your v1, but I think this discussion >>> should help me understand the concept before going to the review :) >>>> >>> >>> Sorry, one more point. With the "chassis" column, it is easy for >>> conditional monitoring, considering that only a small number of values > are >>> useful for a specific chassis. The predicate alone wouldn't tell this >> >> This is a very good point. Conditional monitoring will be > hard/impossible. >> >>> information. If we go with the predicate approach, I think it may be >>> helpful to add another column to tell the datapath information (in NB, > use >>> LS/LR), so that it can be converted to DP in SB and be conditionally >>> monitored (reusing the existing ovn-controller "local datapath" > mechanism). >>> It can be "any", to indicate something that applies to multiple/all > DPs, if >>> useful for the future use cases. >> >> I wonder if this is not too complex. Maybe we should just go to the >> "chassis" column approach instead. To argue against chassis not being a >> NB OVN entity, we already have NB tables that refer to chassis: >> HA_Chassis_Group, HA_Chassis, Gateway_Chassis. >> > I am not too worried about the chassis not being a NB entity either. I > think it is more of a tradeoff between flexibility and simplicity. The > predicate approach can support something not related to chassis, such as: > "instantiate the backends to value abc for ip.src == xxx". Maybe it is not > a meaningful example, and I am not really sure if it is something useful in > the future at all. So, just my 2 cents - for the purpose of this series, if We can't allow expressions that use packet fields as matches. We don't have a packet to look at when we evaluate the template variable predicate. So, at least for the moment, the only real expression we can evaluate is "is_chassis_resident(port)". This is slightly more expressive than just "chassis == <value>". If a logical switch port "migrates" to a new chassis then using such a predicate would allow us to also "migrate" other things, e.g., a load balancer. On the other hand, in ovn-kubernetes (for now the only potential user of templates), LSPs never "migrate". PODs are immutable and if they need to "migrate" they're recreated (along with a new LSP) on a different chassis. > we don't want to make it too complex, maybe "chassis" is a better option. I think that's fine for what we need today. It also avoids ambiguity, e.g., what if we have two template variable instances (same name but different values) whose predicates both evaluate to "true" on the local chassis? > If in the future we do need more flexibility, we can still introduce the > predicate, although we may also need to take care of the backward > compatibility/upgrade, if that's not too bad. Of course maybe I am > short-sighted. I'd like to hear if anyone comes up with some realistic use > cases that require the flexible approach, and then we will need to figure > out a way for cond-monitoring. I'll wait another day and then post a series using the simplified, chassis based version. Thanks, Dumitru > > Thanks, > Han > >>> >>>> Thanks, >>>> Han >>>> >>>>>>> Regarding deprecating and replacing address sets and port groups, > I'm >>>>>>> not sure how easy that would be but we can try it when we get to > that >>>>>> point. >>>>>>> >>>>>> >>>>>> Address sets and port groups are something different in my view. >>> Although >>>>>> they can be treated as variables in a template, they are not really >>>>>> chassis-specific, and each variable needs to be instantiated to a big >>>>>> number of instances (sometimes huge). For this reason, fine-grained >>> I-P >>>>>> embedded in the expression parsing (for Address set) was introduced >>> for the >>>>>> performance of ovn-controller. Maybe we can say there are still some >>>>>> similarities of templates, but I am not really sure if it is really >>> helpful >>>>>> to generalize them and how difficult it would be. >>>>>> >>>>> >>>>> It doesn't look straight forward indeed. If everyone agrees, I'd >>>>> suggest discussing this option (replacing AS and PG) only after/if the >>>>> template support is accepted. >>>>> >>>>>> Thanks, >>>>>> Han >>>>>> >>>>> >>>>> Thanks, >>>>> Dumitru >>>>> >>>>>>>> >>>>>>>> Third, I was wondering if there could be some layer that exists >>> between >>>>>>>> the IDL and the application that expands the template variables as >>> early >>>>>>>> as possible. I'm thinking the application could inject some >>> callback in >>>>>>>> the IDL layer that might allow for the values to be substituted. >>> This >>>>>>>> way, the variable substitution is taken care of in a single place, >>> and >>>>>>>> by the time the application gets the data, it knows that all >>>>>>>> substitutions have been made and there is no need to special case >>>>>>>> template variable names vs. plain tokens. They should all be plain >>>>>>>> tokens. I don't think construction of such a layer should be a >>> barrier >>>>>>>> to merging the code, but it's something worth considering as a > later >>>>>>>> improvement. >>>>>>> >>>>>>> This could work. I'll think more about it. But like you said, it's >>>>>>> probably a longer term goal. It will need some significant changes >>> in >>>>>>> the IDL layer (e.g., to re-evaluate some records when template >>>>>>> instantiations change). >>>>>>> >>>>>>>> >>>>>>>> Anyway, those were my high-level thoughts on the topic. Let me know >>> what >>>>>>>> you think. >>>>>>>> >>>>>>> >>>>>>> I can work on changing the Template_Var schema to add a broader way >>> of >>>>>>> specifying conditions (when/match/etc). I'm already working on >>> adding >>>>>>> proper nbctl support for templated load balancers and trying to >>> tackle >>>>>>> the rest of the todos. I can probably send a v1 sometime in the >>> first >>>>>>> half of next week. Do you want to share any specific code related >>>>>>> comments that I should already integrate or shall we start a proper >>>>>>> review when v1 gets posted? >>>>>>> >>>>>>> Thanks again for your input on this RFC! >>>>>>> >>>>>>> Regards, >>>>>>> Dumitru >>>>>>> >>>>>>>> On 8/5/22 12:26, Dumitru Ceara wrote: >>>>>>>>> Sometimes network components are compute node-specific. Sometimes >>> such >>>>>>>>> components are replicated, almost identically, for multiple nodes >>>>>>>>> in the cluster. >>>>>>>>> >>>>>>>>> One such example is the case of Kubernetes NodePort services which >>>>>>>>> translate (in the ovn-kubernetes case) to Load_Balancer >>>>>>>>> objects being applied to each and every node's logical gateway >>> router. >>>>>>>>> These load balancers are almost identical, the main difference >>> being >>>>>>>>> the fact that they use different VIPs (the node's IP). >>>>>>>>> >>>>>>>>> With the current OVN load balancer design, this becomes a problem >>> at >>>>>>>>> scale because the number of load balancers that must be configured >>> is >>>>>>>>> N x M (N nodes times M services). >>>>>>>>> >>>>>>>>> This series proposes a new concept in OVN: virtual network >>> component >>>>>>>>> templates. The goal of the templates is to help reduce resource >>>>>>>>> consumption in the OVN central components in specific cases like >>> the >>>>>> one >>>>>>>>> described above. >>>>>>>>> >>>>>>>>> To achieve that, the CMS will instead configure a "templated" load >>>>>>>>> balancer for every service and apply that single template record > to >>>>>>>>> the cluster-wide load balancer group. This template is then >>>>>>>>> instantiated differently on different compute nodes. This >>> translation >>>>>>>>> is controlled through per-chassis "template variables" configured >>> by >>>>>>>>> the CMS in the new NB.Template_Var table. >>>>>>>>> >>>>>>>>> A syntetic benchmark simulating what an OpenShift router (using >>> Node >>>>>>>>> Port services) scale test would do shows the following preliminary >>>>>>>>> results: >>>>>>>>> A. 120 node, 2K NodePort services: >>>>>>>>> - before: >>>>>>>>> - Southbound DB size on disk (compacted): ~385MB >>>>>>>>> - Southbound DB memory usage (RSS): ~3GB >>>>>>>>> - Southbound DB logical flows: 720K >>>>>>>>> >>>>>>>>> - after: >>>>>>>>> - Southbound DB size on disk (compacted): ~100MB >>>>>>>>> - Southbound DB memory usage (RSS): ~250MB >>>>>>>>> - Southbound DB logical flows: 6K >>>>>>>>> >>>>>>>>> B. 250 node, 2K NodePort services: >>>>>>>>> - after (didn't run the "before" test as it was taking way too >>> long): >>>>>>>>> - Southbound DB size on disk (compacted): ~155MB >>>>>>>>> - Southbound DB memory usage (RSS): ~760MB >>>>>>>>> - Southbound DB logical flows: 6K >>>>>>>>> >>>>>>>>> The series is sent as RFC because there's still the need to add >>>>>>>>> some template specific unit tests and the "ovn-nbctl lb-*" helper >>>>>>>>> utilities need to be adapted to support templated load balancers. >>>>>>>>> >>>>>>>>> With these two items addressed the code is self can likely qualify >>>>>>>>> for acceptance as a new feature in the upcoming release. >>>>>>>>> >>>>>>>>> There also exists a more extensive TODO list (also listed in the >>> commit >>>>>>>>> log of every patch in the series for now) but these are mainly > load >>>>>>>>> balancer related functionalities that are not yet implemented for >>>>>>>>> templated load balancers but can definitely be implemented as >>> follow >>>>>> ups: >>>>>>>>> - No support for LB health check if the LB is templated. >>>>>>>>> - No support for VIP ARP responder if the LB is templated. >>>>>>>>> - No support for routed VIPs if the LB is templated. >>>>>>>>> - Figure out a way to deal with templates in ovn-trace >>>>>>>>> - Determine if we need to allow Template_Var to match against >>> chassis >>>>>>>>> hostname or other IDs. >>>>>>>>> - Make ofctrl_inject_pkt() work with template_vars. >>>>>>>>> - Make test-ovn work with template_vars. >>>>>>>>> >>>>>>>>> A basic example of how to configure a templated load balancer >>> follows: >>>>>>>>> $ ovn-nbctl create load_balancer name=lb-test \ >>>>>>>>> protocol=tcp options:template=true \ >>>>>>>>> vips:\"^vip:4200\"="^backends" >>>>>>>>> >>>>>>>>> $ ovn-nbctl ls-add ls >>>>>>>>> $ ovn-nbctl ls-lb-add ls lb-test >>>>>>>>> >>>>>>>>> # Instantiate the load balancer on chassis-1 >>>>>>>>> $ ovn-nbctl create template_var name=vip value=80.80.80.1 >>>>>>>>> chassis=chassis-1 >>>>>>>>> $ ovn-nbctl create template_var name=backends >>>>>>>>> value='"42.42.42.1:1000"' chassis=chassis-1 >>>>>>>>> >>>>>>>>> # Instantiate the load balancer on chassis-2 >>>>>>>>> $ ovn-nbctl create template_var name=vip value=80.80.80.2 >>>>>>>>> chassis=chassis-2 >>>>>>>>> $ ovn-nbctl create template_var name=backends >>>>>>>>> value='"42.42.42.2:1000"' chassis=chassis-2 >>>>>>>>> >>>>>>>>> Dumitru Ceara (5): >>>>>>>>> Add NB and SB Template_Var tables. >>>>>>>>> controller: Add support for templated actions and matches. >>>>>>>>> controller: Make resource references more generic. >>>>>>>>> lb: Support using templates. >>>>>>>>> controller: Add Template_Var <- LB references. >>>>>>>>> >>>>>>>>> >>>>>>>>> controller/lflow.c | 248 >>> ++++++++++++++++++++++++-------- >>>>>>>>> controller/lflow.h | 98 +++++++------ >>>>>>>>> controller/ofctrl.c | 9 +- >>>>>>>>> controller/ofctrl.h | 3 +- >>>>>>>>> controller/ovn-controller.c | 277 >>>>>> ++++++++++++++++++++++++++++++++++-- >>>>>>>>> include/ovn/expr.h | 4 +- >>>>>>>>> include/ovn/lex.h | 14 +- >>>>>>>>> lib/actions.c | 9 +- >>>>>>>>> lib/expr.c | 14 +- >>>>>>>>> lib/lb.c | 201 ++++++++++++++++++++++---- >>>>>>>>> lib/lb.h | 36 +++-- >>>>>>>>> lib/lex.c | 55 +++++++ >>>>>>>>> northd/northd.c | 64 +++++---- >>>>>>>>> tests/ovn.at | 2 +- >>>>>>>>> tests/test-ovn.c | 16 ++- >>>>>>>>> utilities/ovn-trace.c | 26 +++- >>>>>>>>> 16 files changed, 869 insertions(+), 207 deletions(-) >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> dev mailing list >>>>>>>>> dev@openvswitch.org >>>>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> dev mailing list >>>>>>> dev@openvswitch.org >>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >>>>>> >>>>> >>> >> >