diff mbox series

[ovs-dev] northd: support HW VTEP with stateful datapath

Message ID 20210917215602.10633-1-odivlad@gmail.com
State Accepted
Headers show
Series [ovs-dev] northd: support HW VTEP with stateful datapath | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test success github build: passed
ovsrobot/github-robot-_ovn-kubernetes fail github build: failed

Commit Message

Vladislav Odintsov Sept. 17, 2021, 9:56 p.m. UTC
A packet going from HW VTEP device to VIF port when arrives to
hypervisor chassis should go through LS ingress pipeline to l2_lkp
stage without any match. In l2_lkp stage an output port is
determined and then packet passed to LS egress pipeline for futher
processing and to VIF port delivery.

Prior to this commit a packet, which was received from HW VTEP
device was dropped in an LS ingress datapath, where stateful services
were defined (ACLs, LBs).

To fix this issue we add a special flag-bit which can be used in LS
pipelines, to check whether the packet came from HW VTEP devices.
In ls_in_pre_acl and ls_in_pre_lb we add new flow with priority 110
to skip such packets.

Signed-off-by: Vladislav Odintsov <odivlad@gmail.com>
---
 northd/northd.c         | 14 ++++++++++++++
 northd/ovn-northd.8.xml | 29 +++++++++++++++++++++++++++++
 northd/ovn_northd.dl    | 33 +++++++++++++++++++++++++++++++--
 tests/ovn-northd.at     |  2 ++
 4 files changed, 76 insertions(+), 2 deletions(-)

Comments

Numan Siddique Sept. 18, 2021, 1:04 a.m. UTC | #1
On Fri, Sep 17, 2021 at 5:56 PM Vladislav Odintsov <odivlad@gmail.com> wrote:
>
> A packet going from HW VTEP device to VIF port when arrives to
> hypervisor chassis should go through LS ingress pipeline to l2_lkp
> stage without any match. In l2_lkp stage an output port is
> determined and then packet passed to LS egress pipeline for futher
> processing and to VIF port delivery.
>
> Prior to this commit a packet, which was received from HW VTEP
> device was dropped in an LS ingress datapath, where stateful services
> were defined (ACLs, LBs).
>
> To fix this issue we add a special flag-bit which can be used in LS
> pipelines, to check whether the packet came from HW VTEP devices.
> In ls_in_pre_acl and ls_in_pre_lb we add new flow with priority 110
> to skip such packets.
>
> Signed-off-by: Vladislav Odintsov <odivlad@gmail.com>

Thanks.  I applied this patch to master and to the newly created
branch-21.09 (considering it as a bug fix).

I didn't backport to other branches.  Let me know if you need
backports to other patches.

I applied with the below changes

--------------
diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index 7bb39d2ab..39f4eaa0c 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -263,16 +263,14 @@
         packets that match the <code>inport</code>.
       </li>
       <li>
-        Logical flows for RAMP (controller-vtep) devices are created for each
-        physical switch. Packets came from such devices hit these flows and set
-        the 14'th bit of OVS register 0 (REG0[14]) to 1. This regbit indicates
-        that packet came from RAMP (controller-vtep) device. Later in logical
-        switch ingress pipeline this register is checked in ls_in_acl_pre and
-        ls_in_lb_pre stages whether to skip sending packet to conntrack in
-        ingress pipeline or not. Packets from RAMP devices should go though
-        ingress pipeline without any flow match till ls_in_l2_lkup stage to
-        determine output port. Stateful ACLs for coming from RAMP device
-        packets are checked within logical switch egress pipeline.
+        For logical ports of type <code>vtep</code>, the above logical flow
+        will also apply the action <code>REGBIT_FROM_RAMP = 1;</code> to
+        indicate that the packet is coming from a RAMP (controller-vtep)
+        device.  Later pipelines will use this information to skip
+        sending the packet to the conntrack.  Packets from <code>vtep</code>
+        logical ports should go though ingress pipeline only to determine
+        the output port and they should not be subjected to any ACL checks.
+        Egress pipeline will do the ACL checks.
       </li>
     </ul>

@@ -467,10 +465,11 @@

     <p>
       This table has a priority-110 flow with the match
-      <code>reg0[14] == 1</code> for all logical switch datapaths to resubmit
-      traffic to the next table. <code>reg0[14]</code> is the register bit,
-      which indicates that packet was received from RAMP device. Packets from
-      RAMP device are handled by ACLs only in Logical Switch egress pipeline.
+      <code>REGBIT_FROM_RAMP == 1</code> for all logical switch datapaths to
+      resubmit traffic to the next table. <code>REGBIT_FROM_RAMP</code>
+      indicates that packet was received from <code>vtep</code> logical ports
+      and it can be skipped from the stateful ACL processing in the ingress
+      pipeline.
     </p>

     <p>
@@ -534,11 +533,11 @@

     <p>
       This table has a priority-110 flow with the match
-      <code>reg0[14] == 1</code> for all logical switch datapaths to resubmit
-      traffic to the next table. <code>reg0[14]</code> is the register bit,
-      which indicates that packet was received from RAMP device. Packets from
-      RAMP device could be handled by load balancing flows only in Logical
-      Switch egress pipeline.
+      <code>REGBIT_FROM_RAMP == 1</code> for all logical switch datapaths to
+      resubmit traffic to the next table. <code>REGBIT_FROM_RAMP</code>
+      indicates that packet was received from <code>vtep</code> logical ports
+      and it can be skipped from the load balancer processing in the ingress
+      pipeline.
     </p>

     <p>
--------------------

Numan

> ---
>  northd/northd.c         | 14 ++++++++++++++
>  northd/ovn-northd.8.xml | 29 +++++++++++++++++++++++++++++
>  northd/ovn_northd.dl    | 33 +++++++++++++++++++++++++++++++--
>  tests/ovn-northd.at     |  2 ++
>  4 files changed, 76 insertions(+), 2 deletions(-)
>
> diff --git a/northd/northd.c b/northd/northd.c
> index 688a6e4ef..1b84874a7 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -196,6 +196,7 @@ enum ovn_stage {
>  #define REGBIT_LKUP_FDB           "reg0[11]"
>  #define REGBIT_HAIRPIN_REPLY      "reg0[12]"
>  #define REGBIT_ACL_LABEL          "reg0[13]"
> +#define REGBIT_FROM_RAMP          "reg0[14]"
>
>  #define REG_ORIG_DIP_IPV4         "reg1"
>  #define REG_ORIG_DIP_IPV6         "xxreg1"
> @@ -5112,6 +5113,11 @@ build_lswitch_input_port_sec_op(
>      if (queue_id) {
>          ds_put_format(actions, "set_queue(%s); ", queue_id);
>      }
> +
> +    if (!strcmp(op->nbsp->type, "vtep")) {
> +        ds_put_format(actions, REGBIT_FROM_RAMP" = 1; ");
> +    }
> +
>      ds_put_cstr(actions, "next;");
>      ovn_lflow_add_with_lport_and_hint(lflows, op->od, S_SWITCH_IN_PORT_SEC_L2,
>                                        50, ds_cstr(match), ds_cstr(actions),
> @@ -5359,6 +5365,10 @@ build_pre_acls(struct ovn_datapath *od, struct hmap *port_groups,
>                        "nd || nd_rs || nd_ra || mldv1 || mldv2 || "
>                        "(udp && udp.src == 546 && udp.dst == 547)", "next;");
>
> +        /* Do not send coming from RAMP switch packets to conntrack. */
> +        ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_ACL, 110,
> +                      REGBIT_FROM_RAMP" == 1", "next;");
> +
>          /* Ingress and Egress Pre-ACL Table (Priority 100).
>           *
>           * Regardless of whether the ACL is "from-lport" or "to-lport",
> @@ -5463,6 +5473,10 @@ build_pre_lb(struct ovn_datapath *od, struct hmap *lflows,
>      ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_LB, 110,
>                    "eth.src == $svc_monitor_mac", "next;");
>
> +    /* Do not send coming from RAMP switch packets to conntrack. */
> +    ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_LB, 110,
> +                  REGBIT_FROM_RAMP" == 1", "next;");
> +
>      /* Allow all packets to go to next tables by default. */
>      ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_LB, 0, "1", "next;");
>      ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_LB, 0, "1", "next;");
> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> index eebf0d717..7bb39d2ab 100644
> --- a/northd/ovn-northd.8.xml
> +++ b/northd/ovn-northd.8.xml
> @@ -262,6 +262,18 @@
>          logical ports on which port security is not enabled, these advance all
>          packets that match the <code>inport</code>.
>        </li>
> +      <li>
> +        Logical flows for RAMP (controller-vtep) devices are created for each
> +        physical switch. Packets came from such devices hit these flows and set
> +        the 14'th bit of OVS register 0 (REG0[14]) to 1. This regbit indicates
> +        that packet came from RAMP (controller-vtep) device. Later in logical
> +        switch ingress pipeline this register is checked in ls_in_acl_pre and
> +        ls_in_lb_pre stages whether to skip sending packet to conntrack in
> +        ingress pipeline or not. Packets from RAMP devices should go though
> +        ingress pipeline without any flow match till ls_in_l2_lkup stage to
> +        determine output port. Stateful ACLs for coming from RAMP device
> +        packets are checked within logical switch egress pipeline.
> +      </li>
>      </ul>
>
>      <p>
> @@ -453,6 +465,14 @@
>        processing.
>      </p>
>
> +    <p>
> +      This table has a priority-110 flow with the match
> +      <code>reg0[14] == 1</code> for all logical switch datapaths to resubmit
> +      traffic to the next table. <code>reg0[14]</code> is the register bit,
> +      which indicates that packet was received from RAMP device. Packets from
> +      RAMP device are handled by ACLs only in Logical Switch egress pipeline.
> +    </p>
> +
>      <p>
>        This table also has a priority-110 flow with the match
>        <code>eth.dst == <var>E</var></code> for all logical switch
> @@ -512,6 +532,15 @@
>        configured. We can now add a lflow to drop ct.inv packets.
>      </p>
>
> +    <p>
> +      This table has a priority-110 flow with the match
> +      <code>reg0[14] == 1</code> for all logical switch datapaths to resubmit
> +      traffic to the next table. <code>reg0[14]</code> is the register bit,
> +      which indicates that packet was received from RAMP device. Packets from
> +      RAMP device could be handled by load balancing flows only in Logical
> +      Switch egress pipeline.
> +    </p>
> +
>      <p>
>        This table also has a priority-110 flow with the match
>        <code>eth.dst == <var>E</var></code> for all logical switch
> diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl
> index 669728497..0202af5dc 100644
> --- a/northd/ovn_northd.dl
> +++ b/northd/ovn_northd.dl
> @@ -1631,6 +1631,7 @@ function rEGBIT_ACL_HINT_BLOCK()   : istring = i"reg0[10]"
>  function rEGBIT_LKUP_FDB()         : istring = i"reg0[11]"
>  function rEGBIT_HAIRPIN_REPLY()    : istring = i"reg0[12]"
>  function rEGBIT_ACL_LABEL()        : istring = i"reg0[13]"
> +function rEGBIT_FROM_RAMP()        : istring = i"reg0[14]"
>
>  function rEG_ORIG_DIP_IPV4()       : istring = i"reg1"
>  function rEG_ORIG_DIP_IPV6()       : istring = i"xxreg1"
> @@ -2070,6 +2071,16 @@ for (&Switch(._uuid = ls_uuid, .has_stateful_acl = true)) {
>           .io_port          = None,
>           .controller_meter = None);
>
> +    /* Do not send coming from RAMP switch packets to conntrack. */
> +    Flow(.logical_datapath = ls_uuid,
> +         .stage            = s_SWITCH_IN_PRE_ACL(),
> +         .priority         = 110,
> +         .__match          = i"${rEGBIT_FROM_RAMP()} == 1",
> +         .actions          = i"next;",
> +         .stage_hint       = 0,
> +         .io_port          = None,
> +         .controller_meter = None);
> +
>      /* Ingress and Egress Pre-ACL Table (Priority 100).
>       *
>       * Regardless of whether the ACL is "from-lport" or "to-lport",
> @@ -2136,6 +2147,16 @@ for (&Switch(._uuid = ls_uuid)) {
>           .io_port          = None,
>           .controller_meter = None);
>
> +    /* Do not send coming from RAMP switch packets to conntrack. */
> +    Flow(.logical_datapath = ls_uuid,
> +         .stage            = s_SWITCH_IN_PRE_LB(),
> +         .priority         = 110,
> +         .__match          = i"${rEGBIT_FROM_RAMP()} == 1",
> +         .actions          = i"next;",
> +         .stage_hint       = 0,
> +         .io_port          = None,
> +         .controller_meter = None);
> +
>      /* Allow all packets to go to next tables by default. */
>      Flow(.logical_datapath = ls_uuid,
>           .stage            = s_SWITCH_IN_PRE_LB(),
> @@ -3361,10 +3382,18 @@ for (&SwitchPort(.lsp = lsp, .sw = sw, .json_name = json_name, .ps_eth_addresses
>              } else {
>                  i"inport == ${json_name} && eth.src == {${ps_eth_addresses.join(\" \")}}"
>              } in
> -        var actions = match (pbinding.options.get(i"qdisc_queue_id")) {
> +        var actions = {
> +            var ramp = if (lsp.__type == i"vtep") {
> +                i"${rEGBIT_FROM_RAMP()} = 1; "
> +            } else {
> +                i""
> +            };
> +            var queue = match (pbinding.options.get(i"qdisc_queue_id")) {
>                  None -> i"next;",
>                  Some{id} -> i"set_queue(${id}); next;"
> -            } in
> +            };
> +            i"${ramp}${queue}"
> +        } in
>          Flow(.logical_datapath = sw._uuid,
>               .stage            = s_SWITCH_IN_PORT_SEC_L2(),
>               .priority         = 50,
> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> index 2af3f2096..5de554455 100644
> --- a/tests/ovn-northd.at
> +++ b/tests/ovn-northd.at
> @@ -3597,6 +3597,7 @@ check_stateful_flows() {
>    table=6 (ls_in_pre_lb       ), priority=110  , match=(eth.dst == $svc_monitor_mac), action=(next;)
>    table=6 (ls_in_pre_lb       ), priority=110  , match=(ip && inport == "sw0-lr0"), action=(next;)
>    table=6 (ls_in_pre_lb       ), priority=110  , match=(nd || nd_rs || nd_ra || mldv1 || mldv2), action=(next;)
> +  table=6 (ls_in_pre_lb       ), priority=110  , match=(reg0[[14]] == 1), action=(next;)
>  ])
>
>      AT_CHECK([grep "ls_in_pre_stateful" sw0flows | sort], [0], [dnl
> @@ -3660,6 +3661,7 @@ AT_CHECK([grep "ls_in_pre_lb" sw0flows | sort], [0], [dnl
>    table=6 (ls_in_pre_lb       ), priority=110  , match=(eth.dst == $svc_monitor_mac), action=(next;)
>    table=6 (ls_in_pre_lb       ), priority=110  , match=(ip && inport == "sw0-lr0"), action=(next;)
>    table=6 (ls_in_pre_lb       ), priority=110  , match=(nd || nd_rs || nd_ra || mldv1 || mldv2), action=(next;)
> +  table=6 (ls_in_pre_lb       ), priority=110  , match=(reg0[[14]] == 1), action=(next;)
>  ])
>
>  AT_CHECK([grep "ls_in_pre_stateful" sw0flows | sort], [0], [dnl
> --
> 2.30.0
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
Vladislav Odintsov Sept. 18, 2021, 12:51 p.m. UTC | #2
Hi Numan,

thanks. I’m okay with your changes.
Recently I’ve seen report about this problem with RAMP/VTEP on the list,
so since it’s a bugfix, I think it would be great to backport it down to branches.

Though, there are a lot of conflicts with older branches, I’ve submitted the backport for 21.06 here:
https://patchwork.ozlabs.org/project/ovn/patch/20210918125121.8257-1-odivlad@gmail.com/

21.03 and older branches have more non-trivial conflicts, and backporting should be done more carefully.
If one needs that, he/she can try to do it by its own.

Regards,
Vladislav Odintsov

> On 18 Sep 2021, at 04:04, Numan Siddique <numans@ovn.org> wrote:
> 
> On Fri, Sep 17, 2021 at 5:56 PM Vladislav Odintsov <odivlad@gmail.com <mailto:odivlad@gmail.com>> wrote:
>> 
>> A packet going from HW VTEP device to VIF port when arrives to
>> hypervisor chassis should go through LS ingress pipeline to l2_lkp
>> stage without any match. In l2_lkp stage an output port is
>> determined and then packet passed to LS egress pipeline for futher
>> processing and to VIF port delivery.
>> 
>> Prior to this commit a packet, which was received from HW VTEP
>> device was dropped in an LS ingress datapath, where stateful services
>> were defined (ACLs, LBs).
>> 
>> To fix this issue we add a special flag-bit which can be used in LS
>> pipelines, to check whether the packet came from HW VTEP devices.
>> In ls_in_pre_acl and ls_in_pre_lb we add new flow with priority 110
>> to skip such packets.
>> 
>> Signed-off-by: Vladislav Odintsov <odivlad@gmail.com>
> 
> Thanks.  I applied this patch to master and to the newly created
> branch-21.09 (considering it as a bug fix).
> 
> I didn't backport to other branches.  Let me know if you need
> backports to other patches.
> 
> I applied with the below changes
> 
> --------------
> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> index 7bb39d2ab..39f4eaa0c 100644
> --- a/northd/ovn-northd.8.xml
> +++ b/northd/ovn-northd.8.xml
> @@ -263,16 +263,14 @@
>         packets that match the <code>inport</code>.
>       </li>
>       <li>
> -        Logical flows for RAMP (controller-vtep) devices are created for each
> -        physical switch. Packets came from such devices hit these flows and set
> -        the 14'th bit of OVS register 0 (REG0[14]) to 1. This regbit indicates
> -        that packet came from RAMP (controller-vtep) device. Later in logical
> -        switch ingress pipeline this register is checked in ls_in_acl_pre and
> -        ls_in_lb_pre stages whether to skip sending packet to conntrack in
> -        ingress pipeline or not. Packets from RAMP devices should go though
> -        ingress pipeline without any flow match till ls_in_l2_lkup stage to
> -        determine output port. Stateful ACLs for coming from RAMP device
> -        packets are checked within logical switch egress pipeline.
> +        For logical ports of type <code>vtep</code>, the above logical flow
> +        will also apply the action <code>REGBIT_FROM_RAMP = 1;</code> to
> +        indicate that the packet is coming from a RAMP (controller-vtep)
> +        device.  Later pipelines will use this information to skip
> +        sending the packet to the conntrack.  Packets from <code>vtep</code>
> +        logical ports should go though ingress pipeline only to determine
> +        the output port and they should not be subjected to any ACL checks.
> +        Egress pipeline will do the ACL checks.
>       </li>
>     </ul>
> 
> @@ -467,10 +465,11 @@
> 
>     <p>
>       This table has a priority-110 flow with the match
> -      <code>reg0[14] == 1</code> for all logical switch datapaths to resubmit
> -      traffic to the next table. <code>reg0[14]</code> is the register bit,
> -      which indicates that packet was received from RAMP device. Packets from
> -      RAMP device are handled by ACLs only in Logical Switch egress pipeline.
> +      <code>REGBIT_FROM_RAMP == 1</code> for all logical switch datapaths to
> +      resubmit traffic to the next table. <code>REGBIT_FROM_RAMP</code>
> +      indicates that packet was received from <code>vtep</code> logical ports
> +      and it can be skipped from the stateful ACL processing in the ingress
> +      pipeline.
>     </p>
> 
>     <p>
> @@ -534,11 +533,11 @@
> 
>     <p>
>       This table has a priority-110 flow with the match
> -      <code>reg0[14] == 1</code> for all logical switch datapaths to resubmit
> -      traffic to the next table. <code>reg0[14]</code> is the register bit,
> -      which indicates that packet was received from RAMP device. Packets from
> -      RAMP device could be handled by load balancing flows only in Logical
> -      Switch egress pipeline.
> +      <code>REGBIT_FROM_RAMP == 1</code> for all logical switch datapaths to
> +      resubmit traffic to the next table. <code>REGBIT_FROM_RAMP</code>
> +      indicates that packet was received from <code>vtep</code> logical ports
> +      and it can be skipped from the load balancer processing in the ingress
> +      pipeline.
>     </p>
> 
>     <p>
> --------------------
> 
> Numan
> 
>> ---
>> northd/northd.c         | 14 ++++++++++++++
>> northd/ovn-northd.8.xml | 29 +++++++++++++++++++++++++++++
>> northd/ovn_northd.dl    | 33 +++++++++++++++++++++++++++++++--
>> tests/ovn-northd.at     |  2 ++
>> 4 files changed, 76 insertions(+), 2 deletions(-)
>> 
>> diff --git a/northd/northd.c b/northd/northd.c
>> index 688a6e4ef..1b84874a7 100644
>> --- a/northd/northd.c
>> +++ b/northd/northd.c
>> @@ -196,6 +196,7 @@ enum ovn_stage {
>> #define REGBIT_LKUP_FDB           "reg0[11]"
>> #define REGBIT_HAIRPIN_REPLY      "reg0[12]"
>> #define REGBIT_ACL_LABEL          "reg0[13]"
>> +#define REGBIT_FROM_RAMP          "reg0[14]"
>> 
>> #define REG_ORIG_DIP_IPV4         "reg1"
>> #define REG_ORIG_DIP_IPV6         "xxreg1"
>> @@ -5112,6 +5113,11 @@ build_lswitch_input_port_sec_op(
>>     if (queue_id) {
>>         ds_put_format(actions, "set_queue(%s); ", queue_id);
>>     }
>> +
>> +    if (!strcmp(op->nbsp->type, "vtep")) {
>> +        ds_put_format(actions, REGBIT_FROM_RAMP" = 1; ");
>> +    }
>> +
>>     ds_put_cstr(actions, "next;");
>>     ovn_lflow_add_with_lport_and_hint(lflows, op->od, S_SWITCH_IN_PORT_SEC_L2,
>>                                       50, ds_cstr(match), ds_cstr(actions),
>> @@ -5359,6 +5365,10 @@ build_pre_acls(struct ovn_datapath *od, struct hmap *port_groups,
>>                       "nd || nd_rs || nd_ra || mldv1 || mldv2 || "
>>                       "(udp && udp.src == 546 && udp.dst == 547)", "next;");
>> 
>> +        /* Do not send coming from RAMP switch packets to conntrack. */
>> +        ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_ACL, 110,
>> +                      REGBIT_FROM_RAMP" == 1", "next;");
>> +
>>         /* Ingress and Egress Pre-ACL Table (Priority 100).
>>          *
>>          * Regardless of whether the ACL is "from-lport" or "to-lport",
>> @@ -5463,6 +5473,10 @@ build_pre_lb(struct ovn_datapath *od, struct hmap *lflows,
>>     ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_LB, 110,
>>                   "eth.src == $svc_monitor_mac", "next;");
>> 
>> +    /* Do not send coming from RAMP switch packets to conntrack. */
>> +    ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_LB, 110,
>> +                  REGBIT_FROM_RAMP" == 1", "next;");
>> +
>>     /* Allow all packets to go to next tables by default. */
>>     ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_LB, 0, "1", "next;");
>>     ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_LB, 0, "1", "next;");
>> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
>> index eebf0d717..7bb39d2ab 100644
>> --- a/northd/ovn-northd.8.xml
>> +++ b/northd/ovn-northd.8.xml
>> @@ -262,6 +262,18 @@
>>         logical ports on which port security is not enabled, these advance all
>>         packets that match the <code>inport</code>.
>>       </li>
>> +      <li>
>> +        Logical flows for RAMP (controller-vtep) devices are created for each
>> +        physical switch. Packets came from such devices hit these flows and set
>> +        the 14'th bit of OVS register 0 (REG0[14]) to 1. This regbit indicates
>> +        that packet came from RAMP (controller-vtep) device. Later in logical
>> +        switch ingress pipeline this register is checked in ls_in_acl_pre and
>> +        ls_in_lb_pre stages whether to skip sending packet to conntrack in
>> +        ingress pipeline or not. Packets from RAMP devices should go though
>> +        ingress pipeline without any flow match till ls_in_l2_lkup stage to
>> +        determine output port. Stateful ACLs for coming from RAMP device
>> +        packets are checked within logical switch egress pipeline.
>> +      </li>
>>     </ul>
>> 
>>     <p>
>> @@ -453,6 +465,14 @@
>>       processing.
>>     </p>
>> 
>> +    <p>
>> +      This table has a priority-110 flow with the match
>> +      <code>reg0[14] == 1</code> for all logical switch datapaths to resubmit
>> +      traffic to the next table. <code>reg0[14]</code> is the register bit,
>> +      which indicates that packet was received from RAMP device. Packets from
>> +      RAMP device are handled by ACLs only in Logical Switch egress pipeline.
>> +    </p>
>> +
>>     <p>
>>       This table also has a priority-110 flow with the match
>>       <code>eth.dst == <var>E</var></code> for all logical switch
>> @@ -512,6 +532,15 @@
>>       configured. We can now add a lflow to drop ct.inv packets.
>>     </p>
>> 
>> +    <p>
>> +      This table has a priority-110 flow with the match
>> +      <code>reg0[14] == 1</code> for all logical switch datapaths to resubmit
>> +      traffic to the next table. <code>reg0[14]</code> is the register bit,
>> +      which indicates that packet was received from RAMP device. Packets from
>> +      RAMP device could be handled by load balancing flows only in Logical
>> +      Switch egress pipeline.
>> +    </p>
>> +
>>     <p>
>>       This table also has a priority-110 flow with the match
>>       <code>eth.dst == <var>E</var></code> for all logical switch
>> diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl
>> index 669728497..0202af5dc 100644
>> --- a/northd/ovn_northd.dl
>> +++ b/northd/ovn_northd.dl
>> @@ -1631,6 +1631,7 @@ function rEGBIT_ACL_HINT_BLOCK()   : istring = i"reg0[10]"
>> function rEGBIT_LKUP_FDB()         : istring = i"reg0[11]"
>> function rEGBIT_HAIRPIN_REPLY()    : istring = i"reg0[12]"
>> function rEGBIT_ACL_LABEL()        : istring = i"reg0[13]"
>> +function rEGBIT_FROM_RAMP()        : istring = i"reg0[14]"
>> 
>> function rEG_ORIG_DIP_IPV4()       : istring = i"reg1"
>> function rEG_ORIG_DIP_IPV6()       : istring = i"xxreg1"
>> @@ -2070,6 +2071,16 @@ for (&Switch(._uuid = ls_uuid, .has_stateful_acl = true)) {
>>          .io_port          = None,
>>          .controller_meter = None);
>> 
>> +    /* Do not send coming from RAMP switch packets to conntrack. */
>> +    Flow(.logical_datapath = ls_uuid,
>> +         .stage            = s_SWITCH_IN_PRE_ACL(),
>> +         .priority         = 110,
>> +         .__match          = i"${rEGBIT_FROM_RAMP()} == 1",
>> +         .actions          = i"next;",
>> +         .stage_hint       = 0,
>> +         .io_port          = None,
>> +         .controller_meter = None);
>> +
>>     /* Ingress and Egress Pre-ACL Table (Priority 100).
>>      *
>>      * Regardless of whether the ACL is "from-lport" or "to-lport",
>> @@ -2136,6 +2147,16 @@ for (&Switch(._uuid = ls_uuid)) {
>>          .io_port          = None,
>>          .controller_meter = None);
>> 
>> +    /* Do not send coming from RAMP switch packets to conntrack. */
>> +    Flow(.logical_datapath = ls_uuid,
>> +         .stage            = s_SWITCH_IN_PRE_LB(),
>> +         .priority         = 110,
>> +         .__match          = i"${rEGBIT_FROM_RAMP()} == 1",
>> +         .actions          = i"next;",
>> +         .stage_hint       = 0,
>> +         .io_port          = None,
>> +         .controller_meter = None);
>> +
>>     /* Allow all packets to go to next tables by default. */
>>     Flow(.logical_datapath = ls_uuid,
>>          .stage            = s_SWITCH_IN_PRE_LB(),
>> @@ -3361,10 +3382,18 @@ for (&SwitchPort(.lsp = lsp, .sw = sw, .json_name = json_name, .ps_eth_addresses
>>             } else {
>>                 i"inport == ${json_name} && eth.src == {${ps_eth_addresses.join(\" \")}}"
>>             } in
>> -        var actions = match (pbinding.options.get(i"qdisc_queue_id")) {
>> +        var actions = {
>> +            var ramp = if (lsp.__type == i"vtep") {
>> +                i"${rEGBIT_FROM_RAMP()} = 1; "
>> +            } else {
>> +                i""
>> +            };
>> +            var queue = match (pbinding.options.get(i"qdisc_queue_id")) {
>>                 None -> i"next;",
>>                 Some{id} -> i"set_queue(${id}); next;"
>> -            } in
>> +            };
>> +            i"${ramp}${queue}"
>> +        } in
>>         Flow(.logical_datapath = sw._uuid,
>>              .stage            = s_SWITCH_IN_PORT_SEC_L2(),
>>              .priority         = 50,
>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
>> index 2af3f2096..5de554455 100644
>> --- a/tests/ovn-northd.at
>> +++ b/tests/ovn-northd.at
>> @@ -3597,6 +3597,7 @@ check_stateful_flows() {
>>   table=6 (ls_in_pre_lb       ), priority=110  , match=(eth.dst == $svc_monitor_mac), action=(next;)
>>   table=6 (ls_in_pre_lb       ), priority=110  , match=(ip && inport == "sw0-lr0"), action=(next;)
>>   table=6 (ls_in_pre_lb       ), priority=110  , match=(nd || nd_rs || nd_ra || mldv1 || mldv2), action=(next;)
>> +  table=6 (ls_in_pre_lb       ), priority=110  , match=(reg0[[14]] == 1), action=(next;)
>> ])
>> 
>>     AT_CHECK([grep "ls_in_pre_stateful" sw0flows | sort], [0], [dnl
>> @@ -3660,6 +3661,7 @@ AT_CHECK([grep "ls_in_pre_lb" sw0flows | sort], [0], [dnl
>>   table=6 (ls_in_pre_lb       ), priority=110  , match=(eth.dst == $svc_monitor_mac), action=(next;)
>>   table=6 (ls_in_pre_lb       ), priority=110  , match=(ip && inport == "sw0-lr0"), action=(next;)
>>   table=6 (ls_in_pre_lb       ), priority=110  , match=(nd || nd_rs || nd_ra || mldv1 || mldv2), action=(next;)
>> +  table=6 (ls_in_pre_lb       ), priority=110  , match=(reg0[[14]] == 1), action=(next;)
>> ])
>> 
>> AT_CHECK([grep "ls_in_pre_stateful" sw0flows | sort], [0], [dnl
>> --
>> 2.30.0
>> 
>> _______________________________________________
>> dev mailing list
>> dev@openvswitch.org <mailto:dev@openvswitch.org>
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev <https://mail.openvswitch.org/mailman/listinfo/ovs-dev>
diff mbox series

Patch

diff --git a/northd/northd.c b/northd/northd.c
index 688a6e4ef..1b84874a7 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -196,6 +196,7 @@  enum ovn_stage {
 #define REGBIT_LKUP_FDB           "reg0[11]"
 #define REGBIT_HAIRPIN_REPLY      "reg0[12]"
 #define REGBIT_ACL_LABEL          "reg0[13]"
+#define REGBIT_FROM_RAMP          "reg0[14]"
 
 #define REG_ORIG_DIP_IPV4         "reg1"
 #define REG_ORIG_DIP_IPV6         "xxreg1"
@@ -5112,6 +5113,11 @@  build_lswitch_input_port_sec_op(
     if (queue_id) {
         ds_put_format(actions, "set_queue(%s); ", queue_id);
     }
+
+    if (!strcmp(op->nbsp->type, "vtep")) {
+        ds_put_format(actions, REGBIT_FROM_RAMP" = 1; ");
+    }
+
     ds_put_cstr(actions, "next;");
     ovn_lflow_add_with_lport_and_hint(lflows, op->od, S_SWITCH_IN_PORT_SEC_L2,
                                       50, ds_cstr(match), ds_cstr(actions),
@@ -5359,6 +5365,10 @@  build_pre_acls(struct ovn_datapath *od, struct hmap *port_groups,
                       "nd || nd_rs || nd_ra || mldv1 || mldv2 || "
                       "(udp && udp.src == 546 && udp.dst == 547)", "next;");
 
+        /* Do not send coming from RAMP switch packets to conntrack. */
+        ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_ACL, 110,
+                      REGBIT_FROM_RAMP" == 1", "next;");
+
         /* Ingress and Egress Pre-ACL Table (Priority 100).
          *
          * Regardless of whether the ACL is "from-lport" or "to-lport",
@@ -5463,6 +5473,10 @@  build_pre_lb(struct ovn_datapath *od, struct hmap *lflows,
     ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_LB, 110,
                   "eth.src == $svc_monitor_mac", "next;");
 
+    /* Do not send coming from RAMP switch packets to conntrack. */
+    ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_LB, 110,
+                  REGBIT_FROM_RAMP" == 1", "next;");
+
     /* Allow all packets to go to next tables by default. */
     ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_LB, 0, "1", "next;");
     ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_LB, 0, "1", "next;");
diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index eebf0d717..7bb39d2ab 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -262,6 +262,18 @@ 
         logical ports on which port security is not enabled, these advance all
         packets that match the <code>inport</code>.
       </li>
+      <li>
+        Logical flows for RAMP (controller-vtep) devices are created for each
+        physical switch. Packets came from such devices hit these flows and set
+        the 14'th bit of OVS register 0 (REG0[14]) to 1. This regbit indicates
+        that packet came from RAMP (controller-vtep) device. Later in logical
+        switch ingress pipeline this register is checked in ls_in_acl_pre and
+        ls_in_lb_pre stages whether to skip sending packet to conntrack in
+        ingress pipeline or not. Packets from RAMP devices should go though
+        ingress pipeline without any flow match till ls_in_l2_lkup stage to
+        determine output port. Stateful ACLs for coming from RAMP device
+        packets are checked within logical switch egress pipeline.
+      </li>
     </ul>
 
     <p>
@@ -453,6 +465,14 @@ 
       processing.
     </p>
 
+    <p>
+      This table has a priority-110 flow with the match
+      <code>reg0[14] == 1</code> for all logical switch datapaths to resubmit
+      traffic to the next table. <code>reg0[14]</code> is the register bit,
+      which indicates that packet was received from RAMP device. Packets from
+      RAMP device are handled by ACLs only in Logical Switch egress pipeline.
+    </p>
+
     <p>
       This table also has a priority-110 flow with the match
       <code>eth.dst == <var>E</var></code> for all logical switch
@@ -512,6 +532,15 @@ 
       configured. We can now add a lflow to drop ct.inv packets.
     </p>
 
+    <p>
+      This table has a priority-110 flow with the match
+      <code>reg0[14] == 1</code> for all logical switch datapaths to resubmit
+      traffic to the next table. <code>reg0[14]</code> is the register bit,
+      which indicates that packet was received from RAMP device. Packets from
+      RAMP device could be handled by load balancing flows only in Logical
+      Switch egress pipeline.
+    </p>
+
     <p>
       This table also has a priority-110 flow with the match
       <code>eth.dst == <var>E</var></code> for all logical switch
diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl
index 669728497..0202af5dc 100644
--- a/northd/ovn_northd.dl
+++ b/northd/ovn_northd.dl
@@ -1631,6 +1631,7 @@  function rEGBIT_ACL_HINT_BLOCK()   : istring = i"reg0[10]"
 function rEGBIT_LKUP_FDB()         : istring = i"reg0[11]"
 function rEGBIT_HAIRPIN_REPLY()    : istring = i"reg0[12]"
 function rEGBIT_ACL_LABEL()        : istring = i"reg0[13]"
+function rEGBIT_FROM_RAMP()        : istring = i"reg0[14]"
 
 function rEG_ORIG_DIP_IPV4()       : istring = i"reg1"
 function rEG_ORIG_DIP_IPV6()       : istring = i"xxreg1"
@@ -2070,6 +2071,16 @@  for (&Switch(._uuid = ls_uuid, .has_stateful_acl = true)) {
          .io_port          = None,
          .controller_meter = None);
 
+    /* Do not send coming from RAMP switch packets to conntrack. */
+    Flow(.logical_datapath = ls_uuid,
+         .stage            = s_SWITCH_IN_PRE_ACL(),
+         .priority         = 110,
+         .__match          = i"${rEGBIT_FROM_RAMP()} == 1",
+         .actions          = i"next;",
+         .stage_hint       = 0,
+         .io_port          = None,
+         .controller_meter = None);
+
     /* Ingress and Egress Pre-ACL Table (Priority 100).
      *
      * Regardless of whether the ACL is "from-lport" or "to-lport",
@@ -2136,6 +2147,16 @@  for (&Switch(._uuid = ls_uuid)) {
          .io_port          = None,
          .controller_meter = None);
 
+    /* Do not send coming from RAMP switch packets to conntrack. */
+    Flow(.logical_datapath = ls_uuid,
+         .stage            = s_SWITCH_IN_PRE_LB(),
+         .priority         = 110,
+         .__match          = i"${rEGBIT_FROM_RAMP()} == 1",
+         .actions          = i"next;",
+         .stage_hint       = 0,
+         .io_port          = None,
+         .controller_meter = None);
+
     /* Allow all packets to go to next tables by default. */
     Flow(.logical_datapath = ls_uuid,
          .stage            = s_SWITCH_IN_PRE_LB(),
@@ -3361,10 +3382,18 @@  for (&SwitchPort(.lsp = lsp, .sw = sw, .json_name = json_name, .ps_eth_addresses
             } else {
                 i"inport == ${json_name} && eth.src == {${ps_eth_addresses.join(\" \")}}"
             } in
-        var actions = match (pbinding.options.get(i"qdisc_queue_id")) {
+        var actions = {
+            var ramp = if (lsp.__type == i"vtep") {
+                i"${rEGBIT_FROM_RAMP()} = 1; "
+            } else {
+                i""
+            };
+            var queue = match (pbinding.options.get(i"qdisc_queue_id")) {
                 None -> i"next;",
                 Some{id} -> i"set_queue(${id}); next;"
-            } in
+            };
+            i"${ramp}${queue}"
+        } in
         Flow(.logical_datapath = sw._uuid,
              .stage            = s_SWITCH_IN_PORT_SEC_L2(),
              .priority         = 50,
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index 2af3f2096..5de554455 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -3597,6 +3597,7 @@  check_stateful_flows() {
   table=6 (ls_in_pre_lb       ), priority=110  , match=(eth.dst == $svc_monitor_mac), action=(next;)
   table=6 (ls_in_pre_lb       ), priority=110  , match=(ip && inport == "sw0-lr0"), action=(next;)
   table=6 (ls_in_pre_lb       ), priority=110  , match=(nd || nd_rs || nd_ra || mldv1 || mldv2), action=(next;)
+  table=6 (ls_in_pre_lb       ), priority=110  , match=(reg0[[14]] == 1), action=(next;)
 ])
 
     AT_CHECK([grep "ls_in_pre_stateful" sw0flows | sort], [0], [dnl
@@ -3660,6 +3661,7 @@  AT_CHECK([grep "ls_in_pre_lb" sw0flows | sort], [0], [dnl
   table=6 (ls_in_pre_lb       ), priority=110  , match=(eth.dst == $svc_monitor_mac), action=(next;)
   table=6 (ls_in_pre_lb       ), priority=110  , match=(ip && inport == "sw0-lr0"), action=(next;)
   table=6 (ls_in_pre_lb       ), priority=110  , match=(nd || nd_rs || nd_ra || mldv1 || mldv2), action=(next;)
+  table=6 (ls_in_pre_lb       ), priority=110  , match=(reg0[[14]] == 1), action=(next;)
 ])
 
 AT_CHECK([grep "ls_in_pre_stateful" sw0flows | sort], [0], [dnl