diff mbox series

[ovs-dev] ovn: Fix IPv6 DAD failure for child ports

Message ID 20180917092410.7792-1-nusiddiq@redhat.com
State Changes Requested
Delegated to: Guru Shetty
Headers show
Series [ovs-dev] ovn: Fix IPv6 DAD failure for child ports | expand

Commit Message

Numan Siddique Sept. 17, 2018, 9:24 a.m. UTC
From: Numan Siddique <nusiddiq@redhat.com>

When a child vlan interface is created inside a VM, the below kernel message
is seen and IPv6 doesn't work on that interface.

[  138.000753] IPv6: vlan4: IPv6 duplicate address <IPv6 LLA> detected!

When a child port sends a broadcast packet, OVN delivers the same
packet back to the child port (and hence the DAD check fails).
This is because 'MLF_ALLOW_LOOPBACK_BIT' is set in REG10 in table 0 when
the packet is received from any child port and table 'OFTABLE_CHECK_LOOPBACK'
doesn't drop the packet.

This patch fixes the issue by using a new register bit (MLF_NESTED_CONTAINER_BIT)
instead of 'MLF_ALLOW_LOOPBACK_BIT' and sets it in REG10 for the packets received
from child ports.

Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
---
 ovn/controller/ofctrl.c   | 29 ++++++++++++++++++++---------
 ovn/controller/ofctrl.h   |  5 +++++
 ovn/controller/physical.c | 38 ++++++++++++++++++++++++++++++++------
 ovn/lib/logical-fields.h  |  4 ++++
 tests/ovn.at              | 11 +++++++++++
 5 files changed, 72 insertions(+), 15 deletions(-)

Comments

Ben Pfaff Sept. 18, 2018, 8:25 a.m. UTC | #1
On Mon, Sep 17, 2018 at 02:54:10PM +0530, nusiddiq@redhat.com wrote:
> From: Numan Siddique <nusiddiq@redhat.com>
> 
> When a child vlan interface is created inside a VM, the below kernel message
> is seen and IPv6 doesn't work on that interface.
> 
> [  138.000753] IPv6: vlan4: IPv6 duplicate address <IPv6 LLA> detected!
> 
> When a child port sends a broadcast packet, OVN delivers the same
> packet back to the child port (and hence the DAD check fails).
> This is because 'MLF_ALLOW_LOOPBACK_BIT' is set in REG10 in table 0 when
> the packet is received from any child port and table 'OFTABLE_CHECK_LOOPBACK'
> doesn't drop the packet.
> 
> This patch fixes the issue by using a new register bit (MLF_NESTED_CONTAINER_BIT)
> instead of 'MLF_ALLOW_LOOPBACK_BIT' and sets it in REG10 for the packets received
> from child ports.
> 
> Signed-off-by: Numan Siddique <nusiddiq@redhat.com>

I'd appreciate a review on this from someone who better understands the
container design and implementation.  Guru, are you a good person to
review it?

Thanks,

Ben.
Gurucharan Shetty Sept. 20, 2018, 2:47 p.m. UTC | #2
On Tue, 18 Sep 2018 at 01:25, Ben Pfaff <blp@ovn.org> wrote:

> On Mon, Sep 17, 2018 at 02:54:10PM +0530, nusiddiq@redhat.com wrote:
> > From: Numan Siddique <nusiddiq@redhat.com>
> >
> > When a child vlan interface is created inside a VM, the below kernel
> message
> > is seen and IPv6 doesn't work on that interface.
> >
> > [  138.000753] IPv6: vlan4: IPv6 duplicate address <IPv6 LLA> detected!
> >
> > When a child port sends a broadcast packet, OVN delivers the same
> > packet back to the child port (and hence the DAD check fails).
> > This is because 'MLF_ALLOW_LOOPBACK_BIT' is set in REG10 in table 0 when
> > the packet is received from any child port and table
> 'OFTABLE_CHECK_LOOPBACK'
> > doesn't drop the packet.
> >
> > This patch fixes the issue by using a new register bit
> (MLF_NESTED_CONTAINER_BIT)
> > instead of 'MLF_ALLOW_LOOPBACK_BIT' and sets it in REG10 for the packets
> received
> > from child ports.
> >
> > Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
>
> I'd appreciate a review on this from someone who better understands the
> container design and implementation.  Guru, are you a good person to
> review it?
>
I will take a look.


>
> Thanks,
>
> Ben.
>
Gurucharan Shetty Oct. 4, 2018, 6:39 p.m. UTC | #3
On Mon, 17 Sep 2018 at 02:24, <nusiddiq@redhat.com> wrote:

> From: Numan Siddique <nusiddiq@redhat.com>
>
> When a child vlan interface is created inside a VM, the below kernel
> message
> is seen and IPv6 doesn't work on that interface.
>

On which interface doesn't IPv6 work? On the Vm's interface or on the
container's interface?


>
> [  138.000753] IPv6: vlan4: IPv6 duplicate address <IPv6 LLA> detected!
>
> When a child port sends a broadcast packet, OVN delivers the same
> packet back to the child port (and hence the DAD check fails).
>

Is this limited to IP broadcast only? Or does this happen with mac
broadcast too? (I am surprised why I hadn't seen this behavior with ARP
packets of containers).


This is because 'MLF_ALLOW_LOOPBACK_BIT' is set in REG10 in table 0 when
> the packet is received from any child port and table
> 'OFTABLE_CHECK_LOOPBACK'
> doesn't drop the packet.
>

I had thought that the previous tables only replicate packets to ports
other than the inport. But I may be wrong.
Some of my comments further down could be stupid because I may have not
understood the problem well.


>
> This patch fixes the issue by using a new register bit
> (MLF_NESTED_CONTAINER_BIT)
> instead of 'MLF_ALLOW_LOOPBACK_BIT' and sets it in REG10 for the packets
> received
> from child ports.
>
> Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
> ---
>  ovn/controller/ofctrl.c   | 29 ++++++++++++++++++++---------
>  ovn/controller/ofctrl.h   |  5 +++++
>  ovn/controller/physical.c | 38 ++++++++++++++++++++++++++++++++------
>  ovn/lib/logical-fields.h  |  4 ++++
>  tests/ovn.at              | 11 +++++++++++
>  5 files changed, 72 insertions(+), 15 deletions(-)
>
> diff --git a/ovn/controller/ofctrl.c b/ovn/controller/ofctrl.c
> index 96c57f143..2f2b185ae 100644
> --- a/ovn/controller/ofctrl.c
> +++ b/ovn/controller/ofctrl.c
> @@ -630,9 +630,11 @@ ofctrl_recv(const struct ofp_header *oh, enum ofptype
> type)
>   *
>   * The caller should initialize its own hmap to hold the flows. */
>  void
> -ofctrl_add_flow(struct hmap *desired_flows,
> -                uint8_t table_id, uint16_t priority, uint64_t cookie,
> -                const struct match *match, const struct ofpbuf *actions)
> +ofctrl_check_and_add_flow(struct hmap *desired_flows,
> +                          uint8_t table_id, uint16_t priority, uint64_t
> cookie,
> +                          const struct match *match,
> +                          const struct ofpbuf *actions,
> +                          bool log_duplicate_flow)
>  {
>      struct ovn_flow *f = xmalloc(sizeof *f);
>      f->table_id = table_id;
> @@ -644,13 +646,14 @@ ofctrl_add_flow(struct hmap *desired_flows,
>      f->cookie = cookie;
>
>      if (ovn_flow_lookup(desired_flows, f)) {
> -        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 5);
> -        if (!VLOG_DROP_DBG(&rl)) {
> -            char *s = ovn_flow_to_string(f);
> -            VLOG_DBG("dropping duplicate flow: %s", s);
> -            free(s);
> +        if (log_duplicate_flow) {
> +            static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 5);
> +            if (!VLOG_DROP_DBG(&rl)) {
> +                char *s = ovn_flow_to_string(f);
> +                VLOG_DBG("dropping duplicate flow: %s", s);
> +                free(s);
> +            }
>          }
> -
>          ovn_flow_destroy(f);
>          return;
>      }
> @@ -658,6 +661,14 @@ ofctrl_add_flow(struct hmap *desired_flows,
>      hmap_insert(desired_flows, &f->hmap_node, f->hmap_node.hash);
>  }
>
> +void
> +ofctrl_add_flow(struct hmap *desired_flows,
> +                uint8_t table_id, uint16_t priority, uint64_t cookie,
> +                const struct match *match, const struct ofpbuf *actions)
> +{
> +    ofctrl_check_and_add_flow(desired_flows, table_id, priority, cookie,
> +                              match, actions, true);
> +}
>
>  /* ovn_flow. */
>
> diff --git a/ovn/controller/ofctrl.h b/ovn/controller/ofctrl.h
> index e3fc95b05..ae0cfa513 100644
> --- a/ovn/controller/ofctrl.h
> +++ b/ovn/controller/ofctrl.h
> @@ -55,4 +55,9 @@ void ofctrl_add_flow(struct hmap *desired_flows, uint8_t
> table_id,
>                       uint16_t priority, uint64_t cookie,
>                       const struct match *, const struct ofpbuf *ofpacts);
>
> +void ofctrl_check_and_add_flow(struct hmap *desired_flows, uint8_t
> table_id,
> +                               uint16_t priority, uint64_t cookie,
> +                               const struct match *,
> +                               const struct ofpbuf *ofpacts,
> +                               bool log_duplicate_flow);
>  #endif /* ovn/ofctrl.h */
> diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c
> index c38d7b09f..d781ae459 100644
> --- a/ovn/controller/physical.c
> +++ b/ovn/controller/physical.c
> @@ -189,7 +189,8 @@ get_zone_ids(const struct sbrec_port_binding *binding,
>
>  static void
>  put_local_common_flows(uint32_t dp_key, uint32_t port_key,
> -                       bool nested_container, const struct zone_ids
> *zone_ids,
> +                       uint32_t parent_port_key,
> +                       const struct zone_ids *zone_ids,
>                         struct ofpbuf *ofpacts_p, struct hmap *flow_table)
>  {
>      struct match match;
> @@ -253,7 +254,7 @@ put_local_common_flows(uint32_t dp_key, uint32_t
> port_key,
>      ofpbuf_clear(ofpacts_p);
>      match_set_metadata(&match, htonll(dp_key));
>      match_set_reg(&match, MFF_LOG_OUTPORT - MFF_REG0, port_key);
> -    if (!nested_container) {
> +    if (!parent_port_key) {
>          match_set_reg_masked(&match, MFF_LOG_FLAGS - MFF_REG0,
>                               MLF_ALLOW_LOOPBACK, MLF_ALLOW_LOOPBACK);
>      }
>
I think before your patch, the above code would always save inport for
nested containers. They continue to do so now. I hope that was your
intention?



> @@ -264,6 +265,25 @@ put_local_common_flows(uint32_t dp_key, uint32_t
> port_key,
>      put_stack(MFF_IN_PORT, ofpact_put_STACK_POP(ofpacts_p));
>      ofctrl_add_flow(flow_table, OFTABLE_SAVE_INPORT, 100, 0,
>

                   &match, ofpacts_p);
>
+
>
Can you add a comment here on what we are doing below?

When a packet is headed to a nested container, the outport in OVN should be
that of the destination nested container.
But here it looks like we are matching on the OVN outport being the parent
port and saying that if the outport is parent port, we should save the
physical inport.
Is that what we are doing? I don't quite understand how this helps to
prevent packet from going back to the originator child port. I am sure I am
missing something simple here.

+    if (parent_port_key) {
> +        match_init_catchall(&match);
> +        ofpbuf_clear(ofpacts_p);
> +        match_set_metadata(&match, htonll(dp_key));
> +        match_set_reg(&match, MFF_LOG_OUTPORT - MFF_REG0,
> parent_port_key);
> +        match_set_reg_masked(&match, MFF_LOG_FLAGS - MFF_REG0,
> +                             MLF_NESTED_CONTAINER, MLF_NESTED_CONTAINER);
> +
> +        put_stack(MFF_IN_PORT, ofpact_put_STACK_PUSH(ofpacts_p));
> +        put_load(0, MFF_IN_PORT, 0, 16, ofpacts_p);
> +        put_resubmit(OFTABLE_LOG_TO_PHY, ofpacts_p);
> +        put_stack(MFF_IN_PORT, ofpact_put_STACK_POP(ofpacts_p));
> +        /* When a parent port has multiple child ports, we don't want to
> +         * log the duplicate flow message.
> +         */
> +        ofctrl_check_and_add_flow(flow_table, OFTABLE_SAVE_INPORT, 100, 0,
> +                                  &match, ofpacts_p, false);
> +    }
>  }
>
>  static void
> @@ -328,7 +348,7 @@ consider_port_binding(struct ovsdb_idl_index
> *sbrec_chassis_by_name,
>          }
>
>          struct zone_ids binding_zones = get_zone_ids(binding, ct_zones);
> -        put_local_common_flows(dp_key, port_key, false, &binding_zones,
> +        put_local_common_flows(dp_key, port_key, 0, &binding_zones,
>                                 ofpacts_p, flow_table);
>
>          match_init_catchall(&match);
> @@ -452,6 +472,7 @@ consider_port_binding(struct ovsdb_idl_index
> *sbrec_chassis_by_name,
>
>      int tag = 0;
>      bool nested_container = false;
> +    const struct sbrec_port_binding *parent_port = NULL;
>      ofp_port_t ofport;
>      bool is_remote = false;
>      if (binding->parent_port && *binding->parent_port) {
> @@ -463,6 +484,8 @@ consider_port_binding(struct ovsdb_idl_index
> *sbrec_chassis_by_name,
>          if (ofport) {
>              tag = *binding->tag;
>              nested_container = true;
> +            parent_port = lport_lookup_by_name(
> +                sbrec_port_binding_by_name, binding->parent_port);
>          }
>      } else {
>          ofport = u16_to_ofp(simap_get(&localvif_to_ofport,
> @@ -523,7 +546,8 @@ consider_port_binding(struct ovsdb_idl_index
> *sbrec_chassis_by_name,
>           */
>
>          struct zone_ids zone_ids = get_zone_ids(binding, ct_zones);
> -        put_local_common_flows(dp_key, port_key, nested_container,
> &zone_ids,
> +        uint32_t parent_port_key = parent_port ? parent_port->tunnel_key
> : 0;
> +        put_local_common_flows(dp_key, port_key, parent_port_key,
> &zone_ids,
>                                 ofpacts_p, flow_table);
>
>          /* Table 0, Priority 150 and 100.
> @@ -553,8 +577,10 @@ consider_port_binding(struct ovsdb_idl_index
> *sbrec_chassis_by_name,
>              if (nested_container) {
>                  /* When a packet comes from a container sitting behind a
>                   * parent_port, we should let it loopback to other
> containers
> -                 * or the parent_port itself. */
> -                put_load(MLF_ALLOW_LOOPBACK, MFF_LOG_FLAGS, 0, 1,
> ofpacts_p);
> +                 * or the parent_port itself. Indicate this by setting the
> +                 * MLF_NESTED_CONTAINER_BIT in MFF_LOG_FLAGS.*/
> +                put_load(1, MFF_LOG_FLAGS, MLF_NESTED_CONTAINER_BIT, 1,
> +                         ofpacts_p);
>              }
>              ofpact_put_STRIP_VLAN(ofpacts_p);
>          }
> diff --git a/ovn/lib/logical-fields.h b/ovn/lib/logical-fields.h
> index b1dbb035a..ab31327e9 100644
> --- a/ovn/lib/logical-fields.h
> +++ b/ovn/lib/logical-fields.h
> @@ -50,6 +50,7 @@ enum mff_log_flags_bits {
>      MLF_FORCE_SNAT_FOR_DNAT_BIT = 2,
>      MLF_FORCE_SNAT_FOR_LB_BIT = 3,
>      MLF_LOCAL_ONLY_BIT = 4,
> +    MLF_NESTED_CONTAINER_BIT = 5,
>  };
>
>  /* MFF_LOG_FLAGS_REG flag assignments */
> @@ -75,6 +76,9 @@ enum mff_log_flags {
>       * hypervisors should instead only be output to local targets
>       */
>      MLF_LOCAL_ONLY = (1 << MLF_LOCAL_ONLY_BIT),
> +
> +    /* Indicate that a packet has received from a nested container. */
> +    MLF_NESTED_CONTAINER = (1 << MLF_NESTED_CONTAINER_BIT),
>  };
>
>  #endif /* ovn/lib/logical-fields.h */
> diff --git a/tests/ovn.at b/tests/ovn.at
> index 769e09f81..a7aba5ccd 100644
> --- a/tests/ovn.at
> +++ b/tests/ovn.at
> @@ -7038,6 +7038,17 @@
> packet=${dst_mac}${src_mac}8100000208004500001c000000003f110100${src_ip}${dst_ip
>  echo  $packet >> expected1
>  OVN_CHECK_PACKETS([hv1/vm1-tx.pcap], [expected1])
>
> +# Send broadcast packet from foo1. foo1 should not receive the same
> packet.
> +src_mac="f00000010205"
> +dst_mac="ffffffffffff"
> +src_ip=`ip_to_hex 192 168 1 2`
> +dst_ip=`ip_to_hex 255 255 255 255`
>
> +packet=${dst_mac}${src_mac}8100000108004500001c0000000040110000${src_ip}${dst_ip}0035111100080000
> +as hv1 ovs-appctl netdev-dummy/receive vm1 $packet
> +
> +# expected packet at VM1
> +OVN_CHECK_PACKETS([hv1/vm1-tx.pcap], [expected1])
> +
>  OVN_CLEANUP([hv1],[hv2])
>
>  AT_CLEANUP
> --
> 2.17.1
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
Numan Siddique Oct. 4, 2018, 8:28 p.m. UTC | #4
Thanks for the review Guru.

Please see below for the comments.



On Fri, Oct 5, 2018 at 12:09 AM Guru Shetty <guru@ovn.org> wrote:

>
>
> On Mon, 17 Sep 2018 at 02:24, <nusiddiq@redhat.com> wrote:
>
>> From: Numan Siddique <nusiddiq@redhat.com>
>>
>> When a child vlan interface is created inside a VM, the below kernel
>> message
>> is seen and IPv6 doesn't work on that interface.
>>
>
> On which interface doesn't IPv6 work? On the Vm's interface or on the
> container's interface?
>
>

It is seen on the container's interface.

>
>> [  138.000753] IPv6: vlan4: IPv6 duplicate address <IPv6 LLA> detected!
>>
>> When a child port sends a broadcast packet, OVN delivers the same
>> packet back to the child port (and hence the DAD check fails).
>>
>
> Is this limited to IP broadcast only? Or does this happen with mac
> broadcast too? (I am surprised why I hadn't seen this behavior with ARP
> packets of containers).
>

It happens with mac broadcast. You can actually reproduce the issue by
applying the code in tests/ovn.at without this patch.

Looks like I have confused you with the patch.  Let me try to give an
example. Suppose we have switch - sw0 - with ports p1, p2 and p3 and all of
them reside in the same chassis.
When p1 sends a broadcast/multicast packet, the below logical flow is hit

table=17(ls_in_l2_lkup      ), priority=100  , match=(eth.mcast),
action=(outport = "_MC_flood"; output;)

This translates to the OF flow
- metadata=0x1,dl_dst=01:00:00:00:00:00/01:00:00:00:00:00
actions=load:0xffff->NXM_NX_REG15[],resubmit(,32)

In the table 33 if reg15 has 0xffff, it will resubmit the flow with reg15
set to each logical port

i.e reg15=0xffff,metadata=0x1
actions=load:0x1->NXM_NX_REG15[],resubmit(,34),load:0x2>NXM_NX_REG15[],resubmit(,34),load:0x3->NXM_NX_REG15[],resubmit(,34),load:0xffff->NXM_NX_REG15[]

In the normal case, the packet will not be delivered back to p1 and instead
will be dropped in table 34 (OFTABLE_CHECK_LOOPBACK)  since reg10[0] is not
set.

table=34, n_packets=0, n_bytes=0,
priority=100,reg10=0/0x1,reg14=0x2,reg15=0x2,metadata=0x1 actions=drop

When the register reg10[0] is set, it means the packet needs to be loop
backed to the source port. So table 64 has flows to clear the in_port
i,.e
table=64,  priority=100,reg10=0x1/0x1,reg15=0x1,metadata=0x1
actions=push:NXM_OF_IN_PORT[],load:0->NXM_OF_IN_PORT[],resubmit(,65),pop:NXM_OF_IN_PORT[]

Now if we take the example of child ports, let's say "p2" is the child port
of "p1" and in_port of p1 is 1
In table 0, we will see 2 flows for in_port=1

match=(in_port=1) -
actions=(load:0x1->OXM_OF_METADATA[],load:0x1->NXM_NX_REG14[] and submit to
table 8)
match=(in_port=1, dl_vlan=4) actions(load:0x1->NXM_NX_REG10[0], strip_vlan,
load:0x1->OXM_OF_METADATA[],load:0x2->NXM_NX_REG14[] , and submit to table
8)

Suppose if p2 sends a braodcast packet, reg10[0] is set in table 0.

In this case, since reg10[0] is set, the broadcast packet is not dropped in
table 34 for port "p2" and it gets delivered back.
So this patch instead of using reg10[0], it uses a new register bit
-MLF_NESTED_CONTAINER_BIT

We also need to clear the in_port in table 64 if MLF_NESTED_CONTAINER_BIT
is set.

I will address the review comments and submit p2.

Hope this clears your confusion.

Thanks
Numan



>
>
> This is because 'MLF_ALLOW_LOOPBACK_BIT' is set in REG10 in table 0 when
>> the packet is received from any child port and table
>> 'OFTABLE_CHECK_LOOPBACK'
>> doesn't drop the packet.
>>
>
> I had thought that the previous tables only replicate packets to ports
> other than the inport. But I may be wrong.
> Some of my comments further down could be stupid because I may have not
> understood the problem well.
>
>
>>
>> This patch fixes the issue by using a new register bit
>> (MLF_NESTED_CONTAINER_BIT)
>> instead of 'MLF_ALLOW_LOOPBACK_BIT' and sets it in REG10 for the packets
>> received
>> from child ports.
>>
>> Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
>> ---
>>  ovn/controller/ofctrl.c   | 29 ++++++++++++++++++++---------
>>  ovn/controller/ofctrl.h   |  5 +++++
>>  ovn/controller/physical.c | 38 ++++++++++++++++++++++++++++++++------
>>  ovn/lib/logical-fields.h  |  4 ++++
>>  tests/ovn.at              | 11 +++++++++++
>>  5 files changed, 72 insertions(+), 15 deletions(-)
>>
>> diff --git a/ovn/controller/ofctrl.c b/ovn/controller/ofctrl.c
>> index 96c57f143..2f2b185ae 100644
>> --- a/ovn/controller/ofctrl.c
>> +++ b/ovn/controller/ofctrl.c
>> @@ -630,9 +630,11 @@ ofctrl_recv(const struct ofp_header *oh, enum
>> ofptype type)
>>   *
>>   * The caller should initialize its own hmap to hold the flows. */
>>  void
>> -ofctrl_add_flow(struct hmap *desired_flows,
>> -                uint8_t table_id, uint16_t priority, uint64_t cookie,
>> -                const struct match *match, const struct ofpbuf *actions)
>> +ofctrl_check_and_add_flow(struct hmap *desired_flows,
>> +                          uint8_t table_id, uint16_t priority, uint64_t
>> cookie,
>> +                          const struct match *match,
>> +                          const struct ofpbuf *actions,
>> +                          bool log_duplicate_flow)
>>  {
>>      struct ovn_flow *f = xmalloc(sizeof *f);
>>      f->table_id = table_id;
>> @@ -644,13 +646,14 @@ ofctrl_add_flow(struct hmap *desired_flows,
>>      f->cookie = cookie;
>>
>>      if (ovn_flow_lookup(desired_flows, f)) {
>> -        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 5);
>> -        if (!VLOG_DROP_DBG(&rl)) {
>> -            char *s = ovn_flow_to_string(f);
>> -            VLOG_DBG("dropping duplicate flow: %s", s);
>> -            free(s);
>> +        if (log_duplicate_flow) {
>> +            static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5,
>> 5);
>> +            if (!VLOG_DROP_DBG(&rl)) {
>> +                char *s = ovn_flow_to_string(f);
>> +                VLOG_DBG("dropping duplicate flow: %s", s);
>> +                free(s);
>> +            }
>>          }
>> -
>>          ovn_flow_destroy(f);
>>          return;
>>      }
>> @@ -658,6 +661,14 @@ ofctrl_add_flow(struct hmap *desired_flows,
>>      hmap_insert(desired_flows, &f->hmap_node, f->hmap_node.hash);
>>  }
>>
>> +void
>> +ofctrl_add_flow(struct hmap *desired_flows,
>> +                uint8_t table_id, uint16_t priority, uint64_t cookie,
>> +                const struct match *match, const struct ofpbuf *actions)
>> +{
>> +    ofctrl_check_and_add_flow(desired_flows, table_id, priority, cookie,
>> +                              match, actions, true);
>> +}
>>
>>  /* ovn_flow. */
>>
>> diff --git a/ovn/controller/ofctrl.h b/ovn/controller/ofctrl.h
>> index e3fc95b05..ae0cfa513 100644
>> --- a/ovn/controller/ofctrl.h
>> +++ b/ovn/controller/ofctrl.h
>> @@ -55,4 +55,9 @@ void ofctrl_add_flow(struct hmap *desired_flows,
>> uint8_t table_id,
>>                       uint16_t priority, uint64_t cookie,
>>                       const struct match *, const struct ofpbuf *ofpacts);
>>
>> +void ofctrl_check_and_add_flow(struct hmap *desired_flows, uint8_t
>> table_id,
>> +                               uint16_t priority, uint64_t cookie,
>> +                               const struct match *,
>> +                               const struct ofpbuf *ofpacts,
>> +                               bool log_duplicate_flow);
>>  #endif /* ovn/ofctrl.h */
>> diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c
>> index c38d7b09f..d781ae459 100644
>> --- a/ovn/controller/physical.c
>> +++ b/ovn/controller/physical.c
>> @@ -189,7 +189,8 @@ get_zone_ids(const struct sbrec_port_binding *binding,
>>
>>  static void
>>  put_local_common_flows(uint32_t dp_key, uint32_t port_key,
>> -                       bool nested_container, const struct zone_ids
>> *zone_ids,
>> +                       uint32_t parent_port_key,
>> +                       const struct zone_ids *zone_ids,
>>                         struct ofpbuf *ofpacts_p, struct hmap *flow_table)
>>  {
>>      struct match match;
>> @@ -253,7 +254,7 @@ put_local_common_flows(uint32_t dp_key, uint32_t
>> port_key,
>>      ofpbuf_clear(ofpacts_p);
>>      match_set_metadata(&match, htonll(dp_key));
>>      match_set_reg(&match, MFF_LOG_OUTPORT - MFF_REG0, port_key);
>> -    if (!nested_container) {
>> +    if (!parent_port_key) {
>>          match_set_reg_masked(&match, MFF_LOG_FLAGS - MFF_REG0,
>>                               MLF_ALLOW_LOOPBACK, MLF_ALLOW_LOOPBACK);
>>      }
>>
> I think before your patch, the above code would always save inport for
> nested containers. They continue to do so now. I hope that was your
> intention?
>
>
>
>> @@ -264,6 +265,25 @@ put_local_common_flows(uint32_t dp_key, uint32_t
>> port_key,
>>      put_stack(MFF_IN_PORT, ofpact_put_STACK_POP(ofpacts_p));
>>      ofctrl_add_flow(flow_table, OFTABLE_SAVE_INPORT, 100, 0,
>>
>
>                    &match, ofpacts_p);
>>
> +
>>
> Can you add a comment here on what we are doing below?
>
> When a packet is headed to a nested container, the outport in OVN should
> be that of the destination nested container.
> But here it looks like we are matching on the OVN outport being the parent
> port and saying that if the outport is parent port, we should save the
> physical inport.
> Is that what we are doing? I don't quite understand how this helps to
> prevent packet from going back to the originator child port. I am sure I am
> missing something simple here.
>
> +    if (parent_port_key) {
>> +        match_init_catchall(&match);
>> +        ofpbuf_clear(ofpacts_p);
>> +        match_set_metadata(&match, htonll(dp_key));
>> +        match_set_reg(&match, MFF_LOG_OUTPORT - MFF_REG0,
>> parent_port_key);
>> +        match_set_reg_masked(&match, MFF_LOG_FLAGS - MFF_REG0,
>> +                             MLF_NESTED_CONTAINER, MLF_NESTED_CONTAINER);
>> +
>> +        put_stack(MFF_IN_PORT, ofpact_put_STACK_PUSH(ofpacts_p));
>> +        put_load(0, MFF_IN_PORT, 0, 16, ofpacts_p);
>> +        put_resubmit(OFTABLE_LOG_TO_PHY, ofpacts_p);
>> +        put_stack(MFF_IN_PORT, ofpact_put_STACK_POP(ofpacts_p));
>> +        /* When a parent port has multiple child ports, we don't want to
>> +         * log the duplicate flow message.
>> +         */
>> +        ofctrl_check_and_add_flow(flow_table, OFTABLE_SAVE_INPORT, 100,
>> 0,
>> +                                  &match, ofpacts_p, false);
>> +    }
>>  }
>>
>>  static void
>> @@ -328,7 +348,7 @@ consider_port_binding(struct ovsdb_idl_index
>> *sbrec_chassis_by_name,
>>          }
>>
>>          struct zone_ids binding_zones = get_zone_ids(binding, ct_zones);
>> -        put_local_common_flows(dp_key, port_key, false, &binding_zones,
>> +        put_local_common_flows(dp_key, port_key, 0, &binding_zones,
>>                                 ofpacts_p, flow_table);
>>
>>          match_init_catchall(&match);
>> @@ -452,6 +472,7 @@ consider_port_binding(struct ovsdb_idl_index
>> *sbrec_chassis_by_name,
>>
>>      int tag = 0;
>>      bool nested_container = false;
>> +    const struct sbrec_port_binding *parent_port = NULL;
>>      ofp_port_t ofport;
>>      bool is_remote = false;
>>      if (binding->parent_port && *binding->parent_port) {
>> @@ -463,6 +484,8 @@ consider_port_binding(struct ovsdb_idl_index
>> *sbrec_chassis_by_name,
>>          if (ofport) {
>>              tag = *binding->tag;
>>              nested_container = true;
>> +            parent_port = lport_lookup_by_name(
>> +                sbrec_port_binding_by_name, binding->parent_port);
>>          }
>>      } else {
>>          ofport = u16_to_ofp(simap_get(&localvif_to_ofport,
>> @@ -523,7 +546,8 @@ consider_port_binding(struct ovsdb_idl_index
>> *sbrec_chassis_by_name,
>>           */
>>
>>          struct zone_ids zone_ids = get_zone_ids(binding, ct_zones);
>> -        put_local_common_flows(dp_key, port_key, nested_container,
>> &zone_ids,
>> +        uint32_t parent_port_key = parent_port ? parent_port->tunnel_key
>> : 0;
>> +        put_local_common_flows(dp_key, port_key, parent_port_key,
>> &zone_ids,
>>                                 ofpacts_p, flow_table);
>>
>>          /* Table 0, Priority 150 and 100.
>> @@ -553,8 +577,10 @@ consider_port_binding(struct ovsdb_idl_index
>> *sbrec_chassis_by_name,
>>              if (nested_container) {
>>                  /* When a packet comes from a container sitting behind a
>>                   * parent_port, we should let it loopback to other
>> containers
>> -                 * or the parent_port itself. */
>> -                put_load(MLF_ALLOW_LOOPBACK, MFF_LOG_FLAGS, 0, 1,
>> ofpacts_p);
>> +                 * or the parent_port itself. Indicate this by setting
>> the
>> +                 * MLF_NESTED_CONTAINER_BIT in MFF_LOG_FLAGS.*/
>> +                put_load(1, MFF_LOG_FLAGS, MLF_NESTED_CONTAINER_BIT, 1,
>> +                         ofpacts_p);
>>              }
>>              ofpact_put_STRIP_VLAN(ofpacts_p);
>>          }
>> diff --git a/ovn/lib/logical-fields.h b/ovn/lib/logical-fields.h
>> index b1dbb035a..ab31327e9 100644
>> --- a/ovn/lib/logical-fields.h
>> +++ b/ovn/lib/logical-fields.h
>> @@ -50,6 +50,7 @@ enum mff_log_flags_bits {
>>      MLF_FORCE_SNAT_FOR_DNAT_BIT = 2,
>>      MLF_FORCE_SNAT_FOR_LB_BIT = 3,
>>      MLF_LOCAL_ONLY_BIT = 4,
>> +    MLF_NESTED_CONTAINER_BIT = 5,
>>  };
>>
>>  /* MFF_LOG_FLAGS_REG flag assignments */
>> @@ -75,6 +76,9 @@ enum mff_log_flags {
>>       * hypervisors should instead only be output to local targets
>>       */
>>      MLF_LOCAL_ONLY = (1 << MLF_LOCAL_ONLY_BIT),
>> +
>> +    /* Indicate that a packet has received from a nested container. */
>> +    MLF_NESTED_CONTAINER = (1 << MLF_NESTED_CONTAINER_BIT),
>>  };
>>
>>  #endif /* ovn/lib/logical-fields.h */
>> diff --git a/tests/ovn.at b/tests/ovn.at
>> index 769e09f81..a7aba5ccd 100644
>> --- a/tests/ovn.at
>> +++ b/tests/ovn.at
>> @@ -7038,6 +7038,17 @@
>> packet=${dst_mac}${src_mac}8100000208004500001c000000003f110100${src_ip}${dst_ip
>>  echo  $packet >> expected1
>>  OVN_CHECK_PACKETS([hv1/vm1-tx.pcap], [expected1])
>>
>> +# Send broadcast packet from foo1. foo1 should not receive the same
>> packet.
>> +src_mac="f00000010205"
>> +dst_mac="ffffffffffff"
>> +src_ip=`ip_to_hex 192 168 1 2`
>> +dst_ip=`ip_to_hex 255 255 255 255`
>>
>> +packet=${dst_mac}${src_mac}8100000108004500001c0000000040110000${src_ip}${dst_ip}0035111100080000
>> +as hv1 ovs-appctl netdev-dummy/receive vm1 $packet
>> +
>> +# expected packet at VM1
>> +OVN_CHECK_PACKETS([hv1/vm1-tx.pcap], [expected1])
>> +
>>  OVN_CLEANUP([hv1],[hv2])
>>
>>  AT_CLEANUP
>> --
>> 2.17.1
>>
>> _______________________________________________
>> dev mailing list
>> dev@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>
>
Gurucharan Shetty Oct. 4, 2018, 9:02 p.m. UTC | #5
On Thu, 4 Oct 2018 at 13:29, Numan Siddique <nusiddiq@redhat.com> wrote:

>
> Thanks for the review Guru.
>
> Please see below for the comments.
>
>
>
> On Fri, Oct 5, 2018 at 12:09 AM Guru Shetty <guru@ovn.org> wrote:
>
>>
>>
>> On Mon, 17 Sep 2018 at 02:24, <nusiddiq@redhat.com> wrote:
>>
>>> From: Numan Siddique <nusiddiq@redhat.com>
>>>
>>> When a child vlan interface is created inside a VM, the below kernel
>>> message
>>> is seen and IPv6 doesn't work on that interface.
>>>
>>
>> On which interface doesn't IPv6 work? On the Vm's interface or on the
>> container's interface?
>>
>>
>
> It is seen on the container's interface.
>
>>
>>> [  138.000753] IPv6: vlan4: IPv6 duplicate address <IPv6 LLA> detected!
>>>
>>> When a child port sends a broadcast packet, OVN delivers the same
>>> packet back to the child port (and hence the DAD check fails).
>>>
>>
>> Is this limited to IP broadcast only? Or does this happen with mac
>> broadcast too? (I am surprised why I hadn't seen this behavior with ARP
>> packets of containers).
>>
>
> It happens with mac broadcast. You can actually reproduce the issue by
> applying the code in tests/ovn.at without this patch.
>
> Looks like I have confused you with the patch.  Let me try to give an
> example. Suppose we have switch - sw0 - with ports p1, p2 and p3 and all of
> them reside in the same chassis.
> When p1 sends a broadcast/multicast packet, the below logical flow is hit
>
> table=17(ls_in_l2_lkup      ), priority=100  , match=(eth.mcast),
> action=(outport = "_MC_flood"; output;)
>
> This translates to the OF flow
> - metadata=0x1,dl_dst=01:00:00:00:00:00/01:00:00:00:00:00
> actions=load:0xffff->NXM_NX_REG15[],resubmit(,32)
>
> In the table 33 if reg15 has 0xffff, it will resubmit the flow with reg15
> set to each logical port
>
> i.e reg15=0xffff,metadata=0x1
> actions=load:0x1->NXM_NX_REG15[],resubmit(,34),load:0x2>NXM_NX_REG15[],resubmit(,34),load:0x3->NXM_NX_REG15[],resubmit(,34),load:0xffff->NXM_NX_REG15[]
>
> In the normal case, the packet will not be delivered back to p1 and
> instead will be dropped in table 34 (OFTABLE_CHECK_LOOPBACK)
> since reg10[0] is not set.
>
> table=34, n_packets=0, n_bytes=0,
> priority=100,reg10=0/0x1,reg14=0x2,reg15=0x2,metadata=0x1 actions=drop
>
> When the register reg10[0] is set, it means the packet needs to be loop
> backed to the source port. So table 64 has flows to clear the in_port
> i,.e
> table=64,  priority=100,reg10=0x1/0x1,reg15=0x1,metadata=0x1
> actions=push:NXM_OF_IN_PORT[],load:0->NXM_OF_IN_PORT[],resubmit(,65),pop:NXM_OF_IN_PORT[]
>
> Now if we take the example of child ports, let's say "p2" is the child
> port of "p1" and in_port of p1 is 1
> In table 0, we will see 2 flows for in_port=1
>
> match=(in_port=1) -
> actions=(load:0x1->OXM_OF_METADATA[],load:0x1->NXM_NX_REG14[] and submit to
> table 8)
> match=(in_port=1, dl_vlan=4) actions(load:0x1->NXM_NX_REG10[0],
> strip_vlan, load:0x1->OXM_OF_METADATA[],load:0x2->NXM_NX_REG14[] , and
> submit to table 8)
>
> Suppose if p2 sends a braodcast packet, reg10[0] is set in table 0.
>
> In this case, since reg10[0] is set, the broadcast packet is not dropped
> in table 34 for port "p2" and it gets delivered back.
> So this patch instead of using reg10[0], it uses a new register bit
> -MLF_NESTED_CONTAINER_BIT
>

Thanks for the detailed response. I understand till the above point. So not
using NXM_NX_REG10[0] (or MLF_ALLOW_LOOPBACK_BIT) actually lets the packet
drop in table 34.  Can you please add that information to the commit
message. I think it will be helpful in the future.

I then don't understand, what we are doing with the new code
in OFTABLE_SAVE_INPORT.  I would expect that the new code is needed to save
inport if the destination is a nested container. Not quite understand why
we have a match for parent port. I will wait for the comment in the code to
understand it.





>
> We also need to clear the in_port in table 64 if MLF_NESTED_CONTAINER_BIT
> is set.
>
> I will address the review comments and submit p2.
>
> Hope this clears your confusion.
>
> Thanks
> Numan
>
>
>
>>
>>
>> This is because 'MLF_ALLOW_LOOPBACK_BIT' is set in REG10 in table 0 when
>>> the packet is received from any child port and table
>>> 'OFTABLE_CHECK_LOOPBACK'
>>> doesn't drop the packet.
>>>
>>
>> I had thought that the previous tables only replicate packets to ports
>> other than the inport. But I may be wrong.
>> Some of my comments further down could be stupid because I may have not
>> understood the problem well.
>>
>>
>>>
>>> This patch fixes the issue by using a new register bit
>>> (MLF_NESTED_CONTAINER_BIT)
>>> instead of 'MLF_ALLOW_LOOPBACK_BIT' and sets it in REG10 for the packets
>>> received
>>> from child ports.
>>>
>>> Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
>>> ---
>>>  ovn/controller/ofctrl.c   | 29 ++++++++++++++++++++---------
>>>  ovn/controller/ofctrl.h   |  5 +++++
>>>  ovn/controller/physical.c | 38 ++++++++++++++++++++++++++++++++------
>>>  ovn/lib/logical-fields.h  |  4 ++++
>>>  tests/ovn.at              | 11 +++++++++++
>>>  5 files changed, 72 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/ovn/controller/ofctrl.c b/ovn/controller/ofctrl.c
>>> index 96c57f143..2f2b185ae 100644
>>> --- a/ovn/controller/ofctrl.c
>>> +++ b/ovn/controller/ofctrl.c
>>> @@ -630,9 +630,11 @@ ofctrl_recv(const struct ofp_header *oh, enum
>>> ofptype type)
>>>   *
>>>   * The caller should initialize its own hmap to hold the flows. */
>>>  void
>>> -ofctrl_add_flow(struct hmap *desired_flows,
>>> -                uint8_t table_id, uint16_t priority, uint64_t cookie,
>>> -                const struct match *match, const struct ofpbuf *actions)
>>> +ofctrl_check_and_add_flow(struct hmap *desired_flows,
>>> +                          uint8_t table_id, uint16_t priority, uint64_t
>>> cookie,
>>> +                          const struct match *match,
>>> +                          const struct ofpbuf *actions,
>>> +                          bool log_duplicate_flow)
>>>  {
>>>      struct ovn_flow *f = xmalloc(sizeof *f);
>>>      f->table_id = table_id;
>>> @@ -644,13 +646,14 @@ ofctrl_add_flow(struct hmap *desired_flows,
>>>      f->cookie = cookie;
>>>
>>>      if (ovn_flow_lookup(desired_flows, f)) {
>>> -        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 5);
>>> -        if (!VLOG_DROP_DBG(&rl)) {
>>> -            char *s = ovn_flow_to_string(f);
>>> -            VLOG_DBG("dropping duplicate flow: %s", s);
>>> -            free(s);
>>> +        if (log_duplicate_flow) {
>>> +            static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5,
>>> 5);
>>> +            if (!VLOG_DROP_DBG(&rl)) {
>>> +                char *s = ovn_flow_to_string(f);
>>> +                VLOG_DBG("dropping duplicate flow: %s", s);
>>> +                free(s);
>>> +            }
>>>          }
>>> -
>>>          ovn_flow_destroy(f);
>>>          return;
>>>      }
>>> @@ -658,6 +661,14 @@ ofctrl_add_flow(struct hmap *desired_flows,
>>>      hmap_insert(desired_flows, &f->hmap_node, f->hmap_node.hash);
>>>  }
>>>
>>> +void
>>> +ofctrl_add_flow(struct hmap *desired_flows,
>>> +                uint8_t table_id, uint16_t priority, uint64_t cookie,
>>> +                const struct match *match, const struct ofpbuf *actions)
>>> +{
>>> +    ofctrl_check_and_add_flow(desired_flows, table_id, priority, cookie,
>>> +                              match, actions, true);
>>> +}
>>>
>>>  /* ovn_flow. */
>>>
>>> diff --git a/ovn/controller/ofctrl.h b/ovn/controller/ofctrl.h
>>> index e3fc95b05..ae0cfa513 100644
>>> --- a/ovn/controller/ofctrl.h
>>> +++ b/ovn/controller/ofctrl.h
>>> @@ -55,4 +55,9 @@ void ofctrl_add_flow(struct hmap *desired_flows,
>>> uint8_t table_id,
>>>                       uint16_t priority, uint64_t cookie,
>>>                       const struct match *, const struct ofpbuf
>>> *ofpacts);
>>>
>>> +void ofctrl_check_and_add_flow(struct hmap *desired_flows, uint8_t
>>> table_id,
>>> +                               uint16_t priority, uint64_t cookie,
>>> +                               const struct match *,
>>> +                               const struct ofpbuf *ofpacts,
>>> +                               bool log_duplicate_flow);
>>>  #endif /* ovn/ofctrl.h */
>>> diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c
>>> index c38d7b09f..d781ae459 100644
>>> --- a/ovn/controller/physical.c
>>> +++ b/ovn/controller/physical.c
>>> @@ -189,7 +189,8 @@ get_zone_ids(const struct sbrec_port_binding
>>> *binding,
>>>
>>>  static void
>>>  put_local_common_flows(uint32_t dp_key, uint32_t port_key,
>>> -                       bool nested_container, const struct zone_ids
>>> *zone_ids,
>>> +                       uint32_t parent_port_key,
>>> +                       const struct zone_ids *zone_ids,
>>>                         struct ofpbuf *ofpacts_p, struct hmap
>>> *flow_table)
>>>  {
>>>      struct match match;
>>> @@ -253,7 +254,7 @@ put_local_common_flows(uint32_t dp_key, uint32_t
>>> port_key,
>>>      ofpbuf_clear(ofpacts_p);
>>>      match_set_metadata(&match, htonll(dp_key));
>>>      match_set_reg(&match, MFF_LOG_OUTPORT - MFF_REG0, port_key);
>>> -    if (!nested_container) {
>>> +    if (!parent_port_key) {
>>>          match_set_reg_masked(&match, MFF_LOG_FLAGS - MFF_REG0,
>>>                               MLF_ALLOW_LOOPBACK, MLF_ALLOW_LOOPBACK);
>>>      }
>>>
>> I think before your patch, the above code would always save inport for
>> nested containers. They continue to do so now. I hope that was your
>> intention?
>>
>>
>>
>>> @@ -264,6 +265,25 @@ put_local_common_flows(uint32_t dp_key, uint32_t
>>> port_key,
>>>      put_stack(MFF_IN_PORT, ofpact_put_STACK_POP(ofpacts_p));
>>>      ofctrl_add_flow(flow_table, OFTABLE_SAVE_INPORT, 100, 0,
>>>
>>
>>                    &match, ofpacts_p);
>>>
>> +
>>>
>> Can you add a comment here on what we are doing below?
>>
>> When a packet is headed to a nested container, the outport in OVN should
>> be that of the destination nested container.
>> But here it looks like we are matching on the OVN outport being the
>> parent port and saying that if the outport is parent port, we should save
>> the physical inport.
>> Is that what we are doing? I don't quite understand how this helps to
>> prevent packet from going back to the originator child port. I am sure I am
>> missing something simple here.
>>
>> +    if (parent_port_key) {
>>> +        match_init_catchall(&match);
>>> +        ofpbuf_clear(ofpacts_p);
>>> +        match_set_metadata(&match, htonll(dp_key));
>>> +        match_set_reg(&match, MFF_LOG_OUTPORT - MFF_REG0,
>>> parent_port_key);
>>> +        match_set_reg_masked(&match, MFF_LOG_FLAGS - MFF_REG0,
>>> +                             MLF_NESTED_CONTAINER,
>>> MLF_NESTED_CONTAINER);
>>> +
>>> +        put_stack(MFF_IN_PORT, ofpact_put_STACK_PUSH(ofpacts_p));
>>> +        put_load(0, MFF_IN_PORT, 0, 16, ofpacts_p);
>>> +        put_resubmit(OFTABLE_LOG_TO_PHY, ofpacts_p);
>>> +        put_stack(MFF_IN_PORT, ofpact_put_STACK_POP(ofpacts_p));
>>> +        /* When a parent port has multiple child ports, we don't want to
>>> +         * log the duplicate flow message.
>>> +         */
>>> +        ofctrl_check_and_add_flow(flow_table, OFTABLE_SAVE_INPORT, 100,
>>> 0,
>>> +                                  &match, ofpacts_p, false);
>>> +    }
>>>  }
>>>
>>>  static void
>>> @@ -328,7 +348,7 @@ consider_port_binding(struct ovsdb_idl_index
>>> *sbrec_chassis_by_name,
>>>          }
>>>
>>>          struct zone_ids binding_zones = get_zone_ids(binding, ct_zones);
>>> -        put_local_common_flows(dp_key, port_key, false, &binding_zones,
>>> +        put_local_common_flows(dp_key, port_key, 0, &binding_zones,
>>>                                 ofpacts_p, flow_table);
>>>
>>>          match_init_catchall(&match);
>>> @@ -452,6 +472,7 @@ consider_port_binding(struct ovsdb_idl_index
>>> *sbrec_chassis_by_name,
>>>
>>>      int tag = 0;
>>>      bool nested_container = false;
>>> +    const struct sbrec_port_binding *parent_port = NULL;
>>>      ofp_port_t ofport;
>>>      bool is_remote = false;
>>>      if (binding->parent_port && *binding->parent_port) {
>>> @@ -463,6 +484,8 @@ consider_port_binding(struct ovsdb_idl_index
>>> *sbrec_chassis_by_name,
>>>          if (ofport) {
>>>              tag = *binding->tag;
>>>              nested_container = true;
>>> +            parent_port = lport_lookup_by_name(
>>> +                sbrec_port_binding_by_name, binding->parent_port);
>>>          }
>>>      } else {
>>>          ofport = u16_to_ofp(simap_get(&localvif_to_ofport,
>>> @@ -523,7 +546,8 @@ consider_port_binding(struct ovsdb_idl_index
>>> *sbrec_chassis_by_name,
>>>           */
>>>
>>>          struct zone_ids zone_ids = get_zone_ids(binding, ct_zones);
>>> -        put_local_common_flows(dp_key, port_key, nested_container,
>>> &zone_ids,
>>> +        uint32_t parent_port_key = parent_port ?
>>> parent_port->tunnel_key : 0;
>>> +        put_local_common_flows(dp_key, port_key, parent_port_key,
>>> &zone_ids,
>>>                                 ofpacts_p, flow_table);
>>>
>>>          /* Table 0, Priority 150 and 100.
>>> @@ -553,8 +577,10 @@ consider_port_binding(struct ovsdb_idl_index
>>> *sbrec_chassis_by_name,
>>>              if (nested_container) {
>>>                  /* When a packet comes from a container sitting behind a
>>>                   * parent_port, we should let it loopback to other
>>> containers
>>> -                 * or the parent_port itself. */
>>> -                put_load(MLF_ALLOW_LOOPBACK, MFF_LOG_FLAGS, 0, 1,
>>> ofpacts_p);
>>> +                 * or the parent_port itself. Indicate this by setting
>>> the
>>> +                 * MLF_NESTED_CONTAINER_BIT in MFF_LOG_FLAGS.*/
>>> +                put_load(1, MFF_LOG_FLAGS, MLF_NESTED_CONTAINER_BIT, 1,
>>> +                         ofpacts_p);
>>>              }
>>>              ofpact_put_STRIP_VLAN(ofpacts_p);
>>>          }
>>> diff --git a/ovn/lib/logical-fields.h b/ovn/lib/logical-fields.h
>>> index b1dbb035a..ab31327e9 100644
>>> --- a/ovn/lib/logical-fields.h
>>> +++ b/ovn/lib/logical-fields.h
>>> @@ -50,6 +50,7 @@ enum mff_log_flags_bits {
>>>      MLF_FORCE_SNAT_FOR_DNAT_BIT = 2,
>>>      MLF_FORCE_SNAT_FOR_LB_BIT = 3,
>>>      MLF_LOCAL_ONLY_BIT = 4,
>>> +    MLF_NESTED_CONTAINER_BIT = 5,
>>>  };
>>>
>>>  /* MFF_LOG_FLAGS_REG flag assignments */
>>> @@ -75,6 +76,9 @@ enum mff_log_flags {
>>>       * hypervisors should instead only be output to local targets
>>>       */
>>>      MLF_LOCAL_ONLY = (1 << MLF_LOCAL_ONLY_BIT),
>>> +
>>> +    /* Indicate that a packet has received from a nested container. */
>>> +    MLF_NESTED_CONTAINER = (1 << MLF_NESTED_CONTAINER_BIT),
>>>  };
>>>
>>>  #endif /* ovn/lib/logical-fields.h */
>>> diff --git a/tests/ovn.at b/tests/ovn.at
>>> index 769e09f81..a7aba5ccd 100644
>>> --- a/tests/ovn.at
>>> +++ b/tests/ovn.at
>>> @@ -7038,6 +7038,17 @@
>>> packet=${dst_mac}${src_mac}8100000208004500001c000000003f110100${src_ip}${dst_ip
>>>  echo  $packet >> expected1
>>>  OVN_CHECK_PACKETS([hv1/vm1-tx.pcap], [expected1])
>>>
>>> +# Send broadcast packet from foo1. foo1 should not receive the same
>>> packet.
>>> +src_mac="f00000010205"
>>> +dst_mac="ffffffffffff"
>>> +src_ip=`ip_to_hex 192 168 1 2`
>>> +dst_ip=`ip_to_hex 255 255 255 255`
>>>
>>> +packet=${dst_mac}${src_mac}8100000108004500001c0000000040110000${src_ip}${dst_ip}0035111100080000
>>> +as hv1 ovs-appctl netdev-dummy/receive vm1 $packet
>>> +
>>> +# expected packet at VM1
>>> +OVN_CHECK_PACKETS([hv1/vm1-tx.pcap], [expected1])
>>> +
>>>  OVN_CLEANUP([hv1],[hv2])
>>>
>>>  AT_CLEANUP
>>> --
>>> 2.17.1
>>>
>>> _______________________________________________
>>> dev mailing list
>>> dev@openvswitch.org
>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>
>>
Numan Siddique Oct. 5, 2018, 11:24 a.m. UTC | #6
On Fri, Oct 5, 2018 at 2:32 AM Guru Shetty <guru@ovn.org> wrote:

>
>
> On Thu, 4 Oct 2018 at 13:29, Numan Siddique <nusiddiq@redhat.com> wrote:
>
>>
>> Thanks for the review Guru.
>>
>> Please see below for the comments.
>>
>>
>>
>> On Fri, Oct 5, 2018 at 12:09 AM Guru Shetty <guru@ovn.org> wrote:
>>
>>>
>>>
>>> On Mon, 17 Sep 2018 at 02:24, <nusiddiq@redhat.com> wrote:
>>>
>>>> From: Numan Siddique <nusiddiq@redhat.com>
>>>>
>>>> When a child vlan interface is created inside a VM, the below kernel
>>>> message
>>>> is seen and IPv6 doesn't work on that interface.
>>>>
>>>
>>> On which interface doesn't IPv6 work? On the Vm's interface or on the
>>> container's interface?
>>>
>>>
>>
>> It is seen on the container's interface.
>>
>>>
>>>> [  138.000753] IPv6: vlan4: IPv6 duplicate address <IPv6 LLA> detected!
>>>>
>>>> When a child port sends a broadcast packet, OVN delivers the same
>>>> packet back to the child port (and hence the DAD check fails).
>>>>
>>>
>>> Is this limited to IP broadcast only? Or does this happen with mac
>>> broadcast too? (I am surprised why I hadn't seen this behavior with ARP
>>> packets of containers).
>>>
>>
>> It happens with mac broadcast. You can actually reproduce the issue by
>> applying the code in tests/ovn.at without this patch.
>>
>> Looks like I have confused you with the patch.  Let me try to give an
>> example. Suppose we have switch - sw0 - with ports p1, p2 and p3 and all of
>> them reside in the same chassis.
>> When p1 sends a broadcast/multicast packet, the below logical flow is hit
>>
>> table=17(ls_in_l2_lkup      ), priority=100  , match=(eth.mcast),
>> action=(outport = "_MC_flood"; output;)
>>
>> This translates to the OF flow
>> - metadata=0x1,dl_dst=01:00:00:00:00:00/01:00:00:00:00:00
>> actions=load:0xffff->NXM_NX_REG15[],resubmit(,32)
>>
>> In the table 33 if reg15 has 0xffff, it will resubmit the flow with reg15
>> set to each logical port
>>
>> i.e reg15=0xffff,metadata=0x1
>> actions=load:0x1->NXM_NX_REG15[],resubmit(,34),load:0x2>NXM_NX_REG15[],resubmit(,34),load:0x3->NXM_NX_REG15[],resubmit(,34),load:0xffff->NXM_NX_REG15[]
>>
>> In the normal case, the packet will not be delivered back to p1 and
>> instead will be dropped in table 34 (OFTABLE_CHECK_LOOPBACK)
>> since reg10[0] is not set.
>>
>> table=34, n_packets=0, n_bytes=0,
>> priority=100,reg10=0/0x1,reg14=0x2,reg15=0x2,metadata=0x1 actions=drop
>>
>> When the register reg10[0] is set, it means the packet needs to be loop
>> backed to the source port. So table 64 has flows to clear the in_port
>> i,.e
>> table=64,  priority=100,reg10=0x1/0x1,reg15=0x1,metadata=0x1
>> actions=push:NXM_OF_IN_PORT[],load:0->NXM_OF_IN_PORT[],resubmit(,65),pop:NXM_OF_IN_PORT[]
>>
>> Now if we take the example of child ports, let's say "p2" is the child
>> port of "p1" and in_port of p1 is 1
>> In table 0, we will see 2 flows for in_port=1
>>
>> match=(in_port=1) -
>> actions=(load:0x1->OXM_OF_METADATA[],load:0x1->NXM_NX_REG14[] and submit to
>> table 8)
>> match=(in_port=1, dl_vlan=4) actions(load:0x1->NXM_NX_REG10[0],
>> strip_vlan, load:0x1->OXM_OF_METADATA[],load:0x2->NXM_NX_REG14[] , and
>> submit to table 8)
>>
>> Suppose if p2 sends a braodcast packet, reg10[0] is set in table 0.
>>
>> In this case, since reg10[0] is set, the broadcast packet is not dropped
>> in table 34 for port "p2" and it gets delivered back.
>> So this patch instead of using reg10[0], it uses a new register bit
>> -MLF_NESTED_CONTAINER_BIT
>>
>
> Thanks for the detailed response. I understand till the above point. So
> not using NXM_NX_REG10[0] (or MLF_ALLOW_LOOPBACK_BIT) actually lets the
> packet drop in table 34.  Can you please add that information to the commit
> message. I think it will be helpful in the future.
>
> I then don't understand, what we are doing with the new code
> in OFTABLE_SAVE_INPORT.  I would expect that the new code is needed to save
> inport if the destination is a nested container. Not quite understand why
> we have a match for parent port. I will wait for the comment in the code to
> understand it.
>

The code to push, clear the inport and pop the inport is already present
for the child ports -
https://github.com/openvswitch/ovs/blob/master/ovn/controller/physical.c#L265
And the code to clear the inport for the parent port or any port when
reg10[0] is set is here -
https://github.com/openvswitch/ovs/blob/master/ovn/controller/physical.c#L256

Now with this patch since we are using reg10[5] to indicate the packet is
from child ports, we need to clear the in_port if the packet from the child
port needs to be sent to the parent port.
And that's why have a new match for the parent ports.

I changed the signature of the function 'put_local_common_flows' because I
wanted to add the flow to check reg10[5] only if a port is a parent port.
Otherwise we will add the flow to check reg10[5] for all
the ports in the switch.

I will add proper documentation and make it clear.

Thanks
Numan


>
>
>>
>> We also need to clear the in_port in table 64 if MLF_NESTED_CONTAINER_BIT
>> is set.
>>
>> I will address the review comments and submit p2.
>>
>> Hope this clears your confusion.
>>
>> Thanks
>> Numan
>>
>>
>>
>>>
>>>
>>> This is because 'MLF_ALLOW_LOOPBACK_BIT' is set in REG10 in table 0 when
>>>> the packet is received from any child port and table
>>>> 'OFTABLE_CHECK_LOOPBACK'
>>>> doesn't drop the packet.
>>>>
>>>
>>> I had thought that the previous tables only replicate packets to ports
>>> other than the inport. But I may be wrong.
>>> Some of my comments further down could be stupid because I may have not
>>> understood the problem well.
>>>
>>>
>>>>
>>>> This patch fixes the issue by using a new register bit
>>>> (MLF_NESTED_CONTAINER_BIT)
>>>> instead of 'MLF_ALLOW_LOOPBACK_BIT' and sets it in REG10 for the
>>>> packets received
>>>> from child ports.
>>>>
>>>> Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
>>>> ---
>>>>  ovn/controller/ofctrl.c   | 29 ++++++++++++++++++++---------
>>>>  ovn/controller/ofctrl.h   |  5 +++++
>>>>  ovn/controller/physical.c | 38 ++++++++++++++++++++++++++++++++------
>>>>  ovn/lib/logical-fields.h  |  4 ++++
>>>>  tests/ovn.at              | 11 +++++++++++
>>>>  5 files changed, 72 insertions(+), 15 deletions(-)
>>>>
>>>> diff --git a/ovn/controller/ofctrl.c b/ovn/controller/ofctrl.c
>>>> index 96c57f143..2f2b185ae 100644
>>>> --- a/ovn/controller/ofctrl.c
>>>> +++ b/ovn/controller/ofctrl.c
>>>> @@ -630,9 +630,11 @@ ofctrl_recv(const struct ofp_header *oh, enum
>>>> ofptype type)
>>>>   *
>>>>   * The caller should initialize its own hmap to hold the flows. */
>>>>  void
>>>> -ofctrl_add_flow(struct hmap *desired_flows,
>>>> -                uint8_t table_id, uint16_t priority, uint64_t cookie,
>>>> -                const struct match *match, const struct ofpbuf
>>>> *actions)
>>>> +ofctrl_check_and_add_flow(struct hmap *desired_flows,
>>>> +                          uint8_t table_id, uint16_t priority,
>>>> uint64_t cookie,
>>>> +                          const struct match *match,
>>>> +                          const struct ofpbuf *actions,
>>>> +                          bool log_duplicate_flow)
>>>>  {
>>>>      struct ovn_flow *f = xmalloc(sizeof *f);
>>>>      f->table_id = table_id;
>>>> @@ -644,13 +646,14 @@ ofctrl_add_flow(struct hmap *desired_flows,
>>>>      f->cookie = cookie;
>>>>
>>>>      if (ovn_flow_lookup(desired_flows, f)) {
>>>> -        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 5);
>>>> -        if (!VLOG_DROP_DBG(&rl)) {
>>>> -            char *s = ovn_flow_to_string(f);
>>>> -            VLOG_DBG("dropping duplicate flow: %s", s);
>>>> -            free(s);
>>>> +        if (log_duplicate_flow) {
>>>> +            static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5,
>>>> 5);
>>>> +            if (!VLOG_DROP_DBG(&rl)) {
>>>> +                char *s = ovn_flow_to_string(f);
>>>> +                VLOG_DBG("dropping duplicate flow: %s", s);
>>>> +                free(s);
>>>> +            }
>>>>          }
>>>> -
>>>>          ovn_flow_destroy(f);
>>>>          return;
>>>>      }
>>>> @@ -658,6 +661,14 @@ ofctrl_add_flow(struct hmap *desired_flows,
>>>>      hmap_insert(desired_flows, &f->hmap_node, f->hmap_node.hash);
>>>>  }
>>>>
>>>> +void
>>>> +ofctrl_add_flow(struct hmap *desired_flows,
>>>> +                uint8_t table_id, uint16_t priority, uint64_t cookie,
>>>> +                const struct match *match, const struct ofpbuf
>>>> *actions)
>>>> +{
>>>> +    ofctrl_check_and_add_flow(desired_flows, table_id, priority,
>>>> cookie,
>>>> +                              match, actions, true);
>>>> +}
>>>>
>>>>  /* ovn_flow. */
>>>>
>>>> diff --git a/ovn/controller/ofctrl.h b/ovn/controller/ofctrl.h
>>>> index e3fc95b05..ae0cfa513 100644
>>>> --- a/ovn/controller/ofctrl.h
>>>> +++ b/ovn/controller/ofctrl.h
>>>> @@ -55,4 +55,9 @@ void ofctrl_add_flow(struct hmap *desired_flows,
>>>> uint8_t table_id,
>>>>                       uint16_t priority, uint64_t cookie,
>>>>                       const struct match *, const struct ofpbuf
>>>> *ofpacts);
>>>>
>>>> +void ofctrl_check_and_add_flow(struct hmap *desired_flows, uint8_t
>>>> table_id,
>>>> +                               uint16_t priority, uint64_t cookie,
>>>> +                               const struct match *,
>>>> +                               const struct ofpbuf *ofpacts,
>>>> +                               bool log_duplicate_flow);
>>>>  #endif /* ovn/ofctrl.h */
>>>> diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c
>>>> index c38d7b09f..d781ae459 100644
>>>> --- a/ovn/controller/physical.c
>>>> +++ b/ovn/controller/physical.c
>>>> @@ -189,7 +189,8 @@ get_zone_ids(const struct sbrec_port_binding
>>>> *binding,
>>>>
>>>>  static void
>>>>  put_local_common_flows(uint32_t dp_key, uint32_t port_key,
>>>> -                       bool nested_container, const struct zone_ids
>>>> *zone_ids,
>>>> +                       uint32_t parent_port_key,
>>>> +                       const struct zone_ids *zone_ids,
>>>>                         struct ofpbuf *ofpacts_p, struct hmap
>>>> *flow_table)
>>>>  {
>>>>      struct match match;
>>>> @@ -253,7 +254,7 @@ put_local_common_flows(uint32_t dp_key, uint32_t
>>>> port_key,
>>>>      ofpbuf_clear(ofpacts_p);
>>>>      match_set_metadata(&match, htonll(dp_key));
>>>>      match_set_reg(&match, MFF_LOG_OUTPORT - MFF_REG0, port_key);
>>>> -    if (!nested_container) {
>>>> +    if (!parent_port_key) {
>>>>          match_set_reg_masked(&match, MFF_LOG_FLAGS - MFF_REG0,
>>>>                               MLF_ALLOW_LOOPBACK, MLF_ALLOW_LOOPBACK);
>>>>      }
>>>>
>>> I think before your patch, the above code would always save inport for
>>> nested containers. They continue to do so now. I hope that was your
>>> intention?
>>>
>>>
>>>
>>>> @@ -264,6 +265,25 @@ put_local_common_flows(uint32_t dp_key, uint32_t
>>>> port_key,
>>>>      put_stack(MFF_IN_PORT, ofpact_put_STACK_POP(ofpacts_p));
>>>>      ofctrl_add_flow(flow_table, OFTABLE_SAVE_INPORT, 100, 0,
>>>>
>>>
>>>                    &match, ofpacts_p);
>>>>
>>> +
>>>>
>>> Can you add a comment here on what we are doing below?
>>>
>>> When a packet is headed to a nested container, the outport in OVN should
>>> be that of the destination nested container.
>>> But here it looks like we are matching on the OVN outport being the
>>> parent port and saying that if the outport is parent port, we should save
>>> the physical inport.
>>> Is that what we are doing? I don't quite understand how this helps to
>>> prevent packet from going back to the originator child port. I am sure I am
>>> missing something simple here.
>>>
>>> +    if (parent_port_key) {
>>>> +        match_init_catchall(&match);
>>>> +        ofpbuf_clear(ofpacts_p);
>>>> +        match_set_metadata(&match, htonll(dp_key));
>>>> +        match_set_reg(&match, MFF_LOG_OUTPORT - MFF_REG0,
>>>> parent_port_key);
>>>> +        match_set_reg_masked(&match, MFF_LOG_FLAGS - MFF_REG0,
>>>> +                             MLF_NESTED_CONTAINER,
>>>> MLF_NESTED_CONTAINER);
>>>> +
>>>> +        put_stack(MFF_IN_PORT, ofpact_put_STACK_PUSH(ofpacts_p));
>>>> +        put_load(0, MFF_IN_PORT, 0, 16, ofpacts_p);
>>>> +        put_resubmit(OFTABLE_LOG_TO_PHY, ofpacts_p);
>>>> +        put_stack(MFF_IN_PORT, ofpact_put_STACK_POP(ofpacts_p));
>>>> +        /* When a parent port has multiple child ports, we don't want
>>>> to
>>>> +         * log the duplicate flow message.
>>>> +         */
>>>> +        ofctrl_check_and_add_flow(flow_table, OFTABLE_SAVE_INPORT,
>>>> 100, 0,
>>>> +                                  &match, ofpacts_p, false);
>>>> +    }
>>>>  }
>>>>
>>>>  static void
>>>> @@ -328,7 +348,7 @@ consider_port_binding(struct ovsdb_idl_index
>>>> *sbrec_chassis_by_name,
>>>>          }
>>>>
>>>>          struct zone_ids binding_zones = get_zone_ids(binding,
>>>> ct_zones);
>>>> -        put_local_common_flows(dp_key, port_key, false, &binding_zones,
>>>> +        put_local_common_flows(dp_key, port_key, 0, &binding_zones,
>>>>                                 ofpacts_p, flow_table);
>>>>
>>>>          match_init_catchall(&match);
>>>> @@ -452,6 +472,7 @@ consider_port_binding(struct ovsdb_idl_index
>>>> *sbrec_chassis_by_name,
>>>>
>>>>      int tag = 0;
>>>>      bool nested_container = false;
>>>> +    const struct sbrec_port_binding *parent_port = NULL;
>>>>      ofp_port_t ofport;
>>>>      bool is_remote = false;
>>>>      if (binding->parent_port && *binding->parent_port) {
>>>> @@ -463,6 +484,8 @@ consider_port_binding(struct ovsdb_idl_index
>>>> *sbrec_chassis_by_name,
>>>>          if (ofport) {
>>>>              tag = *binding->tag;
>>>>              nested_container = true;
>>>> +            parent_port = lport_lookup_by_name(
>>>> +                sbrec_port_binding_by_name, binding->parent_port);
>>>>          }
>>>>      } else {
>>>>          ofport = u16_to_ofp(simap_get(&localvif_to_ofport,
>>>> @@ -523,7 +546,8 @@ consider_port_binding(struct ovsdb_idl_index
>>>> *sbrec_chassis_by_name,
>>>>           */
>>>>
>>>>          struct zone_ids zone_ids = get_zone_ids(binding, ct_zones);
>>>> -        put_local_common_flows(dp_key, port_key, nested_container,
>>>> &zone_ids,
>>>> +        uint32_t parent_port_key = parent_port ?
>>>> parent_port->tunnel_key : 0;
>>>> +        put_local_common_flows(dp_key, port_key, parent_port_key,
>>>> &zone_ids,
>>>>                                 ofpacts_p, flow_table);
>>>>
>>>>          /* Table 0, Priority 150 and 100.
>>>> @@ -553,8 +577,10 @@ consider_port_binding(struct ovsdb_idl_index
>>>> *sbrec_chassis_by_name,
>>>>              if (nested_container) {
>>>>                  /* When a packet comes from a container sitting behind
>>>> a
>>>>                   * parent_port, we should let it loopback to other
>>>> containers
>>>> -                 * or the parent_port itself. */
>>>> -                put_load(MLF_ALLOW_LOOPBACK, MFF_LOG_FLAGS, 0, 1,
>>>> ofpacts_p);
>>>> +                 * or the parent_port itself. Indicate this by setting
>>>> the
>>>> +                 * MLF_NESTED_CONTAINER_BIT in MFF_LOG_FLAGS.*/
>>>> +                put_load(1, MFF_LOG_FLAGS, MLF_NESTED_CONTAINER_BIT, 1,
>>>> +                         ofpacts_p);
>>>>              }
>>>>              ofpact_put_STRIP_VLAN(ofpacts_p);
>>>>          }
>>>> diff --git a/ovn/lib/logical-fields.h b/ovn/lib/logical-fields.h
>>>> index b1dbb035a..ab31327e9 100644
>>>> --- a/ovn/lib/logical-fields.h
>>>> +++ b/ovn/lib/logical-fields.h
>>>> @@ -50,6 +50,7 @@ enum mff_log_flags_bits {
>>>>      MLF_FORCE_SNAT_FOR_DNAT_BIT = 2,
>>>>      MLF_FORCE_SNAT_FOR_LB_BIT = 3,
>>>>      MLF_LOCAL_ONLY_BIT = 4,
>>>> +    MLF_NESTED_CONTAINER_BIT = 5,
>>>>  };
>>>>
>>>>  /* MFF_LOG_FLAGS_REG flag assignments */
>>>> @@ -75,6 +76,9 @@ enum mff_log_flags {
>>>>       * hypervisors should instead only be output to local targets
>>>>       */
>>>>      MLF_LOCAL_ONLY = (1 << MLF_LOCAL_ONLY_BIT),
>>>> +
>>>> +    /* Indicate that a packet has received from a nested container. */
>>>> +    MLF_NESTED_CONTAINER = (1 << MLF_NESTED_CONTAINER_BIT),
>>>>  };
>>>>
>>>>  #endif /* ovn/lib/logical-fields.h */
>>>> diff --git a/tests/ovn.at b/tests/ovn.at
>>>> index 769e09f81..a7aba5ccd 100644
>>>> --- a/tests/ovn.at
>>>> +++ b/tests/ovn.at
>>>> @@ -7038,6 +7038,17 @@
>>>> packet=${dst_mac}${src_mac}8100000208004500001c000000003f110100${src_ip}${dst_ip
>>>>  echo  $packet >> expected1
>>>>  OVN_CHECK_PACKETS([hv1/vm1-tx.pcap], [expected1])
>>>>
>>>> +# Send broadcast packet from foo1. foo1 should not receive the same
>>>> packet.
>>>> +src_mac="f00000010205"
>>>> +dst_mac="ffffffffffff"
>>>> +src_ip=`ip_to_hex 192 168 1 2`
>>>> +dst_ip=`ip_to_hex 255 255 255 255`
>>>>
>>>> +packet=${dst_mac}${src_mac}8100000108004500001c0000000040110000${src_ip}${dst_ip}0035111100080000
>>>> +as hv1 ovs-appctl netdev-dummy/receive vm1 $packet
>>>> +
>>>> +# expected packet at VM1
>>>> +OVN_CHECK_PACKETS([hv1/vm1-tx.pcap], [expected1])
>>>> +
>>>>  OVN_CLEANUP([hv1],[hv2])
>>>>
>>>>  AT_CLEANUP
>>>> --
>>>> 2.17.1
>>>>
>>>> _______________________________________________
>>>> dev mailing list
>>>> dev@openvswitch.org
>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>>
>>>
diff mbox series

Patch

diff --git a/ovn/controller/ofctrl.c b/ovn/controller/ofctrl.c
index 96c57f143..2f2b185ae 100644
--- a/ovn/controller/ofctrl.c
+++ b/ovn/controller/ofctrl.c
@@ -630,9 +630,11 @@  ofctrl_recv(const struct ofp_header *oh, enum ofptype type)
  *
  * The caller should initialize its own hmap to hold the flows. */
 void
-ofctrl_add_flow(struct hmap *desired_flows,
-                uint8_t table_id, uint16_t priority, uint64_t cookie,
-                const struct match *match, const struct ofpbuf *actions)
+ofctrl_check_and_add_flow(struct hmap *desired_flows,
+                          uint8_t table_id, uint16_t priority, uint64_t cookie,
+                          const struct match *match,
+                          const struct ofpbuf *actions,
+                          bool log_duplicate_flow)
 {
     struct ovn_flow *f = xmalloc(sizeof *f);
     f->table_id = table_id;
@@ -644,13 +646,14 @@  ofctrl_add_flow(struct hmap *desired_flows,
     f->cookie = cookie;
 
     if (ovn_flow_lookup(desired_flows, f)) {
-        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 5);
-        if (!VLOG_DROP_DBG(&rl)) {
-            char *s = ovn_flow_to_string(f);
-            VLOG_DBG("dropping duplicate flow: %s", s);
-            free(s);
+        if (log_duplicate_flow) {
+            static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 5);
+            if (!VLOG_DROP_DBG(&rl)) {
+                char *s = ovn_flow_to_string(f);
+                VLOG_DBG("dropping duplicate flow: %s", s);
+                free(s);
+            }
         }
-
         ovn_flow_destroy(f);
         return;
     }
@@ -658,6 +661,14 @@  ofctrl_add_flow(struct hmap *desired_flows,
     hmap_insert(desired_flows, &f->hmap_node, f->hmap_node.hash);
 }
 
+void
+ofctrl_add_flow(struct hmap *desired_flows,
+                uint8_t table_id, uint16_t priority, uint64_t cookie,
+                const struct match *match, const struct ofpbuf *actions)
+{
+    ofctrl_check_and_add_flow(desired_flows, table_id, priority, cookie,
+                              match, actions, true);
+}
 
 /* ovn_flow. */
 
diff --git a/ovn/controller/ofctrl.h b/ovn/controller/ofctrl.h
index e3fc95b05..ae0cfa513 100644
--- a/ovn/controller/ofctrl.h
+++ b/ovn/controller/ofctrl.h
@@ -55,4 +55,9 @@  void ofctrl_add_flow(struct hmap *desired_flows, uint8_t table_id,
                      uint16_t priority, uint64_t cookie,
                      const struct match *, const struct ofpbuf *ofpacts);
 
+void ofctrl_check_and_add_flow(struct hmap *desired_flows, uint8_t table_id,
+                               uint16_t priority, uint64_t cookie,
+                               const struct match *,
+                               const struct ofpbuf *ofpacts,
+                               bool log_duplicate_flow);
 #endif /* ovn/ofctrl.h */
diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c
index c38d7b09f..d781ae459 100644
--- a/ovn/controller/physical.c
+++ b/ovn/controller/physical.c
@@ -189,7 +189,8 @@  get_zone_ids(const struct sbrec_port_binding *binding,
 
 static void
 put_local_common_flows(uint32_t dp_key, uint32_t port_key,
-                       bool nested_container, const struct zone_ids *zone_ids,
+                       uint32_t parent_port_key,
+                       const struct zone_ids *zone_ids,
                        struct ofpbuf *ofpacts_p, struct hmap *flow_table)
 {
     struct match match;
@@ -253,7 +254,7 @@  put_local_common_flows(uint32_t dp_key, uint32_t port_key,
     ofpbuf_clear(ofpacts_p);
     match_set_metadata(&match, htonll(dp_key));
     match_set_reg(&match, MFF_LOG_OUTPORT - MFF_REG0, port_key);
-    if (!nested_container) {
+    if (!parent_port_key) {
         match_set_reg_masked(&match, MFF_LOG_FLAGS - MFF_REG0,
                              MLF_ALLOW_LOOPBACK, MLF_ALLOW_LOOPBACK);
     }
@@ -264,6 +265,25 @@  put_local_common_flows(uint32_t dp_key, uint32_t port_key,
     put_stack(MFF_IN_PORT, ofpact_put_STACK_POP(ofpacts_p));
     ofctrl_add_flow(flow_table, OFTABLE_SAVE_INPORT, 100, 0,
                     &match, ofpacts_p);
+
+    if (parent_port_key) {
+        match_init_catchall(&match);
+        ofpbuf_clear(ofpacts_p);
+        match_set_metadata(&match, htonll(dp_key));
+        match_set_reg(&match, MFF_LOG_OUTPORT - MFF_REG0, parent_port_key);
+        match_set_reg_masked(&match, MFF_LOG_FLAGS - MFF_REG0,
+                             MLF_NESTED_CONTAINER, MLF_NESTED_CONTAINER);
+
+        put_stack(MFF_IN_PORT, ofpact_put_STACK_PUSH(ofpacts_p));
+        put_load(0, MFF_IN_PORT, 0, 16, ofpacts_p);
+        put_resubmit(OFTABLE_LOG_TO_PHY, ofpacts_p);
+        put_stack(MFF_IN_PORT, ofpact_put_STACK_POP(ofpacts_p));
+        /* When a parent port has multiple child ports, we don't want to
+         * log the duplicate flow message.
+         */
+        ofctrl_check_and_add_flow(flow_table, OFTABLE_SAVE_INPORT, 100, 0,
+                                  &match, ofpacts_p, false);
+    }
 }
 
 static void
@@ -328,7 +348,7 @@  consider_port_binding(struct ovsdb_idl_index *sbrec_chassis_by_name,
         }
 
         struct zone_ids binding_zones = get_zone_ids(binding, ct_zones);
-        put_local_common_flows(dp_key, port_key, false, &binding_zones,
+        put_local_common_flows(dp_key, port_key, 0, &binding_zones,
                                ofpacts_p, flow_table);
 
         match_init_catchall(&match);
@@ -452,6 +472,7 @@  consider_port_binding(struct ovsdb_idl_index *sbrec_chassis_by_name,
 
     int tag = 0;
     bool nested_container = false;
+    const struct sbrec_port_binding *parent_port = NULL;
     ofp_port_t ofport;
     bool is_remote = false;
     if (binding->parent_port && *binding->parent_port) {
@@ -463,6 +484,8 @@  consider_port_binding(struct ovsdb_idl_index *sbrec_chassis_by_name,
         if (ofport) {
             tag = *binding->tag;
             nested_container = true;
+            parent_port = lport_lookup_by_name(
+                sbrec_port_binding_by_name, binding->parent_port);
         }
     } else {
         ofport = u16_to_ofp(simap_get(&localvif_to_ofport,
@@ -523,7 +546,8 @@  consider_port_binding(struct ovsdb_idl_index *sbrec_chassis_by_name,
          */
 
         struct zone_ids zone_ids = get_zone_ids(binding, ct_zones);
-        put_local_common_flows(dp_key, port_key, nested_container, &zone_ids,
+        uint32_t parent_port_key = parent_port ? parent_port->tunnel_key : 0;
+        put_local_common_flows(dp_key, port_key, parent_port_key, &zone_ids,
                                ofpacts_p, flow_table);
 
         /* Table 0, Priority 150 and 100.
@@ -553,8 +577,10 @@  consider_port_binding(struct ovsdb_idl_index *sbrec_chassis_by_name,
             if (nested_container) {
                 /* When a packet comes from a container sitting behind a
                  * parent_port, we should let it loopback to other containers
-                 * or the parent_port itself. */
-                put_load(MLF_ALLOW_LOOPBACK, MFF_LOG_FLAGS, 0, 1, ofpacts_p);
+                 * or the parent_port itself. Indicate this by setting the
+                 * MLF_NESTED_CONTAINER_BIT in MFF_LOG_FLAGS.*/
+                put_load(1, MFF_LOG_FLAGS, MLF_NESTED_CONTAINER_BIT, 1,
+                         ofpacts_p);
             }
             ofpact_put_STRIP_VLAN(ofpacts_p);
         }
diff --git a/ovn/lib/logical-fields.h b/ovn/lib/logical-fields.h
index b1dbb035a..ab31327e9 100644
--- a/ovn/lib/logical-fields.h
+++ b/ovn/lib/logical-fields.h
@@ -50,6 +50,7 @@  enum mff_log_flags_bits {
     MLF_FORCE_SNAT_FOR_DNAT_BIT = 2,
     MLF_FORCE_SNAT_FOR_LB_BIT = 3,
     MLF_LOCAL_ONLY_BIT = 4,
+    MLF_NESTED_CONTAINER_BIT = 5,
 };
 
 /* MFF_LOG_FLAGS_REG flag assignments */
@@ -75,6 +76,9 @@  enum mff_log_flags {
      * hypervisors should instead only be output to local targets
      */
     MLF_LOCAL_ONLY = (1 << MLF_LOCAL_ONLY_BIT),
+
+    /* Indicate that a packet has received from a nested container. */
+    MLF_NESTED_CONTAINER = (1 << MLF_NESTED_CONTAINER_BIT),
 };
 
 #endif /* ovn/lib/logical-fields.h */
diff --git a/tests/ovn.at b/tests/ovn.at
index 769e09f81..a7aba5ccd 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -7038,6 +7038,17 @@  packet=${dst_mac}${src_mac}8100000208004500001c000000003f110100${src_ip}${dst_ip
 echo  $packet >> expected1
 OVN_CHECK_PACKETS([hv1/vm1-tx.pcap], [expected1])
 
+# Send broadcast packet from foo1. foo1 should not receive the same packet.
+src_mac="f00000010205"
+dst_mac="ffffffffffff"
+src_ip=`ip_to_hex 192 168 1 2`
+dst_ip=`ip_to_hex 255 255 255 255`
+packet=${dst_mac}${src_mac}8100000108004500001c0000000040110000${src_ip}${dst_ip}0035111100080000
+as hv1 ovs-appctl netdev-dummy/receive vm1 $packet
+
+# expected packet at VM1
+OVN_CHECK_PACKETS([hv1/vm1-tx.pcap], [expected1])
+
 OVN_CLEANUP([hv1],[hv2])
 
 AT_CLEANUP