diff mbox

[ovs-dev,v21,2/8] Persist ovn flow tables

Message ID 1467560133-5700-3-git-send-email-rmoats@us.ibm.com
State Changes Requested
Headers show

Commit Message

Ryan Moats July 3, 2016, 3:35 p.m. UTC
Ensure that ovn flow tables are persisted so that changes to
them chan be applied incrementally - this is a prereq for
making lflow_run and physical_run incremental.

Signed-off-by: Ryan Moats <rmoats@us.ibm.com>
---
 ovn/controller/lflow.c          |  26 ++--
 ovn/controller/lflow.h          |   3 +-
 ovn/controller/ofctrl.c         | 271 +++++++++++++++++++++++++++++-----------
 ovn/controller/ofctrl.h         |  18 ++-
 ovn/controller/ovn-controller.c |   9 +-
 ovn/controller/physical.c       |  59 +++++----
 ovn/controller/physical.h       |   2 +-
 7 files changed, 264 insertions(+), 124 deletions(-)

Comments

Ben Pfaff July 3, 2016, 10:40 p.m. UTC | #1
On Sun, Jul 03, 2016 at 10:35:27AM -0500, Ryan Moats wrote:
> Ensure that ovn flow tables are persisted so that changes to
> them chan be applied incrementally - this is a prereq for
> making lflow_run and physical_run incremental.
> 
> Signed-off-by: Ryan Moats <rmoats@us.ibm.com>

This time I think I've figured out properly what I'm concerned about.
Before, I think I was distracted enough by 'is_new' and the remaining
patches that I hadn't formulated it correctly yet.

ofctrl needs to support the following operations on physical flows:

    1. Add flow.
    2. Update flow.
    3. Remove flow.

lflow generates physical flows from logical flows in a one-to-many
fashion.  That is, a single logical flow can yield zero, one, or
multiple physical flows.  Other sources can yield physical flows too,
and ofctrl needs those sources to pretend that they are generated from
something similar enough to a logical flow that it can be uniquely
identified with a UUID.  All that makes sense.

Case #1, "add flow", for physical flows is implemented via
ofctrl_add_flow() with is_new == true.  This is straightforward enough.
Add the new flow to the flow table, add it to the list of physical flows
associated with the UUID passed in, suppress duplicates.

Case #3, "remove flow", for physical flows is implemented via
ofctrl_remove_flow().  This removes all the physical flows associated
with a UUID.  This is almost as straightforward.  The implementation
here does appear to mean that, if there is a bug in the logical flow
table such that two different logical flows generate physical flows with
the same match, then there is a behavioral change versus the previous
implementation: previously, one of the flows to be added in the main
loop would end up in the flow table, but after this change the removal
will not reveal an underlying flow but just delete it, at least until
something happens to fully refresh flows instead of just incrementally
adding and removing them.  That is an edge case; it might be necessary
to fix it but it is not the top priority.

Case #2, "update flow", is implemented via ofctrl_add_flow() with is_new
== false.  This is the one that I'm worried about and where I do not
understand the strategy.  What I'd expect is that the client would be
able to say, one way or another, "Here is the new set of physical flows
F2 associated with uuid U."  If some of the flows in F2 coincide with
the set of physical flows F1 that had previously been associated with U,
then it would make sense that nothing actually changes in the physical
flow table.  But there are many more possibilities.  For example, if F1
contains more flows than F2, then there needs to be a way to indicate
that some of the physical flows associated with U should be removed.
There is code in ovn_flow_lookup_by_uuid() that tries to tolerate some
kind of changes to flow matches from F1 to F2 (match fields that appear
or disappear), but I don't know a reason to believe that only those
changes can happen from F1 to F2; even after explanation, they look like
magic to me.

Maybe there is a belief here that a given Logical_Flow has some kind of
consistency, e.g. that its match and action, etc. do not change after
the logical flow is added.  That might be a useful assumption, and we
could enforce it if we made some (all?) Logical_Flow columns immutable.
But it is not currently guaranteed to be true: I can use ovn-sbctl (or
whatever) to modify all of the columns in a Logical_Flow row in
arbitrary ways.  This means that F1 and F2 might also have nothing in
common for a given Logical_Flow U.

Now let's go back to the edge case concern I expressed about case #3.
Thinking harder, I don't think it's so much of an edge case, at least
not to the extent that exhibiting it requires a buggy logical flow
table.  Suppose that a single transaction from ovn-northd deletes a
Logical_Flow with uuid U1 and adds a new Logical_Flow with uuid U2.
This will yield a call to ofctrl_remove_flow(U1) and (except in
pathological cases) one or more calls to ofctrl_add_flow(U2).  Suppose
that the sets of physical flows for U1 and U2 overlap in terms of their
key fields (table_id, pipeline, match, ...).  Then I believe that the
result will currently depend on the order of the calls to
ofctrl_remove_flow() and ofctrl_add_flow():

        * If the removal precedes all the adds, all is well.

        * If the removal follows any of the adds, the remove will
          falsely delete all the flows that should be added.

Does any of this ring a bell?

Thanks,

Ben.
Ryan Moats July 3, 2016, 11:08 p.m. UTC | #2
Ben Pfaff <blp@ovn.org> wrote on 07/03/2016 05:40:59 PM:

> From: Ben Pfaff <blp@ovn.org>
> To: Ryan Moats/Omaha/IBM@IBMUS
> Cc: dev@openvswitch.org
> Date: 07/03/2016 05:41 PM
> Subject: Re: [ovs-dev,v21,2/8] Persist ovn flow tables
>
> On Sun, Jul 03, 2016 at 10:35:27AM -0500, Ryan Moats wrote:
> > Ensure that ovn flow tables are persisted so that changes to
> > them chan be applied incrementally - this is a prereq for
> > making lflow_run and physical_run incremental.
> >
> > Signed-off-by: Ryan Moats <rmoats@us.ibm.com>
>
> This time I think I've figured out properly what I'm concerned about.
> Before, I think I was distracted enough by 'is_new' and the remaining
> patches that I hadn't formulated it correctly yet.
>
> ofctrl needs to support the following operations on physical flows:
>
>     1. Add flow.
>     2. Update flow.
>     3. Remove flow.
>
> lflow generates physical flows from logical flows in a one-to-many
> fashion.  That is, a single logical flow can yield zero, one, or
> multiple physical flows.  Other sources can yield physical flows too,
> and ofctrl needs those sources to pretend that they are generated from
> something similar enough to a logical flow that it can be uniquely
> identified with a UUID.  All that makes sense.
>
> Case #1, "add flow", for physical flows is implemented via
> ofctrl_add_flow() with is_new == true.  This is straightforward enough.
> Add the new flow to the flow table, add it to the list of physical flows
> associated with the UUID passed in, suppress duplicates.
>
> Case #3, "remove flow", for physical flows is implemented via
> ofctrl_remove_flow().  This removes all the physical flows associated
> with a UUID.  This is almost as straightforward.  The implementation
> here does appear to mean that, if there is a bug in the logical flow
> table such that two different logical flows generate physical flows with
> the same match, then there is a behavioral change versus the previous
> implementation: previously, one of the flows to be added in the main
> loop would end up in the flow table, but after this change the removal
> will not reveal an underlying flow but just delete it, at least until
> something happens to fully refresh flows instead of just incrementally
> adding and removing them.  That is an edge case; it might be necessary
> to fix it but it is not the top priority.

I'm going to pull your text for case 3 up here and let's deal with that
first:

> Now let's go back to the edge case concern I expressed about case #3.
> Thinking harder, I don't think it's so much of an edge case, at least
> not to the extent that exhibiting it requires a buggy logical flow
> table.  Suppose that a single transaction from ovn-northd deletes a
> Logical_Flow with uuid U1 and adds a new Logical_Flow with uuid U2.
> This will yield a call to ofctrl_remove_flow(U1) and (except in
> pathological cases) one or more calls to ofctrl_add_flow(U2).  Suppose
> that the sets of physical flows for U1 and U2 overlap in terms of their
> key fields (table_id, pipeline, match, ...).  Then I believe that the
> result will currently depend on the order of the calls to
> ofctrl_remove_flow() and ofctrl_add_flow():
>
>         * If the removal precedes all the adds, all is well.
>
>         * If the removal follows any of the adds, the remove will
>           falsely delete all the flows that should be added.
>
> Does any of this ring a bell?

I see where your analysis is going, but I think the problem is on the
add side :) : On the delete side, the code checks for U1 and so none
of the U2 flows will be removed.  However, I see the problem on the add
side: because if U2 creates a duplicate flow to what's in U1, then
right now the U2 flow is discarded silently and after U1 is deleted,
the flow incorrectly disappears.  The simplest thing that comes to
mind is to make two passes through the changed flows: the first to
handle deletes and the second to handle adding/modifying flows.

> Case #2, "update flow", is implemented via ofctrl_add_flow() with is_new
> == false.  This is the one that I'm worried about and where I do not
> understand the strategy.  What I'd expect is that the client would be
> able to say, one way or another, "Here is the new set of physical flows
> F2 associated with uuid U."  If some of the flows in F2 coincide with
> the set of physical flows F1 that had previously been associated with U,
> then it would make sense that nothing actually changes in the physical
> flow table.  But there are many more possibilities.  For example, if F1
> contains more flows than F2, then there needs to be a way to indicate
> that some of the physical flows associated with U should be removed.
> There is code in ovn_flow_lookup_by_uuid() that tries to tolerate some
> kind of changes to flow matches from F1 to F2 (match fields that appear
> or disappear), but I don't know a reason to believe that only those
> changes can happen from F1 to F2; even after explanation, they look like
> magic to me.
>
> Maybe there is a belief here that a given Logical_Flow has some kind of
> consistency, e.g. that its match and action, etc. do not change after
> the logical flow is added.  That might be a useful assumption, and we
> could enforce it if we made some (all?) Logical_Flow columns immutable.
> But it is not currently guaranteed to be true: I can use ovn-sbctl (or
> whatever) to modify all of the columns in a Logical_Flow row in
> arbitrary ways.  This means that F1 and F2 might also have nothing in
> common for a given Logical_Flow U.

If that's not the case, then I think the simplest thing to do is to
delete all the existing physical flows for the modified logical flow
and then recreate them as if they were a new add. That sacrifices some
performance, but it should be correct :)

Now, the question that arises is how to handle the flows created in
physical.c, but I think that should be relatively simple to handle.

If this makes sense, I can throw together v22 with these changes...

Ryan
Ben Pfaff July 4, 2016, 12:19 a.m. UTC | #3
On Sun, Jul 03, 2016 at 06:08:23PM -0500, Ryan Moats wrote:
> Ben Pfaff <blp@ovn.org> wrote on 07/03/2016 05:40:59 PM:
> 
> > From: Ben Pfaff <blp@ovn.org>
> > To: Ryan Moats/Omaha/IBM@IBMUS
> > Cc: dev@openvswitch.org
> > Date: 07/03/2016 05:41 PM
> > Subject: Re: [ovs-dev,v21,2/8] Persist ovn flow tables
> >
> > On Sun, Jul 03, 2016 at 10:35:27AM -0500, Ryan Moats wrote:
> > > Ensure that ovn flow tables are persisted so that changes to
> > > them chan be applied incrementally - this is a prereq for
> > > making lflow_run and physical_run incremental.
> > >
> > > Signed-off-by: Ryan Moats <rmoats@us.ibm.com>
> >
> > This time I think I've figured out properly what I'm concerned about.
> > Before, I think I was distracted enough by 'is_new' and the remaining
> > patches that I hadn't formulated it correctly yet.
> >
> > ofctrl needs to support the following operations on physical flows:
> >
> >     1. Add flow.
> >     2. Update flow.
> >     3. Remove flow.
> >
> > lflow generates physical flows from logical flows in a one-to-many
> > fashion.  That is, a single logical flow can yield zero, one, or
> > multiple physical flows.  Other sources can yield physical flows too,
> > and ofctrl needs those sources to pretend that they are generated from
> > something similar enough to a logical flow that it can be uniquely
> > identified with a UUID.  All that makes sense.
> >
> > Case #1, "add flow", for physical flows is implemented via
> > ofctrl_add_flow() with is_new == true.  This is straightforward enough.
> > Add the new flow to the flow table, add it to the list of physical flows
> > associated with the UUID passed in, suppress duplicates.
> >
> > Case #3, "remove flow", for physical flows is implemented via
> > ofctrl_remove_flow().  This removes all the physical flows associated
> > with a UUID.  This is almost as straightforward.  The implementation
> > here does appear to mean that, if there is a bug in the logical flow
> > table such that two different logical flows generate physical flows with
> > the same match, then there is a behavioral change versus the previous
> > implementation: previously, one of the flows to be added in the main
> > loop would end up in the flow table, but after this change the removal
> > will not reveal an underlying flow but just delete it, at least until
> > something happens to fully refresh flows instead of just incrementally
> > adding and removing them.  That is an edge case; it might be necessary
> > to fix it but it is not the top priority.
> 
> I'm going to pull your text for case 3 up here and let's deal with that
> first:
> 
> > Now let's go back to the edge case concern I expressed about case #3.
> > Thinking harder, I don't think it's so much of an edge case, at least
> > not to the extent that exhibiting it requires a buggy logical flow
> > table.  Suppose that a single transaction from ovn-northd deletes a
> > Logical_Flow with uuid U1 and adds a new Logical_Flow with uuid U2.
> > This will yield a call to ofctrl_remove_flow(U1) and (except in
> > pathological cases) one or more calls to ofctrl_add_flow(U2).  Suppose
> > that the sets of physical flows for U1 and U2 overlap in terms of their
> > key fields (table_id, pipeline, match, ...).  Then I believe that the
> > result will currently depend on the order of the calls to
> > ofctrl_remove_flow() and ofctrl_add_flow():
> >
> >         * If the removal precedes all the adds, all is well.
> >
> >         * If the removal follows any of the adds, the remove will
> >           falsely delete all the flows that should be added.
> >
> > Does any of this ring a bell?
> 
> I see where your analysis is going, but I think the problem is on the
> add side :) : On the delete side, the code checks for U1 and so none
> of the U2 flows will be removed.  However, I see the problem on the add
> side: because if U2 creates a duplicate flow to what's in U1, then
> right now the U2 flow is discarded silently and after U1 is deleted,
> the flow incorrectly disappears.  The simplest thing that comes to
> mind is to make two passes through the changed flows: the first to
> handle deletes and the second to handle adding/modifying flows.

That will only make the problem harder to trigger.  It will work if the
only way that overlaps can come up is from within Logical_Flow itself.
If ovn-controller has two independent pieces of code that can generate
overlapping flows, then the same problem can arise again.  That is
unlikely today, but it will be sitting in the code like a time bomb
waiting to surprise us months or years down the road with unpredictable
behavior.  With this approach, the cure would be to make two passes
through the entire pipeline of code in ovn-controller that can generate
flows, the first pass to remove and the second pass to add.  That would
be a pain.

I suggest that we avoid going down that path by making ofctrl track
flows with duplicate keys, when those keys have different UUIDs.  This
could be done within an hmap, if struct ovn_flow would add an ovs_list
of duplicates (probably preferred).  We'd want to use some kind of
deterministic method to determine which one actually gets installed, so
as to avoid oscillation; for example, it could be done based on UUID
ordering.  This will add a little bit of code now, but I think that it's
well worth it for predictability and for making code harder to screw up
in subtle ways by ordering things wrong.

> > Case #2, "update flow", is implemented via ofctrl_add_flow() with is_new
> > == false.  This is the one that I'm worried about and where I do not
> > understand the strategy.  What I'd expect is that the client would be
> > able to say, one way or another, "Here is the new set of physical flows
> > F2 associated with uuid U."  If some of the flows in F2 coincide with
> > the set of physical flows F1 that had previously been associated with U,
> > then it would make sense that nothing actually changes in the physical
> > flow table.  But there are many more possibilities.  For example, if F1
> > contains more flows than F2, then there needs to be a way to indicate
> > that some of the physical flows associated with U should be removed.
> > There is code in ovn_flow_lookup_by_uuid() that tries to tolerate some
> > kind of changes to flow matches from F1 to F2 (match fields that appear
> > or disappear), but I don't know a reason to believe that only those
> > changes can happen from F1 to F2; even after explanation, they look like
> > magic to me.
> >
> > Maybe there is a belief here that a given Logical_Flow has some kind of
> > consistency, e.g. that its match and action, etc. do not change after
> > the logical flow is added.  That might be a useful assumption, and we
> > could enforce it if we made some (all?) Logical_Flow columns immutable.
> > But it is not currently guaranteed to be true: I can use ovn-sbctl (or
> > whatever) to modify all of the columns in a Logical_Flow row in
> > arbitrary ways.  This means that F1 and F2 might also have nothing in
> > common for a given Logical_Flow U.
> 
> If that's not the case, then I think the simplest thing to do is to
> delete all the existing physical flows for the modified logical flow
> and then recreate them as if they were a new add. That sacrifices some
> performance, but it should be correct :)
> 
> Now, the question that arises is how to handle the flows created in
> physical.c, but I think that should be relatively simple to handle.

My thought is that the API should change to something like this, where
"key" is (table_id, priority, match) and "value" is actions.  These
"key" and "value" names wouldn't actually be used this way but I find it
a good conceptualization:

    ofctrl_add_flow(uuid, key, value)

        Add a flow that maps from key to value.  If there's a duplicate
        key with a different uuid, add the flow anyway but use a
        deterministic rule (e.g. based on UUID order) to determine which
        one will be in the real flow table.  If there's a duplicate key
        with the same uuid, do nothing and log a warning.

    ofctrl_remove_flow(uuid)

        Remove all the flows that have the given uuid.

We could also add a shortcut for the case where a key maps to exactly
one value, e.g.:

    ofctl_set_flow(uuid, key, value)

        Equivalent to:

            ofctrl_remove_flow(uuid);
            ofctrl_add_flow(uuid, key, value);

        but easy to optimize for the case where nothing actually
        changes.

Thanks,

Ben.
Ryan Moats July 4, 2016, 1:03 a.m. UTC | #4
Ben Pfaff <blp@ovn.org> wrote on 07/03/2016 07:19:11 PM:

> From: Ben Pfaff <blp@ovn.org>
> To: Ryan Moats/Omaha/IBM@IBMUS
> Cc: dev@openvswitch.org
> Date: 07/03/2016 07:19 PM
> Subject: Re: [ovs-dev,v21,2/8] Persist ovn flow tables
>
> On Sun, Jul 03, 2016 at 06:08:23PM -0500, Ryan Moats wrote:
> > Ben Pfaff <blp@ovn.org> wrote on 07/03/2016 05:40:59 PM:
> >
> > > From: Ben Pfaff <blp@ovn.org>
> > > To: Ryan Moats/Omaha/IBM@IBMUS
> > > Cc: dev@openvswitch.org
> > > Date: 07/03/2016 05:41 PM
> > > Subject: Re: [ovs-dev,v21,2/8] Persist ovn flow tables
> > >
> > > On Sun, Jul 03, 2016 at 10:35:27AM -0500, Ryan Moats wrote:
> > > > Ensure that ovn flow tables are persisted so that changes to
> > > > them chan be applied incrementally - this is a prereq for
> > > > making lflow_run and physical_run incremental.
> > > >
> > > > Signed-off-by: Ryan Moats <rmoats@us.ibm.com>
> > >
> > > This time I think I've figured out properly what I'm concerned about.
> > > Before, I think I was distracted enough by 'is_new' and the remaining
> > > patches that I hadn't formulated it correctly yet.
> > >
> > > ofctrl needs to support the following operations on physical flows:
> > >
> > >     1. Add flow.
> > >     2. Update flow.
> > >     3. Remove flow.
> > >
> > > lflow generates physical flows from logical flows in a one-to-many
> > > fashion.  That is, a single logical flow can yield zero, one, or
> > > multiple physical flows.  Other sources can yield physical flows too,
> > > and ofctrl needs those sources to pretend that they are generated
from
> > > something similar enough to a logical flow that it can be uniquely
> > > identified with a UUID.  All that makes sense.
> > >
> > > Case #1, "add flow", for physical flows is implemented via
> > > ofctrl_add_flow() with is_new == true.  This is straightforward
enough.
> > > Add the new flow to the flow table, add it to the list of physical
flows
> > > associated with the UUID passed in, suppress duplicates.
> > >
> > > Case #3, "remove flow", for physical flows is implemented via
> > > ofctrl_remove_flow().  This removes all the physical flows associated
> > > with a UUID.  This is almost as straightforward.  The implementation
> > > here does appear to mean that, if there is a bug in the logical flow
> > > table such that two different logical flows generate physical flows
with
> > > the same match, then there is a behavioral change versus the previous
> > > implementation: previously, one of the flows to be added in the main
> > > loop would end up in the flow table, but after this change the
removal
> > > will not reveal an underlying flow but just delete it, at least until
> > > something happens to fully refresh flows instead of just
incrementally
> > > adding and removing them.  That is an edge case; it might be
necessary
> > > to fix it but it is not the top priority.
> >
> > I'm going to pull your text for case 3 up here and let's deal with that
> > first:
> >
> > > Now let's go back to the edge case concern I expressed about case #3.
> > > Thinking harder, I don't think it's so much of an edge case, at least
> > > not to the extent that exhibiting it requires a buggy logical flow
> > > table.  Suppose that a single transaction from ovn-northd deletes a
> > > Logical_Flow with uuid U1 and adds a new Logical_Flow with uuid U2.
> > > This will yield a call to ofctrl_remove_flow(U1) and (except in
> > > pathological cases) one or more calls to ofctrl_add_flow(U2).
Suppose
> > > that the sets of physical flows for U1 and U2 overlap in terms of
their
> > > key fields (table_id, pipeline, match, ...).  Then I believe that the
> > > result will currently depend on the order of the calls to
> > > ofctrl_remove_flow() and ofctrl_add_flow():
> > >
> > >         * If the removal precedes all the adds, all is well.
> > >
> > >         * If the removal follows any of the adds, the remove will
> > >           falsely delete all the flows that should be added.
> > >
> > > Does any of this ring a bell?
> >
> > I see where your analysis is going, but I think the problem is on the
> > add side :) : On the delete side, the code checks for U1 and so none
> > of the U2 flows will be removed.  However, I see the problem on the add
> > side: because if U2 creates a duplicate flow to what's in U1, then
> > right now the U2 flow is discarded silently and after U1 is deleted,
> > the flow incorrectly disappears.  The simplest thing that comes to
> > mind is to make two passes through the changed flows: the first to
> > handle deletes and the second to handle adding/modifying flows.
>
> That will only make the problem harder to trigger.  It will work if the
> only way that overlaps can come up is from within Logical_Flow itself.
> If ovn-controller has two independent pieces of code that can generate
> overlapping flows, then the same problem can arise again.  That is
> unlikely today, but it will be sitting in the code like a time bomb
> waiting to surprise us months or years down the road with unpredictable
> behavior.  With this approach, the cure would be to make two passes
> through the entire pipeline of code in ovn-controller that can generate
> flows, the first pass to remove and the second pass to add.  That would
> be a pain.
>
> I suggest that we avoid going down that path by making ofctrl track
> flows with duplicate keys, when those keys have different UUIDs.  This
> could be done within an hmap, if struct ovn_flow would add an ovs_list
> of duplicates (probably preferred).  We'd want to use some kind of
> deterministic method to determine which one actually gets installed, so
> as to avoid oscillation; for example, it could be done based on UUID
> ordering.  This will add a little bit of code now, but I think that it's
> well worth it for predictability and for making code harder to screw up
> in subtle ways by ordering things wrong.
>
> > > Case #2, "update flow", is implemented via ofctrl_add_flow() with
is_new
> > > == false.  This is the one that I'm worried about and where I do not
> > > understand the strategy.  What I'd expect is that the client would be
> > > able to say, one way or another, "Here is the new set of physical
flows
> > > F2 associated with uuid U."  If some of the flows in F2 coincide with
> > > the set of physical flows F1 that had previously been associated with
U,
> > > then it would make sense that nothing actually changes in the
physical
> > > flow table.  But there are many more possibilities.  For example, if
F1
> > > contains more flows than F2, then there needs to be a way to indicate
> > > that some of the physical flows associated with U should be removed.
> > > There is code in ovn_flow_lookup_by_uuid() that tries to tolerate
some
> > > kind of changes to flow matches from F1 to F2 (match fields that
appear
> > > or disappear), but I don't know a reason to believe that only those
> > > changes can happen from F1 to F2; even after explanation, they look
like
> > > magic to me.
> > >
> > > Maybe there is a belief here that a given Logical_Flow has some kind
of
> > > consistency, e.g. that its match and action, etc. do not change after
> > > the logical flow is added.  That might be a useful assumption, and we
> > > could enforce it if we made some (all?) Logical_Flow columns
immutable.
> > > But it is not currently guaranteed to be true: I can use ovn-sbctl
(or
> > > whatever) to modify all of the columns in a Logical_Flow row in
> > > arbitrary ways.  This means that F1 and F2 might also have nothing in
> > > common for a given Logical_Flow U.
> >
> > If that's not the case, then I think the simplest thing to do is to
> > delete all the existing physical flows for the modified logical flow
> > and then recreate them as if they were a new add. That sacrifices some
> > performance, but it should be correct :)
> >
> > Now, the question that arises is how to handle the flows created in
> > physical.c, but I think that should be relatively simple to handle.
>
> My thought is that the API should change to something like this, where
> "key" is (table_id, priority, match) and "value" is actions.  These
> "key" and "value" names wouldn't actually be used this way but I find it
> a good conceptualization:
>
>     ofctrl_add_flow(uuid, key, value)
>
>         Add a flow that maps from key to value.  If there's a duplicate
>         key with a different uuid, add the flow anyway but use a
>         deterministic rule (e.g. based on UUID order) to determine which
>         one will be in the real flow table.  If there's a duplicate key
>         with the same uuid, do nothing and log a warning.
>
>     ofctrl_remove_flow(uuid)
>
>         Remove all the flows that have the given uuid.
>
> We could also add a shortcut for the case where a key maps to exactly
> one value, e.g.:
>
>     ofctl_set_flow(uuid, key, value)
>
>         Equivalent to:
>
>             ofctrl_remove_flow(uuid);
>             ofctrl_add_flow(uuid, key, value);
>
>         but easy to optimize for the case where nothing actually
>         changes.

I think we are converging, because I generally agree with the above APIs.
(I'm not quite sure when ovn_set_flow would get used, but I can put it
in).

I think we want the key to track both uuid/value pairs (you may be
thinking that already, but it wasn't 100% clear to me from your
text above). Given that we want a deterministic rule for picking
which value based on uuid if the key maps to more than one uuid/value
"pair", an hmap may be the answer, but I'll look around some more
to see if another structure strikes me as being a better candidate.

For now, I think UUID ordering is a fine rule for breaking ties
between uuid/value "pairs".

Obviously, this will take a bit of recoding, but I like this direction,
because I've had some of these items on my "how to address that in the
next pass" list and now they will be addressed...

Ryan
Ben Pfaff July 4, 2016, 4:51 a.m. UTC | #5
On Sun, Jul 03, 2016 at 08:03:54PM -0500, Ryan Moats wrote:
> Ben Pfaff <blp@ovn.org> wrote on 07/03/2016 07:19:11 PM:
> 
> > From: Ben Pfaff <blp@ovn.org>
> > To: Ryan Moats/Omaha/IBM@IBMUS
> > Cc: dev@openvswitch.org
> > Date: 07/03/2016 07:19 PM
> > Subject: Re: [ovs-dev,v21,2/8] Persist ovn flow tables
> >
> > On Sun, Jul 03, 2016 at 06:08:23PM -0500, Ryan Moats wrote:
> > > Ben Pfaff <blp@ovn.org> wrote on 07/03/2016 05:40:59 PM:
> > >
> > > > From: Ben Pfaff <blp@ovn.org>
> > > > To: Ryan Moats/Omaha/IBM@IBMUS
> > > > Cc: dev@openvswitch.org
> > > > Date: 07/03/2016 05:41 PM
> > > > Subject: Re: [ovs-dev,v21,2/8] Persist ovn flow tables
> > > >
> > > > On Sun, Jul 03, 2016 at 10:35:27AM -0500, Ryan Moats wrote:
> > > > > Ensure that ovn flow tables are persisted so that changes to
> > > > > them chan be applied incrementally - this is a prereq for
> > > > > making lflow_run and physical_run incremental.
> > > > >
> > > > > Signed-off-by: Ryan Moats <rmoats@us.ibm.com>
> > > >
> > > > This time I think I've figured out properly what I'm concerned about.
> > > > Before, I think I was distracted enough by 'is_new' and the remaining
> > > > patches that I hadn't formulated it correctly yet.
> > > >
> > > > ofctrl needs to support the following operations on physical flows:
> > > >
> > > >     1. Add flow.
> > > >     2. Update flow.
> > > >     3. Remove flow.
> > > >
> > > > lflow generates physical flows from logical flows in a one-to-many
> > > > fashion.  That is, a single logical flow can yield zero, one, or
> > > > multiple physical flows.  Other sources can yield physical flows too,
> > > > and ofctrl needs those sources to pretend that they are generated
> from
> > > > something similar enough to a logical flow that it can be uniquely
> > > > identified with a UUID.  All that makes sense.
> > > >
> > > > Case #1, "add flow", for physical flows is implemented via
> > > > ofctrl_add_flow() with is_new == true.  This is straightforward
> enough.
> > > > Add the new flow to the flow table, add it to the list of physical
> flows
> > > > associated with the UUID passed in, suppress duplicates.
> > > >
> > > > Case #3, "remove flow", for physical flows is implemented via
> > > > ofctrl_remove_flow().  This removes all the physical flows associated
> > > > with a UUID.  This is almost as straightforward.  The implementation
> > > > here does appear to mean that, if there is a bug in the logical flow
> > > > table such that two different logical flows generate physical flows
> with
> > > > the same match, then there is a behavioral change versus the previous
> > > > implementation: previously, one of the flows to be added in the main
> > > > loop would end up in the flow table, but after this change the
> removal
> > > > will not reveal an underlying flow but just delete it, at least until
> > > > something happens to fully refresh flows instead of just
> incrementally
> > > > adding and removing them.  That is an edge case; it might be
> necessary
> > > > to fix it but it is not the top priority.
> > >
> > > I'm going to pull your text for case 3 up here and let's deal with that
> > > first:
> > >
> > > > Now let's go back to the edge case concern I expressed about case #3.
> > > > Thinking harder, I don't think it's so much of an edge case, at least
> > > > not to the extent that exhibiting it requires a buggy logical flow
> > > > table.  Suppose that a single transaction from ovn-northd deletes a
> > > > Logical_Flow with uuid U1 and adds a new Logical_Flow with uuid U2.
> > > > This will yield a call to ofctrl_remove_flow(U1) and (except in
> > > > pathological cases) one or more calls to ofctrl_add_flow(U2).
> Suppose
> > > > that the sets of physical flows for U1 and U2 overlap in terms of
> their
> > > > key fields (table_id, pipeline, match, ...).  Then I believe that the
> > > > result will currently depend on the order of the calls to
> > > > ofctrl_remove_flow() and ofctrl_add_flow():
> > > >
> > > >         * If the removal precedes all the adds, all is well.
> > > >
> > > >         * If the removal follows any of the adds, the remove will
> > > >           falsely delete all the flows that should be added.
> > > >
> > > > Does any of this ring a bell?
> > >
> > > I see where your analysis is going, but I think the problem is on the
> > > add side :) : On the delete side, the code checks for U1 and so none
> > > of the U2 flows will be removed.  However, I see the problem on the add
> > > side: because if U2 creates a duplicate flow to what's in U1, then
> > > right now the U2 flow is discarded silently and after U1 is deleted,
> > > the flow incorrectly disappears.  The simplest thing that comes to
> > > mind is to make two passes through the changed flows: the first to
> > > handle deletes and the second to handle adding/modifying flows.
> >
> > That will only make the problem harder to trigger.  It will work if the
> > only way that overlaps can come up is from within Logical_Flow itself.
> > If ovn-controller has two independent pieces of code that can generate
> > overlapping flows, then the same problem can arise again.  That is
> > unlikely today, but it will be sitting in the code like a time bomb
> > waiting to surprise us months or years down the road with unpredictable
> > behavior.  With this approach, the cure would be to make two passes
> > through the entire pipeline of code in ovn-controller that can generate
> > flows, the first pass to remove and the second pass to add.  That would
> > be a pain.
> >
> > I suggest that we avoid going down that path by making ofctrl track
> > flows with duplicate keys, when those keys have different UUIDs.  This
> > could be done within an hmap, if struct ovn_flow would add an ovs_list
> > of duplicates (probably preferred).  We'd want to use some kind of
> > deterministic method to determine which one actually gets installed, so
> > as to avoid oscillation; for example, it could be done based on UUID
> > ordering.  This will add a little bit of code now, but I think that it's
> > well worth it for predictability and for making code harder to screw up
> > in subtle ways by ordering things wrong.
> >
> > > > Case #2, "update flow", is implemented via ofctrl_add_flow() with
> is_new
> > > > == false.  This is the one that I'm worried about and where I do not
> > > > understand the strategy.  What I'd expect is that the client would be
> > > > able to say, one way or another, "Here is the new set of physical
> flows
> > > > F2 associated with uuid U."  If some of the flows in F2 coincide with
> > > > the set of physical flows F1 that had previously been associated with
> U,
> > > > then it would make sense that nothing actually changes in the
> physical
> > > > flow table.  But there are many more possibilities.  For example, if
> F1
> > > > contains more flows than F2, then there needs to be a way to indicate
> > > > that some of the physical flows associated with U should be removed.
> > > > There is code in ovn_flow_lookup_by_uuid() that tries to tolerate
> some
> > > > kind of changes to flow matches from F1 to F2 (match fields that
> appear
> > > > or disappear), but I don't know a reason to believe that only those
> > > > changes can happen from F1 to F2; even after explanation, they look
> like
> > > > magic to me.
> > > >
> > > > Maybe there is a belief here that a given Logical_Flow has some kind
> of
> > > > consistency, e.g. that its match and action, etc. do not change after
> > > > the logical flow is added.  That might be a useful assumption, and we
> > > > could enforce it if we made some (all?) Logical_Flow columns
> immutable.
> > > > But it is not currently guaranteed to be true: I can use ovn-sbctl
> (or
> > > > whatever) to modify all of the columns in a Logical_Flow row in
> > > > arbitrary ways.  This means that F1 and F2 might also have nothing in
> > > > common for a given Logical_Flow U.
> > >
> > > If that's not the case, then I think the simplest thing to do is to
> > > delete all the existing physical flows for the modified logical flow
> > > and then recreate them as if they were a new add. That sacrifices some
> > > performance, but it should be correct :)
> > >
> > > Now, the question that arises is how to handle the flows created in
> > > physical.c, but I think that should be relatively simple to handle.
> >
> > My thought is that the API should change to something like this, where
> > "key" is (table_id, priority, match) and "value" is actions.  These
> > "key" and "value" names wouldn't actually be used this way but I find it
> > a good conceptualization:
> >
> >     ofctrl_add_flow(uuid, key, value)
> >
> >         Add a flow that maps from key to value.  If there's a duplicate
> >         key with a different uuid, add the flow anyway but use a
> >         deterministic rule (e.g. based on UUID order) to determine which
> >         one will be in the real flow table.  If there's a duplicate key
> >         with the same uuid, do nothing and log a warning.
> >
> >     ofctrl_remove_flow(uuid)
> >
> >         Remove all the flows that have the given uuid.
> >
> > We could also add a shortcut for the case where a key maps to exactly
> > one value, e.g.:
> >
> >     ofctl_set_flow(uuid, key, value)
> >
> >         Equivalent to:
> >
> >             ofctrl_remove_flow(uuid);
> >             ofctrl_add_flow(uuid, key, value);
> >
> >         but easy to optimize for the case where nothing actually
> >         changes.
> 
> I think we are converging, because I generally agree with the above APIs.
> (I'm not quite sure when ovn_set_flow would get used, but I can put it
> in).

I think that there are a few cases where a UUID is associated with
exactly one physical flow; consider_neighbor_flow() might be an
example.  Maybe this is premature optimization, dunno.

> I think we want the key to track both uuid/value pairs (you may be
> thinking that already, but it wasn't 100% clear to me from your
> text above). Given that we want a deterministic rule for picking
> which value based on uuid if the key maps to more than one uuid/value
> "pair", an hmap may be the answer, but I'll look around some more
> to see if another structure strikes me as being a better candidate.

Here's a summary of the data structure model I have in mind.  It is
really not so far from what you've implemented already:

        one-to-many map from uuid to ovn_flow
        one-to-many map from (table_id,priority,match) to ovn_flow

where the latter map should ordinarily be one-to-one in steady state but
might have duplicates either intermittently while doing adds and removes
within a given iteration of the poll loop or more permanently if there's
a bug in, e.g., the logical flow table output by ovn-northd.

> For now, I think UUID ordering is a fine rule for breaking ties
> between uuid/value "pairs".

Seems fine to start.

> Obviously, this will take a bit of recoding, but I like this direction,
> because I've had some of these items on my "how to address that in the
> next pass" list and now they will be addressed...

Great.
Ryan Moats July 4, 2016, 10:20 p.m. UTC | #6
Ben Pfaff <blp@ovn.org> wrote on 07/03/2016 11:51:17 PM:

> From: Ben Pfaff <blp@ovn.org>
> To: Ryan Moats/Omaha/IBM@IBMUS
> Cc: dev@openvswitch.org
> Date: 07/03/2016 11:51 PM
> Subject: Re: [ovs-dev,v21,2/8] Persist ovn flow tables
>
> On Sun, Jul 03, 2016 at 08:03:54PM -0500, Ryan Moats wrote:
> > Ben Pfaff <blp@ovn.org> wrote on 07/03/2016 07:19:11 PM:
> >
> > > From: Ben Pfaff <blp@ovn.org>
> > > To: Ryan Moats/Omaha/IBM@IBMUS
> > > Cc: dev@openvswitch.org
> > > Date: 07/03/2016 07:19 PM
> > > Subject: Re: [ovs-dev,v21,2/8] Persist ovn flow tables
> > >
> > > On Sun, Jul 03, 2016 at 06:08:23PM -0500, Ryan Moats wrote:
> > > > Ben Pfaff <blp@ovn.org> wrote on 07/03/2016 05:40:59 PM:
> > > >
> > > > > From: Ben Pfaff <blp@ovn.org>
> > > > > To: Ryan Moats/Omaha/IBM@IBMUS
> > > > > Cc: dev@openvswitch.org
> > > > > Date: 07/03/2016 05:41 PM
> > > > > Subject: Re: [ovs-dev,v21,2/8] Persist ovn flow tables
> > > > >
> > > > > On Sun, Jul 03, 2016 at 10:35:27AM -0500, Ryan Moats wrote:
> > > > > > Ensure that ovn flow tables are persisted so that changes to
> > > > > > them chan be applied incrementally - this is a prereq for
> > > > > > making lflow_run and physical_run incremental.
> > > > > >
> > > > > > Signed-off-by: Ryan Moats <rmoats@us.ibm.com>
> > > > >
> > > > > This time I think I've figured out properly what I'm concerned
about.
> > > > > Before, I think I was distracted enough by 'is_new' and the
remaining
> > > > > patches that I hadn't formulated it correctly yet.
> > > > >
> > > > > ofctrl needs to support the following operations on physical
flows:
> > > > >
> > > > >     1. Add flow.
> > > > >     2. Update flow.
> > > > >     3. Remove flow.
> > > > >
> > > > > lflow generates physical flows from logical flows in a
one-to-many
> > > > > fashion.  That is, a single logical flow can yield zero, one, or
> > > > > multiple physical flows.  Other sources can yield physical flows
too,
> > > > > and ofctrl needs those sources to pretend that they are generated
> > from
> > > > > something similar enough to a logical flow that it can be
uniquely
> > > > > identified with a UUID.  All that makes sense.
> > > > >
> > > > > Case #1, "add flow", for physical flows is implemented via
> > > > > ofctrl_add_flow() with is_new == true.  This is straightforward
> > enough.
> > > > > Add the new flow to the flow table, add it to the list of
physical
> > flows
> > > > > associated with the UUID passed in, suppress duplicates.
> > > > >
> > > > > Case #3, "remove flow", for physical flows is implemented via
> > > > > ofctrl_remove_flow().  This removes all the physical flows
associated
> > > > > with a UUID.  This is almost as straightforward.  The
implementation
> > > > > here does appear to mean that, if there is a bug in the logical
flow
> > > > > table such that two different logical flows generate physical
flows
> > with
> > > > > the same match, then there is a behavioral change versus the
previous
> > > > > implementation: previously, one of the flows to be added in the
main
> > > > > loop would end up in the flow table, but after this change the
> > removal
> > > > > will not reveal an underlying flow but just delete it, at least
until
> > > > > something happens to fully refresh flows instead of just
> > incrementally
> > > > > adding and removing them.  That is an edge case; it might be
> > necessary
> > > > > to fix it but it is not the top priority.
> > > >
> > > > I'm going to pull your text for case 3 up here and let's deal with
that
> > > > first:
> > > >
> > > > > Now let's go back to the edge case concern I expressed about case
#3.
> > > > > Thinking harder, I don't think it's so much of an edge case, at
least
> > > > > not to the extent that exhibiting it requires a buggy logical
flow
> > > > > table.  Suppose that a single transaction from ovn-northd deletes
a
> > > > > Logical_Flow with uuid U1 and adds a new Logical_Flow with uuid
U2.
> > > > > This will yield a call to ofctrl_remove_flow(U1) and (except in
> > > > > pathological cases) one or more calls to ofctrl_add_flow(U2).
> > Suppose
> > > > > that the sets of physical flows for U1 and U2 overlap in terms of
> > their
> > > > > key fields (table_id, pipeline, match, ...).  Then I believe that
the
> > > > > result will currently depend on the order of the calls to
> > > > > ofctrl_remove_flow() and ofctrl_add_flow():
> > > > >
> > > > >         * If the removal precedes all the adds, all is well.
> > > > >
> > > > >         * If the removal follows any of the adds, the remove will
> > > > >           falsely delete all the flows that should be added.
> > > > >
> > > > > Does any of this ring a bell?
> > > >
> > > > I see where your analysis is going, but I think the problem is on
the
> > > > add side :) : On the delete side, the code checks for U1 and so
none
> > > > of the U2 flows will be removed.  However, I see the problem on the
add
> > > > side: because if U2 creates a duplicate flow to what's in U1, then
> > > > right now the U2 flow is discarded silently and after U1 is
deleted,
> > > > the flow incorrectly disappears.  The simplest thing that comes to
> > > > mind is to make two passes through the changed flows: the first to
> > > > handle deletes and the second to handle adding/modifying flows.
> > >
> > > That will only make the problem harder to trigger.  It will work if
the
> > > only way that overlaps can come up is from within Logical_Flow
itself.
> > > If ovn-controller has two independent pieces of code that can
generate
> > > overlapping flows, then the same problem can arise again.  That is
> > > unlikely today, but it will be sitting in the code like a time bomb
> > > waiting to surprise us months or years down the road with
unpredictable
> > > behavior.  With this approach, the cure would be to make two passes
> > > through the entire pipeline of code in ovn-controller that can
generate
> > > flows, the first pass to remove and the second pass to add.  That
would
> > > be a pain.
> > >
> > > I suggest that we avoid going down that path by making ofctrl track
> > > flows with duplicate keys, when those keys have different UUIDs.
This
> > > could be done within an hmap, if struct ovn_flow would add an
ovs_list
> > > of duplicates (probably preferred).  We'd want to use some kind of
> > > deterministic method to determine which one actually gets installed,
so
> > > as to avoid oscillation; for example, it could be done based on UUID
> > > ordering.  This will add a little bit of code now, but I think that
it's
> > > well worth it for predictability and for making code harder to screw
up
> > > in subtle ways by ordering things wrong.
> > >
> > > > > Case #2, "update flow", is implemented via ofctrl_add_flow() with
> > is_new
> > > > > == false.  This is the one that I'm worried about and where I do
not
> > > > > understand the strategy.  What I'd expect is that the client
would be
> > > > > able to say, one way or another, "Here is the new set of physical
> > flows
> > > > > F2 associated with uuid U."  If some of the flows in F2 coincide
with
> > > > > the set of physical flows F1 that had previously been associated
with
> > U,
> > > > > then it would make sense that nothing actually changes in the
> > physical
> > > > > flow table.  But there are many more possibilities.  For example,
if
> > F1
> > > > > contains more flows than F2, then there needs to be a way to
indicate
> > > > > that some of the physical flows associated with U should be
removed.
> > > > > There is code in ovn_flow_lookup_by_uuid() that tries to tolerate
> > some
> > > > > kind of changes to flow matches from F1 to F2 (match fields that
> > appear
> > > > > or disappear), but I don't know a reason to believe that only
those
> > > > > changes can happen from F1 to F2; even after explanation, they
look
> > like
> > > > > magic to me.
> > > > >
> > > > > Maybe there is a belief here that a given Logical_Flow has some
kind
> > of
> > > > > consistency, e.g. that its match and action, etc. do not change
after
> > > > > the logical flow is added.  That might be a useful assumption,
and we
> > > > > could enforce it if we made some (all?) Logical_Flow columns
> > immutable.
> > > > > But it is not currently guaranteed to be true: I can use
ovn-sbctl
> > (or
> > > > > whatever) to modify all of the columns in a Logical_Flow row in
> > > > > arbitrary ways.  This means that F1 and F2 might also have
nothing in
> > > > > common for a given Logical_Flow U.
> > > >
> > > > If that's not the case, then I think the simplest thing to do is to
> > > > delete all the existing physical flows for the modified logical
flow
> > > > and then recreate them as if they were a new add. That sacrifices
some
> > > > performance, but it should be correct :)
> > > >
> > > > Now, the question that arises is how to handle the flows created in
> > > > physical.c, but I think that should be relatively simple to handle.
> > >
> > > My thought is that the API should change to something like this,
where
> > > "key" is (table_id, priority, match) and "value" is actions.  These
> > > "key" and "value" names wouldn't actually be used this way but I find
it
> > > a good conceptualization:
> > >
> > >     ofctrl_add_flow(uuid, key, value)
> > >
> > >         Add a flow that maps from key to value.  If there's a
duplicate
> > >         key with a different uuid, add the flow anyway but use a
> > >         deterministic rule (e.g. based on UUID order) to determine
which
> > >         one will be in the real flow table.  If there's a duplicate
key
> > >         with the same uuid, do nothing and log a warning.
> > >
> > >     ofctrl_remove_flow(uuid)
> > >
> > >         Remove all the flows that have the given uuid.
> > >
> > > We could also add a shortcut for the case where a key maps to exactly
> > > one value, e.g.:
> > >
> > >     ofctl_set_flow(uuid, key, value)
> > >
> > >         Equivalent to:
> > >
> > >             ofctrl_remove_flow(uuid);
> > >             ofctrl_add_flow(uuid, key, value);
> > >
> > >         but easy to optimize for the case where nothing actually
> > >         changes.
> >
> > I think we are converging, because I generally agree with the above
APIs.
> > (I'm not quite sure when ovn_set_flow would get used, but I can put it
> > in).
>
> I think that there are a few cases where a UUID is associated with
> exactly one physical flow; consider_neighbor_flow() might be an
> example.  Maybe this is premature optimization, dunno.
>
> > I think we want the key to track both uuid/value pairs (you may be
> > thinking that already, but it wasn't 100% clear to me from your
> > text above). Given that we want a deterministic rule for picking
> > which value based on uuid if the key maps to more than one uuid/value
> > "pair", an hmap may be the answer, but I'll look around some more
> > to see if another structure strikes me as being a better candidate.
>
> Here's a summary of the data structure model I have in mind.  It is
> really not so far from what you've implemented already:
>
>         one-to-many map from uuid to ovn_flow
>         one-to-many map from (table_id,priority,match) to ovn_flow
>
> where the latter map should ordinarily be one-to-one in steady state but
> might have duplicates either intermittently while doing adds and removes
> within a given iteration of the poll loop or more permanently if there's
> a bug in, e.g., the logical flow table output by ovn-northd.
>
> > For now, I think UUID ordering is a fine rule for breaking ties
> > between uuid/value "pairs".
>
> Seems fine to start.
>
> > Obviously, this will take a bit of recoding, but I like this direction,
> > because I've had some of these items on my "how to address that in the
> > next pass" list and now they will be addressed...
>
> Great.

Surprise!  It turned out to be simpler than I thought it would be, since
(as you said) it wasn't really that far from what was already there.

Anyway, v22 is posted and I believe it does everything we discussed
above...

Safe travels,
Ryan
Ben Pfaff July 5, 2016, 1:01 a.m. UTC | #7
On Mon, Jul 04, 2016 at 05:20:38PM -0500, Ryan Moats wrote:
> Ben Pfaff <blp@ovn.org> wrote on 07/03/2016 11:51:17 PM:
> 
> > From: Ben Pfaff <blp@ovn.org>
> > To: Ryan Moats/Omaha/IBM@IBMUS
> > Cc: dev@openvswitch.org
> > Date: 07/03/2016 11:51 PM
> > Subject: Re: [ovs-dev,v21,2/8] Persist ovn flow tables
> >
> > On Sun, Jul 03, 2016 at 08:03:54PM -0500, Ryan Moats wrote:
> > > Ben Pfaff <blp@ovn.org> wrote on 07/03/2016 07:19:11 PM:
> > >
> > > > From: Ben Pfaff <blp@ovn.org>
> > > > To: Ryan Moats/Omaha/IBM@IBMUS
> > > > Cc: dev@openvswitch.org
> > > > Date: 07/03/2016 07:19 PM
> > > > Subject: Re: [ovs-dev,v21,2/8] Persist ovn flow tables
> > > >
> > > > On Sun, Jul 03, 2016 at 06:08:23PM -0500, Ryan Moats wrote:
> > > > > Ben Pfaff <blp@ovn.org> wrote on 07/03/2016 05:40:59 PM:
> > > > >
> > > > > > From: Ben Pfaff <blp@ovn.org>
> > > > > > To: Ryan Moats/Omaha/IBM@IBMUS
> > > > > > Cc: dev@openvswitch.org
> > > > > > Date: 07/03/2016 05:41 PM
> > > > > > Subject: Re: [ovs-dev,v21,2/8] Persist ovn flow tables
> > > > > >
> > > > > > On Sun, Jul 03, 2016 at 10:35:27AM -0500, Ryan Moats wrote:
> > > > > > > Ensure that ovn flow tables are persisted so that changes to
> > > > > > > them chan be applied incrementally - this is a prereq for
> > > > > > > making lflow_run and physical_run incremental.
> > > > > > >
> > > > > > > Signed-off-by: Ryan Moats <rmoats@us.ibm.com>
> > > > > >
> > > > > > This time I think I've figured out properly what I'm concerned
> about.
> > > > > > Before, I think I was distracted enough by 'is_new' and the
> remaining
> > > > > > patches that I hadn't formulated it correctly yet.
> > > > > >
> > > > > > ofctrl needs to support the following operations on physical
> flows:
> > > > > >
> > > > > >     1. Add flow.
> > > > > >     2. Update flow.
> > > > > >     3. Remove flow.
> > > > > >
> > > > > > lflow generates physical flows from logical flows in a
> one-to-many
> > > > > > fashion.  That is, a single logical flow can yield zero, one, or
> > > > > > multiple physical flows.  Other sources can yield physical flows
> too,
> > > > > > and ofctrl needs those sources to pretend that they are generated
> > > from
> > > > > > something similar enough to a logical flow that it can be
> uniquely
> > > > > > identified with a UUID.  All that makes sense.
> > > > > >
> > > > > > Case #1, "add flow", for physical flows is implemented via
> > > > > > ofctrl_add_flow() with is_new == true.  This is straightforward
> > > enough.
> > > > > > Add the new flow to the flow table, add it to the list of
> physical
> > > flows
> > > > > > associated with the UUID passed in, suppress duplicates.
> > > > > >
> > > > > > Case #3, "remove flow", for physical flows is implemented via
> > > > > > ofctrl_remove_flow().  This removes all the physical flows
> associated
> > > > > > with a UUID.  This is almost as straightforward.  The
> implementation
> > > > > > here does appear to mean that, if there is a bug in the logical
> flow
> > > > > > table such that two different logical flows generate physical
> flows
> > > with
> > > > > > the same match, then there is a behavioral change versus the
> previous
> > > > > > implementation: previously, one of the flows to be added in the
> main
> > > > > > loop would end up in the flow table, but after this change the
> > > removal
> > > > > > will not reveal an underlying flow but just delete it, at least
> until
> > > > > > something happens to fully refresh flows instead of just
> > > incrementally
> > > > > > adding and removing them.  That is an edge case; it might be
> > > necessary
> > > > > > to fix it but it is not the top priority.
> > > > >
> > > > > I'm going to pull your text for case 3 up here and let's deal with
> that
> > > > > first:
> > > > >
> > > > > > Now let's go back to the edge case concern I expressed about case
> #3.
> > > > > > Thinking harder, I don't think it's so much of an edge case, at
> least
> > > > > > not to the extent that exhibiting it requires a buggy logical
> flow
> > > > > > table.  Suppose that a single transaction from ovn-northd deletes
> a
> > > > > > Logical_Flow with uuid U1 and adds a new Logical_Flow with uuid
> U2.
> > > > > > This will yield a call to ofctrl_remove_flow(U1) and (except in
> > > > > > pathological cases) one or more calls to ofctrl_add_flow(U2).
> > > Suppose
> > > > > > that the sets of physical flows for U1 and U2 overlap in terms of
> > > their
> > > > > > key fields (table_id, pipeline, match, ...).  Then I believe that
> the
> > > > > > result will currently depend on the order of the calls to
> > > > > > ofctrl_remove_flow() and ofctrl_add_flow():
> > > > > >
> > > > > >         * If the removal precedes all the adds, all is well.
> > > > > >
> > > > > >         * If the removal follows any of the adds, the remove will
> > > > > >           falsely delete all the flows that should be added.
> > > > > >
> > > > > > Does any of this ring a bell?
> > > > >
> > > > > I see where your analysis is going, but I think the problem is on
> the
> > > > > add side :) : On the delete side, the code checks for U1 and so
> none
> > > > > of the U2 flows will be removed.  However, I see the problem on the
> add
> > > > > side: because if U2 creates a duplicate flow to what's in U1, then
> > > > > right now the U2 flow is discarded silently and after U1 is
> deleted,
> > > > > the flow incorrectly disappears.  The simplest thing that comes to
> > > > > mind is to make two passes through the changed flows: the first to
> > > > > handle deletes and the second to handle adding/modifying flows.
> > > >
> > > > That will only make the problem harder to trigger.  It will work if
> the
> > > > only way that overlaps can come up is from within Logical_Flow
> itself.
> > > > If ovn-controller has two independent pieces of code that can
> generate
> > > > overlapping flows, then the same problem can arise again.  That is
> > > > unlikely today, but it will be sitting in the code like a time bomb
> > > > waiting to surprise us months or years down the road with
> unpredictable
> > > > behavior.  With this approach, the cure would be to make two passes
> > > > through the entire pipeline of code in ovn-controller that can
> generate
> > > > flows, the first pass to remove and the second pass to add.  That
> would
> > > > be a pain.
> > > >
> > > > I suggest that we avoid going down that path by making ofctrl track
> > > > flows with duplicate keys, when those keys have different UUIDs.
> This
> > > > could be done within an hmap, if struct ovn_flow would add an
> ovs_list
> > > > of duplicates (probably preferred).  We'd want to use some kind of
> > > > deterministic method to determine which one actually gets installed,
> so
> > > > as to avoid oscillation; for example, it could be done based on UUID
> > > > ordering.  This will add a little bit of code now, but I think that
> it's
> > > > well worth it for predictability and for making code harder to screw
> up
> > > > in subtle ways by ordering things wrong.
> > > >
> > > > > > Case #2, "update flow", is implemented via ofctrl_add_flow() with
> > > is_new
> > > > > > == false.  This is the one that I'm worried about and where I do
> not
> > > > > > understand the strategy.  What I'd expect is that the client
> would be
> > > > > > able to say, one way or another, "Here is the new set of physical
> > > flows
> > > > > > F2 associated with uuid U."  If some of the flows in F2 coincide
> with
> > > > > > the set of physical flows F1 that had previously been associated
> with
> > > U,
> > > > > > then it would make sense that nothing actually changes in the
> > > physical
> > > > > > flow table.  But there are many more possibilities.  For example,
> if
> > > F1
> > > > > > contains more flows than F2, then there needs to be a way to
> indicate
> > > > > > that some of the physical flows associated with U should be
> removed.
> > > > > > There is code in ovn_flow_lookup_by_uuid() that tries to tolerate
> > > some
> > > > > > kind of changes to flow matches from F1 to F2 (match fields that
> > > appear
> > > > > > or disappear), but I don't know a reason to believe that only
> those
> > > > > > changes can happen from F1 to F2; even after explanation, they
> look
> > > like
> > > > > > magic to me.
> > > > > >
> > > > > > Maybe there is a belief here that a given Logical_Flow has some
> kind
> > > of
> > > > > > consistency, e.g. that its match and action, etc. do not change
> after
> > > > > > the logical flow is added.  That might be a useful assumption,
> and we
> > > > > > could enforce it if we made some (all?) Logical_Flow columns
> > > immutable.
> > > > > > But it is not currently guaranteed to be true: I can use
> ovn-sbctl
> > > (or
> > > > > > whatever) to modify all of the columns in a Logical_Flow row in
> > > > > > arbitrary ways.  This means that F1 and F2 might also have
> nothing in
> > > > > > common for a given Logical_Flow U.
> > > > >
> > > > > If that's not the case, then I think the simplest thing to do is to
> > > > > delete all the existing physical flows for the modified logical
> flow
> > > > > and then recreate them as if they were a new add. That sacrifices
> some
> > > > > performance, but it should be correct :)
> > > > >
> > > > > Now, the question that arises is how to handle the flows created in
> > > > > physical.c, but I think that should be relatively simple to handle.
> > > >
> > > > My thought is that the API should change to something like this,
> where
> > > > "key" is (table_id, priority, match) and "value" is actions.  These
> > > > "key" and "value" names wouldn't actually be used this way but I find
> it
> > > > a good conceptualization:
> > > >
> > > >     ofctrl_add_flow(uuid, key, value)
> > > >
> > > >         Add a flow that maps from key to value.  If there's a
> duplicate
> > > >         key with a different uuid, add the flow anyway but use a
> > > >         deterministic rule (e.g. based on UUID order) to determine
> which
> > > >         one will be in the real flow table.  If there's a duplicate
> key
> > > >         with the same uuid, do nothing and log a warning.
> > > >
> > > >     ofctrl_remove_flow(uuid)
> > > >
> > > >         Remove all the flows that have the given uuid.
> > > >
> > > > We could also add a shortcut for the case where a key maps to exactly
> > > > one value, e.g.:
> > > >
> > > >     ofctl_set_flow(uuid, key, value)
> > > >
> > > >         Equivalent to:
> > > >
> > > >             ofctrl_remove_flow(uuid);
> > > >             ofctrl_add_flow(uuid, key, value);
> > > >
> > > >         but easy to optimize for the case where nothing actually
> > > >         changes.
> > >
> > > I think we are converging, because I generally agree with the above
> APIs.
> > > (I'm not quite sure when ovn_set_flow would get used, but I can put it
> > > in).
> >
> > I think that there are a few cases where a UUID is associated with
> > exactly one physical flow; consider_neighbor_flow() might be an
> > example.  Maybe this is premature optimization, dunno.
> >
> > > I think we want the key to track both uuid/value pairs (you may be
> > > thinking that already, but it wasn't 100% clear to me from your
> > > text above). Given that we want a deterministic rule for picking
> > > which value based on uuid if the key maps to more than one uuid/value
> > > "pair", an hmap may be the answer, but I'll look around some more
> > > to see if another structure strikes me as being a better candidate.
> >
> > Here's a summary of the data structure model I have in mind.  It is
> > really not so far from what you've implemented already:
> >
> >         one-to-many map from uuid to ovn_flow
> >         one-to-many map from (table_id,priority,match) to ovn_flow
> >
> > where the latter map should ordinarily be one-to-one in steady state but
> > might have duplicates either intermittently while doing adds and removes
> > within a given iteration of the poll loop or more permanently if there's
> > a bug in, e.g., the logical flow table output by ovn-northd.
> >
> > > For now, I think UUID ordering is a fine rule for breaking ties
> > > between uuid/value "pairs".
> >
> > Seems fine to start.
> >
> > > Obviously, this will take a bit of recoding, but I like this direction,
> > > because I've had some of these items on my "how to address that in the
> > > next pass" list and now they will be addressed...
> >
> > Great.
> 
> Surprise!  It turned out to be simpler than I thought it would be, since
> (as you said) it wasn't really that far from what was already there.
> 
> Anyway, v22 is posted and I believe it does everything we discussed
> above...

Great, thanks!
diff mbox

Patch

diff --git a/ovn/controller/lflow.c b/ovn/controller/lflow.c
index 7eebac8..6e2d5e3 100644
--- a/ovn/controller/lflow.c
+++ b/ovn/controller/lflow.c
@@ -343,13 +343,13 @@  is_switch(const struct sbrec_datapath_binding *ldp)
 
 }
 
-/* Adds the logical flows from the Logical_Flow table to 'flow_table'. */
+/* Adds the logical flows from the Logical_Flow table to flow tables. */
 static void
 add_logical_flows(struct controller_ctx *ctx, const struct lport_index *lports,
                   const struct mcgroup_index *mcgroups,
                   const struct hmap *local_datapaths,
                   const struct hmap *patched_datapaths,
-                  const struct simap *ct_zones, struct hmap *flow_table)
+                  const struct simap *ct_zones)
 {
     uint32_t conj_id_ofs = 1;
 
@@ -489,8 +489,8 @@  add_logical_flows(struct controller_ctx *ctx, const struct lport_index *lports,
                 m->match.flow.conj_id += conj_id_ofs;
             }
             if (!m->n) {
-                ofctrl_add_flow(flow_table, ptable, lflow->priority,
-                                &m->match, &ofpacts);
+                ofctrl_add_flow(ptable, lflow->priority, &m->match, &ofpacts,
+                                &lflow->header_.uuid, true);
             } else {
                 uint64_t conj_stubs[64 / 8];
                 struct ofpbuf conj;
@@ -505,8 +505,8 @@  add_logical_flows(struct controller_ctx *ctx, const struct lport_index *lports,
                     dst->clause = src->clause;
                     dst->n_clauses = src->n_clauses;
                 }
-                ofctrl_add_flow(flow_table, ptable, lflow->priority,
-                                &m->match, &conj);
+                ofctrl_add_flow(ptable, lflow->priority, &m->match, &conj,
+                                &lflow->header_.uuid, true);
                 ofpbuf_uninit(&conj);
             }
         }
@@ -533,12 +533,12 @@  put_load(const uint8_t *data, size_t len,
     bitwise_one(&sf->mask, sf->field->n_bytes, ofs, n_bits);
 }
 
-/* Adds an OpenFlow flow to 'flow_table' for each MAC binding in the OVN
+/* Adds an OpenFlow flow to flow tables for each MAC binding in the OVN
  * southbound database, using 'lports' to resolve logical port names to
  * numbers. */
 static void
 add_neighbor_flows(struct controller_ctx *ctx,
-                   const struct lport_index *lports, struct hmap *flow_table)
+                   const struct lport_index *lports)
 {
     struct ofpbuf ofpacts;
     struct match match;
@@ -574,8 +574,8 @@  add_neighbor_flows(struct controller_ctx *ctx,
         ofpbuf_clear(&ofpacts);
         put_load(mac.ea, sizeof mac.ea, MFF_ETH_DST, 0, 48, &ofpacts);
 
-        ofctrl_add_flow(flow_table, OFTABLE_MAC_BINDING, 100,
-                        &match, &ofpacts);
+        ofctrl_add_flow(OFTABLE_MAC_BINDING, 100, &match, &ofpacts,
+                        &b->header_.uuid, true);
     }
     ofpbuf_uninit(&ofpacts);
 }
@@ -587,12 +587,12 @@  lflow_run(struct controller_ctx *ctx, const struct lport_index *lports,
           const struct mcgroup_index *mcgroups,
           const struct hmap *local_datapaths,
           const struct hmap *patched_datapaths,
-          const struct simap *ct_zones, struct hmap *flow_table)
+          const struct simap *ct_zones)
 {
     update_address_sets(ctx);
     add_logical_flows(ctx, lports, mcgroups, local_datapaths,
-                      patched_datapaths, ct_zones, flow_table);
-    add_neighbor_flows(ctx, lports, flow_table);
+                      patched_datapaths, ct_zones);
+    add_neighbor_flows(ctx, lports);
 }
 
 void
diff --git a/ovn/controller/lflow.h b/ovn/controller/lflow.h
index a3fc50c..8f8f81a 100644
--- a/ovn/controller/lflow.h
+++ b/ovn/controller/lflow.h
@@ -63,8 +63,7 @@  void lflow_run(struct controller_ctx *, const struct lport_index *,
                const struct mcgroup_index *,
                const struct hmap *local_datapaths,
                const struct hmap *patched_datapaths,
-               const struct simap *ct_zones,
-               struct hmap *flow_table);
+               const struct simap *ct_zones);
 void lflow_destroy(void);
 
 #endif /* ovn/lflow.h */
diff --git a/ovn/controller/ofctrl.c b/ovn/controller/ofctrl.c
index f537bc0..06d118c 100644
--- a/ovn/controller/ofctrl.c
+++ b/ovn/controller/ofctrl.c
@@ -16,7 +16,9 @@ 
 #include <config.h>
 #include "byte-order.h"
 #include "dirs.h"
+#include "flow.h"
 #include "hash.h"
+#include "hindex.h"
 #include "hmap.h"
 #include "ofctrl.h"
 #include "openflow/openflow.h"
@@ -40,19 +42,23 @@  VLOG_DEFINE_THIS_MODULE(ofctrl);
 /* An OpenFlow flow. */
 struct ovn_flow {
     /* Key. */
-    struct hmap_node hmap_node;
+    struct hmap_node match_hmap_node; /* For match based hashing. */
+    struct hindex_node uuid_hmap_node; /* For uuid based hashing. */
     uint8_t table_id;
     uint16_t priority;
-    struct match match;
+    struct uuid uuid;
 
     /* Data. */
+    struct match match;
     struct ofpact *ofpacts;
     size_t ofpacts_len;
 };
 
-static uint32_t ovn_flow_hash(const struct ovn_flow *);
-static struct ovn_flow *ovn_flow_lookup(struct hmap *flow_table,
-                                        const struct ovn_flow *target);
+static uint32_t ovn_flow_match_hash(const struct ovn_flow *);
+static struct ovn_flow *ovn_flow_lookup_by_uuid(struct hindex *,
+    const struct ovn_flow *target);
+static struct ovn_flow *ovn_flow_lookup_by_match(struct hmap *,
+    const struct ovn_flow *target);
 static char *ovn_flow_to_string(const struct ovn_flow *);
 static void ovn_flow_log(const struct ovn_flow *, const char *action);
 static void ovn_flow_destroy(struct ovn_flow *);
@@ -100,11 +106,13 @@  static struct hmap installed_flows;
  * S_CLEAR_FLOWS or S_UPDATE_FLOWS, this is really the option we have. */
 static enum mf_field_id mff_ovn_geneve;
 
-static void ovn_flow_table_clear(struct hmap *flow_table);
-static void ovn_flow_table_destroy(struct hmap *flow_table);
+static void ovn_flow_table_destroy(void);
 
 static void ofctrl_recv(const struct ofp_header *, enum ofptype);
 
+static struct hmap match_flow_table = HMAP_INITIALIZER(&match_flow_table);
+static struct hindex uuid_flow_table = HINDEX_INITIALIZER(&uuid_flow_table);
+
 void
 ofctrl_init(void)
 {
@@ -313,7 +321,7 @@  run_S_CLEAR_FLOWS(void)
     VLOG_DBG("clearing all flows");
 
     /* Clear installed_flows, to match the state of the switch. */
-    ovn_flow_table_clear(&installed_flows);
+    ovn_flow_table_clear();
 
     state = S_UPDATE_FLOWS;
 }
@@ -431,7 +439,7 @@  void
 ofctrl_destroy(void)
 {
     rconn_destroy(swconn);
-    ovn_flow_table_destroy(&installed_flows);
+    ovn_flow_table_destroy();
     rconn_packet_counter_destroy(tx_counter);
 }
 
@@ -464,63 +472,132 @@  ofctrl_recv(const struct ofp_header *oh, enum ofptype type)
     }
 }
 
-/* Flow table interface to the rest of ovn-controller. */
+/* Flow table interfaces to the rest of ovn-controller. */
 
-/* Adds a flow to 'desired_flows' with the specified 'match' and 'actions' to
+/* Adds a flow to flow tables with the specified 'match' and 'actions' to
  * the OpenFlow table numbered 'table_id' with the given 'priority'.  The
  * caller retains ownership of 'match' and 'actions'.
  *
- * This just assembles the desired flow table in memory.  Nothing is actually
+ * Because it is possible for both actions and matches to change on a rule,
+ * and because the hmap struct only supports a single hash, this method
+ * uses two hashing mechanisms - a map that uses table_id+priority+matches
+ * for its hash and an index that uses the appropriate ovsdb row UUID
+ * alone.
+ *
+ * This just assembles the desired flow tables in memory.  Nothing is actually
  * sent to the switch until a later call to ofctrl_run().
  *
- * The caller should initialize its own hmap to hold the flows. */
+ * The is_new boolean is used when tracking changes to indicate whether
+ * or not the logical flow driving the flow to be added is a new one or
+ * previously existed and has been modified. */
 void
-ofctrl_add_flow(struct hmap *desired_flows,
-                uint8_t table_id, uint16_t priority,
-                const struct match *match, const struct ofpbuf *actions)
+ofctrl_add_flow(uint8_t table_id, uint16_t priority,
+                const struct match *match, const struct ofpbuf *actions,
+                const struct uuid *uuid, bool is_new)
 {
+    /* Structure that uses table_id+priority+various things as hashes. */
     struct ovn_flow *f = xmalloc(sizeof *f);
     f->table_id = table_id;
     f->priority = priority;
     f->match = *match;
     f->ofpacts = xmemdup(actions->data, actions->size);
     f->ofpacts_len = actions->size;
-    f->hmap_node.hash = ovn_flow_hash(f);
+    memcpy(&f->uuid, uuid, sizeof f->uuid);
+    f->match_hmap_node.hash = ovn_flow_match_hash(f);
+    f->uuid_hmap_node.hash = uuid_hash(&f->uuid);
 
-    if (ovn_flow_lookup(desired_flows, f)) {
-        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 5);
-        if (!VLOG_DROP_INFO(&rl)) {
-            char *s = ovn_flow_to_string(f);
-            VLOG_INFO("dropping duplicate flow: %s", s);
-            free(s);
+    if (!is_new) {
+        struct ovn_flow *d = ovn_flow_lookup_by_match(&match_flow_table, f);
+        if (!d) {
+            d = ovn_flow_lookup_by_uuid(&uuid_flow_table, f);
         }
 
-        ovn_flow_destroy(f);
-        return;
+        if (d) {
+            hmap_remove(&match_flow_table, &d->match_hmap_node);
+            hindex_remove(&uuid_flow_table, &d->uuid_hmap_node);
+            ovn_flow_destroy(d);
+        }
+    } else {
+        /* This is an insert operation, so check to see if this
+         * is a duplicate via the match hash.  If so, then
+         * check if the actions have changed.  If it is a complete
+         * duplicate (i.e. the actions are the same) drop the new
+         * flow. If not, then drop the old flow as superseded.
+         * If the new rule is not a duplicate, check the action
+         * hash to see if this flow is superseding a previous
+         * flow and if so, drop the old flow and insert the
+         * new one. */
+
+        struct ovn_flow *d = ovn_flow_lookup_by_match(&match_flow_table, f);
+
+        if (d) {
+            if (ofpacts_equal(f->ofpacts, f->ofpacts_len,
+                              d->ofpacts, d->ofpacts_len)) {
+                ovn_flow_destroy(f);
+                return;
+            }
+            hmap_remove(&match_flow_table, &d->match_hmap_node);
+            hindex_remove(&uuid_flow_table, &d->uuid_hmap_node);
+            ovn_flow_destroy(d);
+        }
     }
+    hmap_insert(&match_flow_table, &f->match_hmap_node,
+                f->match_hmap_node.hash);
+    hindex_insert(&uuid_flow_table, &f->uuid_hmap_node,
+                f->uuid_hmap_node.hash);
+}
 
-    hmap_insert(desired_flows, &f->hmap_node, f->hmap_node.hash);
+/* Removes a bundles of flows from the flow table. */
+
+void
+ofctrl_remove_flows(const struct uuid *uuid)
+{
+    struct ovn_flow *f, *next;
+    HMAP_FOR_EACH_SAFE (f, next, match_hmap_node, &match_flow_table) {
+        if (uuid_equals(&f->uuid, uuid)) {
+            hmap_remove(&match_flow_table, &f->match_hmap_node);
+            hindex_remove(&uuid_flow_table, &f->uuid_hmap_node);
+            ovn_flow_destroy(f);
+        }
+    }
 }
+
 
 /* ovn_flow. */
 
-/* Returns a hash of the key in 'f'. */
+/* Duplicate an ovn_flow structure. */
+struct ovn_flow *
+ofctrl_dup_flow(struct ovn_flow *source)
+{
+    struct ovn_flow *answer = xmalloc(sizeof *answer);
+    answer->table_id = source->table_id;
+    answer->priority = source->priority;
+    answer->match = source->match;
+    answer->ofpacts = xmemdup(source->ofpacts, source->ofpacts_len);
+    answer->ofpacts_len = source->ofpacts_len;
+    answer->uuid = source->uuid;
+    answer->match_hmap_node.hash = ovn_flow_match_hash(answer);
+    answer->uuid_hmap_node.hash = uuid_hash(&source->uuid);
+    return answer;
+}
+
+/* Returns a hash of the match key in 'f'. */
 static uint32_t
-ovn_flow_hash(const struct ovn_flow *f)
+ovn_flow_match_hash(const struct ovn_flow *f)
 {
     return hash_2words((f->table_id << 16) | f->priority,
                        match_hash(&f->match, 0));
-
 }
 
 /* Finds and returns an ovn_flow in 'flow_table' whose key is identical to
- * 'target''s key, or NULL if there is none. */
+ * 'target''s key, or NULL if there is none, using the match hashmap. */
 static struct ovn_flow *
-ovn_flow_lookup(struct hmap *flow_table, const struct ovn_flow *target)
+ovn_flow_lookup_by_match(struct hmap* flow_table,
+                         const struct ovn_flow *target)
 {
     struct ovn_flow *f;
 
-    HMAP_FOR_EACH_WITH_HASH (f, hmap_node, target->hmap_node.hash,
+    HMAP_FOR_EACH_WITH_HASH (f, match_hmap_node, target->match_hmap_node.hash,
                              flow_table) {
         if (f->table_id == target->table_id
             && f->priority == target->priority
@@ -531,6 +608,50 @@  ovn_flow_lookup(struct hmap *flow_table, const struct ovn_flow *target)
     return NULL;
 }
 
+/* Find the first flow that the target flow can replace.  The target flow
+ * can replace a flow in the flow table that:
+ * 1) has the same uuid as the target flow (i.e. comes from the same
+ *    ovsdb table row),
+ * 2) has the same table_id and priority as the target flow,
+ * 3) either has the same match criteria as the target flow, or
+      has a match criteria that is either a proper subset or proper
+      superset of the target flow.
+ *
+ * The reason for the proper set and proper subet part of condition 3
+ * is that adding/removing logical flow criteria will lead to flows
+ * being either more or less specific and we want to find and replace
+ * the right flow to maintain correct behavior. */
+static struct ovn_flow *
+ovn_flow_lookup_by_uuid(struct hindex* flow_table,
+                          const struct ovn_flow *target)
+{
+    struct ovn_flow *f;
+
+    HINDEX_FOR_EACH_WITH_HASH (f, uuid_hmap_node,
+                               target->uuid_hmap_node.hash, flow_table) {
+        if (f->table_id == target->table_id
+            && f->priority == target->priority
+            && (match_equal(&f->match, &target->match)
+                || ((flow_wildcards_has_extra(&f->match.wc,
+                                              &target->match.wc)
+                     && flow_equal_except(&f->match.flow,
+                                          &target->match.flow,
+                                          &f->match.wc)
+                     && !flow_wildcards_has_extra(&target->match.wc,
+                                                  &f->match.wc))
+                    || (flow_wildcards_has_extra(&target->match.wc,
+                                                 &f->match.wc)
+                        && flow_equal_except(&target->match.flow,
+                                             &f->match.flow,
+                                             &target->match.wc)
+                        && !flow_wildcards_has_extra(&f->match.wc,
+                                                     &target->match.wc))))) {
+            return f;
+        }
+    }
+    return NULL;
+}
+
 static char *
 ovn_flow_to_string(const struct ovn_flow *f)
 {
@@ -558,26 +679,29 @@  ovn_flow_destroy(struct ovn_flow *f)
 {
     if (f) {
         free(f->ofpacts);
-        free(f);
     }
+    free(f);
 }
 
 /* Flow tables of struct ovn_flow. */
 
-static void
-ovn_flow_table_clear(struct hmap *flow_table)
+void
+ovn_flow_table_clear(void)
 {
-    struct ovn_flow *f;
-    HMAP_FOR_EACH_POP (f, hmap_node, flow_table) {
+    struct ovn_flow *f, *next;
+    HMAP_FOR_EACH_SAFE (f, next, match_hmap_node, &match_flow_table) {
+        hmap_remove(&match_flow_table, &f->match_hmap_node);
+        hindex_remove(&uuid_flow_table, &f->uuid_hmap_node);
         ovn_flow_destroy(f);
     }
 }
 
 static void
-ovn_flow_table_destroy(struct hmap *flow_table)
+ovn_flow_table_destroy(void)
 {
-    ovn_flow_table_clear(flow_table);
-    hmap_destroy(flow_table);
+    ovn_flow_table_clear();
+    hmap_destroy(&match_flow_table);
+    hindex_destroy(&uuid_flow_table);
 }
 
 /* Flow table update. */
@@ -597,19 +721,16 @@  queue_flow_mod(struct ofputil_flow_mod *fm)
  * flows from 'flow_table' and frees them.  (The hmap itself isn't
  * destroyed.)
  *
- * This called be called be ofctrl_run() within the main loop. */
+ * This can be called by ofctrl_run() within the main loop. */
 void
-ofctrl_put(struct hmap *flow_table)
+ofctrl_put(void)
 {
     /* The flow table can be updated if the connection to the switch is up and
      * in the correct state and not backlogged with existing flow_mods.  (Our
      * criteria for being backlogged appear very conservative, but the socket
-     * between ovn-controller and OVS provides some buffering.)  Otherwise,
-     * discard the flows.  A solution to either of those problems will cause us
-     * to wake up and retry. */
+     * between ovn-controller and OVS provides some buffering.) */
     if (state != S_UPDATE_FLOWS
         || rconn_packet_counter_n_packets(tx_counter)) {
-        ovn_flow_table_clear(flow_table);
         return;
     }
 
@@ -617,8 +738,8 @@  ofctrl_put(struct hmap *flow_table)
      * longer desired, delete them; if any of them should have different
      * actions, update them. */
     struct ovn_flow *i, *next;
-    HMAP_FOR_EACH_SAFE (i, next, hmap_node, &installed_flows) {
-        struct ovn_flow *d = ovn_flow_lookup(flow_table, i);
+    HMAP_FOR_EACH_SAFE (i, next, match_hmap_node, &installed_flows) {
+        struct ovn_flow *d = ovn_flow_lookup_by_match(&match_flow_table, i);
         if (!d) {
             /* Installed flow is no longer desirable.  Delete it from the
              * switch and from installed_flows. */
@@ -629,9 +750,9 @@  ofctrl_put(struct hmap *flow_table)
                 .command = OFPFC_DELETE_STRICT,
             };
             queue_flow_mod(&fm);
-            ovn_flow_log(i, "removing");
+            ovn_flow_log(i, "removing installed");
 
-            hmap_remove(&installed_flows, &i->hmap_node);
+            hmap_remove(&installed_flows, &i->match_hmap_node);
             ovn_flow_destroy(i);
         } else {
             if (!ofpacts_equal(i->ofpacts, i->ofpacts_len,
@@ -646,40 +767,38 @@  ofctrl_put(struct hmap *flow_table)
                     .command = OFPFC_MODIFY_STRICT,
                 };
                 queue_flow_mod(&fm);
-                ovn_flow_log(i, "updating");
+                ovn_flow_log(i, "updating installed");
 
                 /* Replace 'i''s actions by 'd''s. */
                 free(i->ofpacts);
-                i->ofpacts = d->ofpacts;
+                i->ofpacts = xmemdup(d->ofpacts, d->ofpacts_len);
                 i->ofpacts_len = d->ofpacts_len;
-                d->ofpacts = NULL;
-                d->ofpacts_len = 0;
             }
-
-            hmap_remove(flow_table, &d->hmap_node);
-            ovn_flow_destroy(d);
         }
     }
 
-    /* The previous loop removed from 'flow_table' all of the flows that are
-     * already installed.  Thus, any flows remaining in 'flow_table' need to
-     * be added to the flow table. */
+    /* Iterate through the new flows and add those that aren't found
+     * in the installed flow table. */
     struct ovn_flow *d;
-    HMAP_FOR_EACH_SAFE (d, next, hmap_node, flow_table) {
-        /* Send flow_mod to add flow. */
-        struct ofputil_flow_mod fm = {
-            .match = d->match,
-            .priority = d->priority,
-            .table_id = d->table_id,
-            .ofpacts = d->ofpacts,
-            .ofpacts_len = d->ofpacts_len,
-            .command = OFPFC_ADD,
-        };
-        queue_flow_mod(&fm);
-        ovn_flow_log(d, "adding");
-
-        /* Move 'd' from 'flow_table' to installed_flows. */
-        hmap_remove(flow_table, &d->hmap_node);
-        hmap_insert(&installed_flows, &d->hmap_node, d->hmap_node.hash);
+    HMAP_FOR_EACH_SAFE (d, next, match_hmap_node, &match_flow_table) {
+        struct ovn_flow *i = ovn_flow_lookup_by_match(&installed_flows, d);
+        if (!i) {
+            /* Send flow_mod to add flow. */
+            struct ofputil_flow_mod fm = {
+                .match = d->match,
+                .priority = d->priority,
+                .table_id = d->table_id,
+                .ofpacts = d->ofpacts,
+                .ofpacts_len = d->ofpacts_len,
+                .command = OFPFC_ADD,
+            };
+            queue_flow_mod(&fm);
+            ovn_flow_log(d, "adding installed");
+
+            /* Copy 'd' from 'flow_table' to installed_flows. */
+            struct ovn_flow *new_node = ofctrl_dup_flow(d);
+            hmap_insert(&installed_flows, &new_node->match_hmap_node,
+                        new_node->match_hmap_node.hash);
+        }
     }
 }
diff --git a/ovn/controller/ofctrl.h b/ovn/controller/ofctrl.h
index bc9cfba..d594be7 100644
--- a/ovn/controller/ofctrl.h
+++ b/ovn/controller/ofctrl.h
@@ -20,6 +20,7 @@ 
 #include <stdint.h>
 
 #include "openvswitch/meta-flow.h"
+#include "ovsdb-idl.h"
 
 struct controller_ctx;
 struct hmap;
@@ -30,12 +31,21 @@  struct ovsrec_bridge;
 /* Interface for OVN main loop. */
 void ofctrl_init(void);
 enum mf_field_id ofctrl_run(const struct ovsrec_bridge *br_int);
-void ofctrl_put(struct hmap *flows);
+void ofctrl_put(void);
 void ofctrl_wait(void);
 void ofctrl_destroy(void);
 
-/* Flow table interface to the rest of ovn-controller. */
-void ofctrl_add_flow(struct hmap *flows, uint8_t table_id, uint16_t priority,
-                     const struct match *, const struct ofpbuf *ofpacts);
+struct ovn_flow *ofctrl_dup_flow(struct ovn_flow *source);
+
+/* Flow table interfaces to the rest of ovn-controller. */
+void ofctrl_add_flow(uint8_t table_id, uint16_t priority,
+                     const struct match *, const struct ofpbuf *ofpacts,
+                     const struct uuid *uuid, bool is_new);
+
+void ofctrl_remove_flows(const struct uuid *uuid);
+
+void ofctrl_flow_table_clear(void);
+
+void ovn_flow_table_clear(void);
 
 #endif /* ovn/ofctrl.h */
diff --git a/ovn/controller/ovn-controller.c b/ovn/controller/ovn-controller.c
index 3257eab..6a8ff9b 100644
--- a/ovn/controller/ovn-controller.c
+++ b/ovn/controller/ovn-controller.c
@@ -433,16 +433,15 @@  main(int argc, char *argv[])
             update_ct_zones(&all_lports, &patched_datapaths, &ct_zones,
                             ct_zone_bitmap);
 
-            struct hmap flow_table = HMAP_INITIALIZER(&flow_table);
+            ovn_flow_table_clear();
             lflow_run(&ctx, &lports, &mcgroups, &local_datapaths,
-                      &patched_datapaths, &ct_zones, &flow_table);
+                      &patched_datapaths, &ct_zones);
             if (chassis_id) {
                 physical_run(&ctx, mff_ovn_geneve,
-                             br_int, chassis_id, &ct_zones, &flow_table,
+                             br_int, chassis_id, &ct_zones,
                              &local_datapaths, &patched_datapaths);
             }
-            ofctrl_put(&flow_table);
-            hmap_destroy(&flow_table);
+            ofctrl_put();
         }
 
         sset_destroy(&all_lports);
diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c
index d7637f2..2b5ef0a 100644
--- a/ovn/controller/physical.c
+++ b/ovn/controller/physical.c
@@ -51,6 +51,9 @@  physical_register_ovs_idl(struct ovsdb_idl *ovs_idl)
     ovsdb_idl_add_column(ovs_idl, &ovsrec_interface_col_external_ids);
 }
 
+/* UUID to identify OF flows not associated with ovsdb rows. */
+static struct uuid *hc_uuid = NULL;
+
 /* Maps from a chassis to the OpenFlow port number of the tunnel that can be
  * used to reach that chassis. */
 struct chassis_tunnel {
@@ -149,11 +152,15 @@  get_localnet_port(struct hmap *local_datapaths, int64_t tunnel_key)
 void
 physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
              const struct ovsrec_bridge *br_int, const char *this_chassis_id,
-             const struct simap *ct_zones, struct hmap *flow_table,
+             const struct simap *ct_zones,
              struct hmap *local_datapaths, struct hmap *patched_datapaths)
 {
     struct simap localvif_to_ofport = SIMAP_INITIALIZER(&localvif_to_ofport);
     struct hmap tunnels = HMAP_INITIALIZER(&tunnels);
+    if (!hc_uuid) {
+        hc_uuid = xmalloc(sizeof(struct uuid));
+        uuid_generate(hc_uuid);
+    }
 
     for (int i = 0; i < br_int->n_ports; i++) {
         const struct ovsrec_port *port_rec = br_int->ports[i];
@@ -398,8 +405,9 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
 
             /* Resubmit to first logical ingress pipeline table. */
             put_resubmit(OFTABLE_LOG_INGRESS_PIPELINE, &ofpacts);
-            ofctrl_add_flow(flow_table, OFTABLE_PHY_TO_LOG,
-                            tag ? 150 : 100, &match, &ofpacts);
+            ofctrl_add_flow(OFTABLE_PHY_TO_LOG,
+                            tag ? 150 : 100, &match, &ofpacts,
+                            &binding->header_.uuid, true);
 
             if (!tag && (!strcmp(binding->type, "localnet")
                          || !strcmp(binding->type, "l2gateway"))) {
@@ -408,7 +416,8 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
                  * action. */
                 ofpbuf_pull(&ofpacts, ofpacts_orig_size);
                 match_set_dl_tci_masked(&match, 0, htons(VLAN_CFI));
-                ofctrl_add_flow(flow_table, 0, 100, &match, &ofpacts);
+                ofctrl_add_flow(0, 100, &match, &ofpacts,
+                            &binding->header_.uuid, true);
             }
 
             /* Table 33, priority 100.
@@ -438,8 +447,8 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
 
             /* Resubmit to table 34. */
             put_resubmit(OFTABLE_DROP_LOOPBACK, &ofpacts);
-            ofctrl_add_flow(flow_table, OFTABLE_LOCAL_OUTPUT, 100, &match,
-                            &ofpacts);
+            ofctrl_add_flow(OFTABLE_LOCAL_OUTPUT, 100, &match, &ofpacts,
+                            &binding->header_.uuid, true);
 
             /* Table 34, Priority 100.
              * =======================
@@ -450,8 +459,8 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
             match_set_metadata(&match, htonll(dp_key));
             match_set_reg(&match, MFF_LOG_INPORT - MFF_REG0, port_key);
             match_set_reg(&match, MFF_LOG_OUTPORT - MFF_REG0, port_key);
-            ofctrl_add_flow(flow_table, OFTABLE_DROP_LOOPBACK, 100,
-                            &match, &ofpacts);
+            ofctrl_add_flow(OFTABLE_DROP_LOOPBACK, 100, &match, &ofpacts,
+                            &binding->header_.uuid, true);
 
             /* Table 64, Priority 100.
              * =======================
@@ -485,8 +494,8 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
                 ofpact_put_STRIP_VLAN(&ofpacts);
                 put_stack(MFF_IN_PORT, ofpact_put_STACK_POP(&ofpacts));
             }
-            ofctrl_add_flow(flow_table, OFTABLE_LOG_TO_PHY, 100,
-                            &match, &ofpacts);
+            ofctrl_add_flow(OFTABLE_LOG_TO_PHY, 100, &match, &ofpacts,
+                            &binding->header_.uuid, true);
         } else if (!tun) {
             /* Remote port connected by localnet port */
             /* Table 33, priority 100.
@@ -508,8 +517,8 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
 
             /* Resubmit to table 33. */
             put_resubmit(OFTABLE_LOCAL_OUTPUT, &ofpacts);
-            ofctrl_add_flow(flow_table, OFTABLE_LOCAL_OUTPUT, 100, &match,
-                            &ofpacts);
+            ofctrl_add_flow(OFTABLE_LOCAL_OUTPUT, 100, &match, &ofpacts,
+                            &binding->header_.uuid, true);
         } else {
             /* Remote port connected by tunnel */
             /* Table 32, priority 100.
@@ -532,8 +541,8 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
 
             /* Output to tunnel. */
             ofpact_put_OUTPUT(&ofpacts)->port = ofport;
-            ofctrl_add_flow(flow_table, OFTABLE_REMOTE_OUTPUT, 100,
-                            &match, &ofpacts);
+            ofctrl_add_flow(OFTABLE_REMOTE_OUTPUT, 100, &match, &ofpacts,
+                            &binding->header_.uuid, true);
         }
     }
 
@@ -608,8 +617,8 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
              * group as the logical output port. */
             put_load(mc->tunnel_key, MFF_LOG_OUTPORT, 0, 32, &ofpacts);
 
-            ofctrl_add_flow(flow_table, OFTABLE_LOCAL_OUTPUT, 100,
-                            &match, &ofpacts);
+            ofctrl_add_flow(OFTABLE_LOCAL_OUTPUT, 100, &match, &ofpacts,
+                            &mc->header_.uuid, true);
         }
 
         /* Table 32, priority 100.
@@ -646,8 +655,9 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
                 if (local_ports) {
                     put_resubmit(OFTABLE_LOCAL_OUTPUT, &remote_ofpacts);
                 }
-                ofctrl_add_flow(flow_table, OFTABLE_REMOTE_OUTPUT, 100,
-                                &match, &remote_ofpacts);
+                ofctrl_add_flow(OFTABLE_REMOTE_OUTPUT, 100,
+                                &match, &remote_ofpacts,
+                                &mc->header_.uuid, true);
             }
         }
         sset_destroy(&remote_chassis);
@@ -690,7 +700,8 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
 
         put_resubmit(OFTABLE_LOCAL_OUTPUT, &ofpacts);
 
-        ofctrl_add_flow(flow_table, OFTABLE_PHY_TO_LOG, 100, &match, &ofpacts);
+        ofctrl_add_flow(OFTABLE_PHY_TO_LOG, 100, &match, &ofpacts,
+                        hc_uuid, true);
     }
 
     /* Add flows for VXLAN encapsulations.  Due to the limited amount of
@@ -723,8 +734,8 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
             put_load(binding->tunnel_key, MFF_LOG_INPORT, 0, 15, &ofpacts);
             put_resubmit(OFTABLE_LOG_INGRESS_PIPELINE, &ofpacts);
 
-            ofctrl_add_flow(flow_table, OFTABLE_PHY_TO_LOG, 100, &match,
-                    &ofpacts);
+            ofctrl_add_flow(OFTABLE_PHY_TO_LOG, 100, &match, &ofpacts,
+                            hc_uuid, true);
         }
     }
 
@@ -737,7 +748,8 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
     match_init_catchall(&match);
     ofpbuf_clear(&ofpacts);
     put_resubmit(OFTABLE_LOCAL_OUTPUT, &ofpacts);
-    ofctrl_add_flow(flow_table, OFTABLE_REMOTE_OUTPUT, 0, &match, &ofpacts);
+    ofctrl_add_flow(OFTABLE_REMOTE_OUTPUT, 0, &match, &ofpacts,
+                    hc_uuid, true);
 
     /* Table 34, Priority 0.
      * =======================
@@ -751,7 +763,8 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
     MFF_LOG_REGS;
 #undef MFF_LOG_REGS
     put_resubmit(OFTABLE_LOG_EGRESS_PIPELINE, &ofpacts);
-    ofctrl_add_flow(flow_table, OFTABLE_DROP_LOOPBACK, 0, &match, &ofpacts);
+    ofctrl_add_flow(OFTABLE_DROP_LOOPBACK, 0, &match, &ofpacts,
+                    hc_uuid, true);
 
     ofpbuf_uninit(&ofpacts);
     simap_destroy(&localvif_to_ofport);
diff --git a/ovn/controller/physical.h b/ovn/controller/physical.h
index 2f8b58a..1f98f71 100644
--- a/ovn/controller/physical.h
+++ b/ovn/controller/physical.h
@@ -43,7 +43,7 @@  struct simap;
 void physical_register_ovs_idl(struct ovsdb_idl *);
 void physical_run(struct controller_ctx *, enum mf_field_id mff_ovn_geneve,
                   const struct ovsrec_bridge *br_int, const char *chassis_id,
-                  const struct simap *ct_zones, struct hmap *flow_table,
+                  const struct simap *ct_zones,
                   struct hmap *local_datapaths, struct hmap *patched_datapaths);
 
 #endif /* ovn/physical.h */