[ovs-dev,RFC,00/14] ovn-controller incremental processing.

Message ID	1532480380-97578-1-git-send-email-hzhou8@ebay.com
Headers	show Return-Path: <ovs-dev-bounces@openvswitch.org> From: Han Zhou <zhouhan@gmail.com> To: dev@openvswitch.org Date: Tue, 24 Jul 2018 17:59:26 -0700 Message-Id: <1532480380-97578-1-git-send-email-hzhou8@ebay.com> Subject: [ovs-dev] [RFC 00/14] ovn-controller incremental processing. Precedence: list MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org
Series	ovn-controller incremental processing. \| expand [ovs-dev,RFC,00/14] ovn-controller incremental processing. [ovs-dev,RFC,01/14] ovsdb-idlc.in: Support more interfaces for passing pointers of individual table… [ovs-dev,RFC,02/14] ovn-controller: Incremental processing engine [ovs-dev,RFC,03/14] ovn-controller: Track OVSDB changes [ovs-dev,RFC,04/14] ovn-controller: Initial use of incremental engine. [ovs-dev,RFC,05/14] ovn-controller: Incremental logical flow processing [ovs-dev,RFC,06/14] ovn-controller: runtime_data change handler for SB port-binding [ovs-dev,RFC,07/14] ovn-controller: port-binding incremental processing for physical flows [ovs-dev,RFC,08/14] ovn-controller: incremental processing for multicast group changes [ovs-dev,RFC,09/14] ovsdb-idl: Tracking - preserve data for deleted rows. [ovs-dev,RFC,10/14] ovn-controller: Split addr_sets from runtime_data. [ovs-dev,RFC,11/14] ovn-controller: Maintain resource references for logical flows. [ovs-dev,RFC,12/14] ovn-controller: Incremental processing for address-set changes. [ovs-dev,RFC,13/14] ovn-controller: Split port_groups from runtime_data. [ovs-dev,RFC,14/14] ovn-controller: Incremental processing for port-group changes.

Han Zhou July 25, 2018, 12:59 a.m. UTC

This patch series is the rebase of previous patch series [1] on top of master
where a series of changes that avoided passing ovsdb IDL (eaa4ead5) has been
merged. The major concern of the previous patches are the maintainability
of the dependencies. This patch series, thanks for the removal of passing
IDL directly, eliminates the access to any tables within any engine node
processing, and all dependencies are naturally exposed because otherwise
the code will either not pass compile or abort in the very first iteration,
except for one case - accessing data of other tables through references.

For exposing the dependencies introduced by reference access, it is a big
TODO item and it is the major reason this patch series is RFC only.

Other than this, there is another problem found while using ovsdb IDL index
to query chassis: current ovsdb IDL index is not updated for the changes made
in current transaction. It would be better if we fix the index implementation,
although this patch series worked around this problem by postpone some
processing to next iterations in such circumstances.

[1] https://mail.openvswitch.org/pipermail/ovs-dev/2018-June/347808.html

Han Zhou (14):
  ovsdb-idlc.in: Support more interfaces for passing pointers of
    individual tables.
  ovn-controller: Incremental processing engine
  ovn-controller: Track OVSDB changes
  ovn-controller: Initial use of incremental engine.
  ovn-controller: Incremental logical flow processing
  ovn-controller: runtime_data change handler for SB port-binding
  ovn-controller: port-binding incremental processing for physical flows
  ovn-controller: incremental processing for multicast group changes
  ovsdb-idl: Tracking - preserve data for deleted rows.
  ovn-controller: Split addr_sets from runtime_data.
  ovn-controller: Maintain resource references for logical flows.
  ovn-controller: Incremental processing for address-set changes.
  ovn-controller: Split port_groups from runtime_data.
  ovn-controller: Incremental processing for port-group changes.

 include/ovn/actions.h           |    3 +
 include/ovn/expr.h              |    5 +-
 lib/ovsdb-idl-provider.h        |    2 +
 lib/ovsdb-idl.c                 |   36 +-
 ovn/controller/bfd.c            |    4 +-
 ovn/controller/binding.c        |  108 ++-
 ovn/controller/binding.h        |    7 +
 ovn/controller/encaps.c         |   12 +-
 ovn/controller/lflow.c          |  428 +++++++++++-
 ovn/controller/lflow.h          |  100 ++-
 ovn/controller/ofctrl.c         |  262 +++++---
 ovn/controller/ofctrl.h         |   32 +-
 ovn/controller/ovn-controller.c | 1398 +++++++++++++++++++++++++++++++++------
 ovn/controller/physical.c       |  232 +++++--
 ovn/controller/physical.h       |   20 +-
 ovn/lib/actions.c               |    8 +-
 ovn/lib/automake.mk             |    4 +-
 ovn/lib/expr.c                  |   21 +-
 ovn/lib/extend-table.c          |   60 +-
 ovn/lib/extend-table.h          |   16 +-
 ovn/lib/inc-proc-eng.c          |  201 ++++++
 ovn/lib/inc-proc-eng.h          |  240 +++++++
 ovn/utilities/ovn-trace.c       |    2 +-
 ovsdb/ovsdb-idlc.in             |   25 +
 tests/ovn.at                    |   75 +++
 tests/test-ovn.c                |    7 +-
 26 files changed, 2877 insertions(+), 431 deletions(-)
 create mode 100644 ovn/lib/inc-proc-eng.c
 create mode 100644 ovn/lib/inc-proc-eng.h

Mark Michelson July 31, 2018, 1:47 p.m. UTC | #1

Hi Han,

I've given this patchset a look, and I was following along pretty well 
until I got to about patch 11. From that point on, I had to re-read the 
code more times than I care to admit before I finally understood what 
was going on :)

What you have is a structure (lflow_ref_list_node) that is 
simultaneously in two lists. These two lists are each the data in 
separate hmap nodes. The hmap nodes are in two separate hmaps. One hmap 
uses the reference type and name as a key, and the other uses the lflow 
UUID as a key. This way given an address set name, you can find the 
associated logical flow UUIDs in the ref_lflow_table. Or given a logical 
flow UUID, you can find address sets.

I'm wondering if this can be simplified somehow.

Right now, if logical flows change, the change handler has to update the 
ref_lflow_table so that address sets no longer reference that logical 
flow. If address sets change, then the lflow_ref_table is updated. In 
both cases consider_logical_flow() gets called and realigns the tables 
as appropriate.

The problem with this is that it reeks of cross-cutting concerns, and it 
seems like it wouldn't scale well (consider a 3- or 4-chain of 
dependencies). Ideally, the dependency chain would make sure that the 
change handler for logical flows only deals with logical flows, and the 
change handler for address sets only deals with address sets.

If we generalize things a bit, there are likely to be two ways 
dependencies manifest in the database. In this particular case, text in 
one row expands to data of a separate database row. The other case would 
be where a database row contains the UUID (or list of UUIDs) of other 
database rows.

For the textual case, I think the easiest way to handle this is to 
replace the text with what it expands to earlier than when we currently 
do it. Consider that a logical flow references address set $foo. 
Currently, the logical flow in the southbound database has the text 
"$foo" in it. If $foo were replaced with the actual addresses from the 
address set, then when an address set changes, the text of the logical 
flow would change as well, thus resulting in a direct change of the 
logical flow. A less disruptive version of this might be to use some 
reserved character automatically in the logical flow match followed by a 
sequence number. So for instance, if a logical flow were set up to 
reference address set $foo, then the actual logical flow might be 
something like $foo?1. Then if northbound address set foo changes, the 
logical flow could be updated to $foo?2 by ovn-northd. Again, the 
textual change in the logical flow would result in triggering the change 
tracker.

For the database referencing case, it would be nice if the IDL change 
tracking code could automatically do this for us. This way if record foo 
has a column that references row bar, then if bar changes, we would be 
told that foo also changed. This strikes me as difficult to implement 
and could result in some interesting dependency graphs within the IDL 
code though.

What do you think?

On 07/24/2018 08:59 PM, Han Zhou wrote:
> This patch series is the rebase of previous patch series [1] on top of master
> where a series of changes that avoided passing ovsdb IDL (eaa4ead5) has been
> merged. The major concern of the previous patches are the maintainability
> of the dependencies. This patch series, thanks for the removal of passing
> IDL directly, eliminates the access to any tables within any engine node
> processing, and all dependencies are naturally exposed because otherwise
> the code will either not pass compile or abort in the very first iteration,
> except for one case - accessing data of other tables through references.
> 
> For exposing the dependencies introduced by reference access, it is a big
> TODO item and it is the major reason this patch series is RFC only.
> 
> Other than this, there is another problem found while using ovsdb IDL index
> to query chassis: current ovsdb IDL index is not updated for the changes made
> in current transaction. It would be better if we fix the index implementation,
> although this patch series worked around this problem by postpone some
> processing to next iterations in such circumstances.
> 
> [1] https://mail.openvswitch.org/pipermail/ovs-dev/2018-June/347808.html
> 
> Han Zhou (14):
>    ovsdb-idlc.in: Support more interfaces for passing pointers of
>      individual tables.
>    ovn-controller: Incremental processing engine
>    ovn-controller: Track OVSDB changes
>    ovn-controller: Initial use of incremental engine.
>    ovn-controller: Incremental logical flow processing
>    ovn-controller: runtime_data change handler for SB port-binding
>    ovn-controller: port-binding incremental processing for physical flows
>    ovn-controller: incremental processing for multicast group changes
>    ovsdb-idl: Tracking - preserve data for deleted rows.
>    ovn-controller: Split addr_sets from runtime_data.
>    ovn-controller: Maintain resource references for logical flows.
>    ovn-controller: Incremental processing for address-set changes.
>    ovn-controller: Split port_groups from runtime_data.
>    ovn-controller: Incremental processing for port-group changes.
> 
>   include/ovn/actions.h           |    3 +
>   include/ovn/expr.h              |    5 +-
>   lib/ovsdb-idl-provider.h        |    2 +
>   lib/ovsdb-idl.c                 |   36 +-
>   ovn/controller/bfd.c            |    4 +-
>   ovn/controller/binding.c        |  108 ++-
>   ovn/controller/binding.h        |    7 +
>   ovn/controller/encaps.c         |   12 +-
>   ovn/controller/lflow.c          |  428 +++++++++++-
>   ovn/controller/lflow.h          |  100 ++-
>   ovn/controller/ofctrl.c         |  262 +++++---
>   ovn/controller/ofctrl.h         |   32 +-
>   ovn/controller/ovn-controller.c | 1398 +++++++++++++++++++++++++++++++++------
>   ovn/controller/physical.c       |  232 +++++--
>   ovn/controller/physical.h       |   20 +-
>   ovn/lib/actions.c               |    8 +-
>   ovn/lib/automake.mk             |    4 +-
>   ovn/lib/expr.c                  |   21 +-
>   ovn/lib/extend-table.c          |   60 +-
>   ovn/lib/extend-table.h          |   16 +-
>   ovn/lib/inc-proc-eng.c          |  201 ++++++
>   ovn/lib/inc-proc-eng.h          |  240 +++++++
>   ovn/utilities/ovn-trace.c       |    2 +-
>   ovsdb/ovsdb-idlc.in             |   25 +
>   tests/ovn.at                    |   75 +++
>   tests/test-ovn.c                |    7 +-
>   26 files changed, 2877 insertions(+), 431 deletions(-)
>   create mode 100644 ovn/lib/inc-proc-eng.c
>   create mode 100644 ovn/lib/inc-proc-eng.h
>

Han Zhou Aug. 5, 2018, 7:11 p.m. UTC | #2

Hi Mark,

Thanks for the review and very valuable comments! (I was on vacation last
week so sorry for slow response).

On Tue, Jul 31, 2018 at 3:47 AM, Mark Michelson <mmichels@redhat.com> wrote:
>
> Hi Han,
>
> I've given this patchset a look, and I was following along pretty well
until I got to about patch 11. From that point on, I had to re-read the
code more times than I care to admit before I finally understood what was
going on :)
>
> What you have is a structure (lflow_ref_list_node) that is simultaneously
in two lists. These two lists are each the data in separate hmap nodes. The
hmap nodes are in two separate hmaps. One hmap uses the reference type and
name as a key, and the other uses the lflow UUID as a key. This way given
an address set name, you can find the associated logical flow UUIDs in the
ref_lflow_table. Or given a logical flow UUID, you can find address sets.
>
> I'm wondering if this can be simplified somehow.
>
> Right now, if logical flows change, the change handler has to update the
ref_lflow_table so that address sets no longer reference that logical flow.
If address sets change, then the lflow_ref_table is updated. In both cases
consider_logical_flow() gets called and realigns the tables as appropriate.
>
> The problem with this is that it reeks of cross-cutting concerns, and it
seems like it wouldn't scale well (consider a 3- or 4-chain of
dependencies). Ideally, the dependency chain would make sure that the
change handler for logical flows only deals with logical flows, and the
change handler for address sets only deals with address sets.

I agree that maintaining the cross reference has some overhead, but I don't
see a scaling issue in this case. Adding entries to the cross reference
table is a by-product of the consider_logical_flow() when parsing the
lflow, and deleting the entries is also efficient with O(N), N = number of
address sets used by a lflow, which in most cases should be a very small
number (correct me if I am wrong). As to memory consumption, it maintains
only the mapping between resource names and lflow uuids, so I don't expect
it to be a significant cost either. Could you explain a little more about
the "3- or 4-chain of dependencies" example?

For the "cross-cutting concern", I don't see it that way. I see it as a
pattern of change handler implementation. In general, output of an engine
node is the result of a "join" operation of its inputs. When there are
multiple inputs and one of them changes, for a change handler to compute
the output incrementally, we will need to use the changed row to probe all
the other inputs to update the final output. For the address-set and
port-group handlers, it is join between two tables only, and the cross
reference table is built to make the probing efficient in change handler.
The cross reference table is also generalized so that any resources
referenced by logical flows can reuse the same data structure and
interfaces, and now it is reused by both address-sets and port-groups. We
can make it more generalized to be used for other mappings if needed.

If we really wants to make it more generalized, I think the answer is the
datalog approach. I would be great if it can be implemented that way, but I
am pessimistic for it to be applied to ovn-controller in a practical time
line, given that ovn-controller is more complex in terms of both data
sources and processing logic compared with ovn-northd. And I think it is
practical and simple to implement the probing for most frequent scenarios
as demonstrated by this RFC.

>
> If we generalize things a bit, there are likely to be two ways
dependencies manifest in the database. In this particular case, text in one
row expands to data of a separate database row. The other case would be
where a database row contains the UUID (or list of UUIDs) of other database
rows.
>
> For the textual case, I think the easiest way to handle this is to
replace the text with what it expands to earlier than when we currently do
it. Consider that a logical flow references address set $foo. Currently,
the logical flow in the southbound database has the text "$foo" in it. If
$foo were replaced with the actual addresses from the address set, then
when an address set changes, the text of the logical flow would change as
well, thus resulting in a direct change of the logical flow. A less
disruptive version of this might be to use some reserved character
automatically in the logical flow match followed by a sequence number. So
for instance, if a logical flow were set up to reference address set $foo,
then the actual logical flow might be something like $foo?1. Then if
northbound address set foo changes, the logical flow could be updated to
$foo?2 by ovn-northd. Again, the textual change in the logical flow would
result in triggering the change tracker.
>

This proposal is interesting and I think it is a valid alternative. It is
trying to implement the probing without maintaining a cross reference table
in ovn-controller. In fact it moves the effort of building the reference
table from ovn-controller to maintaining the sequence number for each
address-set/port-group resources in ovn-northd. I am just not sure if this
makes the system simpler or more complex. I will need to think more about
it.

> For the database referencing case, it would be nice if the IDL change
tracking code could automatically do this for us. This way if record foo
has a column that references row bar, then if bar changes, we would be told
that foo also changed. This strikes me as difficult to implement and could
result in some interesting dependency graphs within the IDL code though.
>
> What do you think?
>

For the database referencing case, it seems not directly related to your
concern regarding address sets handling (or port-group handling). Please
correct me if I misunderstood something here. But I agree with idea of
utilizing and improving the IDL capability to build the dependency graph
for table references, and this is exactly in my TODO as mentioned in the
cover letter:

"For exposing the dependencies introduced by reference access, it is a big
TODO item and it is the major reason this patch series is RFC only."

Thanks,
Han

Mark Michelson Aug. 6, 2018, 1:13 p.m. UTC | #3

Hi Han,

I thought about this more over the weekend, and I was hoping I'd get to 
respond to my own e-mail before you saw it, because I realized I had a 
fundamental misunderstanding of the scope and nature of change handlers. 
I'll reply to your comments in-line below.

On 08/05/2018 03:11 PM, Han Zhou wrote:
> Hi Mark,
> 
> Thanks for the review and very valuable comments! (I was on vacation 
> last week so sorry for slow response).
> 
> On Tue, Jul 31, 2018 at 3:47 AM, Mark Michelson <mmichels@redhat.com 
> <mailto:mmichels@redhat.com>> wrote:
>  >
>  > Hi Han,
>  >
>  > I've given this patchset a look, and I was following along pretty 
> well until I got to about patch 11. From that point on, I had to re-read 
> the code more times than I care to admit before I finally understood 
> what was going on :)
>  >
>  > What you have is a structure (lflow_ref_list_node) that is 
> simultaneously in two lists. These two lists are each the data in 
> separate hmap nodes. The hmap nodes are in two separate hmaps. One hmap 
> uses the reference type and name as a key, and the other uses the lflow 
> UUID as a key. This way given an address set name, you can find the 
> associated logical flow UUIDs in the ref_lflow_table. Or given a logical 
> flow UUID, you can find address sets.
>  >
>  > I'm wondering if this can be simplified somehow.
>  >
>  > Right now, if logical flows change, the change handler has to update 
> the ref_lflow_table so that address sets no longer reference that 
> logical flow. If address sets change, then the lflow_ref_table is 
> updated. In both cases consider_logical_flow() gets called and realigns 
> the tables as appropriate.
>  >
>  > The problem with this is that it reeks of cross-cutting concerns, and 
> it seems like it wouldn't scale well (consider a 3- or 4-chain of 
> dependencies). Ideally, the dependency chain would make sure that the 
> change handler for logical flows only deals with logical flows, and the 
> change handler for address sets only deals with address sets.
> 
> I agree that maintaining the cross reference has some overhead, but I 
> don't see a scaling issue in this case. Adding entries to the cross 
> reference table is a by-product of the consider_logical_flow() when 
> parsing the lflow, and deleting the entries is also efficient with O(N), 
> N = number of address sets used by a lflow, which in most cases should 
> be a very small number (correct me if I am wrong). As to memory 
> consumption, it maintains only the mapping between resource names and 
> lflow uuids, so I don't expect it to be a significant cost either. Could 
> you explain a little more about the "3- or 4-chain of dependencies" example?

My thought was along the lines that table A references table B, which 
references table C. A change in table A might result in a change to 
table B, which then results in a change to table C. In my head, I 
thought this would mean having to maintain a large series of hashmaps 
that cross-referenced each other. I realize this isn't correct, though, 
and that as long as the dependency chain is linear, this isn't any 
different from what you already have proposed here.

In reality, I can't think of any example changes in the southbound 
database that would cause such a series of events, but I may not be 
thinking hard enough :)

> 
> For the "cross-cutting concern", I don't see it that way. I see it as a 
> pattern of change handler implementation. In general, output of an 
> engine node is the result of a "join" operation of its inputs. When 
> there are multiple inputs and one of them changes, for a change handler 
> to compute the output incrementally, we will need to use the changed row 
> to probe all the other inputs to update the final output. For the 
> address-set and port-group handlers, it is join between two tables only, 
> and the cross reference table is built to make the probing efficient in 
> change handler. The cross reference table is also generalized so that 
> any resources referenced by logical flows can reuse the same data 
> structure and interfaces, and now it is reused by both address-sets and 
> port-groups. We can make it more generalized to be used for other 
> mappings if needed.

Yes, and this sort of thinking is what I had over the weekend that made 
me have a "Eureka!" moment and realize what I had been missing here.

I had been looking at the address set change handler and thinking of the 
change handler as being an "owner" of address set data. The reason is 
that the engine node is tied directly to updates to the address set 
table. It felt like it was overstepping its boundaries by then stepping 
through data that I thought was owned by the logical flow change 
handler. The fact that both change handlers acted on the same data just 
struck me as wrong.

However, I realized I need to stop thinking about data ownership in that 
sort of way when it comes to change handlers. Engine nodes do have 
ownership like I was imagining in their run() method, but change 
handlers are very different. They are responsible for analyzing more 
than just the data that has changed, but also for analyzing the 
relationship that data has with other data. That other data may or may 
not be tied to other engine nodes.

> 
> If we really wants to make it more generalized, I think the answer is 
> the datalog approach. I would be great if it can be implemented that 
> way, but I am pessimistic for it to be applied to ovn-controller in a 
> practical time line, given that ovn-controller is more complex in terms 
> of both data sources and processing logic compared with ovn-northd. And 
> I think it is practical and simple to implement the probing for most 
> frequent scenarios as demonstrated by this RFC.

I agree. I don't think datalog is the correct approach for ovn-controller.

> 
>  >
>  > If we generalize things a bit, there are likely to be two ways 
> dependencies manifest in the database. In this particular case, text in 
> one row expands to data of a separate database row. The other case would 
> be where a database row contains the UUID (or list of UUIDs) of other 
> database rows.
>  >
>  > For the textual case, I think the easiest way to handle this is to 
> replace the text with what it expands to earlier than when we currently 
> do it. Consider that a logical flow references address set $foo. 
> Currently, the logical flow in the southbound database has the text 
> "$foo" in it. If $foo were replaced with the actual addresses from the 
> address set, then when an address set changes, the text of the logical 
> flow would change as well, thus resulting in a direct change of the 
> logical flow. A less disruptive version of this might be to use some 
> reserved character automatically in the logical flow match followed by a 
> sequence number. So for instance, if a logical flow were set up to 
> reference address set $foo, then the actual logical flow might be 
> something like $foo?1. Then if northbound address set foo changes, the 
> logical flow could be updated to $foo?2 by ovn-northd. Again, the 
> textual change in the logical flow would result in triggering the change 
> tracker.
>  >
> 
> This proposal is interesting and I think it is a valid alternative. It 
> is trying to implement the probing without maintaining a cross reference 
> table in ovn-controller. In fact it moves the effort of building the 
> reference table from ovn-controller to maintaining the sequence number 
> for each address-set/port-group resources in ovn-northd. I am just not 
> sure if this makes the system simpler or more complex. I will need to 
> think more about it.

Yes, having thought about this some more, I agree that this could just 
be trading one complexity for another. Plus, aside from address sets and 
port groups, I'm not sure that there is any other text expansion type 
references in the southbound database. So engineering a big solution for 
this may not have a lot of bang for your buck. It may be worth 
workshopping just to see, though.

> 
>  > For the database referencing case, it would be nice if the IDL change 
> tracking code could automatically do this for us. This way if record foo 
> has a column that references row bar, then if bar changes, we would be 
> told that foo also changed. This strikes me as difficult to implement 
> and could result in some interesting dependency graphs within the IDL 
> code though.
>  >
>  > What do you think?
>  >
> 
> For the database referencing case, it seems not directly related to your 
> concern regarding address sets handling (or port-group handling). Please 
> correct me if I misunderstood something here. But I agree with idea of 
> utilizing and improving the IDL capability to build the dependency graph 
> for table references, and this is exactly in my TODO as mentioned in the 
> cover letter:

You are correct that this does not apply to address set or port group 
referencing by logical flows. The relationship between logical flows and 
address sets and port groups is not currently expressible at the IDL 
level. I was thinking ahead a bit about how other tables may refer to 
each other. I foresaw similar structures in change handlers for those 
tables and wondered if that could be handled at the IDL level instead.

> 
> "For exposing the dependencies introduced by reference access, it is a big
> TODO item and it is the major reason this patch series is RFC only."


> 
> 
> Thanks,
> Han

Han Zhou Aug. 7, 2018, 5:30 a.m. UTC | #4

On Mon, Aug 6, 2018 at 3:13 AM, Mark Michelson <mmichels@redhat.com> wrote:
>
> Hi Han,
>
> I thought about this more over the weekend, and I was hoping I'd get to
respond to my own e-mail before you saw it, because I realized I had a
fundamental misunderstanding of the scope and nature of change handlers.
I'll reply to your comments in-line below.
>
> On 08/05/2018 03:11 PM, Han Zhou wrote:
>>
>> Hi Mark,
>>
>> Thanks for the review and very valuable comments! (I was on vacation
last week so sorry for slow response).
>>
>> On Tue, Jul 31, 2018 at 3:47 AM, Mark Michelson <mmichels@redhat.com
<mailto:mmichels@redhat.com>> wrote:
>>  >
>>  > Hi Han,
>>  >
>>  > I've given this patchset a look, and I was following along pretty
well until I got to about patch 11. From that point on, I had to re-read
the code more times than I care to admit before I finally understood what
was going on :)
>>  >
>>  > What you have is a structure (lflow_ref_list_node) that is
simultaneously in two lists. These two lists are each the data in separate
hmap nodes. The hmap nodes are in two separate hmaps. One hmap uses the
reference type and name as a key, and the other uses the lflow UUID as a
key. This way given an address set name, you can find the associated
logical flow UUIDs in the ref_lflow_table. Or given a logical flow UUID,
you can find address sets.
>>  >
>>  > I'm wondering if this can be simplified somehow.
>>  >
>>  > Right now, if logical flows change, the change handler has to update
the ref_lflow_table so that address sets no longer reference that logical
flow. If address sets change, then the lflow_ref_table is updated. In both
cases consider_logical_flow() gets called and realigns the tables as
appropriate.
>>  >
>>  > The problem with this is that it reeks of cross-cutting concerns, and
it seems like it wouldn't scale well (consider a 3- or 4-chain of
dependencies). Ideally, the dependency chain would make sure that the
change handler for logical flows only deals with logical flows, and the
change handler for address sets only deals with address sets.
>>
>> I agree that maintaining the cross reference has some overhead, but I
don't see a scaling issue in this case. Adding entries to the cross
reference table is a by-product of the consider_logical_flow() when parsing
the lflow, and deleting the entries is also efficient with O(N), N = number
of address sets used by a lflow, which in most cases should be a very small
number (correct me if I am wrong). As to memory consumption, it maintains
only the mapping between resource names and lflow uuids, so I don't expect
it to be a significant cost either. Could you explain a little more about
the "3- or 4-chain of dependencies" example?
>
>
> My thought was along the lines that table A references table B, which
references table C. A change in table A might result in a change to table
B, which then results in a change to table C. In my head, I thought this
would mean having to maintain a large series of hashmaps that
cross-referenced each other. I realize this isn't correct, though, and that
as long as the dependency chain is linear, this isn't any different from
what you already have proposed here.
>
> In reality, I can't think of any example changes in the southbound
database that would cause such a series of events, but I may not be
thinking hard enough :)
>
>>
>> For the "cross-cutting concern", I don't see it that way. I see it as a
pattern of change handler implementation. In general, output of an engine
node is the result of a "join" operation of its inputs. When there are
multiple inputs and one of them changes, for a change handler to compute
the output incrementally, we will need to use the changed row to probe all
the other inputs to update the final output. For the address-set and
port-group handlers, it is join between two tables only, and the cross
reference table is built to make the probing efficient in change handler.
The cross reference table is also generalized so that any resources
referenced by logical flows can reuse the same data structure and
interfaces, and now it is reused by both address-sets and port-groups. We
can make it more generalized to be used for other mappings if needed.
>
>
> Yes, and this sort of thinking is what I had over the weekend that made
me have a "Eureka!" moment and realize what I had been missing here.
>
> I had been looking at the address set change handler and thinking of the
change handler as being an "owner" of address set data. The reason is that
the engine node is tied directly to updates to the address set table. It
felt like it was overstepping its boundaries by then stepping through data
that I thought was owned by the logical flow change handler. The fact that
both change handlers acted on the same data just struck me as wrong.
>
> However, I realized I need to stop thinking about data ownership in that
sort of way when it comes to change handlers. Engine nodes do have
ownership like I was imagining in their run() method, but change handlers
are very different. They are responsible for analyzing more than just the
data that has changed, but also for analyzing the relationship that data
has with other data. That other data may or may not be tied to other engine
nodes.
>
>>
>> If we really wants to make it more generalized, I think the answer is
the datalog approach. I would be great if it can be implemented that way,
but I am pessimistic for it to be applied to ovn-controller in a practical
time line, given that ovn-controller is more complex in terms of both data
sources and processing logic compared with ovn-northd. And I think it is
practical and simple to implement the probing for most frequent scenarios
as demonstrated by this RFC.
>
>
> I agree. I don't think datalog is the correct approach for ovn-controller.
>
>>
>>  >
>>  > If we generalize things a bit, there are likely to be two ways
dependencies manifest in the database. In this particular case, text in one
row expands to data of a separate database row. The other case would be
where a database row contains the UUID (or list of UUIDs) of other database
rows.
>>  >
>>  > For the textual case, I think the easiest way to handle this is to
replace the text with what it expands to earlier than when we currently do
it. Consider that a logical flow references address set $foo. Currently,
the logical flow in the southbound database has the text "$foo" in it. If
$foo were replaced with the actual addresses from the address set, then
when an address set changes, the text of the logical flow would change as
well, thus resulting in a direct change of the logical flow. A less
disruptive version of this might be to use some reserved character
automatically in the logical flow match followed by a sequence number. So
for instance, if a logical flow were set up to reference address set $foo,
then the actual logical flow might be something like $foo?1. Then if
northbound address set foo changes, the logical flow could be updated to
$foo?2 by ovn-northd. Again, the textual change in the logical flow would
result in triggering the change tracker.
>>  >
>>
>> This proposal is interesting and I think it is a valid alternative. It
is trying to implement the probing without maintaining a cross reference
table in ovn-controller. In fact it moves the effort of building the
reference table from ovn-controller to maintaining the sequence number for
each address-set/port-group resources in ovn-northd. I am just not sure if
this makes the system simpler or more complex. I will need to think more
about it.
>
>
> Yes, having thought about this some more, I agree that this could just be
trading one complexity for another. Plus, aside from address sets and port
groups, I'm not sure that there is any other text expansion type references
in the southbound database. So engineering a big solution for this may not
have a lot of bang for your buck. It may be worth workshopping just to see,
though.
>
>>
>>  > For the database referencing case, it would be nice if the IDL change
tracking code could automatically do this for us. This way if record foo
has a column that references row bar, then if bar changes, we would be told
that foo also changed. This strikes me as difficult to implement and could
result in some interesting dependency graphs within the IDL code though.
>>  >
>>  > What do you think?
>>  >
>>
>> For the database referencing case, it seems not directly related to your
concern regarding address sets handling (or port-group handling). Please
correct me if I misunderstood something here. But I agree with idea of
utilizing and improving the IDL capability to build the dependency graph
for table references, and this is exactly in my TODO as mentioned in the
cover letter:
>
>
> You are correct that this does not apply to address set or port group
referencing by logical flows. The relationship between logical flows and
address sets and port groups is not currently expressible at the IDL level.
I was thinking ahead a bit about how other tables may refer to each other.
I foresaw similar structures in change handlers for those tables and
wondered if that could be handled at the IDL level instead.
>
>
>>
>> "For exposing the dependencies introduced by reference access, it is a
big
>> TODO item and it is the major reason this patch series is RFC only."
>
>
Hi Mark,

Thank you very much for sharing the thought process. I am glad we are on
the same page :)

Regards,
Han

Han Zhou Aug. 13, 2018, 5:57 p.m. UTC | #5

On Mon, Aug 6, 2018 at 10:30 PM, Han Zhou <zhouhan@gmail.com> wrote:
>
>
>
> On Mon, Aug 6, 2018 at 3:13 AM, Mark Michelson <mmichels@redhat.com>
wrote:
> >
> > Hi Han,
> >
> > I thought about this more over the weekend, and I was hoping I'd get to
respond to my own e-mail before you saw it, because I realized I had a
fundamental misunderstanding of the scope and nature of change handlers.
I'll reply to your comments in-line below.
> >
> > On 08/05/2018 03:11 PM, Han Zhou wrote:
> >>
> >> Hi Mark,
> >>
> >> Thanks for the review and very valuable comments! (I was on vacation
last week so sorry for slow response).
> >>
> >> On Tue, Jul 31, 2018 at 3:47 AM, Mark Michelson <mmichels@redhat.com
<mailto:mmichels@redhat.com>> wrote:
> >>  >
> >>  > Hi Han,
> >>  >
> >>  > I've given this patchset a look, and I was following along pretty
well until I got to about patch 11. From that point on, I had to re-read
the code more times than I care to admit before I finally understood what
was going on :)
> >>  >
> >>  > What you have is a structure (lflow_ref_list_node) that is
simultaneously in two lists. These two lists are each the data in separate
hmap nodes. The hmap nodes are in two separate hmaps. One hmap uses the
reference type and name as a key, and the other uses the lflow UUID as a
key. This way given an address set name, you can find the associated
logical flow UUIDs in the ref_lflow_table. Or given a logical flow UUID,
you can find address sets.
> >>  >
> >>  > I'm wondering if this can be simplified somehow.
> >>  >
> >>  > Right now, if logical flows change, the change handler has to
update the ref_lflow_table so that address sets no longer reference that
logical flow. If address sets change, then the lflow_ref_table is updated.
In both cases consider_logical_flow() gets called and realigns the tables
as appropriate.
> >>  >
> >>  > The problem with this is that it reeks of cross-cutting concerns,
and it seems like it wouldn't scale well (consider a 3- or 4-chain of
dependencies). Ideally, the dependency chain would make sure that the
change handler for logical flows only deals with logical flows, and the
change handler for address sets only deals with address sets.
> >>
> >> I agree that maintaining the cross reference has some overhead, but I
don't see a scaling issue in this case. Adding entries to the cross
reference table is a by-product of the consider_logical_flow() when parsing
the lflow, and deleting the entries is also efficient with O(N), N = number
of address sets used by a lflow, which in most cases should be a very small
number (correct me if I am wrong). As to memory consumption, it maintains
only the mapping between resource names and lflow uuids, so I don't expect
it to be a significant cost either. Could you explain a little more about
the "3- or 4-chain of dependencies" example?
> >
> >
> > My thought was along the lines that table A references table B, which
references table C. A change in table A might result in a change to table
B, which then results in a change to table C. In my head, I thought this
would mean having to maintain a large series of hashmaps that
cross-referenced each other. I realize this isn't correct, though, and that
as long as the dependency chain is linear, this isn't any different from
what you already have proposed here.
> >
> > In reality, I can't think of any example changes in the southbound
database that would cause such a series of events, but I may not be
thinking hard enough :)
> >
> >>
> >> For the "cross-cutting concern", I don't see it that way. I see it as
a pattern of change handler implementation. In general, output of an engine
node is the result of a "join" operation of its inputs. When there are
multiple inputs and one of them changes, for a change handler to compute
the output incrementally, we will need to use the changed row to probe all
the other inputs to update the final output. For the address-set and
port-group handlers, it is join between two tables only, and the cross
reference table is built to make the probing efficient in change handler.
The cross reference table is also generalized so that any resources
referenced by logical flows can reuse the same data structure and
interfaces, and now it is reused by both address-sets and port-groups. We
can make it more generalized to be used for other mappings if needed.
> >
> >
> > Yes, and this sort of thinking is what I had over the weekend that made
me have a "Eureka!" moment and realize what I had been missing here.
> >
> > I had been looking at the address set change handler and thinking of
the change handler as being an "owner" of address set data. The reason is
that the engine node is tied directly to updates to the address set table.
It felt like it was overstepping its boundaries by then stepping through
data that I thought was owned by the logical flow change handler. The fact
that both change handlers acted on the same data just struck me as wrong.
> >
> > However, I realized I need to stop thinking about data ownership in
that sort of way when it comes to change handlers. Engine nodes do have
ownership like I was imagining in their run() method, but change handlers
are very different. They are responsible for analyzing more than just the
data that has changed, but also for analyzing the relationship that data
has with other data. That other data may or may not be tied to other engine
nodes.
> >
> >>
> >> If we really wants to make it more generalized, I think the answer is
the datalog approach. I would be great if it can be implemented that way,
but I am pessimistic for it to be applied to ovn-controller in a practical
time line, given that ovn-controller is more complex in terms of both data
sources and processing logic compared with ovn-northd. And I think it is
practical and simple to implement the probing for most frequent scenarios
as demonstrated by this RFC.
> >
> >
> > I agree. I don't think datalog is the correct approach for
ovn-controller.
> >
> >>
> >>  >
> >>  > If we generalize things a bit, there are likely to be two ways
dependencies manifest in the database. In this particular case, text in one
row expands to data of a separate database row. The other case would be
where a database row contains the UUID (or list of UUIDs) of other database
rows.
> >>  >
> >>  > For the textual case, I think the easiest way to handle this is to
replace the text with what it expands to earlier than when we currently do
it. Consider that a logical flow references address set $foo. Currently,
the logical flow in the southbound database has the text "$foo" in it. If
$foo were replaced with the actual addresses from the address set, then
when an address set changes, the text of the logical flow would change as
well, thus resulting in a direct change of the logical flow. A less
disruptive version of this might be to use some reserved character
automatically in the logical flow match followed by a sequence number. So
for instance, if a logical flow were set up to reference address set $foo,
then the actual logical flow might be something like $foo?1. Then if
northbound address set foo changes, the logical flow could be updated to
$foo?2 by ovn-northd. Again, the textual change in the logical flow would
result in triggering the change tracker.
> >>  >
> >>
> >> This proposal is interesting and I think it is a valid alternative. It
is trying to implement the probing without maintaining a cross reference
table in ovn-controller. In fact it moves the effort of building the
reference table from ovn-controller to maintaining the sequence number for
each address-set/port-group resources in ovn-northd. I am just not sure if
this makes the system simpler or more complex. I will need to think more
about it.
> >
> >
> > Yes, having thought about this some more, I agree that this could just
be trading one complexity for another. Plus, aside from address sets and
port groups, I'm not sure that there is any other text expansion type
references in the southbound database. So engineering a big solution for
this may not have a lot of bang for your buck. It may be worth workshopping
just to see, though.
> >
> >>
> >>  > For the database referencing case, it would be nice if the IDL
change tracking code could automatically do this for us. This way if record
foo has a column that references row bar, then if bar changes, we would be
told that foo also changed. This strikes me as difficult to implement and
could result in some interesting dependency graphs within the IDL code
though.
> >>  >
> >>  > What do you think?
> >>  >
> >>
> >> For the database referencing case, it seems not directly related to
your concern regarding address sets handling (or port-group handling).
Please correct me if I misunderstood something here. But I agree with idea
of utilizing and improving the IDL capability to build the dependency graph
for table references, and this is exactly in my TODO as mentioned in the
cover letter:
> >
> >
> > You are correct that this does not apply to address set or port group
referencing by logical flows. The relationship between logical flows and
address sets and port groups is not currently expressible at the IDL level.
I was thinking ahead a bit about how other tables may refer to each other.
I foresaw similar structures in change handlers for those tables and
wondered if that could be handled at the IDL level instead.
> >
> >
> >>
> >> "For exposing the dependencies introduced by reference access, it is a
big
> >> TODO item and it is the major reason this patch series is RFC only."
> >
> >
> Hi Mark,
>
> Thank you very much for sharing the thought process. I am glad we are on
the same page :)
>
> Regards,
> Han

Hi Mark, Ben,

Table reference access is handled in the new patch series I sent:
https://mail.openvswitch.org/pipermail/ovs-dev/2018-August/351060.html
Please see the cover letter for a briefing of the changes and ideas, and me
know your thoughts.

Thanks,
Han

[ovs-dev,RFC,00/14] ovn-controller incremental processing.

Message

Comments