diff mbox

[ovs-dev] ovn: Add second ACL stage

Message ID 1469767624-20966-1-git-send-email-mickeys.dev@gmail.com
State Superseded
Headers show

Commit Message

Mickey Spiegel July 29, 2016, 4:47 a.m. UTC
From: Mickey Spiegel <emspiege@us.ibm.com>

This patch adds a second logical switch ingress ACL stage, and
correspondingly a second logical switch egress ACL stage.  This
allows for more than one ACL-based feature to be applied in the
ingress and egress logical switch pipelines.  The features
driving the different ACL stages may be configured by different
users, for example an application deployer managing security
groups and a network or security admin configuring network ACLs
or firewall rules.

Each ACL stage is self contained.  The "action" for the
highest-"priority" matching row in an ACL stage determines a
packet's treatment.  A separate "action" will be determined in
each ACL stage, according to the ACL rules configured for that
ACL stage.  The "priority" values are only relevant within the
context of an ACL stage.

ACL rules that do not specify an ACL stage are applied to the
default "acl" stage.

Signed-off-by: Mickey Spiegel <mickeys.dev@gmail.com>
---
 ovn/northd/ovn-northd.c   | 319 +++++++++++++++++++++++++++-------------------
 ovn/ovn-nb.ovsschema      |   7 +-
 ovn/ovn-nb.xml            |  25 ++++
 ovn/utilities/ovn-nbctl.c |  35 +++--
 tests/ovn-nbctl.at        |  30 +++--
 5 files changed, 264 insertions(+), 152 deletions(-)

Comments

Russell Bryant July 29, 2016, 5:01 p.m. UTC | #1
On Fri, Jul 29, 2016 at 12:47 AM, Mickey Spiegel <mickeys.dev@gmail.com>
wrote:

>
> This patch adds a second logical switch ingress ACL stage, and
> correspondingly a second logical switch egress ACL stage.  This
> allows for more than one ACL-based feature to be applied in the
> ingress and egress logical switch pipelines.  The features
> driving the different ACL stages may be configured by different
> users, for example an application deployer managing security
> groups and a network or security admin configuring network ACLs
> or firewall rules.
>
> Each ACL stage is self contained.  The "action" for the
> highest-"priority" matching row in an ACL stage determines a
> packet's treatment.  A separate "action" will be determined in
> each ACL stage, according to the ACL rules configured for that
> ACL stage.  The "priority" values are only relevant within the
> context of an ACL stage.
>
> ACL rules that do not specify an ACL stage are applied to the
> default "acl" stage.
>
> Signed-off-by: Mickey Spiegel <mickeys.dev@gmail.com>


Could you expand on why priorities in a single stage aren't enough to
satisfy the use case?
Mickey Spiegel July 29, 2016, 5:28 p.m. UTC | #2
-----"dev" <dev-bounces@openvswitch.org> wrote: -----
To: Mickey Spiegel <mickeys.dev@gmail.com>
From: Russell Bryant 
Sent by: "dev" 
Date: 07/29/2016 10:02AM
Cc: ovs dev <dev@openvswitch.org>
Subject: Re: [ovs-dev] [PATCH] ovn: Add second ACL stage

On Fri, Jul 29, 2016 at 12:47 AM, Mickey Spiegel <mickeys.dev@gmail.com>
wrote:

>
> This patch adds a second logical switch ingress ACL stage, and
> correspondingly a second logical switch egress ACL stage.  This
> allows for more than one ACL-based feature to be applied in the
> ingress and egress logical switch pipelines.  The features
> driving the different ACL stages may be configured by different
> users, for example an application deployer managing security
> groups and a network or security admin configuring network ACLs
> or firewall rules.
>
> Each ACL stage is self contained.  The "action" for the
> highest-"priority" matching row in an ACL stage determines a
> packet's treatment.  A separate "action" will be determined in
> each ACL stage, according to the ACL rules configured for that
> ACL stage.  The "priority" values are only relevant within the
> context of an ACL stage.
>
> ACL rules that do not specify an ACL stage are applied to the
> default "acl" stage.
>
> Signed-off-by: Mickey Spiegel <mickeys.dev@gmail.com>


Could you expand on why priorities in a single stage aren't enough to
satisfy the use case?

<Mickey>
If two features are configured independently with a mix of
prioritized allow and drop rules, then with a single stage, a
new set of ACL rules must be produced that achieves the same
behavior.  This is sometimes referred to as an "ACL merge"
algorithm, for example:
http://www.cisco.com/en/US/products/hw/switches/ps708/products_white_paper09186a00800c9470.shtml#wp39514

In the worst case, for example when the features act on different
packet fields (e.g. one on IP address and another on L4 port),
the number of rules required can approach
(# of ACL1 rules) * (# of ACL2 rules).

While it is possible to code up such an algorithm, it adds
significant complexity and complicates whichever layer
implements the merge algorithm, either OVN or the CMS above.

By using multiple independent pipeline stages, all of this
software complexity is avoided, achieving the proper result
in a simple and straightforward manner.

Recent network hardware ASICs tend to have around 8 or 10 ACL
stages, though they tend to evaluate these in parallel given
all the emphasis on low latency these days.

Mickey
Mickey Spiegel July 30, 2016, 8:19 p.m. UTC | #3
On Fri, Jul 29, 2016 at 10:28 AM, Mickey Spiegel <emspiege@us.ibm.com>
wrote:
>
> -----"dev" <dev-bounces@openvswitch.org> wrote: -----
>> To: Mickey Spiegel <mickeys.dev@gmail.com>
>> From: Russell Bryant
>> Sent by: "dev"
>> Date: 07/29/2016 10:02AM
>> Cc: ovs dev <dev@openvswitch.org>
>> Subject: Re: [ovs-dev] [PATCH] ovn: Add second ACL stage
>>
>> On Fri, Jul 29, 2016 at 12:47 AM, Mickey Spiegel <mickeys.dev@gmail.com>
>> wrote:
>>
>>>
>>> This patch adds a second logical switch ingress ACL stage, and
>>> correspondingly a second logical switch egress ACL stage.  This
>>> allows for more than one ACL-based feature to be applied in the
>>> ingress and egress logical switch pipelines.  The features
>>> driving the different ACL stages may be configured by different
>>> users, for example an application deployer managing security
>>> groups and a network or security admin configuring network ACLs
>>> or firewall rules.
>>>
>>> Each ACL stage is self contained.  The "action" for the
>>> highest-"priority" matching row in an ACL stage determines a
>>> packet's treatment.  A separate "action" will be determined in
>>> each ACL stage, according to the ACL rules configured for that
>>> ACL stage.  The "priority" values are only relevant within the
>>> context of an ACL stage.
>>>
>>> ACL rules that do not specify an ACL stage are applied to the
>>> default "acl" stage.
>>>
>>> Signed-off-by: Mickey Spiegel <mickeys.dev@gmail.com>
>>
>>
>> Could you expand on why priorities in a single stage aren't enough to
>> satisfy the use case?
>
> If two features are configured independently with a mix of
> prioritized allow and drop rules, then with a single stage, a
> new set of ACL rules must be produced that achieves the same
> behavior.  This is sometimes referred to as an "ACL merge"
> algorithm, for example:
>
http://www.cisco.com/en/US/products/hw/switches/ps708/products_white_paper09186a00800c9470.shtml#wp39514
>
> In the worst case, for example when the features act on different
> packet fields (e.g. one on IP address and another on L4 port),
> the number of rules required can approach
> (# of ACL1 rules) * (# of ACL2 rules).
>
> While it is possible to code up such an algorithm, it adds
> significant complexity and complicates whichever layer
> implements the merge algorithm, either OVN or the CMS above.
>
> By using multiple independent pipeline stages, all of this
> software complexity is avoided, achieving the proper result
> in a simple and straightforward manner.
>
> Recent network hardware ASICs tend to have around 8 or 10 ACL
> stages, though they tend to evaluate these in parallel given
> all the emphasis on low latency these days.

Throwing in an example to illustrate the difference between one
ACL stage and two ACL stages:

If two separate ACL stages:
Feature 1
acl  from-lport  100 (tcp == 80) allow-related
acl  from-lport  100 (tcp == 8080) allow-related
acl  from-lport  100 (udp) allow-related
acl  from-lport  100 (ip4.src == 10.1.1.0/24 && tcp) allow-related

Feature 2
acl2 from-lport  300 (ip4.dst == 172.16.10.0/24) allow-related
acl2 from-lport  300 (ip4.dst == 192.168.20.0/24) allow-related
acl2 from-lport  200 (ip4.dst == 172.16.0.0/20) drop
acl2 from-lport  200 (ip4.dst == 192.168.0.0/16) drop
acl2 from-lport  100 (ip4.dst == 172.16.0.0/16) allow-related

Combined in one stage, to get the equivalent behavior, this would require:
from-lport  300 (ip4.dst == 172.16.10.0/24 && tcp == 80) allow-related
from-lport  300 (ip4.dst == 172.16.10.0/24 && tcp == 8080) allow-related
from-lport  300 (ip4.dst == 172.16.10.0/24 && udp) allow-related
from-lport  300 (ip4.dst == 172.16.10.0/24 && ip4.src == 10.1.1.0/24 &&
tcp) allow-related
from-lport  300 (ip4.dst == 192.168.20.0/24 && tcp == 80) allow-related
from-lport  300 (ip4.dst == 192.168.20.0/24 && tcp == 8080) allow-related
from-lport  300 (ip4.dst == 192.168.20.0/24 && udp) allow-related
from-lport  300 (ip4.dst == 192.168.20.0/24 && ip4.src == 10.1.1.0/24 &&
tcp) allow-related
from-lport  200 (ip4.dst == 172.16.0.0/20) drop
from-lport  200 (ip4.dst == 192.168.0.0/16) drop
from-lport  100 (ip4.dst == 172.16.0.0/16 && tcp == 80) allow-related
from-lport  100 (ip4.dst == 172.16.0.0/16 && tcp == 8080) allow-related
from-lport  100 (ip4.dst == 172.16.0.0/16 && udp) allow-related
from-lport  100 (ip4.dst == 172.16.0.0/16 && ip4.src == 10.1.1.0/24 && tcp)
allow-related

If there are more IP addresses in feature 2, then the number
of ACL rules will climb geometrically:
(4 feature 1 rules * # feature 2 allow-related rules + # feature 2 drop
rules)

With 2 separate ACL stages, the rules just go straight into
the corresponding ACL table, no merge required:
(# feature 1 rules + # feature 2 rules)

Mickey

> Mickey
>
>>
>> --
>> Russell Bryant
>> _______________________________________________
>> dev mailing list
>> dev@openvswitch.org
>> http://openvswitch.org/mailman/listinfo/dev
>>
Russell Bryant Aug. 2, 2016, 11:52 a.m. UTC | #4
On Sat, Jul 30, 2016 at 4:19 PM, Mickey Spiegel <mickeys.dev@gmail.com>
wrote:

> On Fri, Jul 29, 2016 at 10:28 AM, Mickey Spiegel <emspiege@us.ibm.com>
> wrote:
> >
> > -----"dev" <dev-bounces@openvswitch.org> wrote: -----
> >> To: Mickey Spiegel <mickeys.dev@gmail.com>
> >> From: Russell Bryant
> >> Sent by: "dev"
> >> Date: 07/29/2016 10:02AM
> >> Cc: ovs dev <dev@openvswitch.org>
> >> Subject: Re: [ovs-dev] [PATCH] ovn: Add second ACL stage
> >>
> >> On Fri, Jul 29, 2016 at 12:47 AM, Mickey Spiegel <mickeys.dev@gmail.com
> >
> >> wrote:
> >>
> >>>
> >>> This patch adds a second logical switch ingress ACL stage, and
> >>> correspondingly a second logical switch egress ACL stage.  This
> >>> allows for more than one ACL-based feature to be applied in the
> >>> ingress and egress logical switch pipelines.  The features
> >>> driving the different ACL stages may be configured by different
> >>> users, for example an application deployer managing security
> >>> groups and a network or security admin configuring network ACLs
> >>> or firewall rules.
> >>>
> >>> Each ACL stage is self contained.  The "action" for the
> >>> highest-"priority" matching row in an ACL stage determines a
> >>> packet's treatment.  A separate "action" will be determined in
> >>> each ACL stage, according to the ACL rules configured for that
> >>> ACL stage.  The "priority" values are only relevant within the
> >>> context of an ACL stage.
> >>>
> >>> ACL rules that do not specify an ACL stage are applied to the
> >>> default "acl" stage.
> >>>
> >>> Signed-off-by: Mickey Spiegel <mickeys.dev@gmail.com>
> >>
> >>
> >> Could you expand on why priorities in a single stage aren't enough to
> >> satisfy the use case?
> >
> > If two features are configured independently with a mix of
> > prioritized allow and drop rules, then with a single stage, a
> > new set of ACL rules must be produced that achieves the same
> > behavior.  This is sometimes referred to as an "ACL merge"
> > algorithm, for example:
> >
> http://www.cisco.com/en/US/products/hw/switches/ps708/products_white_paper09186a00800c9470.shtml#wp39514
> >
> > In the worst case, for example when the features act on different
> > packet fields (e.g. one on IP address and another on L4 port),
> > the number of rules required can approach
> > (# of ACL1 rules) * (# of ACL2 rules).
> >
> > While it is possible to code up such an algorithm, it adds
> > significant complexity and complicates whichever layer
> > implements the merge algorithm, either OVN or the CMS above.
> >
> > By using multiple independent pipeline stages, all of this
> > software complexity is avoided, achieving the proper result
> > in a simple and straightforward manner.
> >
> > Recent network hardware ASICs tend to have around 8 or 10 ACL
> > stages, though they tend to evaluate these in parallel given
> > all the emphasis on low latency these days.
>
> Throwing in an example to illustrate the difference between one
> ACL stage and two ACL stages:
>
> If two separate ACL stages:
> Feature 1
> acl  from-lport  100 (tcp == 80) allow-related
> acl  from-lport  100 (tcp == 8080) allow-related
> acl  from-lport  100 (udp) allow-related
> acl  from-lport  100 (ip4.src == 10.1.1.0/24 && tcp) allow-related
>
> Feature 2
> acl2 from-lport  300 (ip4.dst == 172.16.10.0/24) allow-related
> acl2 from-lport  300 (ip4.dst == 192.168.20.0/24) allow-related
> acl2 from-lport  200 (ip4.dst == 172.16.0.0/20) drop
> acl2 from-lport  200 (ip4.dst == 192.168.0.0/16) drop
> acl2 from-lport  100 (ip4.dst == 172.16.0.0/16) allow-related
>
> Combined in one stage, to get the equivalent behavior, this would require:
> from-lport  300 (ip4.dst == 172.16.10.0/24 && tcp == 80) allow-related
> from-lport  300 (ip4.dst == 172.16.10.0/24 && tcp == 8080) allow-related
> from-lport  300 (ip4.dst == 172.16.10.0/24 && udp) allow-related
> from-lport  300 (ip4.dst == 172.16.10.0/24 && ip4.src == 10.1.1.0/24 &&
> tcp) allow-related
> from-lport  300 (ip4.dst == 192.168.20.0/24 && tcp == 80) allow-related
> from-lport  300 (ip4.dst == 192.168.20.0/24 && tcp == 8080) allow-related
> from-lport  300 (ip4.dst == 192.168.20.0/24 && udp) allow-related
> from-lport  300 (ip4.dst == 192.168.20.0/24 && ip4.src == 10.1.1.0/24 &&
> tcp) allow-related
> from-lport  200 (ip4.dst == 172.16.0.0/20) drop
> from-lport  200 (ip4.dst == 192.168.0.0/16) drop
> from-lport  100 (ip4.dst == 172.16.0.0/16 && tcp == 80) allow-related
> from-lport  100 (ip4.dst == 172.16.0.0/16 && tcp == 8080) allow-related
> from-lport  100 (ip4.dst == 172.16.0.0/16 && udp) allow-related
> from-lport  100 (ip4.dst == 172.16.0.0/16 && ip4.src == 10.1.1.0/24 &&
> tcp) allow-related
>

Or have an address set, "addrset1", which contains {172.16.10.0/24,
192.168.20.0/24, 172.16.0.0/20, 192.168.0.0/16, 172.16.0.0/16}.

acl  from-lport  100 (ip4.dst == $addrset1 && tcp && tcp.dst == {80, 8080})
allow-related
acl  from-lport  100 (ip4.dst == $addrset1 && udp) allow-related
acl  from-lport  100 (ip4.dst == $addrset1 && ip4.src == 10.1.1.0/24 &&
tcp) allow-related


>
> If there are more IP addresses in feature 2, then the number
> of ACL rules will climb geometrically:
> (4 feature 1 rules * # feature 2 allow-related rules + # feature 2 drop
> rules)
>
> With 2 separate ACL stages, the rules just go straight into
> the corresponding ACL table, no merge required:
> (# feature 1 rules + # feature 2 rules)
>

Thanks for elaborating.  I'm not opposed.  It seems harmless if not being
used.

Can you update the docs to indicate the specific accepted values for
"stage"?  It currently sounds like you can use as many stages as you want
to me.
Darrell Ball Aug. 2, 2016, 4:26 p.m. UTC | #5
On Tue, Aug 2, 2016 at 4:52 AM, Russell Bryant <russell@ovn.org> wrote:

> On Sat, Jul 30, 2016 at 4:19 PM, Mickey Spiegel <mickeys.dev@gmail.com>
> wrote:
>
> > On Fri, Jul 29, 2016 at 10:28 AM, Mickey Spiegel <emspiege@us.ibm.com>
> > wrote:
> > >
> > > -----"dev" <dev-bounces@openvswitch.org> wrote: -----
> > >> To: Mickey Spiegel <mickeys.dev@gmail.com>
> > >> From: Russell Bryant
> > >> Sent by: "dev"
> > >> Date: 07/29/2016 10:02AM
> > >> Cc: ovs dev <dev@openvswitch.org>
> > >> Subject: Re: [ovs-dev] [PATCH] ovn: Add second ACL stage
> > >>
> > >> On Fri, Jul 29, 2016 at 12:47 AM, Mickey Spiegel <
> mickeys.dev@gmail.com
> > >
> > >> wrote:
> > >>
> > >>>
> > >>> This patch adds a second logical switch ingress ACL stage, and
> > >>> correspondingly a second logical switch egress ACL stage.  This
> > >>> allows for more than one ACL-based feature to be applied in the
> > >>> ingress and egress logical switch pipelines.  The features
> > >>> driving the different ACL stages may be configured by different
> > >>> users, for example an application deployer managing security
> > >>> groups and a network or security admin configuring network ACLs
> > >>> or firewall rules.
> > >>>
> > >>> Each ACL stage is self contained.  The "action" for the
> > >>> highest-"priority" matching row in an ACL stage determines a
> > >>> packet's treatment.  A separate "action" will be determined in
> > >>> each ACL stage, according to the ACL rules configured for that
> > >>> ACL stage.  The "priority" values are only relevant within the
> > >>> context of an ACL stage.
> > >>>
> > >>> ACL rules that do not specify an ACL stage are applied to the
> > >>> default "acl" stage.
> > >>>
> > >>> Signed-off-by: Mickey Spiegel <mickeys.dev@gmail.com>
> > >>
> > >>
> > >> Could you expand on why priorities in a single stage aren't enough to
> > >> satisfy the use case?
> > >
> > > If two features are configured independently with a mix of
> > > prioritized allow and drop rules, then with a single stage, a
> > > new set of ACL rules must be produced that achieves the same
> > > behavior.  This is sometimes referred to as an "ACL merge"
> > > algorithm, for example:
> > >
> >
> http://www.cisco.com/en/US/products/hw/switches/ps708/products_white_paper09186a00800c9470.shtml#wp39514
> > >
> > > In the worst case, for example when the features act on different
> > > packet fields (e.g. one on IP address and another on L4 port),
> > > the number of rules required can approach
> > > (# of ACL1 rules) * (# of ACL2 rules).
> > >
> > > While it is possible to code up such an algorithm, it adds
> > > significant complexity and complicates whichever layer
> > > implements the merge algorithm, either OVN or the CMS above.
> > >
> > > By using multiple independent pipeline stages, all of this
> > > software complexity is avoided, achieving the proper result
> > > in a simple and straightforward manner.
> > >
> > > Recent network hardware ASICs tend to have around 8 or 10 ACL
> > > stages, though they tend to evaluate these in parallel given
> > > all the emphasis on low latency these days.
> >
> > Throwing in an example to illustrate the difference between one
> > ACL stage and two ACL stages:
> >
> > If two separate ACL stages:
> > Feature 1
> > acl  from-lport  100 (tcp == 80) allow-related
> > acl  from-lport  100 (tcp == 8080) allow-related
> > acl  from-lport  100 (udp) allow-related
> > acl  from-lport  100 (ip4.src == 10.1.1.0/24 && tcp) allow-related
> >
> > Feature 2
> > acl2 from-lport  300 (ip4.dst == 172.16.10.0/24) allow-related
> > acl2 from-lport  300 (ip4.dst == 192.168.20.0/24) allow-related
> > acl2 from-lport  200 (ip4.dst == 172.16.0.0/20) drop
> > acl2 from-lport  200 (ip4.dst == 192.168.0.0/16) drop
> > acl2 from-lport  100 (ip4.dst == 172.16.0.0/16) allow-related
> >
> > Combined in one stage, to get the equivalent behavior, this would
> require:
> > from-lport  300 (ip4.dst == 172.16.10.0/24 && tcp == 80) allow-related
> > from-lport  300 (ip4.dst == 172.16.10.0/24 && tcp == 8080) allow-related
> > from-lport  300 (ip4.dst == 172.16.10.0/24 && udp) allow-related
> > from-lport  300 (ip4.dst == 172.16.10.0/24 && ip4.src == 10.1.1.0/24 &&
> > tcp) allow-related
> > from-lport  300 (ip4.dst == 192.168.20.0/24 && tcp == 80) allow-related
> > from-lport  300 (ip4.dst == 192.168.20.0/24 && tcp == 8080)
> allow-related
> > from-lport  300 (ip4.dst == 192.168.20.0/24 && udp) allow-related
> > from-lport  300 (ip4.dst == 192.168.20.0/24 && ip4.src == 10.1.1.0/24 &&
> > tcp) allow-related
> > from-lport  200 (ip4.dst == 172.16.0.0/20) drop
> > from-lport  200 (ip4.dst == 192.168.0.0/16) drop
> > from-lport  100 (ip4.dst == 172.16.0.0/16 && tcp == 80) allow-related
> > from-lport  100 (ip4.dst == 172.16.0.0/16 && tcp == 8080) allow-related
> > from-lport  100 (ip4.dst == 172.16.0.0/16 && udp) allow-related
> > from-lport  100 (ip4.dst == 172.16.0.0/16 && ip4.src == 10.1.1.0/24 &&
> > tcp) allow-related
> >
>
> Or have an address set, "addrset1", which contains {172.16.10.0/24,
> 192.168.20.0/24, 172.16.0.0/20, 192.168.0.0/16, 172.16.0.0/16}.
>
> acl  from-lport  100 (ip4.dst == $addrset1 && tcp && tcp.dst == {80, 8080})
> allow-related
> acl  from-lport  100 (ip4.dst == $addrset1 && udp) allow-related
> acl  from-lport  100 (ip4.dst == $addrset1 && ip4.src == 10.1.1.0/24 &&
> tcp) allow-related
>
>
> >
> > If there are more IP addresses in feature 2, then the number
> > of ACL rules will climb geometrically:
> > (4 feature 1 rules * # feature 2 allow-related rules + # feature 2 drop
> > rules)
> >
> > With 2 separate ACL stages, the rules just go straight into
> > the corresponding ACL table, no merge required:
> > (# feature 1 rules + # feature 2 rules)
> >
>
> Thanks for elaborating.  I'm not opposed.  It seems harmless if not being
> used.
>


There are presently no unit tests for ACLs in the system tests
(system-ovn.at).
The first step should be to add unit tests for single stage ACLs.
and then add a delta of tests if other stages are desired.

It will be good to test the coordination between multiple stages
coming directly from northbound APIs and check what happens when
multistage ACLs are setup and torn down stage by stage, particularly
when the datapath ends up in a more permissive state for some period of
time.



>
> Can you update the docs to indicate the specific accepted values for
> "stage"?



This would significantly complicate the usage of northbound ACL APIs,
since multi-staging would be exposed at the top (northbound) OVN layer.

This would need a clear set of guidelines how northbound
multistage ACLs would be used by a CMS, at the user level.



> It currently sounds like you can use as many stages as you want
> to me.
>
> --
> Russell Bryant
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
>
Mickey Spiegel Aug. 2, 2016, 5:23 p.m. UTC | #6
On Tue, Aug 2, 2016 at 9:26 AM, Darrell Ball <dlu998@gmail.com> wrote:

>
>
> On Tue, Aug 2, 2016 at 4:52 AM, Russell Bryant <russell@ovn.org> wrote:
>
>> On Sat, Jul 30, 2016 at 4:19 PM, Mickey Spiegel <mickeys.dev@gmail.com>
>> wrote:
>>
>> > On Fri, Jul 29, 2016 at 10:28 AM, Mickey Spiegel <emspiege@us.ibm.com>
>> > wrote:
>> > >
>> > > -----"dev" <dev-bounces@openvswitch.org> wrote: -----
>> > >> To: Mickey Spiegel <mickeys.dev@gmail.com>
>> > >> From: Russell Bryant
>> > >> Sent by: "dev"
>> > >> Date: 07/29/2016 10:02AM
>> > >> Cc: ovs dev <dev@openvswitch.org>
>> > >> Subject: Re: [ovs-dev] [PATCH] ovn: Add second ACL stage
>> > >>
>> > >> On Fri, Jul 29, 2016 at 12:47 AM, Mickey Spiegel <
>> mickeys.dev@gmail.com
>> > >
>> > >> wrote:
>> > >>
>> > >>>
>> > >>> This patch adds a second logical switch ingress ACL stage, and
>> > >>> correspondingly a second logical switch egress ACL stage.  This
>> > >>> allows for more than one ACL-based feature to be applied in the
>> > >>> ingress and egress logical switch pipelines.  The features
>> > >>> driving the different ACL stages may be configured by different
>> > >>> users, for example an application deployer managing security
>> > >>> groups and a network or security admin configuring network ACLs
>> > >>> or firewall rules.
>> > >>>
>> > >>> Each ACL stage is self contained.  The "action" for the
>> > >>> highest-"priority" matching row in an ACL stage determines a
>> > >>> packet's treatment.  A separate "action" will be determined in
>> > >>> each ACL stage, according to the ACL rules configured for that
>> > >>> ACL stage.  The "priority" values are only relevant within the
>> > >>> context of an ACL stage.
>> > >>>
>> > >>> ACL rules that do not specify an ACL stage are applied to the
>> > >>> default "acl" stage.
>> > >>>
>> > >>> Signed-off-by: Mickey Spiegel <mickeys.dev@gmail.com>
>> > >>
>> > >>
>> > >> Could you expand on why priorities in a single stage aren't enough to
>> > >> satisfy the use case?
>> > >
>> > > If two features are configured independently with a mix of
>> > > prioritized allow and drop rules, then with a single stage, a
>> > > new set of ACL rules must be produced that achieves the same
>> > > behavior.  This is sometimes referred to as an "ACL merge"
>> > > algorithm, for example:
>> > >
>> >
>> http://www.cisco.com/en/US/products/hw/switches/ps708/products_white_paper09186a00800c9470.shtml#wp39514
>> > >
>> > > In the worst case, for example when the features act on different
>> > > packet fields (e.g. one on IP address and another on L4 port),
>> > > the number of rules required can approach
>> > > (# of ACL1 rules) * (# of ACL2 rules).
>> > >
>> > > While it is possible to code up such an algorithm, it adds
>> > > significant complexity and complicates whichever layer
>> > > implements the merge algorithm, either OVN or the CMS above.
>> > >
>> > > By using multiple independent pipeline stages, all of this
>> > > software complexity is avoided, achieving the proper result
>> > > in a simple and straightforward manner.
>> > >
>> > > Recent network hardware ASICs tend to have around 8 or 10 ACL
>> > > stages, though they tend to evaluate these in parallel given
>> > > all the emphasis on low latency these days.
>> >
>> > Throwing in an example to illustrate the difference between one
>> > ACL stage and two ACL stages:
>> >
>> > If two separate ACL stages:
>> > Feature 1
>> > acl  from-lport  100 (tcp == 80) allow-related
>> > acl  from-lport  100 (tcp == 8080) allow-related
>> > acl  from-lport  100 (udp) allow-related
>> > acl  from-lport  100 (ip4.src == 10.1.1.0/24 && tcp) allow-related
>> >
>> > Feature 2
>> > acl2 from-lport  300 (ip4.dst == 172.16.10.0/24) allow-related
>> > acl2 from-lport  300 (ip4.dst == 192.168.20.0/24) allow-related
>> > acl2 from-lport  200 (ip4.dst == 172.16.0.0/20) drop
>> > acl2 from-lport  200 (ip4.dst == 192.168.0.0/16) drop
>> > acl2 from-lport  100 (ip4.dst == 172.16.0.0/16) allow-related
>> >
>> > Combined in one stage, to get the equivalent behavior, this would
>> require:
>> > from-lport  300 (ip4.dst == 172.16.10.0/24 && tcp == 80) allow-related
>> > from-lport  300 (ip4.dst == 172.16.10.0/24 && tcp == 8080)
>> allow-related
>> > from-lport  300 (ip4.dst == 172.16.10.0/24 && udp) allow-related
>> > from-lport  300 (ip4.dst == 172.16.10.0/24 && ip4.src == 10.1.1.0/24 &&
>> > tcp) allow-related
>> > from-lport  300 (ip4.dst == 192.168.20.0/24 && tcp == 80) allow-related
>> > from-lport  300 (ip4.dst == 192.168.20.0/24 && tcp == 8080)
>> allow-related
>> > from-lport  300 (ip4.dst == 192.168.20.0/24 && udp) allow-related
>> > from-lport  300 (ip4.dst == 192.168.20.0/24 && ip4.src == 10.1.1.0/24
>> &&
>> > tcp) allow-related
>> > from-lport  200 (ip4.dst == 172.16.0.0/20) drop
>> > from-lport  200 (ip4.dst == 192.168.0.0/16) drop
>> > from-lport  100 (ip4.dst == 172.16.0.0/16 && tcp == 80) allow-related
>> > from-lport  100 (ip4.dst == 172.16.0.0/16 && tcp == 8080) allow-related
>> > from-lport  100 (ip4.dst == 172.16.0.0/16 && udp) allow-related
>> > from-lport  100 (ip4.dst == 172.16.0.0/16 && ip4.src == 10.1.1.0/24 &&
>> > tcp) allow-related
>> >
>>
>> Or have an address set, "addrset1", which contains {172.16.10.0/24,
>> 192.168.20.0/24, 172.16.0.0/20, 192.168.0.0/16, 172.16.0.0/16}.
>>
>> acl  from-lport  100 (ip4.dst == $addrset1 && tcp && tcp.dst == {80,
>> 8080})
>> allow-related
>> acl  from-lport  100 (ip4.dst == $addrset1 && udp) allow-related
>> acl  from-lport  100 (ip4.dst == $addrset1 && ip4.src == 10.1.1.0/24 &&
>> tcp) allow-related
>>
>>
>> >
>> > If there are more IP addresses in feature 2, then the number
>> > of ACL rules will climb geometrically:
>> > (4 feature 1 rules * # feature 2 allow-related rules + # feature 2 drop
>> > rules)
>> >
>> > With 2 separate ACL stages, the rules just go straight into
>> > the corresponding ACL table, no merge required:
>> > (# feature 1 rules + # feature 2 rules)
>> >
>>
>> Thanks for elaborating.  I'm not opposed.  It seems harmless if not being
>> used.
>>
>
>
> There are presently no unit tests for ACLs in the system tests
> (system-ovn.at).
> The first step should be to add unit tests for single stage ACLs.
> and then add a delta of tests if other stages are desired.
>
> It will be good to test the coordination between multiple stages
> coming directly from northbound APIs and check what happens when
> multistage ACLs are setup and torn down stage by stage, particularly
> when the datapath ends up in a more permissive state for some period of
> time.
>
>
>
>>
>> Can you update the docs to indicate the specific accepted values for
>> "stage"?
>
>
>
> This would significantly complicate the usage of northbound ACL APIs,
> since multi-staging would be exposed at the top (northbound) OVN layer.
>

The default behavior when "stage" is not specified is to apply the ACL to
the
existing "acl" stage. If you don't care about the second ACL stage, continue
to use ACLs as you do today and it will work. There is no complication.


> This would need a clear set of guidelines how northbound
> multistage ACLs would be used by a CMS, at the user level.
>

The CMS typically does not expose ACLs directly to the user. For example,
with OpenStack, Security Groups use the default "acl" stage. OpenStack
FWaaS v2 would use the "acl2" stage. These are two separate OpenStack
features with separate OpenStack northbound APIs to the user.

Mickey


>
>> It currently sounds like you can use as many stages as you want
>> to me.
>>
>> --
>> Russell Bryant
>> _______________________________________________
>> dev mailing list
>> dev@openvswitch.org
>> http://openvswitch.org/mailman/listinfo/dev
>>
>
>
Gurucharan Shetty Aug. 2, 2016, 5:29 p.m. UTC | #7
On 2 August 2016 at 10:23, Mickey Spiegel <mickeys.dev@gmail.com> wrote:

> On Tue, Aug 2, 2016 at 9:26 AM, Darrell Ball <dlu998@gmail.com> wrote:
>
> >
> >
> > On Tue, Aug 2, 2016 at 4:52 AM, Russell Bryant <russell@ovn.org> wrote:
> >
> >> On Sat, Jul 30, 2016 at 4:19 PM, Mickey Spiegel <mickeys.dev@gmail.com>
> >> wrote:
> >>
> >> > On Fri, Jul 29, 2016 at 10:28 AM, Mickey Spiegel <emspiege@us.ibm.com
> >
> >> > wrote:
> >> > >
> >> > > -----"dev" <dev-bounces@openvswitch.org> wrote: -----
> >> > >> To: Mickey Spiegel <mickeys.dev@gmail.com>
> >> > >> From: Russell Bryant
> >> > >> Sent by: "dev"
> >> > >> Date: 07/29/2016 10:02AM
> >> > >> Cc: ovs dev <dev@openvswitch.org>
> >> > >> Subject: Re: [ovs-dev] [PATCH] ovn: Add second ACL stage
> >> > >>
> >> > >> On Fri, Jul 29, 2016 at 12:47 AM, Mickey Spiegel <
> >> mickeys.dev@gmail.com
> >> > >
> >> > >> wrote:
> >> > >>
> >> > >>>
> >> > >>> This patch adds a second logical switch ingress ACL stage, and
> >> > >>> correspondingly a second logical switch egress ACL stage.  This
> >> > >>> allows for more than one ACL-based feature to be applied in the
> >> > >>> ingress and egress logical switch pipelines.  The features
> >> > >>> driving the different ACL stages may be configured by different
> >> > >>> users, for example an application deployer managing security
> >> > >>> groups and a network or security admin configuring network ACLs
> >> > >>> or firewall rules.
> >> > >>>
> >> > >>> Each ACL stage is self contained.  The "action" for the
> >> > >>> highest-"priority" matching row in an ACL stage determines a
> >> > >>> packet's treatment.  A separate "action" will be determined in
> >> > >>> each ACL stage, according to the ACL rules configured for that
> >> > >>> ACL stage.  The "priority" values are only relevant within the
> >> > >>> context of an ACL stage.
> >> > >>>
> >> > >>> ACL rules that do not specify an ACL stage are applied to the
> >> > >>> default "acl" stage.
> >> > >>>
> >> > >>> Signed-off-by: Mickey Spiegel <mickeys.dev@gmail.com>
> >> > >>
> >> > >>
> >> > >> Could you expand on why priorities in a single stage aren't enough
> to
> >> > >> satisfy the use case?
> >> > >
> >> > > If two features are configured independently with a mix of
> >> > > prioritized allow and drop rules, then with a single stage, a
> >> > > new set of ACL rules must be produced that achieves the same
> >> > > behavior.  This is sometimes referred to as an "ACL merge"
> >> > > algorithm, for example:
> >> > >
> >> >
> >>
> http://www.cisco.com/en/US/products/hw/switches/ps708/products_white_paper09186a00800c9470.shtml#wp39514
> >> > >
> >> > > In the worst case, for example when the features act on different
> >> > > packet fields (e.g. one on IP address and another on L4 port),
> >> > > the number of rules required can approach
> >> > > (# of ACL1 rules) * (# of ACL2 rules).
> >> > >
> >> > > While it is possible to code up such an algorithm, it adds
> >> > > significant complexity and complicates whichever layer
> >> > > implements the merge algorithm, either OVN or the CMS above.
> >> > >
> >> > > By using multiple independent pipeline stages, all of this
> >> > > software complexity is avoided, achieving the proper result
> >> > > in a simple and straightforward manner.
> >> > >
> >> > > Recent network hardware ASICs tend to have around 8 or 10 ACL
> >> > > stages, though they tend to evaluate these in parallel given
> >> > > all the emphasis on low latency these days.
> >> >
> >> > Throwing in an example to illustrate the difference between one
> >> > ACL stage and two ACL stages:
> >> >
> >> > If two separate ACL stages:
> >> > Feature 1
> >> > acl  from-lport  100 (tcp == 80) allow-related
> >> > acl  from-lport  100 (tcp == 8080) allow-related
> >> > acl  from-lport  100 (udp) allow-related
> >> > acl  from-lport  100 (ip4.src == 10.1.1.0/24 && tcp) allow-related
> >> >
> >> > Feature 2
> >> > acl2 from-lport  300 (ip4.dst == 172.16.10.0/24) allow-related
> >> > acl2 from-lport  300 (ip4.dst == 192.168.20.0/24) allow-related
> >> > acl2 from-lport  200 (ip4.dst == 172.16.0.0/20) drop
> >> > acl2 from-lport  200 (ip4.dst == 192.168.0.0/16) drop
> >> > acl2 from-lport  100 (ip4.dst == 172.16.0.0/16) allow-related
> >> >
> >> > Combined in one stage, to get the equivalent behavior, this would
> >> require:
> >> > from-lport  300 (ip4.dst == 172.16.10.0/24 && tcp == 80)
> allow-related
> >> > from-lport  300 (ip4.dst == 172.16.10.0/24 && tcp == 8080)
> >> allow-related
> >> > from-lport  300 (ip4.dst == 172.16.10.0/24 && udp) allow-related
> >> > from-lport  300 (ip4.dst == 172.16.10.0/24 && ip4.src == 10.1.1.0/24
> &&
> >> > tcp) allow-related
> >> > from-lport  300 (ip4.dst == 192.168.20.0/24 && tcp == 80)
> allow-related
> >> > from-lport  300 (ip4.dst == 192.168.20.0/24 && tcp == 8080)
> >> allow-related
> >> > from-lport  300 (ip4.dst == 192.168.20.0/24 && udp) allow-related
> >> > from-lport  300 (ip4.dst == 192.168.20.0/24 && ip4.src == 10.1.1.0/24
> >> &&
> >> > tcp) allow-related
> >> > from-lport  200 (ip4.dst == 172.16.0.0/20) drop
> >> > from-lport  200 (ip4.dst == 192.168.0.0/16) drop
> >> > from-lport  100 (ip4.dst == 172.16.0.0/16 && tcp == 80) allow-related
> >> > from-lport  100 (ip4.dst == 172.16.0.0/16 && tcp == 8080)
> allow-related
> >> > from-lport  100 (ip4.dst == 172.16.0.0/16 && udp) allow-related
> >> > from-lport  100 (ip4.dst == 172.16.0.0/16 && ip4.src == 10.1.1.0/24
> &&
> >> > tcp) allow-related
> >> >
> >>
> >> Or have an address set, "addrset1", which contains {172.16.10.0/24,
> >> 192.168.20.0/24, 172.16.0.0/20, 192.168.0.0/16, 172.16.0.0/16}.
> >>
> >> acl  from-lport  100 (ip4.dst == $addrset1 && tcp && tcp.dst == {80,
> >> 8080})
> >> allow-related
> >> acl  from-lport  100 (ip4.dst == $addrset1 && udp) allow-related
> >> acl  from-lport  100 (ip4.dst == $addrset1 && ip4.src == 10.1.1.0/24 &&
> >> tcp) allow-related
> >>
> >>
> >> >
> >> > If there are more IP addresses in feature 2, then the number
> >> > of ACL rules will climb geometrically:
> >> > (4 feature 1 rules * # feature 2 allow-related rules + # feature 2
> drop
> >> > rules)
> >> >
> >> > With 2 separate ACL stages, the rules just go straight into
> >> > the corresponding ACL table, no merge required:
> >> > (# feature 1 rules + # feature 2 rules)
> >> >
> >>
> >> Thanks for elaborating.  I'm not opposed.  It seems harmless if not
> being
> >> used.
> >>
> >
> >
> > There are presently no unit tests for ACLs in the system tests
> > (system-ovn.at).
> > The first step should be to add unit tests for single stage ACLs.
> > and then add a delta of tests if other stages are desired.
> >
> > It will be good to test the coordination between multiple stages
> > coming directly from northbound APIs and check what happens when
> > multistage ACLs are setup and torn down stage by stage, particularly
> > when the datapath ends up in a more permissive state for some period of
> > time.
> >
> >
> >
> >>
> >> Can you update the docs to indicate the specific accepted values for
> >> "stage"?
> >
> >
> >
> > This would significantly complicate the usage of northbound ACL APIs,
> > since multi-staging would be exposed at the top (northbound) OVN layer.
> >
>
> The default behavior when "stage" is not specified is to apply the ACL to
> the
> existing "acl" stage. If you don't care about the second ACL stage,
> continue
> to use ACLs as you do today and it will work. There is no complication.
>

The 2 ct_commit for deletion of firewall rules will likely be tricky. This
will need unit tests.


>
>
> > This would need a clear set of guidelines how northbound
> > multistage ACLs would be used by a CMS, at the user level.
> >
>
> The CMS typically does not expose ACLs directly to the user. For example,
> with OpenStack, Security Groups use the default "acl" stage. OpenStack
> FWaaS v2 would use the "acl2" stage. These are two separate OpenStack
> features with separate OpenStack northbound APIs to the user.
>
> Mickey
>
>
> >
> >> It currently sounds like you can use as many stages as you want
> >> to me.
> >>
> >> --
> >> Russell Bryant
> >> _______________________________________________
> >> dev mailing list
> >> dev@openvswitch.org
> >> http://openvswitch.org/mailman/listinfo/dev
> >>
> >
> >
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
>
Russell Bryant Aug. 2, 2016, 7:01 p.m. UTC | #8
On Tue, Aug 2, 2016 at 1:29 PM, Guru Shetty <guru@ovn.org> wrote:

> The 2 ct_commit for deletion of firewall rules will likely be tricky. This
> will need unit tests.
>

I don't think I understand the concern.  Can you expand a bit on what you
mean by "2 ct_commit for deletion of firewall rules"?
Darrell Ball Aug. 2, 2016, 7:02 p.m. UTC | #9
On Tue, Aug 2, 2016 at 10:23 AM, Mickey Spiegel <mickeys.dev@gmail.com>
wrote:

> On Tue, Aug 2, 2016 at 9:26 AM, Darrell Ball <dlu998@gmail.com> wrote:
>
>>
>>
>> On Tue, Aug 2, 2016 at 4:52 AM, Russell Bryant <russell@ovn.org> wrote:
>>
>>> On Sat, Jul 30, 2016 at 4:19 PM, Mickey Spiegel <mickeys.dev@gmail.com>
>>> wrote:
>>>
>>> > On Fri, Jul 29, 2016 at 10:28 AM, Mickey Spiegel <emspiege@us.ibm.com>
>>> > wrote:
>>> > >
>>> > > -----"dev" <dev-bounces@openvswitch.org> wrote: -----
>>> > >> To: Mickey Spiegel <mickeys.dev@gmail.com>
>>> > >> From: Russell Bryant
>>> > >> Sent by: "dev"
>>> > >> Date: 07/29/2016 10:02AM
>>> > >> Cc: ovs dev <dev@openvswitch.org>
>>> > >> Subject: Re: [ovs-dev] [PATCH] ovn: Add second ACL stage
>>> > >>
>>> > >> On Fri, Jul 29, 2016 at 12:47 AM, Mickey Spiegel <
>>> mickeys.dev@gmail.com
>>> > >
>>> > >> wrote:
>>> > >>
>>> > >>>
>>> > >>> This patch adds a second logical switch ingress ACL stage, and
>>> > >>> correspondingly a second logical switch egress ACL stage.  This
>>> > >>> allows for more than one ACL-based feature to be applied in the
>>> > >>> ingress and egress logical switch pipelines.  The features
>>> > >>> driving the different ACL stages may be configured by different
>>> > >>> users, for example an application deployer managing security
>>> > >>> groups and a network or security admin configuring network ACLs
>>> > >>> or firewall rules.
>>> > >>>
>>> > >>> Each ACL stage is self contained.  The "action" for the
>>> > >>> highest-"priority" matching row in an ACL stage determines a
>>> > >>> packet's treatment.  A separate "action" will be determined in
>>> > >>> each ACL stage, according to the ACL rules configured for that
>>> > >>> ACL stage.  The "priority" values are only relevant within the
>>> > >>> context of an ACL stage.
>>> > >>>
>>> > >>> ACL rules that do not specify an ACL stage are applied to the
>>> > >>> default "acl" stage.
>>> > >>>
>>> > >>> Signed-off-by: Mickey Spiegel <mickeys.dev@gmail.com>
>>> > >>
>>> > >>
>>> > >> Could you expand on why priorities in a single stage aren't enough
>>> to
>>> > >> satisfy the use case?
>>> > >
>>> > > If two features are configured independently with a mix of
>>> > > prioritized allow and drop rules, then with a single stage, a
>>> > > new set of ACL rules must be produced that achieves the same
>>> > > behavior.  This is sometimes referred to as an "ACL merge"
>>> > > algorithm, for example:
>>> > >
>>> >
>>> http://www.cisco.com/en/US/products/hw/switches/ps708/products_white_paper09186a00800c9470.shtml#wp39514
>>> > >
>>> > > In the worst case, for example when the features act on different
>>> > > packet fields (e.g. one on IP address and another on L4 port),
>>> > > the number of rules required can approach
>>> > > (# of ACL1 rules) * (# of ACL2 rules).
>>> > >
>>> > > While it is possible to code up such an algorithm, it adds
>>> > > significant complexity and complicates whichever layer
>>> > > implements the merge algorithm, either OVN or the CMS above.
>>> > >
>>> > > By using multiple independent pipeline stages, all of this
>>> > > software complexity is avoided, achieving the proper result
>>> > > in a simple and straightforward manner.
>>> > >
>>> > > Recent network hardware ASICs tend to have around 8 or 10 ACL
>>> > > stages, though they tend to evaluate these in parallel given
>>> > > all the emphasis on low latency these days.
>>> >
>>> > Throwing in an example to illustrate the difference between one
>>> > ACL stage and two ACL stages:
>>> >
>>> > If two separate ACL stages:
>>> > Feature 1
>>> > acl  from-lport  100 (tcp == 80) allow-related
>>> > acl  from-lport  100 (tcp == 8080) allow-related
>>> > acl  from-lport  100 (udp) allow-related
>>> > acl  from-lport  100 (ip4.src == 10.1.1.0/24 && tcp) allow-related
>>> >
>>> > Feature 2
>>> > acl2 from-lport  300 (ip4.dst == 172.16.10.0/24) allow-related
>>> > acl2 from-lport  300 (ip4.dst == 192.168.20.0/24) allow-related
>>> > acl2 from-lport  200 (ip4.dst == 172.16.0.0/20) drop
>>> > acl2 from-lport  200 (ip4.dst == 192.168.0.0/16) drop
>>> > acl2 from-lport  100 (ip4.dst == 172.16.0.0/16) allow-related
>>> >
>>> > Combined in one stage, to get the equivalent behavior, this would
>>> require:
>>> > from-lport  300 (ip4.dst == 172.16.10.0/24 && tcp == 80) allow-related
>>> > from-lport  300 (ip4.dst == 172.16.10.0/24 && tcp == 8080)
>>> allow-related
>>> > from-lport  300 (ip4.dst == 172.16.10.0/24 && udp) allow-related
>>> > from-lport  300 (ip4.dst == 172.16.10.0/24 && ip4.src == 10.1.1.0/24
>>> &&
>>> > tcp) allow-related
>>> > from-lport  300 (ip4.dst == 192.168.20.0/24 && tcp == 80)
>>> allow-related
>>> > from-lport  300 (ip4.dst == 192.168.20.0/24 && tcp == 8080)
>>> allow-related
>>> > from-lport  300 (ip4.dst == 192.168.20.0/24 && udp) allow-related
>>> > from-lport  300 (ip4.dst == 192.168.20.0/24 && ip4.src == 10.1.1.0/24
>>> &&
>>> > tcp) allow-related
>>> > from-lport  200 (ip4.dst == 172.16.0.0/20) drop
>>> > from-lport  200 (ip4.dst == 192.168.0.0/16) drop
>>> > from-lport  100 (ip4.dst == 172.16.0.0/16 && tcp == 80) allow-related
>>> > from-lport  100 (ip4.dst == 172.16.0.0/16 && tcp == 8080)
>>> allow-related
>>> > from-lport  100 (ip4.dst == 172.16.0.0/16 && udp) allow-related
>>> > from-lport  100 (ip4.dst == 172.16.0.0/16 && ip4.src == 10.1.1.0/24 &&
>>> > tcp) allow-related
>>> >
>>>
>>> Or have an address set, "addrset1", which contains {172.16.10.0/24,
>>> 192.168.20.0/24, 172.16.0.0/20, 192.168.0.0/16, 172.16.0.0/16}.
>>>
>>> acl  from-lport  100 (ip4.dst == $addrset1 && tcp && tcp.dst == {80,
>>> 8080})
>>> allow-related
>>> acl  from-lport  100 (ip4.dst == $addrset1 && udp) allow-related
>>> acl  from-lport  100 (ip4.dst == $addrset1 && ip4.src == 10.1.1.0/24 &&
>>> tcp) allow-related
>>>
>>>
>>> >
>>> > If there are more IP addresses in feature 2, then the number
>>> > of ACL rules will climb geometrically:
>>> > (4 feature 1 rules * # feature 2 allow-related rules + # feature 2 drop
>>> > rules)
>>> >
>>> > With 2 separate ACL stages, the rules just go straight into
>>> > the corresponding ACL table, no merge required:
>>> > (# feature 1 rules + # feature 2 rules)
>>> >
>>>
>>> Thanks for elaborating.  I'm not opposed.  It seems harmless if not being
>>> used.
>>>
>>
>>
>> There are presently no unit tests for ACLs in the system tests
>> (system-ovn.at).
>> The first step should be to add unit tests for single stage ACLs.
>> and then add a delta of tests if other stages are desired.
>>
>> It will be good to test the coordination between multiple stages
>> coming directly from northbound APIs and check what happens when
>> multistage ACLs are setup and torn down stage by stage, particularly
>> when the datapath ends up in a more permissive state for some period of
>> time.
>>
>
This feature proposal has a problem for both setup and teardown where
the staging will result in a more permissive state for periods of time.

Here is a simple example based on your example above:
If one only wants to allow TCP and src IP 20.20.20.20 and the stage with
TCP is
added first with the stage with src IP 20.20.20.20 lagging, one will have
the
following

200 TCP permit
100 DROP ALL

which permits all TCP - not what we want.

We cannot enforce a transaction across multiple databases (NB, SB,
ovn-controller)



>
>>
>>
>>>
>>> Can you update the docs to indicate the specific accepted values for
>>> "stage"?
>>
>>
>>
>> This would significantly complicate the usage of northbound ACL APIs,
>> since multi-staging would be exposed at the top (northbound) OVN layer.
>>
>
> The default behavior when "stage" is not specified is to apply the ACL to
> the
> existing "acl" stage. If you don't care about the second ACL stage,
> continue
> to use ACLs as you do today and it will work. There is no complication.
>

You need a set of guidelines.
You just cannot assume the northbound API usage will avoid this feature.
How does one know this feature should be avoided or when to use it.
Assuming one decides to use it, how does one know how to use it.



>
>
>> This would need a clear set of guidelines how northbound
>> multistage ACLs would be used by a CMS, at the user level.
>>
>
> The CMS typically does not expose ACLs directly to the user. For example,
> with OpenStack, Security Groups use the default "acl" stage. OpenStack
> FWaaS v2 would use the "acl2" stage. These are two separate OpenStack
> features with separate OpenStack northbound APIs to the user.
>


First of all, every OVN feature should not be tied to Openstack.



>
> Mickey
>
>
>>
>>> It currently sounds like you can use as many stages as you want
>>> to me.
>>>
>>> --
>>> Russell Bryant
>>> _______________________________________________
>>> dev mailing list
>>> dev@openvswitch.org
>>> http://openvswitch.org/mailman/listinfo/dev
>>>
>>
>>
>
Russell Bryant Aug. 2, 2016, 7:05 p.m. UTC | #10
On Tue, Aug 2, 2016 at 3:02 PM, Darrell Ball <dlu998@gmail.com> wrote:

>
>
> On Tue, Aug 2, 2016 at 10:23 AM, Mickey Spiegel <mickeys.dev@gmail.com>
> wrote:
>
>> On Tue, Aug 2, 2016 at 9:26 AM, Darrell Ball <dlu998@gmail.com> wrote:
>>
>>>
>>>
>>> On Tue, Aug 2, 2016 at 4:52 AM, Russell Bryant <russell@ovn.org> wrote:
>>>
>>>> On Sat, Jul 30, 2016 at 4:19 PM, Mickey Spiegel <mickeys.dev@gmail.com>
>>>> wrote:
>>>>
>>>> > On Fri, Jul 29, 2016 at 10:28 AM, Mickey Spiegel <emspiege@us.ibm.com
>>>> >
>>>> > wrote:
>>>> > >
>>>> > > -----"dev" <dev-bounces@openvswitch.org> wrote: -----
>>>> > >> To: Mickey Spiegel <mickeys.dev@gmail.com>
>>>> > >> From: Russell Bryant
>>>> > >> Sent by: "dev"
>>>> > >> Date: 07/29/2016 10:02AM
>>>> > >> Cc: ovs dev <dev@openvswitch.org>
>>>> > >> Subject: Re: [ovs-dev] [PATCH] ovn: Add second ACL stage
>>>> > >>
>>>> > >> On Fri, Jul 29, 2016 at 12:47 AM, Mickey Spiegel <
>>>> mickeys.dev@gmail.com
>>>> > >
>>>> > >> wrote:
>>>> > >>
>>>> > >>>
>>>> > >>> This patch adds a second logical switch ingress ACL stage, and
>>>> > >>> correspondingly a second logical switch egress ACL stage.  This
>>>> > >>> allows for more than one ACL-based feature to be applied in the
>>>> > >>> ingress and egress logical switch pipelines.  The features
>>>> > >>> driving the different ACL stages may be configured by different
>>>> > >>> users, for example an application deployer managing security
>>>> > >>> groups and a network or security admin configuring network ACLs
>>>> > >>> or firewall rules.
>>>> > >>>
>>>> > >>> Each ACL stage is self contained.  The "action" for the
>>>> > >>> highest-"priority" matching row in an ACL stage determines a
>>>> > >>> packet's treatment.  A separate "action" will be determined in
>>>> > >>> each ACL stage, according to the ACL rules configured for that
>>>> > >>> ACL stage.  The "priority" values are only relevant within the
>>>> > >>> context of an ACL stage.
>>>> > >>>
>>>> > >>> ACL rules that do not specify an ACL stage are applied to the
>>>> > >>> default "acl" stage.
>>>> > >>>
>>>> > >>> Signed-off-by: Mickey Spiegel <mickeys.dev@gmail.com>
>>>> > >>
>>>> > >>
>>>> > >> Could you expand on why priorities in a single stage aren't enough
>>>> to
>>>> > >> satisfy the use case?
>>>> > >
>>>> > > If two features are configured independently with a mix of
>>>> > > prioritized allow and drop rules, then with a single stage, a
>>>> > > new set of ACL rules must be produced that achieves the same
>>>> > > behavior.  This is sometimes referred to as an "ACL merge"
>>>> > > algorithm, for example:
>>>> > >
>>>> >
>>>> http://www.cisco.com/en/US/products/hw/switches/ps708/products_white_paper09186a00800c9470.shtml#wp39514
>>>> > >
>>>> > > In the worst case, for example when the features act on different
>>>> > > packet fields (e.g. one on IP address and another on L4 port),
>>>> > > the number of rules required can approach
>>>> > > (# of ACL1 rules) * (# of ACL2 rules).
>>>> > >
>>>> > > While it is possible to code up such an algorithm, it adds
>>>> > > significant complexity and complicates whichever layer
>>>> > > implements the merge algorithm, either OVN or the CMS above.
>>>> > >
>>>> > > By using multiple independent pipeline stages, all of this
>>>> > > software complexity is avoided, achieving the proper result
>>>> > > in a simple and straightforward manner.
>>>> > >
>>>> > > Recent network hardware ASICs tend to have around 8 or 10 ACL
>>>> > > stages, though they tend to evaluate these in parallel given
>>>> > > all the emphasis on low latency these days.
>>>> >
>>>> > Throwing in an example to illustrate the difference between one
>>>> > ACL stage and two ACL stages:
>>>> >
>>>> > If two separate ACL stages:
>>>> > Feature 1
>>>> > acl  from-lport  100 (tcp == 80) allow-related
>>>> > acl  from-lport  100 (tcp == 8080) allow-related
>>>> > acl  from-lport  100 (udp) allow-related
>>>> > acl  from-lport  100 (ip4.src == 10.1.1.0/24 && tcp) allow-related
>>>> >
>>>> > Feature 2
>>>> > acl2 from-lport  300 (ip4.dst == 172.16.10.0/24) allow-related
>>>> > acl2 from-lport  300 (ip4.dst == 192.168.20.0/24) allow-related
>>>> > acl2 from-lport  200 (ip4.dst == 172.16.0.0/20) drop
>>>> > acl2 from-lport  200 (ip4.dst == 192.168.0.0/16) drop
>>>> > acl2 from-lport  100 (ip4.dst == 172.16.0.0/16) allow-related
>>>> >
>>>> > Combined in one stage, to get the equivalent behavior, this would
>>>> require:
>>>> > from-lport  300 (ip4.dst == 172.16.10.0/24 && tcp == 80)
>>>> allow-related
>>>> > from-lport  300 (ip4.dst == 172.16.10.0/24 && tcp == 8080)
>>>> allow-related
>>>> > from-lport  300 (ip4.dst == 172.16.10.0/24 && udp) allow-related
>>>> > from-lport  300 (ip4.dst == 172.16.10.0/24 && ip4.src == 10.1.1.0/24
>>>> &&
>>>> > tcp) allow-related
>>>> > from-lport  300 (ip4.dst == 192.168.20.0/24 && tcp == 80)
>>>> allow-related
>>>> > from-lport  300 (ip4.dst == 192.168.20.0/24 && tcp == 8080)
>>>> allow-related
>>>> > from-lport  300 (ip4.dst == 192.168.20.0/24 && udp) allow-related
>>>> > from-lport  300 (ip4.dst == 192.168.20.0/24 && ip4.src == 10.1.1.0/24
>>>> &&
>>>> > tcp) allow-related
>>>> > from-lport  200 (ip4.dst == 172.16.0.0/20) drop
>>>> > from-lport  200 (ip4.dst == 192.168.0.0/16) drop
>>>> > from-lport  100 (ip4.dst == 172.16.0.0/16 && tcp == 80) allow-related
>>>> > from-lport  100 (ip4.dst == 172.16.0.0/16 && tcp == 8080)
>>>> allow-related
>>>> > from-lport  100 (ip4.dst == 172.16.0.0/16 && udp) allow-related
>>>> > from-lport  100 (ip4.dst == 172.16.0.0/16 && ip4.src == 10.1.1.0/24
>>>> &&
>>>> > tcp) allow-related
>>>> >
>>>>
>>>> Or have an address set, "addrset1", which contains {172.16.10.0/24,
>>>> 192.168.20.0/24, 172.16.0.0/20, 192.168.0.0/16, 172.16.0.0/16}.
>>>>
>>>> acl  from-lport  100 (ip4.dst == $addrset1 && tcp && tcp.dst == {80,
>>>> 8080})
>>>> allow-related
>>>> acl  from-lport  100 (ip4.dst == $addrset1 && udp) allow-related
>>>> acl  from-lport  100 (ip4.dst == $addrset1 && ip4.src == 10.1.1.0/24 &&
>>>> tcp) allow-related
>>>>
>>>>
>>>> >
>>>> > If there are more IP addresses in feature 2, then the number
>>>> > of ACL rules will climb geometrically:
>>>> > (4 feature 1 rules * # feature 2 allow-related rules + # feature 2
>>>> drop
>>>> > rules)
>>>> >
>>>> > With 2 separate ACL stages, the rules just go straight into
>>>> > the corresponding ACL table, no merge required:
>>>> > (# feature 1 rules + # feature 2 rules)
>>>> >
>>>>
>>>> Thanks for elaborating.  I'm not opposed.  It seems harmless if not
>>>> being
>>>> used.
>>>>
>>>
>>>
>>> There are presently no unit tests for ACLs in the system tests
>>> (system-ovn.at).
>>> The first step should be to add unit tests for single stage ACLs.
>>> and then add a delta of tests if other stages are desired.
>>>
>>> It will be good to test the coordination between multiple stages
>>> coming directly from northbound APIs and check what happens when
>>> multistage ACLs are setup and torn down stage by stage, particularly
>>> when the datapath ends up in a more permissive state for some period of
>>> time.
>>>
>>
> This feature proposal has a problem for both setup and teardown where
> the staging will result in a more permissive state for periods of time.
>
> Here is a simple example based on your example above:
> If one only wants to allow TCP and src IP 20.20.20.20 and the stage with
> TCP is
> added first with the stage with src IP 20.20.20.20 lagging, one will have
> the
> following
>
> 200 TCP permit
> 100 DROP ALL
>
> which permits all TCP - not what we want.
>
> We cannot enforce a transaction across multiple databases (NB, SB,
> ovn-controller)
>

I don't understand this.  Rules for both stages could be added in the same
transaction.  It's all in the same table of the northbound database.


>
>
>
>>
>>>
>>>
>>>>
>>>> Can you update the docs to indicate the specific accepted values for
>>>> "stage"?
>>>
>>>
>>>
>>> This would significantly complicate the usage of northbound ACL APIs,
>>> since multi-staging would be exposed at the top (northbound) OVN layer.
>>>
>>
>> The default behavior when "stage" is not specified is to apply the ACL to
>> the
>> existing "acl" stage. If you don't care about the second ACL stage,
>> continue
>> to use ACLs as you do today and it will work. There is no complication.
>>
>
> You need a set of guidelines.
> You just cannot assume the northbound API usage will avoid this feature.
> How does one know this feature should be avoided or when to use it.
> Assuming one decides to use it, how does one know how to use it.
>
>
>
>>
>>
>>> This would need a clear set of guidelines how northbound
>>> multistage ACLs would be used by a CMS, at the user level.
>>>
>>
>> The CMS typically does not expose ACLs directly to the user. For example,
>> with OpenStack, Security Groups use the default "acl" stage. OpenStack
>> FWaaS v2 would use the "acl2" stage. These are two separate OpenStack
>> features with separate OpenStack northbound APIs to the user.
>>
>
>
> First of all, every OVN feature should not be tied to Openstack.]
>

It was just used as an example of how it would be used ...
Gurucharan Shetty Aug. 2, 2016, 7:17 p.m. UTC | #11
On 2 August 2016 at 12:01, Russell Bryant <russell@ovn.org> wrote:

>
> On Tue, Aug 2, 2016 at 1:29 PM, Guru Shetty <guru@ovn.org> wrote:
>
>> The 2 ct_commit for deletion of firewall rules will likely be tricky. This
>> will need unit tests.
>>
>
> I don't think I understand the concern.  Can you expand a bit on what you
> mean by "2 ct_commit for deletion of firewall rules"?
>

My memory on how ct_commit(ct_label=1) works is a little hazy. There are 2
stages now. So whenever a firewall rule is deleted for an established
connection, the default ct_commit(ct_label=1) will get hit and the
connection is dropped. The same thing happens in the second stage for any
removed firewall rule. In the second stage when a firewall rule is deleted
ct_label is also set which will reflect in the first stage. Does not this
cause confusion with the logic?


>
>
> --
> Russell Bryant
>
Russell Bryant Aug. 2, 2016, 7:27 p.m. UTC | #12
On Tue, Aug 2, 2016 at 3:17 PM, Guru Shetty <guru@ovn.org> wrote:

>
>
> On 2 August 2016 at 12:01, Russell Bryant <russell@ovn.org> wrote:
>
>>
>> On Tue, Aug 2, 2016 at 1:29 PM, Guru Shetty <guru@ovn.org> wrote:
>>
>>> The 2 ct_commit for deletion of firewall rules will likely be tricky.
>>> This
>>> will need unit tests.
>>>
>>
>> I don't think I understand the concern.  Can you expand a bit on what you
>> mean by "2 ct_commit for deletion of firewall rules"?
>>
>
> My memory on how ct_commit(ct_label=1) works is a little hazy. There are 2
> stages now. So whenever a firewall rule is deleted for an established
> connection, the default ct_commit(ct_label=1) will get hit and the
> connection is dropped. The same thing happens in the second stage for any
> removed firewall rule. In the second stage when a firewall rule is deleted
> ct_label is also set which will reflect in the first stage. Does not this
> cause confusion with the logic?
>

Setting ct_label back to 0 only happens in the stateful table.  That
ct_commit will only occur if none of the ACL stages think the packet should
be dropped.  I think it's OK.
Gurucharan Shetty Aug. 2, 2016, 7:35 p.m. UTC | #13
On 2 August 2016 at 12:27, Russell Bryant <russell@ovn.org> wrote:

>
>
> On Tue, Aug 2, 2016 at 3:17 PM, Guru Shetty <guru@ovn.org> wrote:
>
>>
>>
>> On 2 August 2016 at 12:01, Russell Bryant <russell@ovn.org> wrote:
>>
>>>
>>> On Tue, Aug 2, 2016 at 1:29 PM, Guru Shetty <guru@ovn.org> wrote:
>>>
>>>> The 2 ct_commit for deletion of firewall rules will likely be tricky.
>>>> This
>>>> will need unit tests.
>>>>
>>>
>>> I don't think I understand the concern.  Can you expand a bit on what
>>> you mean by "2 ct_commit for deletion of firewall rules"?
>>>
>>
>> My memory on how ct_commit(ct_label=1) works is a little hazy. There are
>> 2 stages now. So whenever a firewall rule is deleted for an established
>> connection, the default ct_commit(ct_label=1) will get hit and the
>> connection is dropped. The same thing happens in the second stage for any
>> removed firewall rule. In the second stage when a firewall rule is deleted
>> ct_label is also set which will reflect in the first stage. Does not this
>> cause confusion with the logic?
>>
>
> Setting ct_label back to 0 only happens in the stateful table.  That
> ct_commit will only occur if none of the ACL stages think the packet should
> be dropped.  I think it's OK.
>

I see. I think we should still consider unit tests now. Userspace datapath
has ct_commit now (it still can't do NAT). That should ideally work. If
that does not work, we should consider adding tests to system-ovn.at




>
> --
> Russell Bryant
>
Russell Bryant Aug. 2, 2016, 8:14 p.m. UTC | #14
On Tue, Aug 2, 2016 at 3:35 PM, Guru Shetty <guru@ovn.org> wrote:

>
>
> On 2 August 2016 at 12:27, Russell Bryant <russell@ovn.org> wrote:
>
>>
>>
>> On Tue, Aug 2, 2016 at 3:17 PM, Guru Shetty <guru@ovn.org> wrote:
>>
>>>
>>>
>>> On 2 August 2016 at 12:01, Russell Bryant <russell@ovn.org> wrote:
>>>
>>>>
>>>> On Tue, Aug 2, 2016 at 1:29 PM, Guru Shetty <guru@ovn.org> wrote:
>>>>
>>>>> The 2 ct_commit for deletion of firewall rules will likely be tricky.
>>>>> This
>>>>> will need unit tests.
>>>>>
>>>>
>>>> I don't think I understand the concern.  Can you expand a bit on what
>>>> you mean by "2 ct_commit for deletion of firewall rules"?
>>>>
>>>
>>> My memory on how ct_commit(ct_label=1) works is a little hazy. There are
>>> 2 stages now. So whenever a firewall rule is deleted for an established
>>> connection, the default ct_commit(ct_label=1) will get hit and the
>>> connection is dropped. The same thing happens in the second stage for any
>>> removed firewall rule. In the second stage when a firewall rule is deleted
>>> ct_label is also set which will reflect in the first stage. Does not this
>>> cause confusion with the logic?
>>>
>>
>> Setting ct_label back to 0 only happens in the stateful table.  That
>> ct_commit will only occur if none of the ACL stages think the packet should
>> be dropped.  I think it's OK.
>>
>
> I see. I think we should still consider unit tests now. Userspace datapath
> has ct_commit now (it still can't do NAT). That should ideally work. If
> that does not work, we should consider adding tests to system-ovn.at
>

Yes, I agree that this area is sorely lacking in test coverage.
Darrell Ball Aug. 2, 2016, 8:39 p.m. UTC | #15
On Tue, Aug 2, 2016 at 12:05 PM, Russell Bryant <russell@ovn.org> wrote:

>
>
> On Tue, Aug 2, 2016 at 3:02 PM, Darrell Ball <dlu998@gmail.com> wrote:
>
>>
>>
>> On Tue, Aug 2, 2016 at 10:23 AM, Mickey Spiegel <mickeys.dev@gmail.com>
>> wrote:
>>
>>> On Tue, Aug 2, 2016 at 9:26 AM, Darrell Ball <dlu998@gmail.com> wrote:
>>>
>>>>
>>>>
>>>> On Tue, Aug 2, 2016 at 4:52 AM, Russell Bryant <russell@ovn.org> wrote:
>>>>
>>>>> On Sat, Jul 30, 2016 at 4:19 PM, Mickey Spiegel <mickeys.dev@gmail.com
>>>>> >
>>>>> wrote:
>>>>>
>>>>> > On Fri, Jul 29, 2016 at 10:28 AM, Mickey Spiegel <
>>>>> emspiege@us.ibm.com>
>>>>> > wrote:
>>>>> > >
>>>>> > > -----"dev" <dev-bounces@openvswitch.org> wrote: -----
>>>>> > >> To: Mickey Spiegel <mickeys.dev@gmail.com>
>>>>> > >> From: Russell Bryant
>>>>> > >> Sent by: "dev"
>>>>> > >> Date: 07/29/2016 10:02AM
>>>>> > >> Cc: ovs dev <dev@openvswitch.org>
>>>>> > >> Subject: Re: [ovs-dev] [PATCH] ovn: Add second ACL stage
>>>>> > >>
>>>>> > >> On Fri, Jul 29, 2016 at 12:47 AM, Mickey Spiegel <
>>>>> mickeys.dev@gmail.com
>>>>> > >
>>>>> > >> wrote:
>>>>> > >>
>>>>> > >>>
>>>>> > >>> This patch adds a second logical switch ingress ACL stage, and
>>>>> > >>> correspondingly a second logical switch egress ACL stage.  This
>>>>> > >>> allows for more than one ACL-based feature to be applied in the
>>>>> > >>> ingress and egress logical switch pipelines.  The features
>>>>> > >>> driving the different ACL stages may be configured by different
>>>>> > >>> users, for example an application deployer managing security
>>>>> > >>> groups and a network or security admin configuring network ACLs
>>>>> > >>> or firewall rules.
>>>>> > >>>
>>>>> > >>> Each ACL stage is self contained.  The "action" for the
>>>>> > >>> highest-"priority" matching row in an ACL stage determines a
>>>>> > >>> packet's treatment.  A separate "action" will be determined in
>>>>> > >>> each ACL stage, according to the ACL rules configured for that
>>>>> > >>> ACL stage.  The "priority" values are only relevant within the
>>>>> > >>> context of an ACL stage.
>>>>> > >>>
>>>>> > >>> ACL rules that do not specify an ACL stage are applied to the
>>>>> > >>> default "acl" stage.
>>>>> > >>>
>>>>> > >>> Signed-off-by: Mickey Spiegel <mickeys.dev@gmail.com>
>>>>> > >>
>>>>> > >>
>>>>> > >> Could you expand on why priorities in a single stage aren't
>>>>> enough to
>>>>> > >> satisfy the use case?
>>>>> > >
>>>>> > > If two features are configured independently with a mix of
>>>>> > > prioritized allow and drop rules, then with a single stage, a
>>>>> > > new set of ACL rules must be produced that achieves the same
>>>>> > > behavior.  This is sometimes referred to as an "ACL merge"
>>>>> > > algorithm, for example:
>>>>> > >
>>>>> >
>>>>> http://www.cisco.com/en/US/products/hw/switches/ps708/products_white_paper09186a00800c9470.shtml#wp39514
>>>>> > >
>>>>> > > In the worst case, for example when the features act on different
>>>>> > > packet fields (e.g. one on IP address and another on L4 port),
>>>>> > > the number of rules required can approach
>>>>> > > (# of ACL1 rules) * (# of ACL2 rules).
>>>>> > >
>>>>> > > While it is possible to code up such an algorithm, it adds
>>>>> > > significant complexity and complicates whichever layer
>>>>> > > implements the merge algorithm, either OVN or the CMS above.
>>>>> > >
>>>>> > > By using multiple independent pipeline stages, all of this
>>>>> > > software complexity is avoided, achieving the proper result
>>>>> > > in a simple and straightforward manner.
>>>>> > >
>>>>> > > Recent network hardware ASICs tend to have around 8 or 10 ACL
>>>>> > > stages, though they tend to evaluate these in parallel given
>>>>> > > all the emphasis on low latency these days.
>>>>> >
>>>>> > Throwing in an example to illustrate the difference between one
>>>>> > ACL stage and two ACL stages:
>>>>> >
>>>>> > If two separate ACL stages:
>>>>> > Feature 1
>>>>> > acl  from-lport  100 (tcp == 80) allow-related
>>>>> > acl  from-lport  100 (tcp == 8080) allow-related
>>>>> > acl  from-lport  100 (udp) allow-related
>>>>> > acl  from-lport  100 (ip4.src == 10.1.1.0/24 && tcp) allow-related
>>>>> >
>>>>> > Feature 2
>>>>> > acl2 from-lport  300 (ip4.dst == 172.16.10.0/24) allow-related
>>>>> > acl2 from-lport  300 (ip4.dst == 192.168.20.0/24) allow-related
>>>>> > acl2 from-lport  200 (ip4.dst == 172.16.0.0/20) drop
>>>>> > acl2 from-lport  200 (ip4.dst == 192.168.0.0/16) drop
>>>>> > acl2 from-lport  100 (ip4.dst == 172.16.0.0/16) allow-related
>>>>> >
>>>>> > Combined in one stage, to get the equivalent behavior, this would
>>>>> require:
>>>>> > from-lport  300 (ip4.dst == 172.16.10.0/24 && tcp == 80)
>>>>> allow-related
>>>>> > from-lport  300 (ip4.dst == 172.16.10.0/24 && tcp == 8080)
>>>>> allow-related
>>>>> > from-lport  300 (ip4.dst == 172.16.10.0/24 && udp) allow-related
>>>>> > from-lport  300 (ip4.dst == 172.16.10.0/24 && ip4.src == 10.1.1.0/24
>>>>> &&
>>>>> > tcp) allow-related
>>>>> > from-lport  300 (ip4.dst == 192.168.20.0/24 && tcp == 80)
>>>>> allow-related
>>>>> > from-lport  300 (ip4.dst == 192.168.20.0/24 && tcp == 8080)
>>>>> allow-related
>>>>> > from-lport  300 (ip4.dst == 192.168.20.0/24 && udp) allow-related
>>>>> > from-lport  300 (ip4.dst == 192.168.20.0/24 && ip4.src ==
>>>>> 10.1.1.0/24 &&
>>>>> > tcp) allow-related
>>>>> > from-lport  200 (ip4.dst == 172.16.0.0/20) drop
>>>>> > from-lport  200 (ip4.dst == 192.168.0.0/16) drop
>>>>> > from-lport  100 (ip4.dst == 172.16.0.0/16 && tcp == 80)
>>>>> allow-related
>>>>> > from-lport  100 (ip4.dst == 172.16.0.0/16 && tcp == 8080)
>>>>> allow-related
>>>>> > from-lport  100 (ip4.dst == 172.16.0.0/16 && udp) allow-related
>>>>> > from-lport  100 (ip4.dst == 172.16.0.0/16 && ip4.src == 10.1.1.0/24
>>>>> &&
>>>>> > tcp) allow-related
>>>>> >
>>>>>
>>>>> Or have an address set, "addrset1", which contains {172.16.10.0/24,
>>>>> 192.168.20.0/24, 172.16.0.0/20, 192.168.0.0/16, 172.16.0.0/16}.
>>>>>
>>>>> acl  from-lport  100 (ip4.dst == $addrset1 && tcp && tcp.dst == {80,
>>>>> 8080})
>>>>> allow-related
>>>>> acl  from-lport  100 (ip4.dst == $addrset1 && udp) allow-related
>>>>> acl  from-lport  100 (ip4.dst == $addrset1 && ip4.src == 10.1.1.0/24
>>>>> &&
>>>>> tcp) allow-related
>>>>>
>>>>>
>>>>> >
>>>>> > If there are more IP addresses in feature 2, then the number
>>>>> > of ACL rules will climb geometrically:
>>>>> > (4 feature 1 rules * # feature 2 allow-related rules + # feature 2
>>>>> drop
>>>>> > rules)
>>>>> >
>>>>> > With 2 separate ACL stages, the rules just go straight into
>>>>> > the corresponding ACL table, no merge required:
>>>>> > (# feature 1 rules + # feature 2 rules)
>>>>> >
>>>>>
>>>>> Thanks for elaborating.  I'm not opposed.  It seems harmless if not
>>>>> being
>>>>> used.
>>>>>
>>>>
>>>>
>>>> There are presently no unit tests for ACLs in the system tests
>>>> (system-ovn.at).
>>>> The first step should be to add unit tests for single stage ACLs.
>>>> and then add a delta of tests if other stages are desired.
>>>>
>>>> It will be good to test the coordination between multiple stages
>>>> coming directly from northbound APIs and check what happens when
>>>> multistage ACLs are setup and torn down stage by stage, particularly
>>>> when the datapath ends up in a more permissive state for some period of
>>>> time.
>>>>
>>>
>> This feature proposal has a problem for both setup and teardown where
>> the staging will result in a more permissive state for periods of time.
>>
>> Here is a simple example based on your example above:
>> If one only wants to allow TCP and src IP 20.20.20.20 and the stage with
>> TCP is
>> added first with the stage with src IP 20.20.20.20 lagging, one will have
>> the
>> following
>>
>> 200 TCP permit
>> 100 DROP ALL
>>
>> which permits all TCP - not what we want.
>>
>> We cannot enforce a transaction across multiple databases (NB, SB,
>> ovn-controller)
>>
>
> I don't understand this.  Rules for both stages could be added in the same
> transaction.  It's all in the same table of the northbound database.
>
>

I am assuming that the rules would be entered into the Northbound database
in the same
transaction. That part is fine.

However, there is no enforcement of a transaction across multiple databases
in
OVN. So there is no requirement that northd and ovn-controller maintain
that NB DB transaction
across different tables which generating their respective output (i.e. SB
DB and openflow rules).





>
>>
>>
>>>
>>>>
>>>>
>>>>>
>>>>> Can you update the docs to indicate the specific accepted values for
>>>>> "stage"?
>>>>
>>>>
>>>>
>>>> This would significantly complicate the usage of northbound ACL APIs,
>>>> since multi-staging would be exposed at the top (northbound) OVN layer.
>>>>
>>>
>>> The default behavior when "stage" is not specified is to apply the ACL
>>> to the
>>> existing "acl" stage. If you don't care about the second ACL stage,
>>> continue
>>> to use ACLs as you do today and it will work. There is no complication.
>>>
>>
>> You need a set of guidelines.
>> You just cannot assume the northbound API usage will avoid this feature.
>> How does one know this feature should be avoided or when to use it.
>> Assuming one decides to use it, how does one know how to use it.
>>
>>
>>
>>>
>>>
>>>> This would need a clear set of guidelines how northbound
>>>> multistage ACLs would be used by a CMS, at the user level.
>>>>
>>>
>>> The CMS typically does not expose ACLs directly to the user. For example,
>>> with OpenStack, Security Groups use the default "acl" stage. OpenStack
>>> FWaaS v2 would use the "acl2" stage. These are two separate OpenStack
>>> features with separate OpenStack northbound APIs to the user.
>>>
>>
>>
>> First of all, every OVN feature should not be tied to Openstack.]
>>
>
> It was just used as an example of how it would be used ...
>
> --
> Russell Bryant
>
Mickey Spiegel Aug. 2, 2016, 9:38 p.m. UTC | #16
On Tue, Aug 2, 2016 at 1:39 PM, Darrell Ball <dlu998@gmail.com> wrote:

>
>
> On Tue, Aug 2, 2016 at 12:05 PM, Russell Bryant <russell@ovn.org> wrote:
>
>>
>>
>> On Tue, Aug 2, 2016 at 3:02 PM, Darrell Ball <dlu998@gmail.com> wrote:
>>
>>>
>>>
>>> On Tue, Aug 2, 2016 at 10:23 AM, Mickey Spiegel <mickeys.dev@gmail.com>
>>> wrote:
>>>
>>>> On Tue, Aug 2, 2016 at 9:26 AM, Darrell Ball <dlu998@gmail.com> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Tue, Aug 2, 2016 at 4:52 AM, Russell Bryant <russell@ovn.org>
>>>>> wrote:
>>>>>
>>>>>> On Sat, Jul 30, 2016 at 4:19 PM, Mickey Spiegel <
>>>>>> mickeys.dev@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>> > On Fri, Jul 29, 2016 at 10:28 AM, Mickey Spiegel <
>>>>>> emspiege@us.ibm.com>
>>>>>> > wrote:
>>>>>> > >
>>>>>> > > -----"dev" <dev-bounces@openvswitch.org> wrote: -----
>>>>>> > >> To: Mickey Spiegel <mickeys.dev@gmail.com>
>>>>>> > >> From: Russell Bryant
>>>>>> > >> Sent by: "dev"
>>>>>> > >> Date: 07/29/2016 10:02AM
>>>>>> > >> Cc: ovs dev <dev@openvswitch.org>
>>>>>> > >> Subject: Re: [ovs-dev] [PATCH] ovn: Add second ACL stage
>>>>>> > >>
>>>>>> > >> On Fri, Jul 29, 2016 at 12:47 AM, Mickey Spiegel <
>>>>>> mickeys.dev@gmail.com
>>>>>> > >
>>>>>> > >> wrote:
>>>>>> > >>
>>>>>> > >>>
>>>>>> > >>> This patch adds a second logical switch ingress ACL stage, and
>>>>>> > >>> correspondingly a second logical switch egress ACL stage.  This
>>>>>> > >>> allows for more than one ACL-based feature to be applied in the
>>>>>> > >>> ingress and egress logical switch pipelines.  The features
>>>>>> > >>> driving the different ACL stages may be configured by different
>>>>>> > >>> users, for example an application deployer managing security
>>>>>> > >>> groups and a network or security admin configuring network ACLs
>>>>>> > >>> or firewall rules.
>>>>>> > >>>
>>>>>> > >>> Each ACL stage is self contained.  The "action" for the
>>>>>> > >>> highest-"priority" matching row in an ACL stage determines a
>>>>>> > >>> packet's treatment.  A separate "action" will be determined in
>>>>>> > >>> each ACL stage, according to the ACL rules configured for that
>>>>>> > >>> ACL stage.  The "priority" values are only relevant within the
>>>>>> > >>> context of an ACL stage.
>>>>>> > >>>
>>>>>> > >>> ACL rules that do not specify an ACL stage are applied to the
>>>>>> > >>> default "acl" stage.
>>>>>> > >>>
>>>>>> > >>> Signed-off-by: Mickey Spiegel <mickeys.dev@gmail.com>
>>>>>> > >>
>>>>>> > >>
>>>>>> > >> Could you expand on why priorities in a single stage aren't
>>>>>> enough to
>>>>>> > >> satisfy the use case?
>>>>>> > >
>>>>>> > > If two features are configured independently with a mix of
>>>>>> > > prioritized allow and drop rules, then with a single stage, a
>>>>>> > > new set of ACL rules must be produced that achieves the same
>>>>>> > > behavior.  This is sometimes referred to as an "ACL merge"
>>>>>> > > algorithm, for example:
>>>>>> > >
>>>>>> >
>>>>>> http://www.cisco.com/en/US/products/hw/switches/ps708/products_white_paper09186a00800c9470.shtml#wp39514
>>>>>> > >
>>>>>> > > In the worst case, for example when the features act on different
>>>>>> > > packet fields (e.g. one on IP address and another on L4 port),
>>>>>> > > the number of rules required can approach
>>>>>> > > (# of ACL1 rules) * (# of ACL2 rules).
>>>>>> > >
>>>>>> > > While it is possible to code up such an algorithm, it adds
>>>>>> > > significant complexity and complicates whichever layer
>>>>>> > > implements the merge algorithm, either OVN or the CMS above.
>>>>>> > >
>>>>>> > > By using multiple independent pipeline stages, all of this
>>>>>> > > software complexity is avoided, achieving the proper result
>>>>>> > > in a simple and straightforward manner.
>>>>>> > >
>>>>>> > > Recent network hardware ASICs tend to have around 8 or 10 ACL
>>>>>> > > stages, though they tend to evaluate these in parallel given
>>>>>> > > all the emphasis on low latency these days.
>>>>>> >
>>>>>> > Throwing in an example to illustrate the difference between one
>>>>>> > ACL stage and two ACL stages:
>>>>>> >
>>>>>> > If two separate ACL stages:
>>>>>> > Feature 1
>>>>>> > acl  from-lport  100 (tcp == 80) allow-related
>>>>>> > acl  from-lport  100 (tcp == 8080) allow-related
>>>>>> > acl  from-lport  100 (udp) allow-related
>>>>>> > acl  from-lport  100 (ip4.src == 10.1.1.0/24 && tcp) allow-related
>>>>>> >
>>>>>> > Feature 2
>>>>>> > acl2 from-lport  300 (ip4.dst == 172.16.10.0/24) allow-related
>>>>>> > acl2 from-lport  300 (ip4.dst == 192.168.20.0/24) allow-related
>>>>>> > acl2 from-lport  200 (ip4.dst == 172.16.0.0/20) drop
>>>>>> > acl2 from-lport  200 (ip4.dst == 192.168.0.0/16) drop
>>>>>> > acl2 from-lport  100 (ip4.dst == 172.16.0.0/16) allow-related
>>>>>> >
>>>>>> > Combined in one stage, to get the equivalent behavior, this would
>>>>>> require:
>>>>>> > from-lport  300 (ip4.dst == 172.16.10.0/24 && tcp == 80)
>>>>>> allow-related
>>>>>> > from-lport  300 (ip4.dst == 172.16.10.0/24 && tcp == 8080)
>>>>>> allow-related
>>>>>> > from-lport  300 (ip4.dst == 172.16.10.0/24 && udp) allow-related
>>>>>> > from-lport  300 (ip4.dst == 172.16.10.0/24 && ip4.src ==
>>>>>> 10.1.1.0/24 &&
>>>>>> > tcp) allow-related
>>>>>> > from-lport  300 (ip4.dst == 192.168.20.0/24 && tcp == 80)
>>>>>> allow-related
>>>>>> > from-lport  300 (ip4.dst == 192.168.20.0/24 && tcp == 8080)
>>>>>> allow-related
>>>>>> > from-lport  300 (ip4.dst == 192.168.20.0/24 && udp) allow-related
>>>>>> > from-lport  300 (ip4.dst == 192.168.20.0/24 && ip4.src ==
>>>>>> 10.1.1.0/24 &&
>>>>>> > tcp) allow-related
>>>>>> > from-lport  200 (ip4.dst == 172.16.0.0/20) drop
>>>>>> > from-lport  200 (ip4.dst == 192.168.0.0/16) drop
>>>>>> > from-lport  100 (ip4.dst == 172.16.0.0/16 && tcp == 80)
>>>>>> allow-related
>>>>>> > from-lport  100 (ip4.dst == 172.16.0.0/16 && tcp == 8080)
>>>>>> allow-related
>>>>>> > from-lport  100 (ip4.dst == 172.16.0.0/16 && udp) allow-related
>>>>>> > from-lport  100 (ip4.dst == 172.16.0.0/16 && ip4.src == 10.1.1.0/24
>>>>>> &&
>>>>>> > tcp) allow-related
>>>>>> >
>>>>>>
>>>>>> Or have an address set, "addrset1", which contains {172.16.10.0/24,
>>>>>> 192.168.20.0/24, 172.16.0.0/20, 192.168.0.0/16, 172.16.0.0/16}.
>>>>>>
>>>>>> acl  from-lport  100 (ip4.dst == $addrset1 && tcp && tcp.dst == {80,
>>>>>> 8080})
>>>>>> allow-related
>>>>>> acl  from-lport  100 (ip4.dst == $addrset1 && udp) allow-related
>>>>>> acl  from-lport  100 (ip4.dst == $addrset1 && ip4.src == 10.1.1.0/24
>>>>>> &&
>>>>>> tcp) allow-related
>>>>>>
>>>>>>
>>>>>> >
>>>>>> > If there are more IP addresses in feature 2, then the number
>>>>>> > of ACL rules will climb geometrically:
>>>>>> > (4 feature 1 rules * # feature 2 allow-related rules + # feature 2
>>>>>> drop
>>>>>> > rules)
>>>>>> >
>>>>>> > With 2 separate ACL stages, the rules just go straight into
>>>>>> > the corresponding ACL table, no merge required:
>>>>>> > (# feature 1 rules + # feature 2 rules)
>>>>>> >
>>>>>>
>>>>>> Thanks for elaborating.  I'm not opposed.  It seems harmless if not
>>>>>> being
>>>>>> used.
>>>>>>
>>>>>
>>>>>
>>>>> There are presently no unit tests for ACLs in the system tests
>>>>> (system-ovn.at).
>>>>> The first step should be to add unit tests for single stage ACLs.
>>>>> and then add a delta of tests if other stages are desired.
>>>>>
>>>>> It will be good to test the coordination between multiple stages
>>>>> coming directly from northbound APIs and check what happens when
>>>>> multistage ACLs are setup and torn down stage by stage, particularly
>>>>> when the datapath ends up in a more permissive state for some period
>>>>> of time.
>>>>>
>>>>
>>> This feature proposal has a problem for both setup and teardown where
>>> the staging will result in a more permissive state for periods of time.
>>>
>>> Here is a simple example based on your example above:
>>> If one only wants to allow TCP and src IP 20.20.20.20 and the stage with
>>> TCP is
>>> added first with the stage with src IP 20.20.20.20 lagging, one will
>>> have the
>>> following
>>>
>>> 200 TCP permit
>>> 100 DROP ALL
>>>
>>> which permits all TCP - not what we want.
>>>
>>> We cannot enforce a transaction across multiple databases (NB, SB,
>>> ovn-controller)
>>>
>>
That is not how this is meant to be used. I used one stage for IP addresses
and
one stage for L4 port as a worst case example of expansion due to ACL merge.
That is not the motivation for using two stages. The motivation is for two
different
features that are configured separately, with one example being OpenStack
Security Groups versus OpenStack FWaaS v2, another example being Security
Groups versus Network ACLs as in a rather popular public cloud.

If you have correlated intent, with TCP and src IP 20.20.20.20 belonging
together,
they should absolutely be put together in one rule in one common ACL stage.

I don't understand this.  Rules for both stages could be added in the same
>> transaction.  It's all in the same table of the northbound database.
>>
>>
>
> I am assuming that the rules would be entered into the Northbound database
> in the same
> transaction. That part is fine.
>
> However, there is no enforcement of a transaction across multiple
> databases in
> OVN. So there is no requirement that northd and ovn-controller maintain
> that NB DB transaction
> across different tables which generating their respective output (i.e. SB
> DB and openflow rules).
>
>
>
>
>
>>
>>>
>>>
>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> Can you update the docs to indicate the specific accepted values for
>>>>>> "stage"?
>>>>>
>>>>>
>>>>>
>>>>> This would significantly complicate the usage of northbound ACL APIs,
>>>>> since multi-staging would be exposed at the top (northbound) OVN layer.
>>>>>
>>>>
>>>> The default behavior when "stage" is not specified is to apply the ACL
>>>> to the
>>>> existing "acl" stage. If you don't care about the second ACL stage,
>>>> continue
>>>> to use ACLs as you do today and it will work. There is no complication.
>>>>
>>>
>>> You need a set of guidelines.
>>> You just cannot assume the northbound API usage will avoid this feature.
>>> How does one know this feature should be avoided or when to use it.
>>> Assuming one decides to use it, how does one know how to use it.
>>>
>>
If you are exposing the OVN northbound API directly, then you have two
options:
1. Hide stage, and just program everything in the default "acl" stage.
2. Expose stage and try to explain how it works.

Hardware switches have had multiple ACL tables for many many years.
As far as I can remember, they are always used for different features that
are configured separately:
Security based on VLANs
Security based on L3 interface
QoS
Service Function Chaining
Control Plane Protection

This would need a clear set of guidelines how northbound
>>>>> multistage ACLs would be used by a CMS, at the user level.
>>>>>
>>>>
>>>> The CMS typically does not expose ACLs directly to the user. For
>>>> example,
>>>> with OpenStack, Security Groups use the default "acl" stage. OpenStack
>>>> FWaaS v2 would use the "acl2" stage. These are two separate OpenStack
>>>> features with separate OpenStack northbound APIs to the user.
>>>>
>>>
>>>
>>> First of all, every OVN feature should not be tied to Openstack.]
>>>
>>
>> It was just used as an example of how it would be used ...
>>
>
As I said above and Russell reiterated, OpenStack FWaaS is just one example.
That is why I went with the generic stage names of "acl" and "acl2" rather
than something like "fw".

Mickey


> --
>> Russell Bryant
>>
>
>
Darrell Ball Aug. 2, 2016, 10:38 p.m. UTC | #17
On Tue, Aug 2, 2016 at 2:38 PM, Mickey Spiegel <mickeys.dev@gmail.com>
wrote:

> On Tue, Aug 2, 2016 at 1:39 PM, Darrell Ball <dlu998@gmail.com> wrote:
>
>>
>>
>> On Tue, Aug 2, 2016 at 12:05 PM, Russell Bryant <russell@ovn.org> wrote:
>>
>>>
>>>
>>> On Tue, Aug 2, 2016 at 3:02 PM, Darrell Ball <dlu998@gmail.com> wrote:
>>>
>>>>
>>>>
>>>> On Tue, Aug 2, 2016 at 10:23 AM, Mickey Spiegel <mickeys.dev@gmail.com>
>>>> wrote:
>>>>
>>>>> On Tue, Aug 2, 2016 at 9:26 AM, Darrell Ball <dlu998@gmail.com> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Aug 2, 2016 at 4:52 AM, Russell Bryant <russell@ovn.org>
>>>>>> wrote:
>>>>>>
>>>>>>> On Sat, Jul 30, 2016 at 4:19 PM, Mickey Spiegel <
>>>>>>> mickeys.dev@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>> > On Fri, Jul 29, 2016 at 10:28 AM, Mickey Spiegel <
>>>>>>> emspiege@us.ibm.com>
>>>>>>> > wrote:
>>>>>>> > >
>>>>>>> > > -----"dev" <dev-bounces@openvswitch.org> wrote: -----
>>>>>>> > >> To: Mickey Spiegel <mickeys.dev@gmail.com>
>>>>>>> > >> From: Russell Bryant
>>>>>>> > >> Sent by: "dev"
>>>>>>> > >> Date: 07/29/2016 10:02AM
>>>>>>> > >> Cc: ovs dev <dev@openvswitch.org>
>>>>>>> > >> Subject: Re: [ovs-dev] [PATCH] ovn: Add second ACL stage
>>>>>>> > >>
>>>>>>> > >> On Fri, Jul 29, 2016 at 12:47 AM, Mickey Spiegel <
>>>>>>> mickeys.dev@gmail.com
>>>>>>> > >
>>>>>>> > >> wrote:
>>>>>>> > >>
>>>>>>> > >>>
>>>>>>> > >>> This patch adds a second logical switch ingress ACL stage, and
>>>>>>> > >>> correspondingly a second logical switch egress ACL stage.  This
>>>>>>> > >>> allows for more than one ACL-based feature to be applied in the
>>>>>>> > >>> ingress and egress logical switch pipelines.  The features
>>>>>>> > >>> driving the different ACL stages may be configured by different
>>>>>>> > >>> users, for example an application deployer managing security
>>>>>>> > >>> groups and a network or security admin configuring network ACLs
>>>>>>> > >>> or firewall rules.
>>>>>>> > >>>
>>>>>>> > >>> Each ACL stage is self contained.  The "action" for the
>>>>>>> > >>> highest-"priority" matching row in an ACL stage determines a
>>>>>>> > >>> packet's treatment.  A separate "action" will be determined in
>>>>>>> > >>> each ACL stage, according to the ACL rules configured for that
>>>>>>> > >>> ACL stage.  The "priority" values are only relevant within the
>>>>>>> > >>> context of an ACL stage.
>>>>>>> > >>>
>>>>>>> > >>> ACL rules that do not specify an ACL stage are applied to the
>>>>>>> > >>> default "acl" stage.
>>>>>>> > >>>
>>>>>>> > >>> Signed-off-by: Mickey Spiegel <mickeys.dev@gmail.com>
>>>>>>> > >>
>>>>>>> > >>
>>>>>>> > >> Could you expand on why priorities in a single stage aren't
>>>>>>> enough to
>>>>>>> > >> satisfy the use case?
>>>>>>> > >
>>>>>>> > > If two features are configured independently with a mix of
>>>>>>> > > prioritized allow and drop rules, then with a single stage, a
>>>>>>> > > new set of ACL rules must be produced that achieves the same
>>>>>>> > > behavior.  This is sometimes referred to as an "ACL merge"
>>>>>>> > > algorithm, for example:
>>>>>>> > >
>>>>>>> >
>>>>>>> http://www.cisco.com/en/US/products/hw/switches/ps708/products_white_paper09186a00800c9470.shtml#wp39514
>>>>>>> > >
>>>>>>> > > In the worst case, for example when the features act on different
>>>>>>> > > packet fields (e.g. one on IP address and another on L4 port),
>>>>>>> > > the number of rules required can approach
>>>>>>> > > (# of ACL1 rules) * (# of ACL2 rules).
>>>>>>> > >
>>>>>>> > > While it is possible to code up such an algorithm, it adds
>>>>>>> > > significant complexity and complicates whichever layer
>>>>>>> > > implements the merge algorithm, either OVN or the CMS above.
>>>>>>> > >
>>>>>>> > > By using multiple independent pipeline stages, all of this
>>>>>>> > > software complexity is avoided, achieving the proper result
>>>>>>> > > in a simple and straightforward manner.
>>>>>>> > >
>>>>>>> > > Recent network hardware ASICs tend to have around 8 or 10 ACL
>>>>>>> > > stages, though they tend to evaluate these in parallel given
>>>>>>> > > all the emphasis on low latency these days.
>>>>>>> >
>>>>>>> > Throwing in an example to illustrate the difference between one
>>>>>>> > ACL stage and two ACL stages:
>>>>>>> >
>>>>>>> > If two separate ACL stages:
>>>>>>> > Feature 1
>>>>>>> > acl  from-lport  100 (tcp == 80) allow-related
>>>>>>> > acl  from-lport  100 (tcp == 8080) allow-related
>>>>>>> > acl  from-lport  100 (udp) allow-related
>>>>>>> > acl  from-lport  100 (ip4.src == 10.1.1.0/24 && tcp) allow-related
>>>>>>> >
>>>>>>> > Feature 2
>>>>>>> > acl2 from-lport  300 (ip4.dst == 172.16.10.0/24) allow-related
>>>>>>> > acl2 from-lport  300 (ip4.dst == 192.168.20.0/24) allow-related
>>>>>>> > acl2 from-lport  200 (ip4.dst == 172.16.0.0/20) drop
>>>>>>> > acl2 from-lport  200 (ip4.dst == 192.168.0.0/16) drop
>>>>>>> > acl2 from-lport  100 (ip4.dst == 172.16.0.0/16) allow-related
>>>>>>> >
>>>>>>> > Combined in one stage, to get the equivalent behavior, this would
>>>>>>> require:
>>>>>>> > from-lport  300 (ip4.dst == 172.16.10.0/24 && tcp == 80)
>>>>>>> allow-related
>>>>>>> > from-lport  300 (ip4.dst == 172.16.10.0/24 && tcp == 8080)
>>>>>>> allow-related
>>>>>>> > from-lport  300 (ip4.dst == 172.16.10.0/24 && udp) allow-related
>>>>>>> > from-lport  300 (ip4.dst == 172.16.10.0/24 && ip4.src ==
>>>>>>> 10.1.1.0/24 &&
>>>>>>> > tcp) allow-related
>>>>>>> > from-lport  300 (ip4.dst == 192.168.20.0/24 && tcp == 80)
>>>>>>> allow-related
>>>>>>> > from-lport  300 (ip4.dst == 192.168.20.0/24 && tcp == 8080)
>>>>>>> allow-related
>>>>>>> > from-lport  300 (ip4.dst == 192.168.20.0/24 && udp) allow-related
>>>>>>> > from-lport  300 (ip4.dst == 192.168.20.0/24 && ip4.src ==
>>>>>>> 10.1.1.0/24 &&
>>>>>>> > tcp) allow-related
>>>>>>> > from-lport  200 (ip4.dst == 172.16.0.0/20) drop
>>>>>>> > from-lport  200 (ip4.dst == 192.168.0.0/16) drop
>>>>>>> > from-lport  100 (ip4.dst == 172.16.0.0/16 && tcp == 80)
>>>>>>> allow-related
>>>>>>> > from-lport  100 (ip4.dst == 172.16.0.0/16 && tcp == 8080)
>>>>>>> allow-related
>>>>>>> > from-lport  100 (ip4.dst == 172.16.0.0/16 && udp) allow-related
>>>>>>> > from-lport  100 (ip4.dst == 172.16.0.0/16 && ip4.src ==
>>>>>>> 10.1.1.0/24 &&
>>>>>>> > tcp) allow-related
>>>>>>> >
>>>>>>>
>>>>>>> Or have an address set, "addrset1", which contains {172.16.10.0/24,
>>>>>>> 192.168.20.0/24, 172.16.0.0/20, 192.168.0.0/16, 172.16.0.0/16}.
>>>>>>>
>>>>>>> acl  from-lport  100 (ip4.dst == $addrset1 && tcp && tcp.dst == {80,
>>>>>>> 8080})
>>>>>>> allow-related
>>>>>>> acl  from-lport  100 (ip4.dst == $addrset1 && udp) allow-related
>>>>>>> acl  from-lport  100 (ip4.dst == $addrset1 && ip4.src == 10.1.1.0/24
>>>>>>> &&
>>>>>>> tcp) allow-related
>>>>>>>
>>>>>>>
>>>>>>> >
>>>>>>> > If there are more IP addresses in feature 2, then the number
>>>>>>> > of ACL rules will climb geometrically:
>>>>>>> > (4 feature 1 rules * # feature 2 allow-related rules + # feature 2
>>>>>>> drop
>>>>>>> > rules)
>>>>>>> >
>>>>>>> > With 2 separate ACL stages, the rules just go straight into
>>>>>>> > the corresponding ACL table, no merge required:
>>>>>>> > (# feature 1 rules + # feature 2 rules)
>>>>>>> >
>>>>>>>
>>>>>>> Thanks for elaborating.  I'm not opposed.  It seems harmless if not
>>>>>>> being
>>>>>>> used.
>>>>>>>
>>>>>>
>>>>>>
>>>>>> There are presently no unit tests for ACLs in the system tests
>>>>>> (system-ovn.at).
>>>>>> The first step should be to add unit tests for single stage ACLs.
>>>>>> and then add a delta of tests if other stages are desired.
>>>>>>
>>>>>> It will be good to test the coordination between multiple stages
>>>>>> coming directly from northbound APIs and check what happens when
>>>>>> multistage ACLs are setup and torn down stage by stage, particularly
>>>>>> when the datapath ends up in a more permissive state for some period
>>>>>> of time.
>>>>>>
>>>>>
>>>> This feature proposal has a problem for both setup and teardown where
>>>> the staging will result in a more permissive state for periods of time.
>>>>
>>>> Here is a simple example based on your example above:
>>>> If one only wants to allow TCP and src IP 20.20.20.20 and the stage
>>>> with TCP is
>>>> added first with the stage with src IP 20.20.20.20 lagging, one will
>>>> have the
>>>> following
>>>>
>>>> 200 TCP permit
>>>> 100 DROP ALL
>>>>
>>>> which permits all TCP - not what we want.
>>>>
>>>> We cannot enforce a transaction across multiple databases (NB, SB,
>>>> ovn-controller)
>>>>
>>>
> That is not how this is meant to be used. I used one stage for IP
> addresses and
> one stage for L4 port as a worst case example of expansion due to ACL
> merge.
> That is not the motivation for using two stages. The motivation is for two
> different
> features that are configured separately, with one example being OpenStack
> Security Groups versus OpenStack FWaaS v2, another example being Security
> Groups versus Network ACLs as in a rather popular public cloud.
>
> If you have correlated intent, with TCP and src IP 20.20.20.20 belonging
> together,
> they should absolutely be put together in one rule in one common ACL stage.
>

Good - then this is part of the "OVN documentation" aspect I mentioned.
A CMS, such as Openstack or something else would need to know the details
of what is the recommended usage of NB APIs. Openstack is the user/client
in this case.



>
> I don't understand this.  Rules for both stages could be added in the same
>>> transaction.  It's all in the same table of the northbound database.
>>>
>>>
>>
>> I am assuming that the rules would be entered into the Northbound
>> database in the same
>> transaction. That part is fine.
>>
>> However, there is no enforcement of a transaction across multiple
>> databases in
>> OVN. So there is no requirement that northd and ovn-controller maintain
>> that NB DB transaction
>> across different tables which generating their respective output (i.e. SB
>> DB and openflow rules).
>>
>>
>>
>>
>>
>>>
>>>>
>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Can you update the docs to indicate the specific accepted values for
>>>>>>> "stage"?
>>>>>>
>>>>>>
>>>>>>
>>>>>> This would significantly complicate the usage of northbound ACL APIs,
>>>>>> since multi-staging would be exposed at the top (northbound) OVN
>>>>>> layer.
>>>>>>
>>>>>
>>>>> The default behavior when "stage" is not specified is to apply the ACL
>>>>> to the
>>>>> existing "acl" stage. If you don't care about the second ACL stage,
>>>>> continue
>>>>> to use ACLs as you do today and it will work. There is no complication.
>>>>>
>>>>
>>>> You need a set of guidelines.
>>>> You just cannot assume the northbound API usage will avoid this feature.
>>>> How does one know this feature should be avoided or when to use it.
>>>> Assuming one decides to use it, how does one know how to use it.
>>>>
>>>
> If you are exposing the OVN northbound API directly, then you have two
> options:
> 1. Hide stage, and just program everything in the default "acl" stage.
> 2. Expose stage and try to explain how it works.
>

I see ACL stages defined in the NB schema in this proposed patch.
So, this patch exposes stages to OVN clients, of which Openstack is just one
such possible client or "user of OVN".

Some tests along with use case documentation/recommedations is needed.



>
> Hardware switches have had multiple ACL tables for many many years.
> As far as I can remember, they are always used for different features that
> are configured separately:
> Security based on VLANs
> Security based on L3 interface
> QoS
> Service Function Chaining
> Control Plane Protection
>


There are multiple match/action capabilities in HW of different types.
Hardware switches and routers typically don't expose implementation details,
such as number of ACL stages, type of hardware resource etc at the
northbound
interface.




>
> This would need a clear set of guidelines how northbound
>>>>>> multistage ACLs would be used by a CMS, at the user level.
>>>>>>
>>>>>
>>>>> The CMS typically does not expose ACLs directly to the user. For
>>>>> example,
>>>>> with OpenStack, Security Groups use the default "acl" stage. OpenStack
>>>>> FWaaS v2 would use the "acl2" stage. These are two separate OpenStack
>>>>> features with separate OpenStack northbound APIs to the user.
>>>>>
>>>>
>>>>
>>>> First of all, every OVN feature should not be tied to Openstack.]
>>>>
>>>
>>> It was just used as an example of how it would be used ...
>>>
>>
> As I said above and Russell reiterated, OpenStack FWaaS is just one
> example.
> That is why I went with the generic stage names of "acl" and "acl2" rather
> than something like "fw".
>


Alright, perhaps some tests and use case documentation will be useful here.



>
> Mickey
>
>
>> --
>>> Russell Bryant
>>>
>>
>>
>
Ben Pfaff Aug. 14, 2016, 5:02 a.m. UTC | #18
On Fri, Jul 29, 2016 at 05:28:26PM +0000, Mickey Spiegel wrote:
> Could you expand on why priorities in a single stage aren't enough to
> satisfy the use case?
> 
> <Mickey>
> If two features are configured independently with a mix of
> prioritized allow and drop rules, then with a single stage, a
> new set of ACL rules must be produced that achieves the same
> behavior.  This is sometimes referred to as an "ACL merge"
> algorithm, for example:
> http://www.cisco.com/en/US/products/hw/switches/ps708/products_white_paper09186a00800c9470.shtml#wp39514
> 
> In the worst case, for example when the features act on different
> packet fields (e.g. one on IP address and another on L4 port),
> the number of rules required can approach
> (# of ACL1 rules) * (# of ACL2 rules).
> 
> While it is possible to code up such an algorithm, it adds
> significant complexity and complicates whichever layer
> implements the merge algorithm, either OVN or the CMS above.
> 
> By using multiple independent pipeline stages, all of this
> software complexity is avoided, achieving the proper result
> in a simple and straightforward manner.
> 
> Recent network hardware ASICs tend to have around 8 or 10 ACL
> stages, though they tend to evaluate these in parallel given
> all the emphasis on low latency these days.

I guess that, in software, if there's a need for 2 of something, there's
usually a need for N of it, so I'd tend to prefer that instead of
hard-coding 2 stages of ACLs, we make N of them available (for perhaps N
== 8), especially given that you say hardware tends to work that way.
It's not really more expensive for OVS, and definitely not if only a few
of them are used.  We might need to expand the number of logical tables,
since currently there are only 16 ingress tables and 16 egress tables,
but doubling them to 32 each wouldn't be a big deal.
Mickey Spiegel Aug. 14, 2016, 10:21 p.m. UTC | #19
On Sat, Aug 13, 2016 at 10:02 PM, Ben Pfaff <blp@ovn.org> wrote:

> On Fri, Jul 29, 2016 at 05:28:26PM +0000, Mickey Spiegel wrote:
> > Could you expand on why priorities in a single stage aren't enough to
> > satisfy the use case?
> >
> > <Mickey>
> > If two features are configured independently with a mix of
> > prioritized allow and drop rules, then with a single stage, a
> > new set of ACL rules must be produced that achieves the same
> > behavior.  This is sometimes referred to as an "ACL merge"
> > algorithm, for example:
> > http://www.cisco.com/en/US/products/hw/switches/ps708/products_white_
> paper09186a00800c9470.shtml#wp39514
> >
> > In the worst case, for example when the features act on different
> > packet fields (e.g. one on IP address and another on L4 port),
> > the number of rules required can approach
> > (# of ACL1 rules) * (# of ACL2 rules).
> >
> > While it is possible to code up such an algorithm, it adds
> > significant complexity and complicates whichever layer
> > implements the merge algorithm, either OVN or the CMS above.
> >
> > By using multiple independent pipeline stages, all of this
> > software complexity is avoided, achieving the proper result
> > in a simple and straightforward manner.
> >
> > Recent network hardware ASICs tend to have around 8 or 10 ACL
> > stages, though they tend to evaluate these in parallel given
> > all the emphasis on low latency these days.
>
> I guess that, in software, if there's a need for 2 of something, there's
> usually a need for N of it, so I'd tend to prefer that instead of
> hard-coding 2 stages of ACLs, we make N of them available (for perhaps N
> == 8), especially given that you say hardware tends to work that way.
> It's not really more expensive for OVS, and definitely not if only a few
> of them are used.  We might need to expand the number of logical tables,
> since currently there are only 16 ingress tables and 16 egress tables,
> but doubling them to 32 each wouldn't be a big deal.
>

I did try to code the core part of the changes so that more ACL stages
could be easily added in the future, but the code having to do with
definition of the pipeline stages, associated functions, and nbctl is only
coded for 2 stages at the moment. Let me think about the best way to
generalize this.

As far as need and usage, I guess the key question is whether features
such as service function chaining and QoS marking will use generic ACL
stages, or pipeline stages specifically coded for those features?
In hardware switches, those type of features use many of the multiple
ACL stages.

The way I coded the patch, the fixed rules allowing and dropping
certain flows regardless of user-defined ACL rules are duplicated in
each ACL stage. However, I am not sure if those rules are necessary
or make sense if the actions for that pipeline stage are redirect (for SFC)
or QoS marking, rather than allow and drop. I need to think about it.

I have moved on to other things temporarily, will come back to this patch
if/when I have time to work on ACL tests, or if someone else adds ACL
tests.

Mickey


> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
>
diff mbox

Patch

diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index 3047b76..2e1d314 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -97,12 +97,13 @@  enum ovn_stage {
     PIPELINE_STAGE(SWITCH, IN,  PRE_LB,         4, "ls_in_pre_lb")         \
     PIPELINE_STAGE(SWITCH, IN,  PRE_STATEFUL,   5, "ls_in_pre_stateful")    \
     PIPELINE_STAGE(SWITCH, IN,  ACL,            6, "ls_in_acl")          \
-    PIPELINE_STAGE(SWITCH, IN,  LB,             7, "ls_in_lb")           \
-    PIPELINE_STAGE(SWITCH, IN,  STATEFUL,       8, "ls_in_stateful")     \
-    PIPELINE_STAGE(SWITCH, IN,  ARP_ND_RSP,     9, "ls_in_arp_rsp")      \
-    PIPELINE_STAGE(SWITCH, IN,  DHCP_OPTIONS,   10, "ls_in_dhcp_options") \
-    PIPELINE_STAGE(SWITCH, IN,  DHCP_RESPONSE,  11, "ls_in_dhcp_response") \
-    PIPELINE_STAGE(SWITCH, IN,  L2_LKUP,        12, "ls_in_l2_lkup")      \
+    PIPELINE_STAGE(SWITCH, IN,  ACL2,           7, "ls_in_acl2")          \
+    PIPELINE_STAGE(SWITCH, IN,  LB,             8, "ls_in_lb")           \
+    PIPELINE_STAGE(SWITCH, IN,  STATEFUL,       9, "ls_in_stateful")     \
+    PIPELINE_STAGE(SWITCH, IN,  ARP_ND_RSP,     10, "ls_in_arp_rsp")      \
+    PIPELINE_STAGE(SWITCH, IN,  DHCP_OPTIONS,   11, "ls_in_dhcp_options") \
+    PIPELINE_STAGE(SWITCH, IN,  DHCP_RESPONSE,  12, "ls_in_dhcp_response") \
+    PIPELINE_STAGE(SWITCH, IN,  L2_LKUP,        13, "ls_in_l2_lkup")      \
                                                                       \
     /* Logical switch egress stages. */                               \
     PIPELINE_STAGE(SWITCH, OUT, PRE_LB,       0, "ls_out_pre_lb")     \
@@ -110,9 +111,10 @@  enum ovn_stage {
     PIPELINE_STAGE(SWITCH, OUT, PRE_STATEFUL, 2, "ls_out_pre_stateful")  \
     PIPELINE_STAGE(SWITCH, OUT, LB,           3, "ls_out_lb")            \
     PIPELINE_STAGE(SWITCH, OUT, ACL,          4, "ls_out_acl")            \
-    PIPELINE_STAGE(SWITCH, OUT, STATEFUL,     5, "ls_out_stateful")       \
-    PIPELINE_STAGE(SWITCH, OUT, PORT_SEC_IP,  6, "ls_out_port_sec_ip")    \
-    PIPELINE_STAGE(SWITCH, OUT, PORT_SEC_L2,  7, "ls_out_port_sec_l2")    \
+    PIPELINE_STAGE(SWITCH, OUT, ACL2,         5, "ls_out_acl2")            \
+    PIPELINE_STAGE(SWITCH, OUT, STATEFUL,     6, "ls_out_stateful")       \
+    PIPELINE_STAGE(SWITCH, OUT, PORT_SEC_IP,  7, "ls_out_port_sec_ip")    \
+    PIPELINE_STAGE(SWITCH, OUT, PORT_SEC_L2,  8, "ls_out_port_sec_l2")    \
                                                                       \
     /* Logical router ingress stages. */                              \
     PIPELINE_STAGE(ROUTER, IN,  ADMISSION,   0, "lr_in_admission")    \
@@ -193,6 +195,48 @@  ovn_stage_to_datapath_type(enum ovn_stage stage)
     default: OVS_NOT_REACHED();
     }
 }
+
+static enum ovn_stage
+ovn_stage_from_acl_stage(const char *acl_stage, enum ovn_pipeline pipeline) {
+    if ((pipeline == P_IN) && !acl_stage) {
+        return S_SWITCH_IN_ACL;
+    } else if ((pipeline == P_OUT) && !acl_stage) {
+        return S_SWITCH_OUT_ACL;
+    } else if ((pipeline == P_IN) && !strcmp(acl_stage, "acl")) {
+        return S_SWITCH_IN_ACL;
+    } else if ((pipeline == P_OUT) && !strcmp(acl_stage, "acl")) {
+        return S_SWITCH_OUT_ACL;
+    } else if ((pipeline == P_IN) && !strcmp(acl_stage, "acl2")) {
+        return S_SWITCH_IN_ACL2;
+    } else if ((pipeline == P_OUT) && !strcmp(acl_stage, "acl2")) {
+        return S_SWITCH_OUT_ACL2;
+    } else if (pipeline == P_IN) {
+        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
+        VLOG_WARN_RL(&rl, "Bad configuration: invalid ACL stage %s",
+                     acl_stage);
+        return S_SWITCH_IN_ACL;
+    } else {
+        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
+        VLOG_WARN_RL(&rl, "Bad configuration: invalid ACL stage %s",
+                     acl_stage);
+        return S_SWITCH_OUT_ACL;
+    }
+}
+
+enum acl_stage_enum {
+    acl,
+    acl2
+};
+
+static const char *
+acl_stage_str_from_enum(enum acl_stage_enum stage)
+{
+    switch (stage) {
+        case acl: return "acl";
+        case acl2: return "acl2";
+        default: return "<unknown>";
+    }
+}
 
 static void
 usage(void)
@@ -1645,105 +1689,149 @@  build_acls(struct ovn_datapath *od, struct hmap *lflows)
 {
     bool has_stateful = has_stateful_acl(od);
 
-    /* Ingress and Egress ACL Table (Priority 0): Packets are allowed by
-     * default.  A related rule at priority 1 is added below if there
-     * are any stateful ACLs in this datapath. */
-    ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, 0, "1", "next;");
-    ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, 0, "1", "next;");
+    for (int i = acl; i <= acl2; i++) {
+        const char *acl_stage = acl_stage_str_from_enum(i);
+        enum ovn_stage in_stage = ovn_stage_from_acl_stage(acl_stage, P_IN);
+        enum ovn_stage out_stage = ovn_stage_from_acl_stage(acl_stage, P_OUT);
 
-    if (has_stateful) {
-        /* Ingress and Egress ACL Table (Priority 1).
-         *
-         * By default, traffic is allowed.  This is partially handled by
-         * the Priority 0 ACL flows added earlier, but we also need to
-         * commit IP flows.  This is because, while the initiater's
-         * direction may not have any stateful rules, the server's may
-         * and then its return traffic would not have an associated
-         * conntrack entry and would return "+invalid".
-         *
-         * We use "ct_commit" for a connection that is not already known
-         * by the connection tracker.  Once a connection is committed,
-         * subsequent packets will hit the flow at priority 0 that just
-         * uses "next;"
-         *
-         * We also check for established connections that have ct_label[0]
-         * set on them.  That's a connection that was disallowed, but is
-         * now allowed by policy again since it hit this default-allow flow.
-         * We need to set ct_label[0]=0 to let the connection continue,
-         * which will be done by ct_commit() in the "stateful" stage.
-         * Subsequent packets will hit the flow at priority 0 that just
-         * uses "next;". */
-        ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, 1,
-                      "ip && (!ct.est || (ct.est && ct_label[0] == 1))",
-                       REGBIT_CONNTRACK_COMMIT" = 1; next;");
-        ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, 1,
-                      "ip && (!ct.est || (ct.est && ct_label[0] == 1))",
-                       REGBIT_CONNTRACK_COMMIT" = 1; next;");
-
-        /* Ingress and Egress ACL Table (Priority 65535).
-         *
-         * Always drop traffic that's in an invalid state.  Also drop
-         * reply direction packets for connections that have been marked
-         * for deletion (bit 0 of ct_label is set).
-         *
-         * This is enforced at a higher priority than ACLs can be defined. */
-        ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX,
-                      "ct.inv || (ct.est && ct.rpl && ct_label[0] == 1)",
-                      "drop;");
-        ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX,
-                      "ct.inv || (ct.est && ct.rpl && ct_label[0] == 1)",
-                      "drop;");
+        /* Ingress and Egress ACL Table (Priority 0): Packets are allowed by
+         * default.  A related rule at priority 1 is added below if there
+         * are any stateful ACLs in this datapath. */
+        ovn_lflow_add(lflows, od, in_stage, 0, "1", "next;");
+        ovn_lflow_add(lflows, od, out_stage, 0, "1", "next;");
 
-        /* Ingress and Egress ACL Table (Priority 65535).
-         *
-         * Allow reply traffic that is part of an established
-         * conntrack entry that has not been marked for deletion
-         * (bit 0 of ct_label).  We only match traffic in the
-         * reply direction because we want traffic in the request
-         * direction to hit the currently defined policy from ACLs.
-         *
-         * This is enforced at a higher priority than ACLs can be defined. */
-        ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX,
-                      "ct.est && !ct.rel && !ct.new && !ct.inv "
-                      "&& ct.rpl && ct_label[0] == 0",
-                      "next;");
-        ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX,
-                      "ct.est && !ct.rel && !ct.new && !ct.inv "
-                      "&& ct.rpl && ct_label[0] == 0",
-                      "next;");
-
-        /* Ingress and Egress ACL Table (Priority 65535).
-         *
-         * Allow traffic that is related to an existing conntrack entry that
-         * has not been marked for deletion (bit 0 of ct_label).
-         *
-         * This is enforced at a higher priority than ACLs can be defined.
-         *
-         * NOTE: This does not support related data sessions (eg,
-         * a dynamically negotiated FTP data channel), but will allow
-         * related traffic such as an ICMP Port Unreachable through
-         * that's generated from a non-listening UDP port.  */
-        ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX,
-                      "!ct.est && ct.rel && !ct.new && !ct.inv "
-                      "&& ct_label[0] == 0",
-                      "next;");
-        ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX,
-                      "!ct.est && ct.rel && !ct.new && !ct.inv "
-                      "&& ct_label[0] == 0",
-                      "next;");
-
-        /* Ingress and Egress ACL Table (Priority 65535).
-         *
-         * Not to do conntrack on ND packets. */
-        ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX, "nd", "next;");
-        ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX, "nd", "next;");
+        if (has_stateful) {
+            /* Ingress and Egress ACL Table (Priority 1).
+             *
+             * By default, traffic is allowed.  This is partially handled by
+             * the Priority 0 ACL flows added earlier, but we also need to
+             * commit IP flows.  This is because, while the initiater's
+             * direction may not have any stateful rules, the server's may
+             * and then its return traffic would not have an associated
+             * conntrack entry and would return "+invalid".
+             *
+             * We use "ct_commit" for a connection that is not already known
+             * by the connection tracker.  Once a connection is committed,
+             * subsequent packets will hit the flow at priority 0 that just
+             * uses "next;"
+             *
+             * We also check for established connections that have ct_label[0]
+             * set on them.  That's a connection that was disallowed, but is
+             * now allowed by policy again since it hit this default-allow
+             * flow.  We need to set ct_label[0]=0 to let the connection
+             * continue, which will be done by ct_commit() in the "stateful"
+             * stage.  Subsequent packets will hit the flow at priority 0 that
+             * just uses "next;". */
+            ovn_lflow_add(lflows, od, in_stage, 1,
+                          "ip && (!ct.est || (ct.est && ct_label[0] == 1))",
+                           REGBIT_CONNTRACK_COMMIT" = 1; next;");
+            ovn_lflow_add(lflows, od, out_stage, 1,
+                          "ip && (!ct.est || (ct.est && ct_label[0] == 1))",
+                           REGBIT_CONNTRACK_COMMIT" = 1; next;");
+
+            /* Ingress and Egress ACL Table (Priority 65535).
+             *
+             * Always drop traffic that's in an invalid state.  Also drop
+             * reply direction packets for connections that have been marked
+             * for deletion (bit 0 of ct_label is set).
+             *
+             * This is enforced at a higher priority than ACLs can be
+             * defined. */
+            ovn_lflow_add(lflows, od, in_stage, UINT16_MAX,
+                          "ct.inv || (ct.est && ct.rpl && ct_label[0] == 1)",
+                          "drop;");
+            ovn_lflow_add(lflows, od, out_stage, UINT16_MAX,
+                          "ct.inv || (ct.est && ct.rpl && ct_label[0] == 1)",
+                          "drop;");
+
+            /* Ingress and Egress ACL Table (Priority 65535).
+             *
+             * Allow reply traffic that is part of an established
+             * conntrack entry that has not been marked for deletion
+             * (bit 0 of ct_label).  We only match traffic in the
+             * reply direction because we want traffic in the request
+             * direction to hit the currently defined policy from ACLs.
+             *
+             * This is enforced at a higher priority than ACLs can be
+             * defined. */
+            ovn_lflow_add(lflows, od, in_stage, UINT16_MAX,
+                          "ct.est && !ct.rel && !ct.new && !ct.inv "
+                          "&& ct.rpl && ct_label[0] == 0",
+                          "next;");
+            ovn_lflow_add(lflows, od, out_stage, UINT16_MAX,
+                          "ct.est && !ct.rel && !ct.new && !ct.inv "
+                          "&& ct.rpl && ct_label[0] == 0",
+                          "next;");
+
+            /* Ingress and Egress ACL Table (Priority 65535).
+             *
+             * Allow traffic that is related to an existing conntrack entry
+             * that has not been marked for deletion (bit 0 of ct_label).
+             *
+             * This is enforced at a higher priority than ACLs can be defined.
+             *
+             * NOTE: This does not support related data sessions (eg,
+             * a dynamically negotiated FTP data channel), but will allow
+             * related traffic such as an ICMP Port Unreachable through
+             * that's generated from a non-listening UDP port.  */
+            ovn_lflow_add(lflows, od, in_stage, UINT16_MAX,
+                          "!ct.est && ct.rel && !ct.new && !ct.inv "
+                          "&& ct_label[0] == 0",
+                          "next;");
+            ovn_lflow_add(lflows, od, out_stage, UINT16_MAX,
+                          "!ct.est && ct.rel && !ct.new && !ct.inv "
+                          "&& ct_label[0] == 0",
+                          "next;");
+
+            /* Ingress and Egress ACL Table (Priority 65535).
+             *
+             * Not to do conntrack on ND packets. */
+            ovn_lflow_add(lflows, od, in_stage, UINT16_MAX, "nd", "next;");
+            ovn_lflow_add(lflows, od, out_stage, UINT16_MAX, "nd", "next;");
+        }
+
+        /* Add 34000 priority flow to allow DHCP reply from ovn-controller to
+         * all logical ports of the datapath if the CMS has configured DHCPv4
+         * options*/
+        if (od->nbs && od->nbs->n_ports) {
+            for (size_t i = 0; i < od->nbs->n_ports; i++) {
+                if (od->nbs->ports[i]->dhcpv4_options) {
+                    const char *server_id = smap_get(
+                        &od->nbs->ports[i]->dhcpv4_options->options,
+                        "server_id");
+                    const char *server_mac = smap_get(
+                        &od->nbs->ports[i]->dhcpv4_options->options,
+                        "server_mac");
+                    const char *lease_time = smap_get(
+                        &od->nbs->ports[i]->dhcpv4_options->options,
+                        "lease_time");
+                    const char *router = smap_get(
+                        &od->nbs->ports[i]->dhcpv4_options->options, "router");
+                    if (server_id && server_mac && lease_time && router) {
+                        struct ds match = DS_EMPTY_INITIALIZER;
+                        const char *actions =
+                            has_stateful ? "ct_commit; next;" : "next;";
+                        ds_put_format(&match,
+                                      "outport == \"%s\" && eth.src == %s "
+                                      "&& ip4.src == %s && udp "
+                                      "&& udp.src == 67 && udp.dst == 68",
+                                      od->nbs->ports[i]->name,
+                                      server_mac, server_id);
+                        ovn_lflow_add(
+                            lflows, od, out_stage, 34000, ds_cstr(&match),
+                            actions);
+                    }
+                }
+            }
+        }
     }
 
     /* Ingress or Egress ACL Table (Various priorities). */
     for (size_t i = 0; i < od->nbs->n_acls; i++) {
         struct nbrec_acl *acl = od->nbs->acls[i];
-        bool ingress = !strcmp(acl->direction, "from-lport") ? true :false;
-        enum ovn_stage stage = ingress ? S_SWITCH_IN_ACL : S_SWITCH_OUT_ACL;
+        enum ovn_pipeline pipeline
+            = !strcmp(acl->direction, "from-lport") ? P_IN : P_OUT;
+        enum ovn_stage stage = ovn_stage_from_acl_stage(acl->stage, pipeline);
 
         if (!strcmp(acl->action, "allow")
             || !strcmp(acl->action, "allow-related")) {
@@ -1849,35 +1937,6 @@  build_acls(struct ovn_datapath *od, struct hmap *lflows)
             }
         }
     }
-
-    /* Add 34000 priority flow to allow DHCP reply from ovn-controller to all
-     * logical ports of the datapath if the CMS has configured DHCPv4 options*/
-    if (od->nbs && od->nbs->n_ports) {
-        for (size_t i = 0; i < od->nbs->n_ports; i++) {
-            if (od->nbs->ports[i]->dhcpv4_options) {
-                const char *server_id = smap_get(
-                    &od->nbs->ports[i]->dhcpv4_options->options, "server_id");
-                const char *server_mac = smap_get(
-                    &od->nbs->ports[i]->dhcpv4_options->options, "server_mac");
-                const char *lease_time = smap_get(
-                    &od->nbs->ports[i]->dhcpv4_options->options, "lease_time");
-                const char *router = smap_get(
-                    &od->nbs->ports[i]->dhcpv4_options->options, "router");
-                if (server_id && server_mac && lease_time && router) {
-                    struct ds match = DS_EMPTY_INITIALIZER;
-                    const char *actions =
-                        has_stateful ? "ct_commit; next;" : "next;";
-                    ds_put_format(&match, "outport == \"%s\" && eth.src == %s "
-                                  "&& ip4.src == %s && udp && udp.src == 67 "
-                                  "&& udp.dst == 68", od->nbs->ports[i]->name,
-                                  server_mac, server_id);
-                    ovn_lflow_add(
-                        lflows, od, S_SWITCH_OUT_ACL, 34000, ds_cstr(&match),
-                        actions);
-                }
-            }
-        }
-    }
 }
 
 static void
diff --git a/ovn/ovn-nb.ovsschema b/ovn/ovn-nb.ovsschema
index a5dc669..4de531e 100644
--- a/ovn/ovn-nb.ovsschema
+++ b/ovn/ovn-nb.ovsschema
@@ -1,7 +1,7 @@ 
 {
     "name": "OVN_Northbound",
-    "version": "5.2.0",
-    "cksum": "650844440 8727",
+    "version": "5.3.0",
+    "cksum": "3860252638 8923",
     "tables": {
         "NB_Global": {
             "columns": {
@@ -102,6 +102,9 @@ 
                 "match": {"type": "string"},
                 "action": {"type": {"key": {"type": "string",
                                             "enum": ["set", ["allow", "allow-related", "drop", "reject"]]}}},
+                "stage": {"type": {"key": {"type": "string",
+                                           "enum": ["set", ["acl", "acl2"]]},
+                                   "min": 0, "max": 1}},
                 "log": {"type": "boolean"},
                 "external_ids": {
                     "type": {"key": "string", "value": "string",
diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
index 4b61bbc..574b589 100644
--- a/ovn/ovn-nb.xml
+++ b/ovn/ovn-nb.xml
@@ -735,6 +735,31 @@ 
       </ul>
     </column>
 
+    <column name="stage">
+      <p>
+        OVN supports the use of more than one ACL stage.  This allows for more
+        than one ACL-based feature to be applied in the ingress and egress
+        logical switch pipelines.  The features driving the different ACL
+        stages may be configured by different users, for example an application
+        deployer managing security groups and a network or security admin
+        configuring network ACLs or firewall rules.
+      </p>
+
+      <p>
+        Each ACL stage is self contained.  The <ref column="action"/> column
+        for the highest-<ref column="priority"/> matching row in an ACL stage
+        determines a packet's treatment.  A separate <ref column="action"/>
+        will be determined in each ACL stage, according to the ACL rules
+        configured for that ACL stage.  The <ref column="priority"/> values
+        are only relevant within the context of an ACL stage.
+      </p>
+
+      <p>
+        ACL rules that do not specify an ACL stage are applied to the default
+        <code>acl</code> stage. 
+      </p>
+    </column>
+
     <column name="log">
       <p>
         If set to <code>true</code>, packets that match the ACL will trigger a
diff --git a/ovn/utilities/ovn-nbctl.c b/ovn/utilities/ovn-nbctl.c
index 947e3a1..20baf06 100644
--- a/ovn/utilities/ovn-nbctl.c
+++ b/ovn/utilities/ovn-nbctl.c
@@ -334,9 +334,9 @@  Logical switch commands:\n\
   ls-list                   print the names of all logical switches\n\
 \n\
 ACL commands:\n\
-  acl-add SWITCH DIRECTION PRIORITY MATCH ACTION [log]\n\
+  acl-add SWITCH DIRECTION PRIORITY MATCH ACTION [STAGE] [log]\n\
                             add an ACL to SWITCH\n\
-  acl-del SWITCH [DIRECTION [PRIORITY MATCH]]\n\
+  acl-del SWITCH [DIRECTION [PRIORITY MATCH [STAGE]]]\n\
                             remove ACLs from SWITCH\n\
   acl-list SWITCH           print ACLs for SWITCH\n\
 \n\
@@ -1150,7 +1150,14 @@  acl_cmp(const void *acl1_, const void *acl2_)
     int dir1 = dir_encode(acl1->direction);
     int dir2 = dir_encode(acl2->direction);
 
-    if (dir1 != dir2) {
+    /* Normalize stage==NULL and stage=="acl". */
+    const char *stage1 = (acl1->stage) ? acl1->stage : "acl";
+    const char *stage2 = (acl2->stage) ? acl2->stage : "acl";
+
+    const int stage_cmp = strcmp(stage1, stage2);
+    if (stage_cmp) {
+        return stage_cmp;
+    } else if (dir1 != dir2) {
         return dir1 < dir2 ? -1 : 1;
     } else if (acl1->priority != acl2->priority) {
         return acl1->priority > acl2->priority ? -1 : 1;
@@ -1177,7 +1184,8 @@  nbctl_acl_list(struct ctl_context *ctx)
 
     for (i = 0; i < ls->n_acls; i++) {
         const struct nbrec_acl *acl = acls[i];
-        ds_put_format(&ctx->output, "%10s %5"PRId64" (%s) %s%s\n",
+        ds_put_format(&ctx->output, "%-4.4s %10s %5"PRId64" (%s) %s%s\n",
+                      acl->stage ? acl->stage : "",
                       acl->direction, acl->priority,
                       acl->match, acl->action, acl->log ? " log" : "");
     }
@@ -1229,12 +1237,22 @@  nbctl_acl_add(struct ctl_context *ctx)
         return;
     }
 
+    /* Validate stage. */
+    const char *stage = ctx->argc == 7 ? ctx->argv[6] : NULL;
+    if (stage && strcmp(stage, "acl") && strcmp(stage, "acl2")) {
+        ctl_fatal("%s: stage must be one of \"acl\" and \"acl2\"", stage);
+        return;
+    }
+
     /* Create the acl. */
     struct nbrec_acl *acl = nbrec_acl_insert(ctx->txn);
     nbrec_acl_set_priority(acl, priority);
     nbrec_acl_set_direction(acl, direction);
     nbrec_acl_set_match(acl, ctx->argv[4]);
     nbrec_acl_set_action(acl, action);
+    if (stage) {
+        nbrec_acl_set_stage(acl, stage);
+    }
     if (shash_find(&ctx->options, "--log") != NULL) {
         nbrec_acl_set_log(acl, true);
     }
@@ -1254,7 +1272,7 @@  nbctl_acl_del(struct ctl_context *ctx)
     const struct nbrec_logical_switch *ls;
     ls = ls_by_name_or_uuid(ctx, ctx->argv[1], true);
 
-    if (ctx->argc != 2 && ctx->argc != 3 && ctx->argc != 5) {
+    if (ctx->argc != 2 && ctx->argc != 3 && ctx->argc != 5 && ctx->argc != 6) {
         ctl_fatal("cannot specify priority without match");
     }
 
@@ -1293,7 +1311,8 @@  nbctl_acl_del(struct ctl_context *ctx)
         struct nbrec_acl *acl = ls->acls[i];
 
         if (priority == acl->priority && !strcmp(ctx->argv[4], acl->match) &&
-             !strcmp(direction, acl->direction)) {
+             !strcmp(direction, acl->direction) && (ctx->argc == 5 ||
+             !strcmp(ctx->argv[5], acl->stage))) {
             struct nbrec_acl **new_acls
                 = xmemdup(ls->acls, sizeof *new_acls * ls->n_acls);
             new_acls[i] = ls->acls[ls->n_acls - 1];
@@ -2403,9 +2422,9 @@  static const struct ctl_command_syntax nbctl_commands[] = {
     { "ls-list", 0, 0, "", NULL, nbctl_ls_list, NULL, "", RO },
 
     /* acl commands. */
-    { "acl-add", 5, 5, "SWITCH DIRECTION PRIORITY MATCH ACTION", NULL,
+    { "acl-add", 5, 6, "SWITCH DIRECTION PRIORITY MATCH ACTION [STAGE]", NULL,
       nbctl_acl_add, NULL, "--log", RW },
-    { "acl-del", 1, 4, "SWITCH [DIRECTION [PRIORITY MATCH]]", NULL,
+    { "acl-del", 1, 5, "SWITCH [DIRECTION [PRIORITY MATCH [STAGE]]]", NULL,
       nbctl_acl_del, NULL, "", RW },
     { "acl-list", 1, 1, "SWITCH", NULL, nbctl_acl_list, NULL, "", RO },
 
diff --git a/tests/ovn-nbctl.at b/tests/ovn-nbctl.at
index 5357ced..13d55e7 100644
--- a/tests/ovn-nbctl.at
+++ b/tests/ovn-nbctl.at
@@ -196,26 +196,32 @@  OVN_NBCTL_TEST_START
 AT_CHECK([ovn-nbctl ls-add ls0])
 AT_CHECK([ovn-nbctl --log acl-add ls0 from-lport 600 udp drop])
 AT_CHECK([ovn-nbctl --log acl-add ls0 to-lport 500 udp drop])
-AT_CHECK([ovn-nbctl acl-add ls0 from-lport 400 tcp drop])
+AT_CHECK([ovn-nbctl acl-add ls0 from-lport 400 tcp drop acl])
 AT_CHECK([ovn-nbctl acl-add ls0 to-lport 300 tcp drop])
+AT_CHECK([ovn-nbctl acl-add ls0 from-lport 300 'ip4.dst == 10.0.0.0/24' allow acl2])
 AT_CHECK([ovn-nbctl acl-add ls0 from-lport 200 ip drop])
 AT_CHECK([ovn-nbctl acl-add ls0 to-lport 100 ip drop])
+AT_CHECK([ovn-nbctl acl-add ls0 from-lport 0 ip drop acl2])
 
 AT_CHECK([ovn-nbctl acl-list ls0], [0], [dnl
-from-lport   600 (udp) drop log
-from-lport   400 (tcp) drop
-from-lport   200 (ip) drop
-  to-lport   500 (udp) drop log
-  to-lport   300 (tcp) drop
-  to-lport   100 (ip) drop
+     from-lport   600 (udp) drop log
+acl  from-lport   400 (tcp) drop
+     from-lport   200 (ip) drop
+       to-lport   500 (udp) drop log
+       to-lport   300 (tcp) drop
+       to-lport   100 (ip) drop
+acl2 from-lport   300 (ip4.dst == 10.0.0.0/24) allow
+acl2 from-lport     0 (ip) drop
 ])
 
 dnl Delete in one direction.
 AT_CHECK([ovn-nbctl acl-del ls0 to-lport])
 AT_CHECK([ovn-nbctl acl-list ls0], [0], [dnl
-from-lport   600 (udp) drop log
-from-lport   400 (tcp) drop
-from-lport   200 (ip) drop
+     from-lport   600 (udp) drop log
+acl  from-lport   400 (tcp) drop
+     from-lport   200 (ip) drop
+acl2 from-lport   300 (ip4.dst == 10.0.0.0/24) allow
+acl2 from-lport     0 (ip) drop
 ])
 
 dnl Delete all ACLs.
@@ -230,8 +236,8 @@  AT_CHECK([ovn-nbctl acl-add ls0 from-lport 200 ip drop])
 dnl Delete a single flow.
 AT_CHECK([ovn-nbctl acl-del ls0 from-lport 400 tcp])
 AT_CHECK([ovn-nbctl acl-list ls0], [0], [dnl
-from-lport   600 (udp) drop
-from-lport   200 (ip) drop
+     from-lport   600 (udp) drop
+     from-lport   200 (ip) drop
 ])
 
 OVN_NBCTL_TEST_STOP