diff mbox

[ovs-dev] dpif: execute ct together with recirc when slow path

Message ID e99796ef-ecfc-44e5-aca4-465f935aa525@DESKTOP-LBPO59B.local
State Changes Requested
Headers show

Commit Message

Wei Li June 14, 2017, 12:04 p.m. UTC
ct(table=X,zone=X) will xlate 2 actions (ct and recirc), so they should be executed together.
---
 lib/dpif.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

Comments

Wei Li June 14, 2017, 12:12 p.m. UTC | #1
that is for this problem
[ovs-dev] [ovs-discuss] ovn: unsnat handling error for Distributed Gateway
https://mail.openvswitch.org/pipermail/ovs-dev/2017-April/330536.html

Previous patch
[PATCH] ovn-northd: Add logical flows to reply ICMP echo requests for 
all the other router ports connected to one switch

pls recheck, thks

在 2017/6/14 20:04, wei 写道:
> ct(table=X,zone=X) will xlate 2 actions (ct and recirc), so they should be executed together.
> ---
>   lib/dpif.c | 18 ++++++++++++++++--
>   1 file changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/lib/dpif.c b/lib/dpif.c
> index fe6a986c5..72274731a 100644
> --- a/lib/dpif.c
> +++ b/lib/dpif.c
> @@ -1109,6 +1109,7 @@ struct dpif_execute_helper_aux {
>       const struct flow *flow;
>       int error;
>       const struct nlattr *meter_action; /* Non-NULL, if have a meter action. */
> +    const struct nlattr *ct_action; /* stort ct action when it need to be execute together with recirc action. */
>   };
>   
>   /* This is called for actions that need the context of the datapath to be
> @@ -1132,6 +1133,13 @@ dpif_execute_helper_cb(void *aux_, struct dp_packet_batch *packets_,
>   	break;
>   
>       case OVS_ACTION_ATTR_CT:
> +		if (nl_attr_type(nl_attr_next(action)) == OVS_ACTION_ATTR_RECIRC) {
> +		    /* OVS_ACTION_ATTR_CT and the following OVS_ACTION_ATTR_RECIRC should execute together.
> +		     * store first and execute in the next loop which process OVS_ACTION_ATTR_RECIRC). */
> +			aux->ct_action = action;
> +			break;
> +		}
> +
>       case OVS_ACTION_ATTR_OUTPUT:
>       case OVS_ACTION_ATTR_TUNNEL_PUSH:
>       case OVS_ACTION_ATTR_TUNNEL_POP:
> @@ -1142,7 +1150,7 @@ dpif_execute_helper_cb(void *aux_, struct dp_packet_batch *packets_,
>           uint64_t stub[256 / 8];
>           struct pkt_metadata *md = &packet->md;
>   
> -        if (flow_tnl_dst_is_set(&md->tunnel) || aux->meter_action) {
> +        if (flow_tnl_dst_is_set(&md->tunnel) || aux->meter_action || aux->ct_action) {
>               ofpbuf_use_stub(&execute_actions, stub, sizeof stub);
>   
>               if (aux->meter_action) {
> @@ -1171,6 +1179,12 @@ dpif_execute_helper_cb(void *aux_, struct dp_packet_batch *packets_,
>               if (md->tunnel.ip_dst) {
>                   odp_put_tunnel_action(&md->tunnel, &execute_actions);
>               }
> +
> +			if (aux->ct_action) {
> +				/* current action definitely is OVS_ACTION_ATTR_RECIRC. */
> +				ofpbuf_put(&execute_actions, aux->ct_action, NLA_ALIGN(aux->ct_action->nla_len));
> +				aux->ct_action = NULL;
> +			}
>               ofpbuf_put(&execute_actions, action, NLA_ALIGN(action->nla_len));
>   
>               execute.actions = execute_actions.data;
> @@ -1238,7 +1252,7 @@ dpif_execute_helper_cb(void *aux_, struct dp_packet_batch *packets_,
>   static int
>   dpif_execute_with_help(struct dpif *dpif, struct dpif_execute *execute)
>   {
> -    struct dpif_execute_helper_aux aux = {dpif, execute->flow, 0, NULL};
> +    struct dpif_execute_helper_aux aux = {dpif, execute->flow, 0, NULL, NULL};
>       struct dp_packet_batch pb;
>   
>       COVERAGE_INC(dpif_execute_with_help);
Joe Stringer June 14, 2017, 6:43 p.m. UTC | #2
On 14 June 2017 at 05:04, wei <liw@dtdream.com> wrote:
> ct(table=X,zone=X) will xlate 2 actions (ct and recirc), so they should be executed together.

Hi, thanks for the contribution.

There's a few things in this patch that do not match standard OVS
contribution guidelines. Using utilities/checkpatch.py, you can pick
up on most of these and address them before submission. Would you mind
fixing these up? The following website goes into more detail about
contributing to OVS:

http://docs.openvswitch.org/en/latest/internals/contributing/submitting-patches/

At minimum, all patches need to provide signoff before they can be
applied to OVS. The use of whitespace in this patch also does not
match the usual OVS style.

Regarding the patch details, it seems to follow a similar workaround
used for meters in dpif_execute_helper_cb(). I think that this will
likely work, as the only copy of the packet that requires the results
of ct() is the copy that is recirculated. That is to say, if the
datapath actions looked like "...,ct(...),recirc(...),output(...)", it
would be correct to assemble two execute calls to the datapath, one
for the ct+recirc, and one for the output.

That said, at some point we should consider addressing the core
problem which is that dpif_execute_helper_cb() is written to try to
execute one action at a time. I don't expect you to implement this,
but I want to jot these thoughts down while they're fresh in my mind.
Rather than executing one action at a time and building up specific
solutions to group ct+recirc or meter+subsequent actions, we should
probably attempt to group together as many actions as can be executed
in the datapath and execute in one transaction. What I could imagine
is something like the current meters loop logic existing in a generic
way in odp_execute_actions(), taking into account which actions *must*
be executed in userspace and which actions *must* be executed in the
datapath.

Let's say for example that actions of type A must be executed in the
datapath, actions type B could be executed anywhere, and actions type
C must be executed in userspace. A could represent meter, ct, recirc,
tunnel_{pop,push}, userspace, or output. B includes the majority of
actions. C could include ARP modification and perhaps a short list of
other actions. I'll use "U" to indicate a userspace action that we
insert to make things work correctly (more description below).

Actions list "BBBABBC" could be executed "BBB" in userspace, then
assemble a datapath execute for "A", then run "BBC" in userspace again
(so long as 'A' does not impact running of BBC). However, consider if
"A" is actually a meter action. Then it could potentially drop the
packet, which means that "BBC" should not be run. Therefore, perhaps
to execute this list, "BBB" should be executed in userspace, then
assemble a datapath execute for "ABBU", where "U" action should
contain enough context that the executed packet can come back up and
finally execute the "C" action later on.

With an actions list like "CBBBABBABA", I would expect that "CBBB"
could be executed in userspace, then "ABBABA" could be executed as one
list in kernel. Equally, "C" could be executed in userspace and
"BBBABBABA" could be executed in kernel.

Now, maybe in practice these situations are not hit that often so
there is little impact on correctness or performance. In that case
maybe it's not that important. However there is some potential that if
some of these cases are hit regularly, a more general solution could
improve correctness and performance and even tidy up the code while
we're at it.

Cheers,
Joe
Joe Stringer June 14, 2017, 8:17 p.m. UTC | #3
On 14 June 2017 at 11:43, Joe Stringer <joe@ovn.org> wrote:
> On 14 June 2017 at 05:04, wei <liw@dtdream.com> wrote:
>> ct(table=X,zone=X) will xlate 2 actions (ct and recirc), so they should be executed together.
>
> Hi, thanks for the contribution.
>
> There's a few things in this patch that do not match standard OVS
> contribution guidelines. Using utilities/checkpatch.py, you can pick
> up on most of these and address them before submission. Would you mind
> fixing these up? The following website goes into more detail about
> contributing to OVS:
>
> http://docs.openvswitch.org/en/latest/internals/contributing/submitting-patches/
>
> At minimum, all patches need to provide signoff before they can be
> applied to OVS. The use of whitespace in this patch also does not
> match the usual OVS style.
>
> Regarding the patch details, it seems to follow a similar workaround
> used for meters in dpif_execute_helper_cb(). I think that this will
> likely work, as the only copy of the packet that requires the results
> of ct() is the copy that is recirculated. That is to say, if the
> datapath actions looked like "...,ct(...),recirc(...),output(...)", it
> would be correct to assemble two execute calls to the datapath, one
> for the ct+recirc, and one for the output.

I should note that with this proposal, if the ct action has nat, then
the actions will be incorrect for actions following the recirc(...) -
for example output (which should see a NATed packet).
Wei Li June 15, 2017, 2:56 a.m. UTC | #4
Great idea, this can process actions like 
"ct(...nat),recirc(...),output(...)" correctly, output packet will be a 
NATed packet

Before implementing your idea, how deal with my patch?

1: drop it

2:accept it for fixing limited scenario (i will do checkpatch)

What's your suggestion?

在 2017/6/15 2:43, Joe Stringer 写道:
> On 14 June 2017 at 05:04, wei <liw@dtdream.com> wrote:
>> ct(table=X,zone=X) will xlate 2 actions (ct and recirc), so they should be executed together.
> Hi, thanks for the contribution.
>
> There's a few things in this patch that do not match standard OVS
> contribution guidelines. Using utilities/checkpatch.py, you can pick
> up on most of these and address them before submission. Would you mind
> fixing these up? The following website goes into more detail about
> contributing to OVS:
>
> http://docs.openvswitch.org/en/latest/internals/contributing/submitting-patches/
>
> At minimum, all patches need to provide signoff before they can be
> applied to OVS. The use of whitespace in this patch also does not
> match the usual OVS style.
>
> Regarding the patch details, it seems to follow a similar workaround
> used for meters in dpif_execute_helper_cb(). I think that this will
> likely work, as the only copy of the packet that requires the results
> of ct() is the copy that is recirculated. That is to say, if the
> datapath actions looked like "...,ct(...),recirc(...),output(...)", it
> would be correct to assemble two execute calls to the datapath, one
> for the ct+recirc, and one for the output.
>
> That said, at some point we should consider addressing the core
> problem which is that dpif_execute_helper_cb() is written to try to
> execute one action at a time. I don't expect you to implement this,
> but I want to jot these thoughts down while they're fresh in my mind.
> Rather than executing one action at a time and building up specific
> solutions to group ct+recirc or meter+subsequent actions, we should
> probably attempt to group together as many actions as can be executed
> in the datapath and execute in one transaction. What I could imagine
> is something like the current meters loop logic existing in a generic
> way in odp_execute_actions(), taking into account which actions *must*
> be executed in userspace and which actions *must* be executed in the
> datapath.
>
> Let's say for example that actions of type A must be executed in the
> datapath, actions type B could be executed anywhere, and actions type
> C must be executed in userspace. A could represent meter, ct, recirc,
> tunnel_{pop,push}, userspace, or output. B includes the majority of
> actions. C could include ARP modification and perhaps a short list of
> other actions. I'll use "U" to indicate a userspace action that we
> insert to make things work correctly (more description below).
>
> Actions list "BBBABBC" could be executed "BBB" in userspace, then
> assemble a datapath execute for "A", then run "BBC" in userspace again
> (so long as 'A' does not impact running of BBC). However, consider if
> "A" is actually a meter action. Then it could potentially drop the
> packet, which means that "BBC" should not be run. Therefore, perhaps
> to execute this list, "BBB" should be executed in userspace, then
> assemble a datapath execute for "ABBU", where "U" action should
> contain enough context that the executed packet can come back up and
> finally execute the "C" action later on.
>
> With an actions list like "CBBBABBABA", I would expect that "CBBB"
> could be executed in userspace, then "ABBABA" could be executed as one
> list in kernel. Equally, "C" could be executed in userspace and
> "BBBABBABA" could be executed in kernel.
>
> Now, maybe in practice these situations are not hit that often so
> there is little impact on correctness or performance. In that case
> maybe it's not that important. However there is some potential that if
> some of these cases are hit regularly, a more general solution could
> improve correctness and performance and even tidy up the code while
> we're at it.
>
> Cheers,
> Joe
Joe Stringer June 15, 2017, 11:45 p.m. UTC | #5
On 14 June 2017 at 19:56, Wei Li <liw@dtdream.com> wrote:
> Great idea, this can process actions like
> "ct(...nat),recirc(...),output(...)" correctly, output packet will be a
> NATed packet
>
> Before implementing your idea, how deal with my patch?
>
> 1: drop it
>
> 2:accept it for fixing limited scenario (i will do checkpatch)
>
> What's your suggestion?

I think that we should probably work towards a proper solution, this
only addresses a subset of the currently-broken cases.
diff mbox

Patch

diff --git a/lib/dpif.c b/lib/dpif.c
index fe6a986c5..72274731a 100644
--- a/lib/dpif.c
+++ b/lib/dpif.c
@@ -1109,6 +1109,7 @@  struct dpif_execute_helper_aux {
     const struct flow *flow;
     int error;
     const struct nlattr *meter_action; /* Non-NULL, if have a meter action. */
+    const struct nlattr *ct_action; /* stort ct action when it need to be execute together with recirc action. */
 };
 
 /* This is called for actions that need the context of the datapath to be
@@ -1132,6 +1133,13 @@  dpif_execute_helper_cb(void *aux_, struct dp_packet_batch *packets_,
 	break;
 
     case OVS_ACTION_ATTR_CT:
+		if (nl_attr_type(nl_attr_next(action)) == OVS_ACTION_ATTR_RECIRC) {
+		    /* OVS_ACTION_ATTR_CT and the following OVS_ACTION_ATTR_RECIRC should execute together.
+		     * store first and execute in the next loop which process OVS_ACTION_ATTR_RECIRC). */
+			aux->ct_action = action;
+			break;
+		}
+
     case OVS_ACTION_ATTR_OUTPUT:
     case OVS_ACTION_ATTR_TUNNEL_PUSH:
     case OVS_ACTION_ATTR_TUNNEL_POP:
@@ -1142,7 +1150,7 @@  dpif_execute_helper_cb(void *aux_, struct dp_packet_batch *packets_,
         uint64_t stub[256 / 8];
         struct pkt_metadata *md = &packet->md;
 
-        if (flow_tnl_dst_is_set(&md->tunnel) || aux->meter_action) {
+        if (flow_tnl_dst_is_set(&md->tunnel) || aux->meter_action || aux->ct_action) {
             ofpbuf_use_stub(&execute_actions, stub, sizeof stub);
 
             if (aux->meter_action) {
@@ -1171,6 +1179,12 @@  dpif_execute_helper_cb(void *aux_, struct dp_packet_batch *packets_,
             if (md->tunnel.ip_dst) {
                 odp_put_tunnel_action(&md->tunnel, &execute_actions);
             }
+
+			if (aux->ct_action) {
+				/* current action definitely is OVS_ACTION_ATTR_RECIRC. */
+				ofpbuf_put(&execute_actions, aux->ct_action, NLA_ALIGN(aux->ct_action->nla_len));
+				aux->ct_action = NULL;
+			}
             ofpbuf_put(&execute_actions, action, NLA_ALIGN(action->nla_len));
 
             execute.actions = execute_actions.data;
@@ -1238,7 +1252,7 @@  dpif_execute_helper_cb(void *aux_, struct dp_packet_batch *packets_,
 static int
 dpif_execute_with_help(struct dpif *dpif, struct dpif_execute *execute)
 {
-    struct dpif_execute_helper_aux aux = {dpif, execute->flow, 0, NULL};
+    struct dpif_execute_helper_aux aux = {dpif, execute->flow, 0, NULL, NULL};
     struct dp_packet_batch pb;
 
     COVERAGE_INC(dpif_execute_with_help);