diff mbox

[ovs-dev,PATCHv2] ofproto-dpif: Mark packets as "untracked" after call to ct().

Message ID 1502433498-29205-1-git-send-email-jpettit@ovn.org
State Rejected
Headers show

Commit Message

Justin Pettit Aug. 11, 2017, 6:38 a.m. UTC
Packet and Connection state is only available to the processing path
that follows the "recirc_table" argument of the ct() action.  The
previous behavior made these states available until the end of the
pipeline.  This commit changes the behavior so that the Packet and
Connection state are cleared for the current processing path whenever
ct() is called (in addition to reaching the end of the pipeline.)

A future commit will remove the behavior that a "send to controller"
action causes all packets for that flow to be handled via the slow-path.
The current behavior of connection tracking state makes that difficult
due to datapath actions containing multiple OpenFlow rules that may
contain different connection tracking states.  This change will make
that future commit possible.

Signed-off-by: Justin Pettit <jpettit@ovn.org>
---
v1->v2: Fix system tests.
---
 lib/ofp-actions.c            | 27 +++++++++++++--------------
 ofproto/ofproto-dpif-xlate.c | 21 +++++++--------------
 tests/ofproto-dpif.at        | 10 +++++-----
 tests/system-traffic.at      |  4 ++--
 utilities/ovs-ofctl.8.in     | 10 ++++++----
 5 files changed, 33 insertions(+), 39 deletions(-)

Comments

Justin Pettit Aug. 11, 2017, 6:58 a.m. UTC | #1
I just wanted to specially call out this patch for testing.  My hope is that we can merge this into 2.8, but wanted to give people a heads up that they may want to try it with their applications.  The most visible change is that a call to ct() will clear the ct_state for any actions that follow it.  As before, the ct_state will be accessible to flows rooted in the table specified by the "table" argument to ct(), but now another ct() action will clear the state for the current processing path (but the new ct_state will be available if the "table" argument is given).

My hope is that this won't affect people's pipelines, but it's worth checking if it does.  I would recommend that anyone that uses the conntrack action in their applications give this patch a try (on top of branch-2.8).  If there are any problems, I can try to help you reconstruct your pipeline so it's not affected.  (The changes should be minimal, and there are a couple of examples in the patch itself.)  I've done some light testing with OVN, and it didn't require any changes, and I hope it's the same for everyone else.

The reason I'd like to get this change into 2.8 is that it's a requirement for a performance optimization I'd like to get into 2.9.  Currently, if a controller action is specified, all traffic is slow-pathed to userspace.  As we start making heavier use of local controllers, this will have a serious negative effect on forwarding performance and increase in CPU usage.  I've made a few other (more minor, corner-casey) changes to the OpenFlow processing in 2.8, and I'd like to get all potentially breaking issues into a single release.

If you have an application that uses conntrack, please give it a try and let me know if you run into any issues.

Thanks!

--Justin


> On Aug 10, 2017, at 11:38 PM, Justin Pettit <jpettit@ovn.org> wrote:
> 
> Packet and Connection state is only available to the processing path
> that follows the "recirc_table" argument of the ct() action.  The
> previous behavior made these states available until the end of the
> pipeline.  This commit changes the behavior so that the Packet and
> Connection state are cleared for the current processing path whenever
> ct() is called (in addition to reaching the end of the pipeline.)
> 
> A future commit will remove the behavior that a "send to controller"
> action causes all packets for that flow to be handled via the slow-path.
> The current behavior of connection tracking state makes that difficult
> due to datapath actions containing multiple OpenFlow rules that may
> contain different connection tracking states.  This change will make
> that future commit possible.
> 
> Signed-off-by: Justin Pettit <jpettit@ovn.org>
> ---
> v1->v2: Fix system tests.
> ---
> lib/ofp-actions.c            | 27 +++++++++++++--------------
> ofproto/ofproto-dpif-xlate.c | 21 +++++++--------------
> tests/ofproto-dpif.at        | 10 +++++-----
> tests/system-traffic.at      |  4 ++--
> utilities/ovs-ofctl.8.in     | 10 ++++++----
> 5 files changed, 33 insertions(+), 39 deletions(-)
> 
> diff --git a/lib/ofp-actions.c b/lib/ofp-actions.c
> index bfc8a805ffd5..71eb70c3c239 100644
> --- a/lib/ofp-actions.c
> +++ b/lib/ofp-actions.c
> @@ -5858,20 +5858,19 @@ format_DEBUG_RECIRC(const struct ofpact_null *a OVS_UNUSED,
>  *
>  *   - Packet State:
>  *
> - *      Untracked packets have not yet passed through the connection tracker,
> - *      and the connection state for such packets is unknown. In most cases,
> - *      packets entering the OpenFlow pipeline will initially be in the
> - *      untracked state. Untracked packets may become tracked by executing
> - *      NXAST_CT with a "recirc_table" specified. This makes various aspects
> - *      about the connection available, in particular the connection state.
> - *
> - *      Tracked packets have previously passed through the connection tracker.
> - *      These packets will remain tracked through until the end of the OpenFlow
> - *      pipeline. Tracked packets which have NXAST_CT executed with a
> - *      "recirc_table" specified will return to the tracked state.
> - *
> - *      The packet state is only significant for the duration of packet
> - *      processing within the OpenFlow pipeline.
> + *      Untracked packets have an unknown connection state.  In most
> + *      cases, packets entering the OpenFlow pipeline will initially be
> + *      in the untracked state. Untracked packets may become tracked by
> + *      executing NXAST_CT with a "recirc_table" specified. This makes
> + *      various aspects about the connection available, in particular
> + *      the connection state.
> + *
> + *      An NXAST_CT action always puts the packet into an untracked
> + *      state for the current processing path.  If "recirc_table" is
> + *      set, execution is forked and the packet passes through the
> + *      connection tracker.  The specified table's processing path is
> + *      able to match on Connection state until the end of the OpenFlow
> + *      pipeline or NXAST_CT is called again.
>  *
>  *   - Connection State:
>  *
> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> index 973e760547fa..9e1f837cb23e 100644
> --- a/ofproto/ofproto-dpif-xlate.c
> +++ b/ofproto/ofproto-dpif-xlate.c
> @@ -5721,9 +5721,7 @@ put_ct_nat(struct xlate_ctx *ctx)
> static void
> compose_conntrack_action(struct xlate_ctx *ctx, struct ofpact_conntrack *ofc)
> {
> -    ovs_u128 old_ct_label = ctx->xin->flow.ct_label;
>     ovs_u128 old_ct_label_mask = ctx->wc->masks.ct_label;
> -    uint32_t old_ct_mark = ctx->xin->flow.ct_mark;
>     uint32_t old_ct_mark_mask = ctx->wc->masks.ct_mark;
>     size_t ct_offset;
>     uint16_t zone;
> @@ -5735,7 +5733,7 @@ compose_conntrack_action(struct xlate_ctx *ctx, struct ofpact_conntrack *ofc)
>     /* Process nested actions first, to populate the key. */
>     ctx->ct_nat_action = NULL;
>     ctx->wc->masks.ct_mark = 0;
> -    ctx->wc->masks.ct_label.u64.hi = ctx->wc->masks.ct_label.u64.lo = 0;
> +    ctx->wc->masks.ct_label = OVS_U128_ZERO;
>     do_xlate_actions(ofc->actions, ofpact_ct_get_action_len(ofc), ctx);
> 
>     if (ofc->zone_src.field) {
> @@ -5761,23 +5759,18 @@ compose_conntrack_action(struct xlate_ctx *ctx, struct ofpact_conntrack *ofc)
>     ctx->ct_nat_action = NULL;
>     nl_msg_end_nested(ctx->odp_actions, ct_offset);
> 
> -    /* Restore the original ct fields in the key. These should only be exposed
> -     * after recirculation to another table. */
> -    ctx->xin->flow.ct_mark = old_ct_mark;
>     ctx->wc->masks.ct_mark = old_ct_mark_mask;
> -    ctx->xin->flow.ct_label = old_ct_label;
>     ctx->wc->masks.ct_label = old_ct_label_mask;
> 
> -    if (ofc->recirc_table == NX_CT_RECIRC_NONE) {
> -        /* If we do not recirculate as part of this action, hide the results of
> -         * connection tracking from subsequent recirculations. */
> -        ctx->conntracked = false;
> -    } else {
> -        /* Use ct_* fields from datapath during recirculation upcall. */
> +    if (ofc->recirc_table != NX_CT_RECIRC_NONE) {
>         ctx->conntracked = true;
>         compose_recirculate_and_fork(ctx, ofc->recirc_table);
> -        ctx->conntracked = false;
>     }
> +
> +    /* The ct_* fields are only available in the scope of the 'recirc_table'
> +     * call chain. */
> +    flow_clear_conntrack(&ctx->xin->flow);
> +    ctx->conntracked = false;
> }
> 
> static void
> diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
> index 284a65ec6524..28a7e827cac2 100644
> --- a/tests/ofproto-dpif.at
> +++ b/tests/ofproto-dpif.at
> @@ -8949,7 +8949,7 @@ OVS_WAIT_UNTIL([ovs-appctl -t ovs-ofctl exit])
> 
> dnl Check this output. We only see the latter two packets, not the first.
> AT_CHECK([cat ofctl_monitor.log], [0], [dnl
> -NXT_PACKET_IN (xid=0x0): table_id=6 cookie=0x0 total_len=42 ct_state=new|trk,ct_zone=1,ct_nw_src=10.1.1.1,ct_nw_dst=10.1.1.2,ct_nw_proto=17,ct_tp_src=1,ct_tp_dst=2,reg0=0x1,reg1=0x4d2,reg2=0x1,reg3=0x1,reg4=0x1,in_port=1 (via action) data_len=42 (unbuffered)
> +NXT_PACKET_IN (xid=0x0): table_id=6 cookie=0x0 total_len=42 reg0=0x1,reg1=0x4d2,reg2=0x1,reg3=0x1,reg4=0x1,in_port=1 (via action) data_len=42 (unbuffered)
> udp,vlan_tci=0x0000,dl_src=50:54:00:00:00:09,dl_dst=50:54:00:00:00:0a,nw_src=10.1.1.1,nw_dst=10.1.1.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=1,tp_dst=2 udp_csum:e9d6
> dnl
> NXT_PACKET_IN (xid=0x0): table_id=6 cookie=0x0 total_len=42 ct_state=est|rpl|trk,ct_zone=1,ct_mark=0x1,ct_label=0x4d2000000000000000000000000,ct_nw_src=10.1.1.1,ct_nw_dst=10.1.1.2,ct_nw_proto=17,ct_tp_src=1,ct_tp_dst=2,reg0=0x1,reg1=0x4d2,reg2=0x1,reg3=0x2,reg4=0x1,in_port=2 (via action) data_len=42 (unbuffered)
> @@ -8970,7 +8970,7 @@ OVS_WAIT_UNTIL([ovs-appctl -t ovs-ofctl exit])
> 
> dnl Check this output. We should see both packets
> AT_CHECK([cat ofctl_monitor.log], [0], [dnl
> -NXT_PACKET_IN (xid=0x0): table_id=6 cookie=0x0 total_len=42 ct_state=new|trk,ct_zone=1,ct_nw_src=10.1.1.1,ct_nw_dst=10.1.1.2,ct_nw_proto=17,ct_tp_src=3,ct_tp_dst=2,reg0=0x1,reg1=0x4d2,reg2=0x1,reg3=0x1,reg4=0x1,in_port=1 (via action) data_len=42 (unbuffered)
> +NXT_PACKET_IN (xid=0x0): table_id=6 cookie=0x0 total_len=42 reg0=0x1,reg1=0x4d2,reg2=0x1,reg3=0x1,reg4=0x1,in_port=1 (via action) data_len=42 (unbuffered)
> udp,vlan_tci=0x0000,dl_src=50:54:00:00:00:09,dl_dst=50:54:00:00:00:0a,nw_src=10.1.1.1,nw_dst=10.1.1.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=3,tp_dst=2 udp_csum:e9d4
> dnl
> NXT_PACKET_IN (xid=0x0): table_id=6 cookie=0x0 total_len=42 ct_state=est|rpl|trk,ct_zone=1,ct_mark=0x1,ct_label=0x4d2000000000000000000000000,ct_nw_src=10.1.1.1,ct_nw_dst=10.1.1.2,ct_nw_proto=17,ct_tp_src=3,ct_tp_dst=2,reg0=0x1,reg1=0x4d2,reg2=0x1,reg3=0x2,reg4=0x1,in_port=2 (via action) data_len=42 (unbuffered)
> @@ -9025,7 +9025,7 @@ AT_CHECK([cat ofctl_monitor.log], [0], [dnl
> NXT_PACKET_IN (xid=0x0): cookie=0x0 total_len=42 in_port=1 (via action) data_len=42 (unbuffered)
> udp,vlan_tci=0x0000,dl_src=50:54:00:00:00:09,dl_dst=50:54:00:00:00:0a,nw_src=10.1.1.1,nw_dst=10.1.1.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=1,tp_dst=2 udp_csum:e9d6
> dnl
> -NXT_PACKET_IN (xid=0x0): table_id=1 cookie=0x0 total_len=42 ct_state=est|rpl|trk,ct_nw_src=10.1.1.1,ct_nw_dst=10.1.1.2,ct_nw_proto=17,ct_tp_src=1,ct_tp_dst=2,in_port=2 (via action) data_len=42 (unbuffered)
> +NXT_PACKET_IN (xid=0x0): table_id=1 cookie=0x0 total_len=42 in_port=2 (via action) data_len=42 (unbuffered)
> udp,vlan_tci=0x0000,dl_src=50:54:00:00:00:0a,dl_dst=50:54:00:00:00:09,nw_src=10.1.1.2,nw_dst=10.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=2,tp_dst=1 udp_csum:e9d6
> ])
> 
> @@ -9047,7 +9047,7 @@ AT_CHECK([cat ofctl_monitor.log], [0], [dnl
> NXT_PACKET_IN (xid=0x0): cookie=0x0 total_len=42 in_port=1 (via action) data_len=42 (unbuffered)
> udp,vlan_tci=0x0000,dl_src=50:54:00:00:00:09,dl_dst=50:54:00:00:00:0a,nw_src=10.1.1.1,nw_dst=10.1.1.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=3,tp_dst=4 udp_csum:e9d2
> dnl
> -NXT_PACKET_IN (xid=0x0): table_id=1 cookie=0x0 total_len=42 ct_state=est|rpl|trk,ct_nw_src=10.1.1.1,ct_nw_dst=10.1.1.2,ct_nw_proto=17,ct_tp_src=3,ct_tp_dst=4,in_port=2 (via action) data_len=42 (unbuffered)
> +NXT_PACKET_IN (xid=0x0): table_id=1 cookie=0x0 total_len=42 in_port=2 (via action) data_len=42 (unbuffered)
> udp,vlan_tci=0x0000,dl_src=50:54:00:00:00:0a,dl_dst=50:54:00:00:00:09,nw_src=10.1.1.2,nw_dst=10.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=4,tp_dst=3 udp_csum:e9d2
> ])
> 
> @@ -9362,7 +9362,7 @@ OVS_WAIT_UNTIL([ovs-appctl -t ovs-ofctl exit])
> 
> dnl Check this output. We only see the latter two packets, not the first.
> AT_CHECK([cat ofctl_monitor.log], [0], [dnl
> -NXT_PACKET_IN (xid=0x0): table_id=1 cookie=0x0 total_len=42 ct_state=new|trk,ct_nw_src=10.1.1.1,ct_nw_dst=10.1.1.2,ct_nw_proto=17,ct_tp_src=1,ct_tp_dst=2,in_port=1 (via action) data_len=42 (unbuffered)
> +NXT_PACKET_IN (xid=0x0): table_id=1 cookie=0x0 total_len=42 in_port=1 (via action) data_len=42 (unbuffered)
> udp,vlan_tci=0x0000,dl_src=50:54:00:00:00:09,dl_dst=50:54:00:00:00:0a,nw_src=10.1.1.1,nw_dst=10.1.1.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=1,tp_dst=2 udp_csum:e9d6
> dnl
> NXT_PACKET_IN (xid=0x0): table_id=1 cookie=0x0 total_len=42 ct_state=est|rpl|trk,ct_nw_src=10.1.1.1,ct_nw_dst=10.1.1.2,ct_nw_proto=17,ct_tp_src=1,ct_tp_dst=2,in_port=2 (via action) data_len=42 (unbuffered)
> diff --git a/tests/system-traffic.at b/tests/system-traffic.at
> index 798dd2cbd2c2..522eaa615834 100644
> --- a/tests/system-traffic.at
> +++ b/tests/system-traffic.at
> @@ -2287,7 +2287,7 @@ dnl Ingress pipeline
> dnl - Allow all connections from LOCAL port (commit and proceed to egress)
> dnl - All other connections go through conntracker using the input port as
> dnl   a connection tracking zone.
> -table=1,priority=150,in_port=LOCAL,ip,ct_state=+trk+new,action=ct(commit,zone=OXM_OF_IN_PORT[[0..15]]),goto_table:2
> +table=1,priority=150,in_port=LOCAL,ip,ct_state=+trk+new,action=ct(commit,table=2,zone=OXM_OF_IN_PORT[[0..15]])
> table=1,priority=100,ip,action=ct(table=2,zone=OXM_OF_IN_PORT[[0..15]])
> table=1,priority=1,action=drop
> 
> @@ -2295,7 +2295,7 @@ dnl Egress pipeline
> dnl - Allow all connections from LOCAL port (commit and skip to output)
> dnl - Allow other established connections to go through conntracker using
> dnl   output port as a connection tracking zone.
> -table=2,priority=150,in_port=LOCAL,ip,ct_state=+trk+new,action=ct(commit,zone=NXM_NX_REG0[[0..15]]),goto_table:4
> +table=2,priority=150,in_port=LOCAL,ip,ct_state=+trk+new,action=ct(commit,table=4,zone=NXM_NX_REG0[[0..15]])
> table=2,priority=100,ip,ct_state=+trk+est,action=ct(table=3,zone=NXM_NX_REG0[[0..15]])
> table=2,priority=1,action=drop
> 
> diff --git a/utilities/ovs-ofctl.8.in b/utilities/ovs-ofctl.8.in
> index f6bd90374a18..c65de97f5e2e 100644
> --- a/utilities/ovs-ofctl.8.in
> +++ b/utilities/ovs-ofctl.8.in
> @@ -1031,11 +1031,13 @@ Restores the queue to the value it was before any \fBset_queue\fR
> actions were applied.
> .
> .IP \fBct\fR
> -.IQ \fBct\fB(\fR[\fIargument\fR][\fB,\fIargument\fR...]\fB)
> +.IQ \fBct(\fR[\fIargument\fR][\fB,\fIargument\fR...]\fB)
> Send the packet through the connection tracker.  Refer to the \fBct_state\fR
> -documentation above for possible packet and connection states. The following
> -arguments are supported:
> -
> +documentation above for possible packet and connection states. A \fBct\fR
> +action always sets the packet to an untracked state and clears out the
> +\fBct_state\fR fields for the current processing path.  Those fields are
> +only available for the processing path pointed to by the \fBtable\fR
> +argument.  The following arguments are supported:
> .RS
> .IP \fBcommit\fR
> .RS
> -- 
> 2.7.4
> 
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Russell Bryant Aug. 15, 2017, 2:57 p.m. UTC | #2
I've reached out to 3 other users of OVS connection tracking for
feedback and will report back to this thread once I hear anything.

On Fri, Aug 11, 2017 at 2:58 AM, Justin Pettit <jpettit@ovn.org> wrote:
> I just wanted to specially call out this patch for testing.  My hope is that we can merge this into 2.8, but wanted to give people a heads up that they may want to try it with their applications.  The most visible change is that a call to ct() will clear the ct_state for any actions that follow it.  As before, the ct_state will be accessible to flows rooted in the table specified by the "table" argument to ct(), but now another ct() action will clear the state for the current processing path (but the new ct_state will be available if the "table" argument is given).
>
> My hope is that this won't affect people's pipelines, but it's worth checking if it does.  I would recommend that anyone that uses the conntrack action in their applications give this patch a try (on top of branch-2.8).  If there are any problems, I can try to help you reconstruct your pipeline so it's not affected.  (The changes should be minimal, and there are a couple of examples in the patch itself.)  I've done some light testing with OVN, and it didn't require any changes, and I hope it's the same for everyone else.
>
> The reason I'd like to get this change into 2.8 is that it's a requirement for a performance optimization I'd like to get into 2.9.  Currently, if a controller action is specified, all traffic is slow-pathed to userspace.  As we start making heavier use of local controllers, this will have a serious negative effect on forwarding performance and increase in CPU usage.  I've made a few other (more minor, corner-casey) changes to the OpenFlow processing in 2.8, and I'd like to get all potentially breaking issues into a single release.
>
> If you have an application that uses conntrack, please give it a try and let me know if you run into any issues.
>
> Thanks!
>
> --Justin
>
>
>> On Aug 10, 2017, at 11:38 PM, Justin Pettit <jpettit@ovn.org> wrote:
>>
>> Packet and Connection state is only available to the processing path
>> that follows the "recirc_table" argument of the ct() action.  The
>> previous behavior made these states available until the end of the
>> pipeline.  This commit changes the behavior so that the Packet and
>> Connection state are cleared for the current processing path whenever
>> ct() is called (in addition to reaching the end of the pipeline.)
>>
>> A future commit will remove the behavior that a "send to controller"
>> action causes all packets for that flow to be handled via the slow-path.
>> The current behavior of connection tracking state makes that difficult
>> due to datapath actions containing multiple OpenFlow rules that may
>> contain different connection tracking states.  This change will make
>> that future commit possible.
>>
>> Signed-off-by: Justin Pettit <jpettit@ovn.org>
>> ---
>> v1->v2: Fix system tests.
>> ---
>> lib/ofp-actions.c            | 27 +++++++++++++--------------
>> ofproto/ofproto-dpif-xlate.c | 21 +++++++--------------
>> tests/ofproto-dpif.at        | 10 +++++-----
>> tests/system-traffic.at      |  4 ++--
>> utilities/ovs-ofctl.8.in     | 10 ++++++----
>> 5 files changed, 33 insertions(+), 39 deletions(-)
>>
>> diff --git a/lib/ofp-actions.c b/lib/ofp-actions.c
>> index bfc8a805ffd5..71eb70c3c239 100644
>> --- a/lib/ofp-actions.c
>> +++ b/lib/ofp-actions.c
>> @@ -5858,20 +5858,19 @@ format_DEBUG_RECIRC(const struct ofpact_null *a OVS_UNUSED,
>>  *
>>  *   - Packet State:
>>  *
>> - *      Untracked packets have not yet passed through the connection tracker,
>> - *      and the connection state for such packets is unknown. In most cases,
>> - *      packets entering the OpenFlow pipeline will initially be in the
>> - *      untracked state. Untracked packets may become tracked by executing
>> - *      NXAST_CT with a "recirc_table" specified. This makes various aspects
>> - *      about the connection available, in particular the connection state.
>> - *
>> - *      Tracked packets have previously passed through the connection tracker.
>> - *      These packets will remain tracked through until the end of the OpenFlow
>> - *      pipeline. Tracked packets which have NXAST_CT executed with a
>> - *      "recirc_table" specified will return to the tracked state.
>> - *
>> - *      The packet state is only significant for the duration of packet
>> - *      processing within the OpenFlow pipeline.
>> + *      Untracked packets have an unknown connection state.  In most
>> + *      cases, packets entering the OpenFlow pipeline will initially be
>> + *      in the untracked state. Untracked packets may become tracked by
>> + *      executing NXAST_CT with a "recirc_table" specified. This makes
>> + *      various aspects about the connection available, in particular
>> + *      the connection state.
>> + *
>> + *      An NXAST_CT action always puts the packet into an untracked
>> + *      state for the current processing path.  If "recirc_table" is
>> + *      set, execution is forked and the packet passes through the
>> + *      connection tracker.  The specified table's processing path is
>> + *      able to match on Connection state until the end of the OpenFlow
>> + *      pipeline or NXAST_CT is called again.
>>  *
>>  *   - Connection State:
>>  *
>> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
>> index 973e760547fa..9e1f837cb23e 100644
>> --- a/ofproto/ofproto-dpif-xlate.c
>> +++ b/ofproto/ofproto-dpif-xlate.c
>> @@ -5721,9 +5721,7 @@ put_ct_nat(struct xlate_ctx *ctx)
>> static void
>> compose_conntrack_action(struct xlate_ctx *ctx, struct ofpact_conntrack *ofc)
>> {
>> -    ovs_u128 old_ct_label = ctx->xin->flow.ct_label;
>>     ovs_u128 old_ct_label_mask = ctx->wc->masks.ct_label;
>> -    uint32_t old_ct_mark = ctx->xin->flow.ct_mark;
>>     uint32_t old_ct_mark_mask = ctx->wc->masks.ct_mark;
>>     size_t ct_offset;
>>     uint16_t zone;
>> @@ -5735,7 +5733,7 @@ compose_conntrack_action(struct xlate_ctx *ctx, struct ofpact_conntrack *ofc)
>>     /* Process nested actions first, to populate the key. */
>>     ctx->ct_nat_action = NULL;
>>     ctx->wc->masks.ct_mark = 0;
>> -    ctx->wc->masks.ct_label.u64.hi = ctx->wc->masks.ct_label.u64.lo = 0;
>> +    ctx->wc->masks.ct_label = OVS_U128_ZERO;
>>     do_xlate_actions(ofc->actions, ofpact_ct_get_action_len(ofc), ctx);
>>
>>     if (ofc->zone_src.field) {
>> @@ -5761,23 +5759,18 @@ compose_conntrack_action(struct xlate_ctx *ctx, struct ofpact_conntrack *ofc)
>>     ctx->ct_nat_action = NULL;
>>     nl_msg_end_nested(ctx->odp_actions, ct_offset);
>>
>> -    /* Restore the original ct fields in the key. These should only be exposed
>> -     * after recirculation to another table. */
>> -    ctx->xin->flow.ct_mark = old_ct_mark;
>>     ctx->wc->masks.ct_mark = old_ct_mark_mask;
>> -    ctx->xin->flow.ct_label = old_ct_label;
>>     ctx->wc->masks.ct_label = old_ct_label_mask;
>>
>> -    if (ofc->recirc_table == NX_CT_RECIRC_NONE) {
>> -        /* If we do not recirculate as part of this action, hide the results of
>> -         * connection tracking from subsequent recirculations. */
>> -        ctx->conntracked = false;
>> -    } else {
>> -        /* Use ct_* fields from datapath during recirculation upcall. */
>> +    if (ofc->recirc_table != NX_CT_RECIRC_NONE) {
>>         ctx->conntracked = true;
>>         compose_recirculate_and_fork(ctx, ofc->recirc_table);
>> -        ctx->conntracked = false;
>>     }
>> +
>> +    /* The ct_* fields are only available in the scope of the 'recirc_table'
>> +     * call chain. */
>> +    flow_clear_conntrack(&ctx->xin->flow);
>> +    ctx->conntracked = false;
>> }
>>
>> static void
>> diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
>> index 284a65ec6524..28a7e827cac2 100644
>> --- a/tests/ofproto-dpif.at
>> +++ b/tests/ofproto-dpif.at
>> @@ -8949,7 +8949,7 @@ OVS_WAIT_UNTIL([ovs-appctl -t ovs-ofctl exit])
>>
>> dnl Check this output. We only see the latter two packets, not the first.
>> AT_CHECK([cat ofctl_monitor.log], [0], [dnl
>> -NXT_PACKET_IN (xid=0x0): table_id=6 cookie=0x0 total_len=42 ct_state=new|trk,ct_zone=1,ct_nw_src=10.1.1.1,ct_nw_dst=10.1.1.2,ct_nw_proto=17,ct_tp_src=1,ct_tp_dst=2,reg0=0x1,reg1=0x4d2,reg2=0x1,reg3=0x1,reg4=0x1,in_port=1 (via action) data_len=42 (unbuffered)
>> +NXT_PACKET_IN (xid=0x0): table_id=6 cookie=0x0 total_len=42 reg0=0x1,reg1=0x4d2,reg2=0x1,reg3=0x1,reg4=0x1,in_port=1 (via action) data_len=42 (unbuffered)
>> udp,vlan_tci=0x0000,dl_src=50:54:00:00:00:09,dl_dst=50:54:00:00:00:0a,nw_src=10.1.1.1,nw_dst=10.1.1.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=1,tp_dst=2 udp_csum:e9d6
>> dnl
>> NXT_PACKET_IN (xid=0x0): table_id=6 cookie=0x0 total_len=42 ct_state=est|rpl|trk,ct_zone=1,ct_mark=0x1,ct_label=0x4d2000000000000000000000000,ct_nw_src=10.1.1.1,ct_nw_dst=10.1.1.2,ct_nw_proto=17,ct_tp_src=1,ct_tp_dst=2,reg0=0x1,reg1=0x4d2,reg2=0x1,reg3=0x2,reg4=0x1,in_port=2 (via action) data_len=42 (unbuffered)
>> @@ -8970,7 +8970,7 @@ OVS_WAIT_UNTIL([ovs-appctl -t ovs-ofctl exit])
>>
>> dnl Check this output. We should see both packets
>> AT_CHECK([cat ofctl_monitor.log], [0], [dnl
>> -NXT_PACKET_IN (xid=0x0): table_id=6 cookie=0x0 total_len=42 ct_state=new|trk,ct_zone=1,ct_nw_src=10.1.1.1,ct_nw_dst=10.1.1.2,ct_nw_proto=17,ct_tp_src=3,ct_tp_dst=2,reg0=0x1,reg1=0x4d2,reg2=0x1,reg3=0x1,reg4=0x1,in_port=1 (via action) data_len=42 (unbuffered)
>> +NXT_PACKET_IN (xid=0x0): table_id=6 cookie=0x0 total_len=42 reg0=0x1,reg1=0x4d2,reg2=0x1,reg3=0x1,reg4=0x1,in_port=1 (via action) data_len=42 (unbuffered)
>> udp,vlan_tci=0x0000,dl_src=50:54:00:00:00:09,dl_dst=50:54:00:00:00:0a,nw_src=10.1.1.1,nw_dst=10.1.1.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=3,tp_dst=2 udp_csum:e9d4
>> dnl
>> NXT_PACKET_IN (xid=0x0): table_id=6 cookie=0x0 total_len=42 ct_state=est|rpl|trk,ct_zone=1,ct_mark=0x1,ct_label=0x4d2000000000000000000000000,ct_nw_src=10.1.1.1,ct_nw_dst=10.1.1.2,ct_nw_proto=17,ct_tp_src=3,ct_tp_dst=2,reg0=0x1,reg1=0x4d2,reg2=0x1,reg3=0x2,reg4=0x1,in_port=2 (via action) data_len=42 (unbuffered)
>> @@ -9025,7 +9025,7 @@ AT_CHECK([cat ofctl_monitor.log], [0], [dnl
>> NXT_PACKET_IN (xid=0x0): cookie=0x0 total_len=42 in_port=1 (via action) data_len=42 (unbuffered)
>> udp,vlan_tci=0x0000,dl_src=50:54:00:00:00:09,dl_dst=50:54:00:00:00:0a,nw_src=10.1.1.1,nw_dst=10.1.1.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=1,tp_dst=2 udp_csum:e9d6
>> dnl
>> -NXT_PACKET_IN (xid=0x0): table_id=1 cookie=0x0 total_len=42 ct_state=est|rpl|trk,ct_nw_src=10.1.1.1,ct_nw_dst=10.1.1.2,ct_nw_proto=17,ct_tp_src=1,ct_tp_dst=2,in_port=2 (via action) data_len=42 (unbuffered)
>> +NXT_PACKET_IN (xid=0x0): table_id=1 cookie=0x0 total_len=42 in_port=2 (via action) data_len=42 (unbuffered)
>> udp,vlan_tci=0x0000,dl_src=50:54:00:00:00:0a,dl_dst=50:54:00:00:00:09,nw_src=10.1.1.2,nw_dst=10.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=2,tp_dst=1 udp_csum:e9d6
>> ])
>>
>> @@ -9047,7 +9047,7 @@ AT_CHECK([cat ofctl_monitor.log], [0], [dnl
>> NXT_PACKET_IN (xid=0x0): cookie=0x0 total_len=42 in_port=1 (via action) data_len=42 (unbuffered)
>> udp,vlan_tci=0x0000,dl_src=50:54:00:00:00:09,dl_dst=50:54:00:00:00:0a,nw_src=10.1.1.1,nw_dst=10.1.1.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=3,tp_dst=4 udp_csum:e9d2
>> dnl
>> -NXT_PACKET_IN (xid=0x0): table_id=1 cookie=0x0 total_len=42 ct_state=est|rpl|trk,ct_nw_src=10.1.1.1,ct_nw_dst=10.1.1.2,ct_nw_proto=17,ct_tp_src=3,ct_tp_dst=4,in_port=2 (via action) data_len=42 (unbuffered)
>> +NXT_PACKET_IN (xid=0x0): table_id=1 cookie=0x0 total_len=42 in_port=2 (via action) data_len=42 (unbuffered)
>> udp,vlan_tci=0x0000,dl_src=50:54:00:00:00:0a,dl_dst=50:54:00:00:00:09,nw_src=10.1.1.2,nw_dst=10.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=4,tp_dst=3 udp_csum:e9d2
>> ])
>>
>> @@ -9362,7 +9362,7 @@ OVS_WAIT_UNTIL([ovs-appctl -t ovs-ofctl exit])
>>
>> dnl Check this output. We only see the latter two packets, not the first.
>> AT_CHECK([cat ofctl_monitor.log], [0], [dnl
>> -NXT_PACKET_IN (xid=0x0): table_id=1 cookie=0x0 total_len=42 ct_state=new|trk,ct_nw_src=10.1.1.1,ct_nw_dst=10.1.1.2,ct_nw_proto=17,ct_tp_src=1,ct_tp_dst=2,in_port=1 (via action) data_len=42 (unbuffered)
>> +NXT_PACKET_IN (xid=0x0): table_id=1 cookie=0x0 total_len=42 in_port=1 (via action) data_len=42 (unbuffered)
>> udp,vlan_tci=0x0000,dl_src=50:54:00:00:00:09,dl_dst=50:54:00:00:00:0a,nw_src=10.1.1.1,nw_dst=10.1.1.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=1,tp_dst=2 udp_csum:e9d6
>> dnl
>> NXT_PACKET_IN (xid=0x0): table_id=1 cookie=0x0 total_len=42 ct_state=est|rpl|trk,ct_nw_src=10.1.1.1,ct_nw_dst=10.1.1.2,ct_nw_proto=17,ct_tp_src=1,ct_tp_dst=2,in_port=2 (via action) data_len=42 (unbuffered)
>> diff --git a/tests/system-traffic.at b/tests/system-traffic.at
>> index 798dd2cbd2c2..522eaa615834 100644
>> --- a/tests/system-traffic.at
>> +++ b/tests/system-traffic.at
>> @@ -2287,7 +2287,7 @@ dnl Ingress pipeline
>> dnl - Allow all connections from LOCAL port (commit and proceed to egress)
>> dnl - All other connections go through conntracker using the input port as
>> dnl   a connection tracking zone.
>> -table=1,priority=150,in_port=LOCAL,ip,ct_state=+trk+new,action=ct(commit,zone=OXM_OF_IN_PORT[[0..15]]),goto_table:2
>> +table=1,priority=150,in_port=LOCAL,ip,ct_state=+trk+new,action=ct(commit,table=2,zone=OXM_OF_IN_PORT[[0..15]])
>> table=1,priority=100,ip,action=ct(table=2,zone=OXM_OF_IN_PORT[[0..15]])
>> table=1,priority=1,action=drop
>>
>> @@ -2295,7 +2295,7 @@ dnl Egress pipeline
>> dnl - Allow all connections from LOCAL port (commit and skip to output)
>> dnl - Allow other established connections to go through conntracker using
>> dnl   output port as a connection tracking zone.
>> -table=2,priority=150,in_port=LOCAL,ip,ct_state=+trk+new,action=ct(commit,zone=NXM_NX_REG0[[0..15]]),goto_table:4
>> +table=2,priority=150,in_port=LOCAL,ip,ct_state=+trk+new,action=ct(commit,table=4,zone=NXM_NX_REG0[[0..15]])
>> table=2,priority=100,ip,ct_state=+trk+est,action=ct(table=3,zone=NXM_NX_REG0[[0..15]])
>> table=2,priority=1,action=drop
>>
>> diff --git a/utilities/ovs-ofctl.8.in b/utilities/ovs-ofctl.8.in
>> index f6bd90374a18..c65de97f5e2e 100644
>> --- a/utilities/ovs-ofctl.8.in
>> +++ b/utilities/ovs-ofctl.8.in
>> @@ -1031,11 +1031,13 @@ Restores the queue to the value it was before any \fBset_queue\fR
>> actions were applied.
>> .
>> .IP \fBct\fR
>> -.IQ \fBct\fB(\fR[\fIargument\fR][\fB,\fIargument\fR...]\fB)
>> +.IQ \fBct(\fR[\fIargument\fR][\fB,\fIargument\fR...]\fB)
>> Send the packet through the connection tracker.  Refer to the \fBct_state\fR
>> -documentation above for possible packet and connection states. The following
>> -arguments are supported:
>> -
>> +documentation above for possible packet and connection states. A \fBct\fR
>> +action always sets the packet to an untracked state and clears out the
>> +\fBct_state\fR fields for the current processing path.  Those fields are
>> +only available for the processing path pointed to by the \fBtable\fR
>> +argument.  The following arguments are supported:
>> .RS
>> .IP \fBcommit\fR
>> .RS
>> --
>> 2.7.4
>>
>> _______________________________________________
>> dev mailing list
>> dev@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
Joe Stringer Aug. 21, 2017, 7:38 p.m. UTC | #3
On 10 August 2017 at 23:38, Justin Pettit <jpettit@ovn.org> wrote:
> Packet and Connection state is only available to the processing path
> that follows the "recirc_table" argument of the ct() action.  The
> previous behavior made these states available until the end of the
> pipeline.  This commit changes the behavior so that the Packet and
> Connection state are cleared for the current processing path whenever
> ct() is called (in addition to reaching the end of the pipeline.)
>
> A future commit will remove the behavior that a "send to controller"
> action causes all packets for that flow to be handled via the slow-path.
> The current behavior of connection tracking state makes that difficult
> due to datapath actions containing multiple OpenFlow rules that may
> contain different connection tracking states.  This change will make
> that future commit possible.
>
> Signed-off-by: Justin Pettit <jpettit@ovn.org>
> ---

Overall I think that this is an improvement to consistency of what
connection tracking metadata is accessible from different points in
the OpenFlow pipeline. Although this will restrict the availability of
ct_state following the ct() action execution, controller writers who
wish to preserve access to this content across a CT action execution
can do so using registers. In practice I'm not aware of any controller
that is currently operating this way though.

Do we need a NEWS item for this?

Acked-by: Joe Stringer <joe@ovn.org>
Justin Pettit Aug. 21, 2017, 7:56 p.m. UTC | #4
> On Aug 21, 2017, at 12:38 PM, Joe Stringer <joe@ovn.org> wrote:
> 
> Overall I think that this is an improvement to consistency of what
> connection tracking metadata is accessible from different points in
> the OpenFlow pipeline. Although this will restrict the availability of
> ct_state following the ct() action execution, controller writers who
> wish to preserve access to this content across a CT action execution
> can do so using registers. In practice I'm not aware of any controller
> that is currently operating this way though.
> 
> Do we need a NEWS item for this?

Good point I added something.

> Acked-by: Joe Stringer <joe@ovn.org>

Thanks for all your suggestions and testing on these patches leading up to this no-slowpath work.  I've pushed this change to master and branch-2.8.

--Justin
Flavio Leitner Aug. 30, 2017, 4:13 a.m. UTC | #5
On Mon, 21 Aug 2017 12:56:33 -0700
Justin Pettit <jpettit@ovn.org> wrote:

> > On Aug 21, 2017, at 12:38 PM, Joe Stringer <joe@ovn.org> wrote:
> > 
> > Overall I think that this is an improvement to consistency of what
> > connection tracking metadata is accessible from different points in
> > the OpenFlow pipeline. Although this will restrict the availability of
> > ct_state following the ct() action execution, controller writers who
> > wish to preserve access to this content across a CT action execution
> > can do so using registers. In practice I'm not aware of any controller
> > that is currently operating this way though.
> > 
> > Do we need a NEWS item for this?  
> 
> Good point I added something.
> 
> > Acked-by: Joe Stringer <joe@ovn.org>  
> 
> Thanks for all your suggestions and testing on these patches leading up to this no-slowpath work.  I've pushed this change to master and branch-2.8.

We found that at least OpenShift 3.6.0 breaks after this change.

It does:
"table=30, priority=300, ip, nw_dst=%s, ct_state=+rpl, actions=ct(nat),goto_table:70"
and then there is a rule that checks if ct_state=+rpl.

We are going to fix it to "actions=ct(nat,table=70)" in the newer
releases, but of course we can't change the past, and that prevents
us to provide newer OVS as updates.

Thanks,
Justin Pettit Aug. 31, 2017, 8:17 p.m. UTC | #6
> On Aug 29, 2017, at 9:13 PM, Flavio Leitner <fbl@sysclose.org> wrote:
> 
> On Mon, 21 Aug 2017 12:56:33 -0700
> Justin Pettit <jpettit@ovn.org> wrote:
> 
>>> On Aug 21, 2017, at 12:38 PM, Joe Stringer <joe@ovn.org> wrote:
>>> 
>>> Overall I think that this is an improvement to consistency of what
>>> connection tracking metadata is accessible from different points in
>>> the OpenFlow pipeline. Although this will restrict the availability of
>>> ct_state following the ct() action execution, controller writers who
>>> wish to preserve access to this content across a CT action execution
>>> can do so using registers. In practice I'm not aware of any controller
>>> that is currently operating this way though.
>>> 
>>> Do we need a NEWS item for this?  
>> 
>> Good point I added something.
>> 
>>> Acked-by: Joe Stringer <joe@ovn.org>  
>> 
>> Thanks for all your suggestions and testing on these patches leading up to this no-slowpath work.  I've pushed this change to master and branch-2.8.
> 
> We found that at least OpenShift 3.6.0 breaks after this change.
> 
> It does:
> "table=30, priority=300, ip, nw_dst=%s, ct_state=+rpl, actions=ct(nat),goto_table:70"
> and then there is a rule that checks if ct_state=+rpl.
> 
> We are going to fix it to "actions=ct(nat,table=70)" in the newer
> releases, but of course we can't change the past, and that prevents
> us to provide newer OVS as updates.

I sent out a revert patch for branch-2.8:

	https://mail.openvswitch.org/pipermail/ovs-dev/2017-August/338154.html

It's unfortunate to introduce an API change, and I believe we have a good history of rarely doing that.  However, I think this change will provide a big long-term win when we can avoid slow-pathing controller actions.  As such, I'd still like to include this change in the 2.9 release (or whatever it's called).

Thanks,

--Justin
Flavio Leitner Aug. 31, 2017, 10:07 p.m. UTC | #7
On Thu, 31 Aug 2017 13:17:28 -0700
Justin Pettit <jpettit@ovn.org> wrote:

> 
> > On Aug 29, 2017, at 9:13 PM, Flavio Leitner <fbl@sysclose.org> wrote:
> > 
> > On Mon, 21 Aug 2017 12:56:33 -0700
> > Justin Pettit <jpettit@ovn.org> wrote:
> > 
> >>> On Aug 21, 2017, at 12:38 PM, Joe Stringer <joe@ovn.org> wrote:
> >>> 
> >>> Overall I think that this is an improvement to consistency of what
> >>> connection tracking metadata is accessible from different points in
> >>> the OpenFlow pipeline. Although this will restrict the availability of
> >>> ct_state following the ct() action execution, controller writers who
> >>> wish to preserve access to this content across a CT action execution
> >>> can do so using registers. In practice I'm not aware of any controller
> >>> that is currently operating this way though.
> >>> 
> >>> Do we need a NEWS item for this?  
> >> 
> >> Good point I added something.
> >> 
> >>> Acked-by: Joe Stringer <joe@ovn.org>  
> >> 
> >> Thanks for all your suggestions and testing on these patches leading up to this no-slowpath work.  I've pushed this change to master and branch-2.8.
> > 
> > We found that at least OpenShift 3.6.0 breaks after this change.
> > 
> > It does:
> > "table=30, priority=300, ip, nw_dst=%s, ct_state=+rpl, actions=ct(nat),goto_table:70"
> > and then there is a rule that checks if ct_state=+rpl.
> > 
> > We are going to fix it to "actions=ct(nat,table=70)" in the newer
> > releases, but of course we can't change the past, and that prevents
> > us to provide newer OVS as updates.
> 
> I sent out a revert patch for branch-2.8:
> 
> 	https://mail.openvswitch.org/pipermail/ovs-dev/2017-August/338154.html
> 
> It's unfortunate to introduce an API change, and I believe we have a good history of rarely doing that.  However, I think this change will provide a big long-term win when we can avoid slow-pathing controller actions.  As such, I'd still like to include this change in the 2.9 release (or whatever it's called).

It's unfortunate, indeed. We should strive to avoid that as much as
possible.  I hope we can find a way to support that change until it
comes as part of the next release though.  Let's see during the next
development cycle.

Thanks!
diff mbox

Patch

diff --git a/lib/ofp-actions.c b/lib/ofp-actions.c
index bfc8a805ffd5..71eb70c3c239 100644
--- a/lib/ofp-actions.c
+++ b/lib/ofp-actions.c
@@ -5858,20 +5858,19 @@  format_DEBUG_RECIRC(const struct ofpact_null *a OVS_UNUSED,
  *
  *   - Packet State:
  *
- *      Untracked packets have not yet passed through the connection tracker,
- *      and the connection state for such packets is unknown. In most cases,
- *      packets entering the OpenFlow pipeline will initially be in the
- *      untracked state. Untracked packets may become tracked by executing
- *      NXAST_CT with a "recirc_table" specified. This makes various aspects
- *      about the connection available, in particular the connection state.
- *
- *      Tracked packets have previously passed through the connection tracker.
- *      These packets will remain tracked through until the end of the OpenFlow
- *      pipeline. Tracked packets which have NXAST_CT executed with a
- *      "recirc_table" specified will return to the tracked state.
- *
- *      The packet state is only significant for the duration of packet
- *      processing within the OpenFlow pipeline.
+ *      Untracked packets have an unknown connection state.  In most
+ *      cases, packets entering the OpenFlow pipeline will initially be
+ *      in the untracked state. Untracked packets may become tracked by
+ *      executing NXAST_CT with a "recirc_table" specified. This makes
+ *      various aspects about the connection available, in particular
+ *      the connection state.
+ *
+ *      An NXAST_CT action always puts the packet into an untracked
+ *      state for the current processing path.  If "recirc_table" is
+ *      set, execution is forked and the packet passes through the
+ *      connection tracker.  The specified table's processing path is
+ *      able to match on Connection state until the end of the OpenFlow
+ *      pipeline or NXAST_CT is called again.
  *
  *   - Connection State:
  *
diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index 973e760547fa..9e1f837cb23e 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -5721,9 +5721,7 @@  put_ct_nat(struct xlate_ctx *ctx)
 static void
 compose_conntrack_action(struct xlate_ctx *ctx, struct ofpact_conntrack *ofc)
 {
-    ovs_u128 old_ct_label = ctx->xin->flow.ct_label;
     ovs_u128 old_ct_label_mask = ctx->wc->masks.ct_label;
-    uint32_t old_ct_mark = ctx->xin->flow.ct_mark;
     uint32_t old_ct_mark_mask = ctx->wc->masks.ct_mark;
     size_t ct_offset;
     uint16_t zone;
@@ -5735,7 +5733,7 @@  compose_conntrack_action(struct xlate_ctx *ctx, struct ofpact_conntrack *ofc)
     /* Process nested actions first, to populate the key. */
     ctx->ct_nat_action = NULL;
     ctx->wc->masks.ct_mark = 0;
-    ctx->wc->masks.ct_label.u64.hi = ctx->wc->masks.ct_label.u64.lo = 0;
+    ctx->wc->masks.ct_label = OVS_U128_ZERO;
     do_xlate_actions(ofc->actions, ofpact_ct_get_action_len(ofc), ctx);
 
     if (ofc->zone_src.field) {
@@ -5761,23 +5759,18 @@  compose_conntrack_action(struct xlate_ctx *ctx, struct ofpact_conntrack *ofc)
     ctx->ct_nat_action = NULL;
     nl_msg_end_nested(ctx->odp_actions, ct_offset);
 
-    /* Restore the original ct fields in the key. These should only be exposed
-     * after recirculation to another table. */
-    ctx->xin->flow.ct_mark = old_ct_mark;
     ctx->wc->masks.ct_mark = old_ct_mark_mask;
-    ctx->xin->flow.ct_label = old_ct_label;
     ctx->wc->masks.ct_label = old_ct_label_mask;
 
-    if (ofc->recirc_table == NX_CT_RECIRC_NONE) {
-        /* If we do not recirculate as part of this action, hide the results of
-         * connection tracking from subsequent recirculations. */
-        ctx->conntracked = false;
-    } else {
-        /* Use ct_* fields from datapath during recirculation upcall. */
+    if (ofc->recirc_table != NX_CT_RECIRC_NONE) {
         ctx->conntracked = true;
         compose_recirculate_and_fork(ctx, ofc->recirc_table);
-        ctx->conntracked = false;
     }
+
+    /* The ct_* fields are only available in the scope of the 'recirc_table'
+     * call chain. */
+    flow_clear_conntrack(&ctx->xin->flow);
+    ctx->conntracked = false;
 }
 
 static void
diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
index 284a65ec6524..28a7e827cac2 100644
--- a/tests/ofproto-dpif.at
+++ b/tests/ofproto-dpif.at
@@ -8949,7 +8949,7 @@  OVS_WAIT_UNTIL([ovs-appctl -t ovs-ofctl exit])
 
 dnl Check this output. We only see the latter two packets, not the first.
 AT_CHECK([cat ofctl_monitor.log], [0], [dnl
-NXT_PACKET_IN (xid=0x0): table_id=6 cookie=0x0 total_len=42 ct_state=new|trk,ct_zone=1,ct_nw_src=10.1.1.1,ct_nw_dst=10.1.1.2,ct_nw_proto=17,ct_tp_src=1,ct_tp_dst=2,reg0=0x1,reg1=0x4d2,reg2=0x1,reg3=0x1,reg4=0x1,in_port=1 (via action) data_len=42 (unbuffered)
+NXT_PACKET_IN (xid=0x0): table_id=6 cookie=0x0 total_len=42 reg0=0x1,reg1=0x4d2,reg2=0x1,reg3=0x1,reg4=0x1,in_port=1 (via action) data_len=42 (unbuffered)
 udp,vlan_tci=0x0000,dl_src=50:54:00:00:00:09,dl_dst=50:54:00:00:00:0a,nw_src=10.1.1.1,nw_dst=10.1.1.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=1,tp_dst=2 udp_csum:e9d6
 dnl
 NXT_PACKET_IN (xid=0x0): table_id=6 cookie=0x0 total_len=42 ct_state=est|rpl|trk,ct_zone=1,ct_mark=0x1,ct_label=0x4d2000000000000000000000000,ct_nw_src=10.1.1.1,ct_nw_dst=10.1.1.2,ct_nw_proto=17,ct_tp_src=1,ct_tp_dst=2,reg0=0x1,reg1=0x4d2,reg2=0x1,reg3=0x2,reg4=0x1,in_port=2 (via action) data_len=42 (unbuffered)
@@ -8970,7 +8970,7 @@  OVS_WAIT_UNTIL([ovs-appctl -t ovs-ofctl exit])
 
 dnl Check this output. We should see both packets
 AT_CHECK([cat ofctl_monitor.log], [0], [dnl
-NXT_PACKET_IN (xid=0x0): table_id=6 cookie=0x0 total_len=42 ct_state=new|trk,ct_zone=1,ct_nw_src=10.1.1.1,ct_nw_dst=10.1.1.2,ct_nw_proto=17,ct_tp_src=3,ct_tp_dst=2,reg0=0x1,reg1=0x4d2,reg2=0x1,reg3=0x1,reg4=0x1,in_port=1 (via action) data_len=42 (unbuffered)
+NXT_PACKET_IN (xid=0x0): table_id=6 cookie=0x0 total_len=42 reg0=0x1,reg1=0x4d2,reg2=0x1,reg3=0x1,reg4=0x1,in_port=1 (via action) data_len=42 (unbuffered)
 udp,vlan_tci=0x0000,dl_src=50:54:00:00:00:09,dl_dst=50:54:00:00:00:0a,nw_src=10.1.1.1,nw_dst=10.1.1.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=3,tp_dst=2 udp_csum:e9d4
 dnl
 NXT_PACKET_IN (xid=0x0): table_id=6 cookie=0x0 total_len=42 ct_state=est|rpl|trk,ct_zone=1,ct_mark=0x1,ct_label=0x4d2000000000000000000000000,ct_nw_src=10.1.1.1,ct_nw_dst=10.1.1.2,ct_nw_proto=17,ct_tp_src=3,ct_tp_dst=2,reg0=0x1,reg1=0x4d2,reg2=0x1,reg3=0x2,reg4=0x1,in_port=2 (via action) data_len=42 (unbuffered)
@@ -9025,7 +9025,7 @@  AT_CHECK([cat ofctl_monitor.log], [0], [dnl
 NXT_PACKET_IN (xid=0x0): cookie=0x0 total_len=42 in_port=1 (via action) data_len=42 (unbuffered)
 udp,vlan_tci=0x0000,dl_src=50:54:00:00:00:09,dl_dst=50:54:00:00:00:0a,nw_src=10.1.1.1,nw_dst=10.1.1.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=1,tp_dst=2 udp_csum:e9d6
 dnl
-NXT_PACKET_IN (xid=0x0): table_id=1 cookie=0x0 total_len=42 ct_state=est|rpl|trk,ct_nw_src=10.1.1.1,ct_nw_dst=10.1.1.2,ct_nw_proto=17,ct_tp_src=1,ct_tp_dst=2,in_port=2 (via action) data_len=42 (unbuffered)
+NXT_PACKET_IN (xid=0x0): table_id=1 cookie=0x0 total_len=42 in_port=2 (via action) data_len=42 (unbuffered)
 udp,vlan_tci=0x0000,dl_src=50:54:00:00:00:0a,dl_dst=50:54:00:00:00:09,nw_src=10.1.1.2,nw_dst=10.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=2,tp_dst=1 udp_csum:e9d6
 ])
 
@@ -9047,7 +9047,7 @@  AT_CHECK([cat ofctl_monitor.log], [0], [dnl
 NXT_PACKET_IN (xid=0x0): cookie=0x0 total_len=42 in_port=1 (via action) data_len=42 (unbuffered)
 udp,vlan_tci=0x0000,dl_src=50:54:00:00:00:09,dl_dst=50:54:00:00:00:0a,nw_src=10.1.1.1,nw_dst=10.1.1.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=3,tp_dst=4 udp_csum:e9d2
 dnl
-NXT_PACKET_IN (xid=0x0): table_id=1 cookie=0x0 total_len=42 ct_state=est|rpl|trk,ct_nw_src=10.1.1.1,ct_nw_dst=10.1.1.2,ct_nw_proto=17,ct_tp_src=3,ct_tp_dst=4,in_port=2 (via action) data_len=42 (unbuffered)
+NXT_PACKET_IN (xid=0x0): table_id=1 cookie=0x0 total_len=42 in_port=2 (via action) data_len=42 (unbuffered)
 udp,vlan_tci=0x0000,dl_src=50:54:00:00:00:0a,dl_dst=50:54:00:00:00:09,nw_src=10.1.1.2,nw_dst=10.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=4,tp_dst=3 udp_csum:e9d2
 ])
 
@@ -9362,7 +9362,7 @@  OVS_WAIT_UNTIL([ovs-appctl -t ovs-ofctl exit])
 
 dnl Check this output. We only see the latter two packets, not the first.
 AT_CHECK([cat ofctl_monitor.log], [0], [dnl
-NXT_PACKET_IN (xid=0x0): table_id=1 cookie=0x0 total_len=42 ct_state=new|trk,ct_nw_src=10.1.1.1,ct_nw_dst=10.1.1.2,ct_nw_proto=17,ct_tp_src=1,ct_tp_dst=2,in_port=1 (via action) data_len=42 (unbuffered)
+NXT_PACKET_IN (xid=0x0): table_id=1 cookie=0x0 total_len=42 in_port=1 (via action) data_len=42 (unbuffered)
 udp,vlan_tci=0x0000,dl_src=50:54:00:00:00:09,dl_dst=50:54:00:00:00:0a,nw_src=10.1.1.1,nw_dst=10.1.1.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=1,tp_dst=2 udp_csum:e9d6
 dnl
 NXT_PACKET_IN (xid=0x0): table_id=1 cookie=0x0 total_len=42 ct_state=est|rpl|trk,ct_nw_src=10.1.1.1,ct_nw_dst=10.1.1.2,ct_nw_proto=17,ct_tp_src=1,ct_tp_dst=2,in_port=2 (via action) data_len=42 (unbuffered)
diff --git a/tests/system-traffic.at b/tests/system-traffic.at
index 798dd2cbd2c2..522eaa615834 100644
--- a/tests/system-traffic.at
+++ b/tests/system-traffic.at
@@ -2287,7 +2287,7 @@  dnl Ingress pipeline
 dnl - Allow all connections from LOCAL port (commit and proceed to egress)
 dnl - All other connections go through conntracker using the input port as
 dnl   a connection tracking zone.
-table=1,priority=150,in_port=LOCAL,ip,ct_state=+trk+new,action=ct(commit,zone=OXM_OF_IN_PORT[[0..15]]),goto_table:2
+table=1,priority=150,in_port=LOCAL,ip,ct_state=+trk+new,action=ct(commit,table=2,zone=OXM_OF_IN_PORT[[0..15]])
 table=1,priority=100,ip,action=ct(table=2,zone=OXM_OF_IN_PORT[[0..15]])
 table=1,priority=1,action=drop
 
@@ -2295,7 +2295,7 @@  dnl Egress pipeline
 dnl - Allow all connections from LOCAL port (commit and skip to output)
 dnl - Allow other established connections to go through conntracker using
 dnl   output port as a connection tracking zone.
-table=2,priority=150,in_port=LOCAL,ip,ct_state=+trk+new,action=ct(commit,zone=NXM_NX_REG0[[0..15]]),goto_table:4
+table=2,priority=150,in_port=LOCAL,ip,ct_state=+trk+new,action=ct(commit,table=4,zone=NXM_NX_REG0[[0..15]])
 table=2,priority=100,ip,ct_state=+trk+est,action=ct(table=3,zone=NXM_NX_REG0[[0..15]])
 table=2,priority=1,action=drop
 
diff --git a/utilities/ovs-ofctl.8.in b/utilities/ovs-ofctl.8.in
index f6bd90374a18..c65de97f5e2e 100644
--- a/utilities/ovs-ofctl.8.in
+++ b/utilities/ovs-ofctl.8.in
@@ -1031,11 +1031,13 @@  Restores the queue to the value it was before any \fBset_queue\fR
 actions were applied.
 .
 .IP \fBct\fR
-.IQ \fBct\fB(\fR[\fIargument\fR][\fB,\fIargument\fR...]\fB)
+.IQ \fBct(\fR[\fIargument\fR][\fB,\fIargument\fR...]\fB)
 Send the packet through the connection tracker.  Refer to the \fBct_state\fR
-documentation above for possible packet and connection states. The following
-arguments are supported:
-
+documentation above for possible packet and connection states. A \fBct\fR
+action always sets the packet to an untracked state and clears out the
+\fBct_state\fR fields for the current processing path.  Those fields are
+only available for the processing path pointed to by the \fBtable\fR
+argument.  The following arguments are supported:
 .RS
 .IP \fBcommit\fR
 .RS