diff mbox series

[ovs-dev,V2] ofproto-dpif-xlate: Incorrect handling of errors in group action processing

Message ID AM5PR0701MB29616BA2024887F7D105235FA4370@AM5PR0701MB2961.eurprd07.prod.outlook.com
State Not Applicable
Headers show
Series [ovs-dev,V2] ofproto-dpif-xlate: Incorrect handling of errors in group action processing | expand

Commit Message

Vishal Deep Ajmera Dec. 11, 2017, 11:54 a.m. UTC
As per OpenFlow v1.3 specification, when an action list contains a group
action a copy of the packet is passed to the group for processing by the
group. This means that if there is an error encountered during group
processing, only the copy of packet should be dropped, but subsequent
actions in the action list should be executed on the original packet.

Additionally, if the group type is "ALL", each action bucket of the group
should process a copy of the packet. If there is an error while processing
one bucket other buckets should still be processed.

Example 1:
table=0,in_port=tap0 actions=output:tap1,group:10,output:tap2

Even if any error is encountered while processing the group action, the
packet should still be forwarded to ports tap1 and tap2.

Example 2:
group_id=1,type=all,bucket=actions=output:tap1,bucket=actions=encap(eth)

Even if processing the action in the second bucket fails because the
packet already has an Ethernet header, the other copy of the packet should
still be processed by the first bucket and output to port tap1.

Currently the error handling in OVS does not comply with those rules. When
any group bucket execution fails the error is recorded in the so-called
"translation context" which is global for the processing of the original
packet. Once an error is recorded, OVS skips processing subsequent buckets
and installs a drop action in the datapath even if parts of the action list
were previously processed successfully.

This patch clears the error flag after any bucket of a group is executed.
This way the error encountered in processing any bucket of the group will
not impact other actions of action-list or other buckets of the group.

Errors which are system limits to protect translation from taking too long
time or too much space are not cleared. Instead drop action is installed
for them.

Signed-off-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com>
Signed-off-by: Keshav Gupta <keshav.gupta@ericsson.com>
---
ofproto/ofproto-dpif-xlate.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)

--
1.9.1

Comments

Vishal Deep Ajmera Dec. 18, 2017, 4:43 p.m. UTC | #1
Hi,

Do anyone sees any issue with the patch below ?

Warm Regards,
Vishal

-----Original Message-----
From: ovs-dev-bounces@openvswitch.org [mailto:ovs-dev-bounces@openvswitch.org] On Behalf Of Vishal Deep Ajmera
Sent: Monday, December 11, 2017 5:25 PM
To: dev@openvswitch.org
Subject: [ovs-dev] [PATCH V2] ofproto-dpif-xlate: Incorrect handling of errors in group action processing

As per OpenFlow v1.3 specification, when an action list contains a group action a copy of the packet is passed to the group for processing by the group. This means that if there is an error encountered during group processing, only the copy of packet should be dropped, but subsequent actions in the action list should be executed on the original packet.

Additionally, if the group type is "ALL", each action bucket of the group should process a copy of the packet. If there is an error while processing one bucket other buckets should still be processed.

Example 1:
table=0,in_port=tap0 actions=output:tap1,group:10,output:tap2

Even if any error is encountered while processing the group action, the packet should still be forwarded to ports tap1 and tap2.

Example 2:
group_id=1,type=all,bucket=actions=output:tap1,bucket=actions=encap(eth)

Even if processing the action in the second bucket fails because the packet already has an Ethernet header, the other copy of the packet should still be processed by the first bucket and output to port tap1.

Currently the error handling in OVS does not comply with those rules. When any group bucket execution fails the error is recorded in the so-called "translation context" which is global for the processing of the original packet. Once an error is recorded, OVS skips processing subsequent buckets and installs a drop action in the datapath even if parts of the action list were previously processed successfully.

This patch clears the error flag after any bucket of a group is executed.
This way the error encountered in processing any bucket of the group will not impact other actions of action-list or other buckets of the group.

Errors which are system limits to protect translation from taking too long time or too much space are not cleared. Instead drop action is installed for them.

Signed-off-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com>
Signed-off-by: Keshav Gupta <keshav.gupta@ericsson.com>
---
ofproto/ofproto-dpif-xlate.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)

diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c index 9b3a2f2..7893c0e 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -4065,6 +4065,23 @@ xlate_group_bucket(struct xlate_ctx *ctx, struct ofputil_bucket *bucket,
      * the group bucket freezes translation, the actions after the group action
      * must continue processing with the original, not the frozen packet! */
    ctx->exit = false;
+
+    /* Context error in a bucket should not impact processing of other buckets
+     * or actions. This is similar to cloning a packet for group buckets.
+     * There is no need to restore the error back to old value due to the fact
+     * that we actually processed group action which can happen only when there
+     * is no previous context error.
+     *
+     * Exception to above is errors which are system limits to protect translation
+     * from running too long or occupy too much space. These errors should not be
+     * masked. XLATE_RECURSION_TOO_DEEP, XLATE_TOO_MANY_RESUBMITS and
+     * XLATE_STACK_TOO_DEEP fall in this category.
+     */
+    if (ctx->error == XLATE_TOO_MANY_MPLS_LABELS ||
+        ctx->error == XLATE_UNSUPPORTED_PACKET_TYPE) {
+        /* reset the error and continue processing other buckets */
+        ctx->error = XLATE_OK;
+    }
}

static void
--
1.9.1
Ben Pfaff Jan. 4, 2018, 8:58 p.m. UTC | #2
On Mon, Dec 11, 2017 at 11:54:33AM +0000, Vishal Deep Ajmera wrote:
> As per OpenFlow v1.3 specification, when an action list contains a group
> action a copy of the packet is passed to the group for processing by the
> group. This means that if there is an error encountered during group
> processing, only the copy of packet should be dropped, but subsequent
> actions in the action list should be executed on the original packet.
> 
> Additionally, if the group type is "ALL", each action bucket of the group
> should process a copy of the packet. If there is an error while processing
> one bucket other buckets should still be processed.
> 
> Example 1:
> table=0,in_port=tap0 actions=output:tap1,group:10,output:tap2
> 
> Even if any error is encountered while processing the group action, the
> packet should still be forwarded to ports tap1 and tap2.
> 
> Example 2:
> group_id=1,type=all,bucket=actions=output:tap1,bucket=actions=encap(eth)
> 
> Even if processing the action in the second bucket fails because the
> packet already has an Ethernet header, the other copy of the packet should
> still be processed by the first bucket and output to port tap1.
> 
> Currently the error handling in OVS does not comply with those rules. When
> any group bucket execution fails the error is recorded in the so-called
> "translation context" which is global for the processing of the original
> packet. Once an error is recorded, OVS skips processing subsequent buckets
> and installs a drop action in the datapath even if parts of the action list
> were previously processed successfully.
> 
> This patch clears the error flag after any bucket of a group is executed.
> This way the error encountered in processing any bucket of the group will
> not impact other actions of action-list or other buckets of the group.
> 
> Errors which are system limits to protect translation from taking too long
> time or too much space are not cleared. Instead drop action is installed
> for them.
> 
> Signed-off-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com>
> Signed-off-by: Keshav Gupta <keshav.gupta@ericsson.com>

Seems reasonable to me, but it fails to apply due to white space
damage.  Can you resubmit a clean copy?

Thanks,

Ben.
Vishal Deep Ajmera Jan. 5, 2018, 11:24 a.m. UTC | #3
Thanks Ben for reviewing the patch. I have sent V3 patch after rebasing with master.

Warm Regards,
Vishal Ajmera

-----Original Message-----
From: Ben Pfaff [mailto:blp@ovn.org] 
Sent: Friday, January 05, 2018 2:29 AM
To: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com>
Cc: dev@openvswitch.org; Keshav Gupta <keshav.gupta@ericsson.com>
Subject: Re: [PATCH V2] ofproto-dpif-xlate: Incorrect handling of errors in group action processing

On Mon, Dec 11, 2017 at 11:54:33AM +0000, Vishal Deep Ajmera wrote:
> As per OpenFlow v1.3 specification, when an action list contains a 
> group action a copy of the packet is passed to the group for 
> processing by the group. This means that if there is an error 
> encountered during group processing, only the copy of packet should be 
> dropped, but subsequent actions in the action list should be executed on the original packet.
> 
> Additionally, if the group type is "ALL", each action bucket of the 
> group should process a copy of the packet. If there is an error while 
> processing one bucket other buckets should still be processed.
> 
> Example 1:
> table=0,in_port=tap0 actions=output:tap1,group:10,output:tap2
> 
> Even if any error is encountered while processing the group action, 
> the packet should still be forwarded to ports tap1 and tap2.
> 
> Example 2:
> group_id=1,type=all,bucket=actions=output:tap1,bucket=actions=encap(et
> h)
> 
> Even if processing the action in the second bucket fails because the 
> packet already has an Ethernet header, the other copy of the packet 
> should still be processed by the first bucket and output to port tap1.
> 
> Currently the error handling in OVS does not comply with those rules. 
> When any group bucket execution fails the error is recorded in the 
> so-called "translation context" which is global for the processing of 
> the original packet. Once an error is recorded, OVS skips processing 
> subsequent buckets and installs a drop action in the datapath even if 
> parts of the action list were previously processed successfully.
> 
> This patch clears the error flag after any bucket of a group is executed.
> This way the error encountered in processing any bucket of the group 
> will not impact other actions of action-list or other buckets of the group.
> 
> Errors which are system limits to protect translation from taking too 
> long time or too much space are not cleared. Instead drop action is 
> installed for them.
> 
> Signed-off-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com>
> Signed-off-by: Keshav Gupta <keshav.gupta@ericsson.com>

Seems reasonable to me, but it fails to apply due to white space damage.  Can you resubmit a clean copy?

Thanks,

Ben.
diff mbox series

Patch

diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index 9b3a2f2..7893c0e 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -4065,6 +4065,23 @@  xlate_group_bucket(struct xlate_ctx *ctx, struct ofputil_bucket *bucket,
      * the group bucket freezes translation, the actions after the group action
      * must continue processing with the original, not the frozen packet! */
    ctx->exit = false;
+
+    /* Context error in a bucket should not impact processing of other buckets
+     * or actions. This is similar to cloning a packet for group buckets.
+     * There is no need to restore the error back to old value due to the fact
+     * that we actually processed group action which can happen only when there
+     * is no previous context error.
+     *
+     * Exception to above is errors which are system limits to protect translation
+     * from running too long or occupy too much space. These errors should not be
+     * masked. XLATE_RECURSION_TOO_DEEP, XLATE_TOO_MANY_RESUBMITS and
+     * XLATE_STACK_TOO_DEEP fall in this category.
+     */
+    if (ctx->error == XLATE_TOO_MANY_MPLS_LABELS ||
+        ctx->error == XLATE_UNSUPPORTED_PACKET_TYPE) {
+        /* reset the error and continue processing other buckets */
+        ctx->error = XLATE_OK;
+    }
}

static void