From patchwork Thu May 11 16:04:01 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zoltan Balogh X-Patchwork-Id: 761230 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3wNybK29jlz9s7k for ; Fri, 12 May 2017 02:04:12 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=ericsson.onmicrosoft.com header.i=@ericsson.onmicrosoft.com header.b="OlC+8xNc"; dkim-atps=neutral Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id 4445BB64; Thu, 11 May 2017 16:04:10 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 155BAB5F for ; Thu, 11 May 2017 16:04:09 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from sesbmg22.ericsson.net (sesbmg22.ericsson.net [193.180.251.48]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 911C1152 for ; Thu, 11 May 2017 16:04:07 +0000 (UTC) X-AuditID: c1b4fb30-29dff7000000015f-c9-59148b74e1d0 Received: from ESESSHC011.ericsson.se (Unknown_Domain [153.88.183.51]) by sesbmg22.ericsson.net (Symantec Mail Security) with SMTP id 94.5A.00351.47B84195; Thu, 11 May 2017 18:04:05 +0200 (CEST) Received: from EUR01-HE1-obe.outbound.protection.outlook.com (153.88.183.145) by oa.msg.ericsson.com (153.88.183.51) with Microsoft SMTP Server (TLS) id 14.3.339.0; Thu, 11 May 2017 18:04:03 +0200 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ericsson.onmicrosoft.com; s=selector1-ericsson-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=wZqEMgY0z+LR1p5HI/rw8QgPk1v9jvJTlWMdsyRZTWg=; b=OlC+8xNcwQTM2wKPtjbKV8zwIiID7/WH+7bGXSaXhDNN+joIhr66XWfSaidzzsH7ozR9whiMuGh7Sr+DEppjnMjpcJbPupJytqGP09EzCeGcu1QRdKai/50mspjDCEV+1fJU4+pmk7ly32gaGtebyoT9MXIZ2CwJYC28KezGSP4= Received: from AM2PR07MB1042.eurprd07.prod.outlook.com (10.162.37.27) by AM2PR07MB1042.eurprd07.prod.outlook.com (10.162.37.27) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1084.7; Thu, 11 May 2017 16:04:01 +0000 Received: from AM2PR07MB1042.eurprd07.prod.outlook.com ([fe80::25ea:238d:aff1:8466]) by AM2PR07MB1042.eurprd07.prod.outlook.com ([fe80::25ea:238d:aff1:8466%14]) with mapi id 15.01.1084.015; Thu, 11 May 2017 16:04:01 +0000 From: =?utf-8?B?Wm9sdMOhbiBCYWxvZ2g=?= To: Joe Stringer , Sugesh Chandran Thread-Topic: [ovs-dev] [PATCH v5] tunneling: Avoid recirculation on datapath by computing the recirculate actions at translate time. Thread-Index: AQHSsqyCZdRitG6X502axzxZcdrMDaHjcTiAgAC28qCAAOgjgIABGJoQgAASvICAAC8YAIAC7SoggAES56CAAEXgAIAAO9AwgAA06YCAAOfi8IACFGsAgAFU79A= Date: Thu, 11 May 2017 16:04:01 +0000 Message-ID: References: <1491905641-77996-1-git-send-email-sugesh.chandran@intel.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: ovn.org; dkim=none (message not signed) header.d=none; ovn.org; dmarc=none action=none header.from=ericsson.com; x-originating-ip: [91.82.100.59] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; AM2PR07MB1042; 7:TeZz41Q+gTcOOJmnMymGcENkH8daQZKt7vy7a6i4DKq0qLWU3eqtSNZYVZj+yznF8oSr8l5vZFLAhHDnIZ2JRdfUuJMz4wHI9naxQJ9F5MTN3XY2tQTGrr2zLOioDn9r20J9X4f3v5fUyBTbrjeP/X1+aGIOlXEZ0GynhUf+gU8aTId85YcEKFRn/ycmf6R8ozOPbJM1RgsoCzzd5vEDLS3u0HjxlsKlxLsVzI+Z4YrEOx06WRbxk1j/TT64CBetNJXqWrm3tXNHA2YlYzx7xVeWt5dxctRahqhzFOGC2l/+npkGVJQQssxVX6XHheHfMbkihV8d52nkidGIGLpIqw== x-ld-processed: 92e84ceb-fbfd-47ab-be52-080c6b87953f,ExtAddr x-ms-office365-filtering-correlation-id: f14757b5-f5e5-4aa4-764c-08d498875742 x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(22001)(2017030254075)(201703131423075)(201703031133081)(201702281549075); SRVR:AM2PR07MB1042; x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:; x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(6040450)(601004)(2401047)(5005006)(8121501046)(3002001)(93006095)(93001095)(10201501046)(6041248)(20161123562025)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123558100)(20161123555025)(20161123560025)(20161123564025)(6072148); SRVR:AM2PR07MB1042; BCL:0; PCL:0; RULEID:; SRVR:AM2PR07MB1042; x-forefront-prvs: 0304E36CA3 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(979002)(6009001)(39850400002)(39410400002)(39400400002)(39860400002)(39450400003)(39840400002)(57704003)(189002)(51414003)(377424004)(199003)(99286003)(55016002)(305945005)(7696004)(86362001)(7736002)(6436002)(189998001)(102836003)(38730400002)(6506006)(39060400002)(74316002)(4326008)(6246003)(25786009)(561944003)(33656002)(85202003)(345774005)(66066001)(9686003)(2950100002)(54906002)(5660300001)(478600001)(229853002)(53936002)(2900100001)(54356999)(93886004)(85182001)(3846002)(76176999)(50986999)(81166006)(8676002)(6116002)(3280700002)(2906002)(3660700001)(5250100002)(8936002)(21314002)(969003)(989001)(999001)(1009001)(1019001); DIR:OUT; SFP:1101; SCL:1; SRVR:AM2PR07MB1042; H:AM2PR07MB1042.eurprd07.prod.outlook.com; FPR:; SPF:None; MLV:ovrnspm; PTR:InfoNoRecords; MX:1; A:1; LANG:en; spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-MS-Exchange-CrossTenant-originalarrivaltime: 11 May 2017 16:04:01.6161 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 92e84ceb-fbfd-47ab-be52-080c6b87953f X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM2PR07MB1042 X-OriginatorOrg: ericsson.com X-Brightmail-Tracker: H4sIAAAAAAAAA02SbUhTURjHOffebdfp4LSWPpm9DYpcpDYlbmAv4hdBiiKwjMyWXXTkW9uU 7I2lMt2mJWWUq6ZllMlwOUotR6BOKjU0iZmWpM2UiVEZIeXSdncN+vZ7zu85/+ecw6FJqV0Q TqtzdawmV5UtF4qpmkOtsVsKzLLUmBmPgrF+dSBm+qoeMd29TpKZNnQg5md3C8FcLbGKmM9v fpPM0GDIbjrpqWVUlFTv9BJJk8OLKGnKXUHtow6L40+w2epCVhO985g461q3k8r/pken7foq Uo9enjehIBpwHCzetFEcS7ELgf69gueXCNp7lCYkpilcScILp1PIi2oCHE3JnJBiD4KL5T9E nBDiRLjk6Q00yXAyNE+MBJjEiwh+lKZxG5ZjA4LBOQPBFTJchsD9rmWpMCGoHhhF3BYKb4Bi V38gVoKPwMfWDxQ/z0aDoao1IILwfqj0GQUcIxwKcz02gp8XBiMTtQR/Owz3nP0kzyvA61kQ cEEIGxE036+leLEOHHVfRTyvhsFaM+K5goQSYz7P8XDDfH0paA+0txgQFwT4MoIFy4A/lfYX J6Gh4wDfo4Byl4Xge2YJePBzVsCLCCgd61sSXUKY9L0QVKHNlv9ObvFnkTgS7M+i+eX1UG0e F1kCr7EMXtVMUHWIakQrtKz2eE6mUhnFatQZWm1eblQuq3Mg/zfqeDwf04a8UwmdCNNIHiIZ i1ieKhWoCrVFOZ0IaFIuk+SVyVKlkhOqojOsJi9dU5DNajvRKpqSh0kSng8ckuJMlY49ybL5 rOafJeigcD1a2Rna8PBol9GUqPi+1/HLmt5oZO4WZwTbmryHXe5H83u/Bx80pUS0xUXojp5b z/pKb79W27Pcw3F2pX1fZJ+DDXt6byCtST00bvuzRuacfTKWEhvintm0/cq1U/WjdQ132Pmz og2fdm1UZ2xz3SpXXvCubUmxFn2JeitKqN/ha5dT2izVVgWp0ar+Ank6JmNCAwAA X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Cc: ovs dev Subject: Re: [ovs-dev] [PATCH v5] tunneling: Avoid recirculation on datapath by computing the recirculate actions at translate time. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org Hi Joe, Thank you for your comments! Please, find my answers below. > I had trouble trying to review this, in part because I felt like > changes to fix multiple issues are being placed into one patch. If > there are separate issues being addressed, each issue should be fixed > in a separate patch, each with a clear description of the intent of > the change (with links to patches that introduced the breakage if > possible!). If there is refactoring to do, consider separating the > non-functional changes into a separate patch---when looking at the > original patch that started this discussion, I found it difficult to > review post-merge because the refactoring was included and I couldn't > tell if there were actually functional changes. > > Given that this is taking a bit longer than I thought and we need to > take a holistic approach to how all of these interactions should work, > I plan to go ahead and apply the revert patch. I look forward to > seeing a fresh series proposal that will bring back the benefits of > the original patch. > > More feedback below. > > > Here comes the patch: > > > > > > Since tunneling and forwarding via patch port uses clone action, the tunneled > > Where does forwarding via patch port use clone action? Sorry, that was not correct. Patch port does not use clone action. Both patch port and native tunneling use the apply_netsed_clone_action(). That's what I wanted to express. > apply_nested_clone_action() also takes copies of the flow and > base_flow, do we need to do it twice in this path? I suspect that this > work could benefit from some further refactoring, taking note to have > separate patches to refactor code and different patches for functional > changes. Thank you for indicating this. There is no need to create a backup copy of flow. We need to store some fields of base_flow. I could use a function like this to create backup: static void copy_flow_L2L3_data(struct flow *dst, const struct flow *src) { dst->packet_type = src->packet_type; dst->dl_dst = src->dl_dst; dst->dl_src = src->dl_src; dst->dl_type = src->dl_type; dst->nw_dst = src->nw_dst; dst->nw_src = src->nw_src; dst->ipv6_dst = src->ipv6_dst; dst->ipv6_src = src->ipv6_src; dst->nw_tos = src->nw_tos; dst->nw_ttl = src->nw_ttl; dst->nw_proto = src->nw_proto; dst->tp_dst = src->tp_dst; dst->tp_src = src->tp_src; } ... /* Backup. */ copy_flow_L2L3_data(&old_base_flow, &ctx->base_flow); ... /* Restore. */ copy_flow_L2L3_data(&ctx->base_flow, &old_base_flow); > > > > err = tnl_route_lookup_flow(flow, &d_ip6, &s_ip6, &out_dev); > > if (err) { > > @@ -3216,16 +3248,57 @@ build_tunnel_send(struct xlate_ctx *ctx, const struct xport *xport, > > tnl_push_data.tnl_port = odp_to_u32(tunnel_odp_port); > > tnl_push_data.out_port = odp_to_u32(out_dev->odp_port); > > > > + /* After tunnel header has been added, MAC and IP data of flow and > > + * base_flow need to be set properly, since there is not recirculation > > + * any more when sending packet to tunnel. */ > > + if (!eth_addr_is_zero(ctx->xin->flow.dl_dst)) { > > + ctx->xin->flow.dl_dst = dmac; > > + ctx->base_flow.dl_dst = ctx->xin->flow.dl_dst; > > + } > > What's the expectation if ctx->xin->flow.dl_dst is zero? Is that for > L3 tunneled packets? No, it's not for L3 tunneled packets. Without this push_pop_tunnel and push_pop_tunnel_ipv6 unit tests do fail. This needs to be fixed. 783: tunnel_push_pop - action FAILED (tunnel-push-pop.at:109) 785: tunnel_push_pop_ipv6 - action FAILED (tunnel-push-pop-ipv6.at:92) ./tunnel-push-pop.at:108: ovs-appctl ofproto/trace ovs-dummy 'in_port(2),eth_type(0x0800),ipv4(src=1.1.3.88,dst=1.1. 3.112,proto=47,tos=0,ttl=64,frag=no)' stdout: Flow: ip,in_port=LOCAL,vlan_tci=0x0000,dl_src=00:00:00:00:00:00,dl_dst=00:00:00:00:00:00,nw_src=1.1.3.88,nw_dst=1.1. 3.112,nw_proto=47,nw_tos=0,nw_ecn=0,nw_ttl=64 bridge("int-br") ---------------- 0. priority 32768 output:2 -> output to native tunnel -> tunneling to 1.1.2.92 via br0 -> tunneling from aa:55:aa:55:00:00 1.1.2.88 to f8:bc:12:44:34:b6 1.1.2.92 bridge("br0") ------------- 0. priority 32768 NORMAL -> learned port is input port, dropping Final flow: unchanged Megaflow: recirc_id=0,ip,in_port=LOCAL,vlan_tci=0x0000/0x1fff,dl_src=00:00:00:00:00:00,dl_dst=00:00:00:00:00:00,nw_e cn=0,nw_frag=no Datapath actions: clone(drop),tnl_push(tnl_port(4789),header(size=50,type=4,eth(dst=f8:bc:12:44:34:b6,src=aa:55:aa:5 5:00:00,dl_type=0x0800),ipv4(src=1.1.2.88,dst=1.1.2.92,proto=17,tos=0,ttl=64,frag=0x4000),udp(src=0,dst=4789,csum=0x 0),vxlan(flags=0x8000000,vni=0x7b)),out_port(100)) ./tunnel-push-pop.at:109: tail -1 stdout > > + > > + ctx->xin->flow.dl_src = smac; > > + ctx->base_flow.dl_src = ctx->xin->flow.dl_src; > > + > > + propagate_tunnel_data_to_flow(&ctx->xin->flow, &ctx->base_flow); > > + > > + if (!tnl_params.is_ipv6) { > > + ctx->xin->flow.dl_type = htons(ETH_TYPE_IP); > > + } else { > > + ctx->xin->flow.dl_type = htons(ETH_TYPE_IPV6); > > + } > > When writing if (!condition) { ... } else { ... }, please consider > whether you can swap the conditions and remove the negation. It's less > cognitive load if we don't have to negate a negative condition to > consider which condition the "else" applies to. (This otherwise looks > correct, so just a style request.) Ok. I'm going to change this. > > + > > + if (tnl_push_data.tnl_type == OVS_VPORT_TYPE_GRE) { > > + ctx->xin->flow.nw_proto = IPPROTO_GRE; > > + } else { > > + ctx->xin->flow.nw_proto = IPPROTO_UDP; > > + } > > + ctx->base_flow.nw_proto = ctx->xin->flow.nw_proto; > > Should we use a switch () statement here to ensure that newer tunnel > ports are properly handled in this case if/when they are added? What > about things like LISP or STT? I can use a switch case: switch (tnl_push_data.tnl_type) { case OVS_VPORT_TYPE_GRE: ctx->xin->flow.nw_proto = IPPROTO_GRE; break; case OVS_VPORT_TYPE_VXLAN: case OVS_VPORT_TYPE_GENEVE: ctx->xin->flow.nw_proto = IPPROTO_UDP; break; case OVS_VPORT_TYPE_LISP: /* XXX: UDP? */ case OVS_VPORT_TYPE_STT: /* XXX: TCP? */ default: break; } But, how to handle LISP and STT? I have not seen any build_header() function for LISP and STT in netdev-vport.c: static const struct vport_class vport_classes[] = { TUNNEL_CLASS("geneve", "genev_sys", netdev_geneve_build_header, netdev_tnl_push_udp_header, netdev_geneve_pop_header), TUNNEL_CLASS("gre", "gre_sys", netdev_gre_build_header, netdev_gre_push_header, netdev_gre_pop_header), TUNNEL_CLASS("vxlan", "vxlan_sys", netdev_vxlan_build_header, netdev_tnl_push_udp_header, netdev_vxlan_pop_header), TUNNEL_CLASS("lisp", "lisp_sys", NULL, NULL, NULL), TUNNEL_CLASS("stt", "stt_sys", NULL, NULL, NULL), }; Are LISP and STT valid tunnel types for native tunneling? If I'm not wrong, build_tunnel_send() is invoked when native tunneling is used. > > + > > size_t push_action_size = 0; > > size_t clone_ofs = nl_msg_start_nested(ctx->odp_actions, > > OVS_ACTION_ATTR_CLONE); > > odp_put_tnl_push_action(ctx->odp_actions, &tnl_push_data); > > push_action_size = ctx->odp_actions->size; > > + > > + if (ctx->rule) { > > + old_ths = ctx->rule->tunnel_header_size; > > + ctx->rule->tunnel_header_size = tnl_push_data.header_len; > > + } > > + > > apply_nested_clone_actions(ctx, xport, out_dev); > > if (ctx->odp_actions->size > push_action_size) { > > /* Update the CLONE action only when combined */ > > nl_msg_end_nested(ctx->odp_actions, clone_ofs); > > } > > If there are no actions on the underlay bridge (ie drop), the else > condition here should probably reset the nl_msg nesting so that we > don't generate any actions to the datapath at all. You may also > consider adding a xlate_report() call to explain why the drop is > occuring - this should help debugging later on using ofproto/trace. I'm not sure about this. Sugesh, how should this be resolved? > > if (credit_counts) { > > + uint64_t stats_n_bytes = 0; > > + > > + if (rule->truncated_packet_size) { > > + stats_n_bytes = MIN(rule->truncated_packet_size, stats->n_bytes); > > + } else { > > + stats_n_bytes = stats->n_bytes; > > + } > > Is this fixing a separate issue? I'm confused about whether truncated > packet stats are being correctly attributed today on master. If so, I > think this should split out into a separate patch. You might also > consider whether it makes more sense to modify 'stats' earlier in the > path so that each rule doesn't need to individually apply the stats > adjustment. I could imagine a store/restore of the stats plus > modifying for truncation during the xlate_output_trunc_action() > processing rather than pushing this complexity all the way into the > rule stats attribution. Incorrect bytes statistics on the underlay bridge is an unwanted side-effect of the original "Avoid recirculate" commit. By exluding recirculation, there is no upcall for the tunnelled packets, and the packet size is not set by upcall_xlate() again: static void upcall_xlate(struct udpif *udpif, struct upcall *upcall, struct ofpbuf *odp_actions, struct flow_wildcards *wc) { struct dpif_flow_stats stats; struct xlate_in xin; stats.n_packets = 1; stats.n_bytes = dp_packet_size(upcall->packet); stats.used = time_msec(); stats.tcp_flags = ntohs(upcall->flow->tcp_flags); xlate_in_init(&xin, upcall->ofproto, ofproto_dpif_get_tables_version(upcall->ofproto), upcall->flow, upcall->in_port, NULL, stats.tcp_flags, upcall->packet, wc, odp_actions); if (upcall->type == DPIF_UC_MISS) { xin.resubmit_stats = &stats; Here upcall_xlate() sets xin.resubmint_stats which will be used for calculating statistic. These are const values, that's why I have not updated them earlier. /* If nonnull, flow translation credits the specified statistics to each * rule reached through a resubmit or OFPP_TABLE action. * * This is normally null so the client has to set it manually after * calling xlate_in_init(). */ const struct dpif_flow_stats *resubmit_stats; If it does not harm, then the const modifier could be removed and stats updated at earlier stage. Best regards, Zoltan --- - 2017-05-11 14:40:42.205776583 +0200 +++ /home/ezolbal/work/general_L3_tunneling/ovs/tests/testsuite.dir/at-groups/783/stdout 2017-05-11 14:40:42.200387095 +0200 @@ -1,2 +1,2 @@ -Datapath actions: clone(tnl_push(tnl_port(4789),header(size=50,type=4,eth(dst=f8:bc:12:44:34:b6,src=aa:55:aa:55:00:00,dl_type=0x0800),ipv4(src=1.1.2.88,dst=1.1.2.92,proto=17,tos=0,ttl=64,frag=0x4000),udp(src=0,dst=4789,csum=0x0),vxlan(flags=0x8000000,vni=0x7b)),out_port(100)),1) +Datapath actions: clone(drop),tnl_push(tnl_port(4789),header(size=50,type=4,eth(dst=f8:bc:12:44:34:b6,src=aa:55:aa:55:00:00,dl_type=0x0800),ipv4(src=1.1.2.88,dst=1.1.2.92,proto=17,tos=0,ttl=64,frag=0x4000),udp(src=0,dst=4789,csum=0x0),vxlan(flags=0x8000000,vni=0x7b)),out_port(100))