From patchwork Fri Jun 24 18:49:52 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yin Lin X-Patchwork-Id: 640377 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from archives.nicira.com (archives.nicira.com [96.126.127.54]) by ozlabs.org (Postfix) with ESMTP id 3rbnSp3Xs4z9sXx for ; Sat, 25 Jun 2016 04:50:02 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=onevmw.onmicrosoft.com header.i=@onevmw.onmicrosoft.com header.b=ZYT0iW5u; dkim-atps=neutral Received: from archives.nicira.com (localhost [127.0.0.1]) by archives.nicira.com (Postfix) with ESMTP id 9162710C3F; Fri, 24 Jun 2016 11:50:01 -0700 (PDT) X-Original-To: dev@openvswitch.org Delivered-To: dev@openvswitch.org Received: from mx1e3.cudamail.com (mx1.cudamail.com [69.90.118.67]) by archives.nicira.com (Postfix) with ESMTPS id 2D0D610C3E for ; Fri, 24 Jun 2016 11:50:00 -0700 (PDT) Received: from bar5.cudamail.com (localhost [127.0.0.1]) by mx1e3.cudamail.com (Postfix) with ESMTPS id A8410420563 for ; Fri, 24 Jun 2016 12:49:59 -0600 (MDT) X-ASG-Debug-ID: 1466794197-09eadd24bc219f60001-byXFYA Received: from mx3-pf2.cudamail.com ([192.168.14.1]) by bar5.cudamail.com with ESMTP id wUKQ2jTMREV7CepF (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Fri, 24 Jun 2016 12:49:58 -0600 (MDT) X-Barracuda-Envelope-From: linyi@vmware.com X-Barracuda-RBL-Trusted-Forwarder: 192.168.14.1 Received: from unknown (HELO smtp-outbound-2.vmware.com) (208.91.2.13) by mx3-pf2.cudamail.com with ESMTPS (DHE-RSA-AES256-SHA encrypted); 24 Jun 2016 18:49:57 -0000 Received-SPF: error (mx3-pf2.cudamail.com: error in processing during lookup of vmware.com: DNS problem) X-Barracuda-Apparent-Source-IP: 208.91.2.13 X-Barracuda-RBL-IP: 208.91.2.13 Received: from sc9-mailhost2.vmware.com (sc9-mailhost2.vmware.com [10.113.161.72]) by smtp-outbound-2.vmware.com (Postfix) with ESMTP id B64C598204 for ; Fri, 24 Jun 2016 11:49:55 -0700 (PDT) Received: from EX13-CAS-001.vmware.com (ex13-cas-001.vmware.com [10.113.191.51]) by sc9-mailhost2.vmware.com (Postfix) with ESMTP id 89663B00CA for ; Fri, 24 Jun 2016 11:49:56 -0700 (PDT) Received: from EX13-MBX-TERM.vmware.com (10.113.191.143) by EX13-MBX-009.vmware.com (10.113.191.29) with Microsoft SMTP Server (TLS) id 15.0.1156.6; Fri, 24 Jun 2016 11:49:56 -0700 Received: from EX13-CAS-003.vmware.com (10.113.191.53) by EX13-MBX-TERM.vmware.com (10.113.191.143) with Microsoft SMTP Server (TLS) id 15.0.1156.6; Fri, 24 Jun 2016 11:49:55 -0700 Received: from na01-bl2-obe.outbound.protection.outlook.com (10.113.170.11) by EX13-CAS-003.vmware.com (10.113.191.53) with Microsoft SMTP Server (TLS) id 15.0.1156.6 via Frontend Transport; Fri, 24 Jun 2016 11:49:55 -0700 Received: from SN1PR0501MB2110.namprd05.prod.outlook.com (10.163.228.149) by SN1PR0501MB2110.namprd05.prod.outlook.com (10.163.228.149) with Microsoft SMTP Server (TLS) id 15.1.523.12; Fri, 24 Jun 2016 18:49:53 +0000 Received: from SN1PR0501MB2110.namprd05.prod.outlook.com ([10.163.228.149]) by SN1PR0501MB2110.namprd05.prod.outlook.com ([10.163.228.149]) with mapi id 15.01.0523.019; Fri, 24 Jun 2016 18:49:53 +0000 X-CudaMail-Envelope-Sender: linyi@vmware.com From: Yin Lin To: "dev@openvswitch.org" X-CudaMail-Whitelist-To: dev@openvswitch.org X-CudaMail-MID: CM-V2-623035529 X-CudaMail-DTE: 062416 X-CudaMail-Originating-IP: 208.91.2.13 Thread-Topic: [PATCH v8][PATCH 1/2] datapath-windows: Add Geneve support X-ASG-Orig-Subj: [##CM-V2-623035529##][PATCH v8][PATCH 1/2] datapath-windows: Add Geneve support Thread-Index: AdHOSS/RiC9FU5KuQwK0pJgFOqEsKg== Date: Fri, 24 Jun 2016 18:49:52 +0000 Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=linyi@vmware.com; x-originating-ip: [208.91.1.34] x-ms-office365-filtering-correlation-id: 5490e9a3-173d-4399-51d2-08d39c605424 x-microsoft-exchange-diagnostics: 1; SN1PR0501MB2110; 20:9RlBHsZ0sifWJebDN5eelziNpN/c5sh1W2RNuSrua42Jr/aj1aqso9z6Q6v3LxscL2QE1bf3+6YVdIUpCmPHAl/LvSie3wXF6GyWIU8BvVJ5u2h58QjcZgHsJfWRdL5iMfTB7vov6F2/MACLSFbCFL/CYYOM6cAYpI7y7EMoa+Y= x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:SN1PR0501MB2110; x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(61668805478150)(250069074691196); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(601004)(2401047)(8121501046)(5005006)(10201501046)(3002001); SRVR:SN1PR0501MB2110; BCL:0; PCL:0; RULEID:; SRVR:SN1PR0501MB2110; x-forefront-prvs: 0983EAD6B2 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(6009001)(7916002)(199003)(189002)(50986999)(77096005)(99286002)(2900100001)(15975445007)(5002640100001)(2501003)(450100001)(189998001)(97736004)(101416001)(586003)(81166006)(81156014)(8676002)(7846002)(5640700001)(7736002)(74316001)(122556002)(5003600100003)(19580395003)(19580405001)(1730700003)(106356001)(7696003)(68736007)(54356999)(8936002)(11100500001)(87936001)(229853001)(3280700002)(66066001)(105586002)(2906002)(107886002)(2351001)(33656002)(3660700001)(305945005)(110136002)(3846002)(102836003)(6116002)(86362001)(575784001)(10400500002)(92566002)(76576001)(9686002)(21314002)(569005); DIR:OUT; SFP:1101; SCL:1; SRVR:SN1PR0501MB2110; H:SN1PR0501MB2110.namprd05.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; A:1; MX:1; LANG:en; received-spf: None (protection.outlook.com: vmware.com does not designate permitted sender hosts) spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=onevmw.onmicrosoft.com; s=selector1-vmware-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=Wa7AojpI17Tvs9FY/BB9MixeInFX1zAA7wgkb7W8Jws=; b=ZYT0iW5uwaakNN0dGdoyadUZ3fsrUkR3qkABrG4T2E5vfaF5mRBRucxf9JxMcigTgZ8Obrp2+E86TkocC0tBFWdtSA2FzVaZqBHlJtHCIvRP87AYZpvVop7dZb/i8xYRhn52AbBJSUqfBr7S7ru6DO6tvdNtqhrYLHBujerwqGk= x-ms-exchange-crosstenant-originalarrivaltime: 24 Jun 2016 18:49:52.9609 (UTC) x-ms-exchange-crosstenant-fromentityheader: Hosted x-ms-exchange-crosstenant-id: b39138ca-3cee-4b4a-a4d6-cd83d9dd62f0 x-ms-exchange-transport-crosstenantheadersstamped: SN1PR0501MB2110 MIME-Version: 1.0 X-OriginatorOrg: vmware.com X-Barracuda-Connect: UNKNOWN[192.168.14.1] X-Barracuda-Start-Time: 1466794198 X-Barracuda-Encrypted: DHE-RSA-AES256-SHA X-Barracuda-URL: https://web.cudamail.com:443/cgi-mod/mark.cgi X-ASG-Whitelist: Header =?UTF-8?B?eFwtY3VkYW1haWxcLXdoaXRlbGlzdFwtdG8=?= X-Virus-Scanned: by bsmtpd at cudamail.com X-Barracuda-BRTS-Status: 1 X-ASG-Whitelist: EmailCat (corporate) Subject: [ovs-dev] [PATCH v8][PATCH 1/2] datapath-windows: Add Geneve support X-BeenThere: dev@openvswitch.org X-Mailman-Version: 2.1.16 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@openvswitch.org Sender: "dev" Signed-off-by: Yin Lin --- datapath-windows/automake.mk | 2 + datapath-windows/ovsext/Actions.c | 72 ++----- datapath-windows/ovsext/Debug.h | 1 + datapath-windows/ovsext/DpInternal.h | 29 ++- datapath-windows/ovsext/Flow.c | 179 +++++++++++++++-- datapath-windows/ovsext/Flow.h | 7 + datapath-windows/ovsext/Geneve.c | 356 +++++++++++++++++++++++++++++++++ datapath-windows/ovsext/Geneve.h | 122 +++++++++++ datapath-windows/ovsext/Util.h | 1 + datapath-windows/ovsext/Vport.c | 20 +- datapath-windows/ovsext/ovsext.vcxproj | 2 + 11 files changed, 716 insertions(+), 75 deletions(-) create mode 100644 datapath-windows/ovsext/Geneve.c create mode 100644 datapath-windows/ovsext/Geneve.h -- 2.8.0.windows.1 diff --git a/datapath-windows/automake.mk b/datapath-windows/automake.mk index c9af806..53fb5c5 100644 --- a/datapath-windows/automake.mk +++ b/datapath-windows/automake.mk @@ -68,6 +68,8 @@ EXTRA_DIST += \ datapath-windows/ovsext/Vport.h \ datapath-windows/ovsext/Vxlan.c \ datapath-windows/ovsext/Vxlan.h \ +datapath-windows/ovsext/Geneve.c \ +datapath-windows/ovsext/Geneve.h \ datapath-windows/ovsext/ovsext.inf \ datapath-windows/ovsext/ovsext.rc \ datapath-windows/ovsext/ovsext.vcxproj \ diff --git a/datapath-windows/ovsext/Actions.c b/datapath-windows/ovsext/Actions.c index 7ac6bb7..722a2a8 100644 --- a/datapath-windows/ovsext/Actions.c +++ b/datapath-windows/ovsext/Actions.c @@ -33,6 +33,7 @@ #include "User.h" #include "Vport.h" #include "Vxlan.h" +#include "Geneve.h" #ifdef OVS_DBG_MOD #undef OVS_DBG_MOD @@ -48,6 +49,8 @@ typedef struct _OVS_ACTION_STATS { UINT64 txVxlan; UINT64 rxStt; UINT64 txStt; + UINT64 rxGeneve; + UINT64 txGeneve; UINT64 flowMiss; UINT64 flowUserspace; UINT64 txTcp; @@ -237,6 +240,9 @@ OvsDetectTunnelRxPkt(OvsForwardingContext *ovsFwdCtx, case OVS_VPORT_TYPE_VXLAN: ovsActionStats.rxVxlan++; break; + case OVS_VPORT_TYPE_GENEVE: + ovsActionStats.rxGeneve++; + break; case OVS_VPORT_TYPE_GRE: ovsActionStats.rxGre++; break; @@ -333,6 +339,9 @@ OvsDetectTunnelPkt(OvsForwardingContext *ovsFwdCtx, case OVS_VPORT_TYPE_STT: ovsActionStats.txStt++; break; + case OVS_VPORT_TYPE_GENEVE: + ovsActionStats.txGeneve++; + break; } ovsFwdCtx->tunnelTxNic = dstVport; } @@ -689,6 +698,11 @@ OvsTunnelPortTx(OvsForwardingContext *ovsFwdCtx) &ovsFwdCtx->tunKey, ovsFwdCtx->switchContext, &ovsFwdCtx->layers, &newNbl); break; + case OVS_VPORT_TYPE_GENEVE: + status = OvsEncapGeneve(ovsFwdCtx->tunnelTxNic, ovsFwdCtx->curNbl, + &ovsFwdCtx->tunKey, ovsFwdCtx->switchContext, + &ovsFwdCtx->layers, &newNbl); + break; default: ASSERT(! "Tx: Unhandled tunnel type"); } @@ -767,6 +781,10 @@ OvsTunnelPortRx(OvsForwardingContext *ovsFwdCtx) dropReason = L"OVS-STT segment is cached"; } break; + case OVS_VPORT_TYPE_GENEVE: + status = OvsDecapGeneve(ovsFwdCtx->switchContext, ovsFwdCtx->curNbl, + &ovsFwdCtx->tunKey, &newNbl); + break; default: OVS_LOG_ERROR("Rx: Unhandled tunnel type: %d\n", tunnelRxVport->ovsType); @@ -1233,57 +1251,6 @@ OvsActionMplsPush(OvsForwardingContext *ovsFwdCtx, } /* - * -------------------------------------------------------------------------- - * OvsTunnelAttrToIPv4TunnelKey -- - * Convert tunnel attribute to OvsIPv4TunnelKey. - * -------------------------------------------------------------------------- - */ -static __inline NDIS_STATUS -OvsTunnelAttrToIPv4TunnelKey(PNL_ATTR attr, - OvsIPv4TunnelKey *tunKey) -{ - PNL_ATTR a; - INT rem; - - tunKey->attr[0] = 0; - tunKey->attr[1] = 0; - tunKey->attr[2] = 0; - ASSERT(NlAttrType(attr) == OVS_KEY_ATTR_TUNNEL); - - NL_ATTR_FOR_EACH_UNSAFE (a, rem, NlAttrData(attr), - NlAttrGetSize(attr)) { - switch (NlAttrType(a)) { - case OVS_TUNNEL_KEY_ATTR_ID: - tunKey->tunnelId = NlAttrGetBe64(a); - tunKey->flags |= OVS_TNL_F_KEY; - break; - case OVS_TUNNEL_KEY_ATTR_IPV4_SRC: - tunKey->src = NlAttrGetBe32(a); - break; - case OVS_TUNNEL_KEY_ATTR_IPV4_DST: - tunKey->dst = NlAttrGetBe32(a); - break; - case OVS_TUNNEL_KEY_ATTR_TOS: - tunKey->tos = NlAttrGetU8(a); - break; - case OVS_TUNNEL_KEY_ATTR_TTL: - tunKey->ttl = NlAttrGetU8(a); - break; - case OVS_TUNNEL_KEY_ATTR_DONT_FRAGMENT: - tunKey->flags |= OVS_TNL_F_DONT_FRAGMENT; - break; - case OVS_TUNNEL_KEY_ATTR_CSUM: - tunKey->flags |= OVS_TNL_F_CSUM; - break; - default: - ASSERT(0); - } - } - - return NDIS_STATUS_SUCCESS; -} - -/* *---------------------------------------------------------------------------- * OvsUpdateEthHeader -- * Updates the ethernet header in ovsFwdCtx.curNbl inline based on the @@ -1511,7 +1478,8 @@ OvsExecuteSetAction(OvsForwardingContext *ovsFwdCtx, case OVS_KEY_ATTR_TUNNEL: { OvsIPv4TunnelKey tunKey; - status = OvsTunnelAttrToIPv4TunnelKey((PNL_ATTR)a, &tunKey); + NTSTATUS convertStatus = OvsTunnelAttrToIPv4TunnelKey((PNL_ATTR)a, &tunKey); + status = SUCCEEDED(convertStatus) ? NDIS_STATUS_SUCCESS : NDIS_STATUS_FAILURE; ASSERT(status == NDIS_STATUS_SUCCESS); tunKey.flow_hash = (uint16)(hash ? *hash : OvsHashFlow(key)); tunKey.dst_port = key->ipKey.l4.tpDst; diff --git a/datapath-windows/ovsext/Debug.h b/datapath-windows/ovsext/Debug.h index e5ed963..935f858 100644 --- a/datapath-windows/ovsext/Debug.h +++ b/datapath-windows/ovsext/Debug.h @@ -41,6 +41,7 @@ #define OVS_DBG_TUNFLT BIT32(21) #define OVS_DBG_STT BIT32(22) #define OVS_DBG_CONTRK BIT32(23) +#define OVS_DBG_GENEVE BIT32(24) #define OVS_DBG_RESERVED BIT32(31) //Please add above OVS_DBG_RESERVED. diff --git a/datapath-windows/ovsext/DpInternal.h b/datapath-windows/ovsext/DpInternal.h index 07bc180..42b5ec9 100644 --- a/datapath-windows/ovsext/DpInternal.h +++ b/datapath-windows/ovsext/DpInternal.h @@ -128,10 +128,18 @@ typedef struct L2Key { } L2Key; /* Size of 24 byte. */ /* Number of packet attributes required to store OVS tunnel key. */ -#define NUM_PKT_ATTR_REQUIRED 3 +#define NUM_PKT_ATTR_REQUIRED 35 +#define TUN_OPT_MAX_LEN 255 typedef union OvsIPv4TunnelKey { + /* Options should always be the first member of tunnel key. + * They are stored at the end of the array if they are less than the + * maximum size. This allows us to get the benefits of variable length + * matching for small options. + */ struct { + UINT8 tunOpts[TUN_OPT_MAX_LEN]; /* Tunnel options. */ + UINT8 tunOptLen; /* Tunnel option length in byte. */ ovs_be32 dst; ovs_be32 src; ovs_be64 tunnelId; @@ -147,7 +155,22 @@ typedef union OvsIPv4TunnelKey { }; }; uint64_t attr[NUM_PKT_ATTR_REQUIRED]; -} OvsIPv4TunnelKey; /* Size of 24 byte. */ +} OvsIPv4TunnelKey; /* Size of 280 byte. */ + +__inline uint8_t TunnelKeyGetOptionsOffset(const OvsIPv4TunnelKey *key) +{ + return TUN_OPT_MAX_LEN - key->tunOptLen; +} + +__inline uint8_t* TunnelKeyGetOptions(OvsIPv4TunnelKey *key) +{ + return key->tunOpts + TunnelKeyGetOptionsOffset(key); +} + +__inline uint16_t TunnelKeyGetRealSize(OvsIPv4TunnelKey *key) +{ + return sizeof(OvsIPv4TunnelKey) - TunnelKeyGetOptionsOffset(key); +} typedef struct MplsKey { ovs_be32 lse; /* MPLS topmost label stack entry. */ @@ -155,7 +178,7 @@ typedef struct MplsKey { } MplsKey; /* Size of 8 bytes. */ typedef __declspec(align(8)) struct OvsFlowKey { - OvsIPv4TunnelKey tunKey; /* 24 bytes */ + OvsIPv4TunnelKey tunKey; /* 280 bytes */ L2Key l2; /* 24 bytes */ union { /* These headers are mutually exclusive. */ diff --git a/datapath-windows/ovsext/Flow.c b/datapath-windows/ovsext/Flow.c index 595518f..bc0bb37 100644 --- a/datapath-windows/ovsext/Flow.c +++ b/datapath-windows/ovsext/Flow.c @@ -21,6 +21,7 @@ #include "Flow.h" #include "PacketParser.h" #include "Datapath.h" +#include "Geneve.h" #ifdef OVS_DBG_MOD #undef OVS_DBG_MOD @@ -85,7 +86,7 @@ static NTSTATUS OvsDoDumpFlows(OvsFlowDumpInput *dumpInput, UINT32 *replyLen); static NTSTATUS OvsProbeSupportedFeature(POVS_MESSAGE msgIn, PNL_ATTR keyAttr); - +static UINT16 OvsGetFlowL2Offset(const OvsIPv4TunnelKey *tunKey); #define OVS_FLOW_TABLE_SIZE 2048 #define OVS_FLOW_TABLE_MASK (OVS_FLOW_TABLE_SIZE -1) @@ -1029,6 +1030,14 @@ MapFlowTunKeyToNlKey(PNL_BUFFER nlBuf, goto done; } + if (tunKey->tunOptLen > 0 && + !NlMsgPutTailUnspec(nlBuf, OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS, + (PCHAR)TunnelKeyGetOptions(tunKey), + tunKey->tunOptLen)) { + rc = STATUS_UNSUCCESSFUL; + goto done; + } + done: NlMsgEndNested(nlBuf, offset); error_nested_start: @@ -1638,6 +1647,120 @@ _MapKeyAttrToFlowPut(PNL_ATTR *keyAttrs, /* *---------------------------------------------------------------------------- + * OvsTunnelAttrToGeneveOptions -- + * Converts OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS attribute to tunKey->tunOpts. + *---------------------------------------------------------------------------- + */ +static __inline NTSTATUS +OvsTunnelAttrToGeneveOptions(PNL_ATTR attr, + OvsIPv4TunnelKey *tunKey) +{ + UINT32 optLen = NlAttrGetSize(attr); + GeneveOptionHdr *option; + BOOLEAN isCritical = FALSE; + if (optLen > TUN_OPT_MAX_LEN) { + OVS_LOG_ERROR("Geneve option length err (len %d, max %Iu).", + optLen, TUN_OPT_MAX_LEN); + return STATUS_INFO_LENGTH_MISMATCH; + } else if (optLen % 4 != 0) { + OVS_LOG_ERROR("Geneve opt len %d is not a multiple of 4.", optLen); + return STATUS_INFO_LENGTH_MISMATCH; + } + tunKey->tunOptLen = (UINT8)optLen; + option = (GeneveOptionHdr *)NlAttrData(attr); + while (optLen > 0) { + UINT32 len; + if (optLen < sizeof(*option)) { + return STATUS_INFO_LENGTH_MISMATCH; + } + len = sizeof(*option) + option->length * 4; + if (len > optLen) { + return STATUS_INFO_LENGTH_MISMATCH; + } + if (option->type & GENEVE_CRIT_OPT_TYPE) { + isCritical = TRUE; + } + option = (GeneveOptionHdr *)((UINT8 *)option + len); + optLen -= len; + } + memcpy(TunnelKeyGetOptions(tunKey), option, optLen); + if (isCritical) { + tunKey->flags |= OVS_TNL_F_CRT_OPT; + } + return STATUS_SUCCESS; +} + + +/* + *---------------------------------------------------------------------------- + * OvsTunnelAttrToIPv4TunnelKey -- + * Converts OVS_KEY_ATTR_TUNNEL attribute to tunKey. + *---------------------------------------------------------------------------- + */ +NTSTATUS +OvsTunnelAttrToIPv4TunnelKey(PNL_ATTR attr, + OvsIPv4TunnelKey *tunKey) +{ + PNL_ATTR a; + INT rem; + INT hasOpt = 0; + NTSTATUS status; + + memset(tunKey, 0, OVS_WIN_TUNNEL_KEY_SIZE); + ASSERT(NlAttrType(attr) == OVS_KEY_ATTR_TUNNEL); + + NL_ATTR_FOR_EACH_UNSAFE(a, rem, NlAttrData(attr), + NlAttrGetSize(attr)) { + switch (NlAttrType(a)) { + case OVS_TUNNEL_KEY_ATTR_ID: + tunKey->tunnelId = NlAttrGetBe64(a); + tunKey->flags |= OVS_TNL_F_KEY; + break; + case OVS_TUNNEL_KEY_ATTR_IPV4_SRC: + tunKey->src = NlAttrGetBe32(a); + break; + case OVS_TUNNEL_KEY_ATTR_IPV4_DST: + tunKey->dst = NlAttrGetBe32(a); + break; + case OVS_TUNNEL_KEY_ATTR_TOS: + tunKey->tos = NlAttrGetU8(a); + break; + case OVS_TUNNEL_KEY_ATTR_TTL: + tunKey->ttl = NlAttrGetU8(a); + break; + case OVS_TUNNEL_KEY_ATTR_DONT_FRAGMENT: + tunKey->flags |= OVS_TNL_F_DONT_FRAGMENT; + break; + case OVS_TUNNEL_KEY_ATTR_CSUM: + tunKey->flags |= OVS_TNL_F_CSUM; + break; + case OVS_TUNNEL_KEY_ATTR_OAM: + tunKey->flags |= OVS_TNL_F_OAM; + break; + case OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS: + if (hasOpt) { + /* Duplicate options attribute is not allowed. */ + return NDIS_STATUS_FAILURE; + } + status = OvsTunnelAttrToGeneveOptions(a, tunKey); + if (!SUCCEEDED(status)) { + return status; + } + tunKey->flags |= OVS_TNL_F_GENEVE_OPT; + hasOpt = 1; + break; + default: + // XXX: Support OVS_TUNNEL_KEY_ATTR_VXLAN_OPTS + return STATUS_INVALID_PARAMETER; + } + } + + return STATUS_SUCCESS; +} + + +/* + *---------------------------------------------------------------------------- * MapTunAttrToFlowPut -- * Converts FLOW_TUNNEL_KEY attribute to OvsFlowKey->tunKey. *---------------------------------------------------------------------------- @@ -1647,8 +1770,10 @@ MapTunAttrToFlowPut(PNL_ATTR *keyAttrs, PNL_ATTR *tunAttrs, OvsFlowKey *destKey) { + memset(&destKey->tunKey, 0, OVS_WIN_TUNNEL_KEY_SIZE); if (keyAttrs[OVS_KEY_ATTR_TUNNEL]) { - + /* XXX: This blocks performs same functionality as + OvsTunnelAttrToIPv4TunnelKey. Consider refactoring the code.*/ if (tunAttrs[OVS_TUNNEL_KEY_ATTR_ID]) { destKey->tunKey.tunnelId = NlAttrGetU64(tunAttrs[OVS_TUNNEL_KEY_ATTR_ID]); @@ -1683,13 +1808,21 @@ MapTunAttrToFlowPut(PNL_ATTR *keyAttrs, NlAttrGetU8(tunAttrs[OVS_TUNNEL_KEY_ATTR_TTL]); } - destKey->tunKey.pad = 0; - destKey->l2.offset = 0; + if (tunAttrs[OVS_TUNNEL_KEY_ATTR_OAM]) { + destKey->tunKey.flags |= OVS_TNL_F_OAM; + } + + if (tunAttrs[OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS]) { + NTSTATUS status = OvsTunnelAttrToGeneveOptions( + tunAttrs[OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS], + &destKey->tunKey); + if (SUCCEEDED(status)) { + destKey->tunKey.flags |= OVS_TNL_F_GENEVE_OPT; + } + } + destKey->l2.offset = OvsGetFlowL2Offset(&destKey->tunKey); } else { - destKey->tunKey.attr[0] = 0; - destKey->tunKey.attr[1] = 0; - destKey->tunKey.attr[2] = 0; - destKey->l2.offset = sizeof destKey->tunKey; + destKey->l2.offset = OvsGetFlowL2Offset(NULL); } } @@ -1853,6 +1986,19 @@ OvsGetFlowMetadata(OvsFlowKey *key, return status; } +UINT16 +OvsGetFlowL2Offset(const OvsIPv4TunnelKey *tunKey) +{ + if (tunKey != NULL) { + // Align with int64 boundary + if (tunKey->tunOptLen == 0) { + return (TUN_OPT_MAX_LEN + 1) / 8 * 8; + } + return TunnelKeyGetOptionsOffset(tunKey) / 8 * 8; + } else { + return OVS_WIN_TUNNEL_KEY_SIZE; + } +} /* *---------------------------------------------------------------------------- @@ -2057,16 +2203,17 @@ OvsExtractFlow(const NET_BUFFER_LIST *packet, if (tunKey) { ASSERT(tunKey->dst != 0); - RtlMoveMemory(&flow->tunKey, tunKey, sizeof flow->tunKey); - flow->l2.offset = 0; + UINT8 optOffset = TunnelKeyGetOptionsOffset(tunKey); + RtlMoveMemory(((UINT8 *)&flow->tunKey) + optOffset, + ((UINT8 *)tunKey) + optOffset, + TunnelKeyGetRealSize(tunKey)); } else { flow->tunKey.dst = 0; - flow->l2.offset = OVS_WIN_TUNNEL_KEY_SIZE; } - + flow->l2.offset = OvsGetFlowL2Offset(tunKey); flow->l2.inPort = inPort; - if ( OvsPacketLenNBL(packet) < ETH_HEADER_LEN_DIX) { + if (OvsPacketLenNBL(packet) < ETH_HEADER_LEN_DIX) { flow->l2.keyLen = OVS_WIN_TUNNEL_KEY_SIZE + 8 - flow->l2.offset; return NDIS_STATUS_SUCCESS; } @@ -2390,8 +2537,8 @@ OvsLookupFlow(OVS_DATAPATH *datapath, UINT16 size = key->l2.keyLen; UINT8 *start; - ASSERT(key->tunKey.dst || offset == sizeof (OvsIPv4TunnelKey)); - ASSERT(!key->tunKey.dst || offset == 0); + ASSERT(key->tunKey.dst || offset == sizeof(OvsIPv4TunnelKey)); + ASSERT(!key->tunKey.dst || offset == OvsGetFlowL2Offset(&key->tunKey)); start = (UINT8 *)key + offset; @@ -2447,7 +2594,7 @@ OvsHashFlow(const OvsFlowKey *key) UINT16 size = key->l2.keyLen; UINT8 *start; - ASSERT(key->tunKey.dst || offset == sizeof (OvsIPv4TunnelKey)); + ASSERT(key->tunKey.dst || offset == sizeof(OvsIPv4TunnelKey)); ASSERT(!key->tunKey.dst || offset == 0); start = (UINT8 *)key + offset; return OvsJhashBytes(start, size, 0); diff --git a/datapath-windows/ovsext/Flow.h b/datapath-windows/ovsext/Flow.h index d39db45..0744d30 100644 --- a/datapath-windows/ovsext/Flow.h +++ b/datapath-windows/ovsext/Flow.h @@ -87,10 +87,17 @@ VOID MapTunAttrToFlowPut(PNL_ATTR *keyAttrs, PNL_ATTR *tunAttrs, OvsFlowKey *destKey); UINT32 OvsFlowKeyAttrSize(void); UINT32 OvsTunKeyAttrSize(void); +NTSTATUS OvsTunnelAttrToIPv4TunnelKey(PNL_ATTR attr, OvsIPv4TunnelKey *tunKey); /* Flags for tunneling */ #define OVS_TNL_F_DONT_FRAGMENT (1 << 0) #define OVS_TNL_F_CSUM (1 << 1) #define OVS_TNL_F_KEY (1 << 2) +#define OVS_TNL_F_OAM (1 << 3) +#define OVS_TNL_F_CRT_OPT (1 << 4) +#define OVS_TNL_F_GENEVE_OPT (1 << 5) +#define OVS_TNL_F_VXLAN_OPT (1 << 6) + +#define OVS_TNL_HAS_OPTIONS (OVS_TNL_F_GENEVE_OPT | OVS_TNL_F_VXLAN_OPT) #endif /* __FLOW_H_ */ diff --git a/datapath-windows/ovsext/Geneve.c b/datapath-windows/ovsext/Geneve.c new file mode 100644 index 0000000..53a9bce --- /dev/null +++ b/datapath-windows/ovsext/Geneve.c @@ -0,0 +1,356 @@ +/* + * Copyright (c) 2016 VMware, Inc. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include "precomp.h" + +#include "Atomic.h" +#include "Debug.h" +#include "Flow.h" +#include "IpHelper.h" +#include "Jhash.h" +#include "NetProto.h" +#include "Offload.h" +#include "PacketIO.h" +#include "PacketParser.h" +#include "Geneve.h" +#include "Switch.h" +#include "User.h" +#include "Util.h" +#include "Vport.h" + +#ifdef OVS_DBG_MOD +#undef OVS_DBG_MOD +#endif +#define OVS_DBG_MOD OVS_DBG_GENEVE + + +NTSTATUS OvsInitGeneveTunnel(POVS_VPORT_ENTRY vport, + UINT16 udpDestPort) +{ + POVS_GENEVE_VPORT genevePort; + + genevePort = (POVS_GENEVE_VPORT) + OvsAllocateMemoryWithTag(sizeof(*genevePort), OVS_GENEVE_POOL_TAG); + if (!genevePort) { + OVS_LOG_ERROR("Insufficient memory, can't allocate GENEVE_VPORT"); + return STATUS_INSUFFICIENT_RESOURCES; + } + + RtlZeroMemory(genevePort, sizeof(*genevePort)); + genevePort->dstPort = udpDestPort; + vport->priv = (PVOID) genevePort; + return STATUS_SUCCESS; +} + +VOID +OvsCleanupGeneveTunnel(POVS_VPORT_ENTRY vport) +{ + if (vport->ovsType != OVS_VPORT_TYPE_GENEVE || + vport->priv == NULL) { + return; + } + + OvsFreeMemoryWithTag(vport->priv, OVS_GENEVE_POOL_TAG); + vport->priv = NULL; +} + +NDIS_STATUS OvsEncapGeneve(POVS_VPORT_ENTRY vport, + PNET_BUFFER_LIST curNbl, + OvsIPv4TunnelKey *tunKey, + POVS_SWITCH_CONTEXT switchContext, + POVS_PACKET_HDR_INFO layers, + PNET_BUFFER_LIST *newNbl) +{ + NTSTATUS status; + OVS_FWD_INFO fwdInfo; + PNET_BUFFER curNb; + PMDL curMdl; + PUINT8 bufferStart; + EthHdr *ethHdr; + IPHdr *ipHdr; + UDPHdr *udpHdr; + GeneveHdr *geneveHdr; + GeneveOptionHdr *optHdr; + POVS_GENEVE_VPORT vportGeneve; + UINT32 headRoom = OvsGetGeneveTunHdrMinSize() + tunKey->tunOptLen; + UINT32 packetLength; + ULONG mss = 0; + NDIS_TCP_IP_CHECKSUM_NET_BUFFER_LIST_INFO csumInfo; + + status = OvsLookupIPFwdInfo(tunKey->dst, &fwdInfo); + if (status != STATUS_SUCCESS) { + OvsFwdIPHelperRequest(NULL, 0, tunKey, NULL, NULL, NULL); + // return NDIS_STATUS_PENDING; + /* + * XXX: Don't know if the completionList will make any sense when + * accessed in the callback. Make sure the caveats are known. + * + * XXX: This code will work once we are able to grab locks in the + * callback. + */ + return NDIS_STATUS_FAILURE; + } + + curNb = NET_BUFFER_LIST_FIRST_NB(curNbl); + packetLength = NET_BUFFER_DATA_LENGTH(curNb); + + if (layers->isTcp) { + mss = OVSGetTcpMSS(curNbl); + + OVS_LOG_TRACE("MSS %u packet len %u", mss, + packetLength); + if (mss) { + OVS_LOG_TRACE("l4Offset %d", layers->l4Offset); + *newNbl = OvsTcpSegmentNBL(switchContext, curNbl, layers, + mss, headRoom); + if (*newNbl == NULL) { + OVS_LOG_ERROR("Unable to segment NBL"); + return NDIS_STATUS_FAILURE; + } + /* Clear out LSO flags after this point */ + NET_BUFFER_LIST_INFO(*newNbl, TcpLargeSendNetBufferListInfo) = 0; + } + } + + vportGeneve = (POVS_GENEVE_VPORT) GetOvsVportPriv(vport); + ASSERT(vportGeneve != NULL); + + /* If we didn't split the packet above, make a copy now */ + if (*newNbl == NULL) { + *newNbl = OvsPartialCopyNBL(switchContext, curNbl, 0, headRoom, + FALSE /*NBL info*/); + if (*newNbl == NULL) { + OVS_LOG_ERROR("Unable to copy NBL"); + return NDIS_STATUS_FAILURE; + } + csumInfo.Value = NET_BUFFER_LIST_INFO(curNbl, + TcpIpChecksumNetBufferListInfo); + status = OvsApplySWChecksumOnNB(layers, *newNbl, &csumInfo); + + if (status != NDIS_STATUS_SUCCESS) { + goto ret_error; + } + } + + curNbl = *newNbl; + for (curNb = NET_BUFFER_LIST_FIRST_NB(curNbl); curNb != NULL; + curNb = curNb->Next) { + status = NdisRetreatNetBufferDataStart(curNb, headRoom, 0, NULL); + if (status != NDIS_STATUS_SUCCESS) { + goto ret_error; + } + + curMdl = NET_BUFFER_CURRENT_MDL(curNb); + bufferStart = (PUINT8)MmGetSystemAddressForMdlSafe(curMdl, + LowPagePriority); + if (!bufferStart) { + status = NDIS_STATUS_RESOURCES; + goto ret_error; + } + + bufferStart += NET_BUFFER_CURRENT_MDL_OFFSET(curNb); + if (NET_BUFFER_NEXT_NB(curNb)) { + OVS_LOG_TRACE("nb length %u next %u", + NET_BUFFER_DATA_LENGTH(curNb), + NET_BUFFER_DATA_LENGTH(curNb->Next)); + } + + /* L2 header */ + ethHdr = (EthHdr *)bufferStart; + ASSERT(((PCHAR)&fwdInfo.dstMacAddr + sizeof fwdInfo.dstMacAddr) == + (PCHAR)&fwdInfo.srcMacAddr); + NdisMoveMemory(ethHdr->Destination, fwdInfo.dstMacAddr, + sizeof ethHdr->Destination + sizeof ethHdr->Source); + ethHdr->Type = htons(ETH_TYPE_IPV4); + + /* IP header */ + ipHdr = (IPHdr *)((PCHAR)ethHdr + sizeof *ethHdr); + + ipHdr->ihl = sizeof *ipHdr / 4; + ipHdr->version = IPPROTO_IPV4; + ipHdr->tos = tunKey->tos; + ipHdr->tot_len = htons(NET_BUFFER_DATA_LENGTH(curNb) - sizeof *ethHdr); + ipHdr->id = (uint16)atomic_add64(&vportGeneve->ipId, + NET_BUFFER_DATA_LENGTH(curNb)); + ipHdr->frag_off = (tunKey->flags & OVS_TNL_F_DONT_FRAGMENT) ? + IP_DF_NBO : 0; + ipHdr->ttl = tunKey->ttl ? tunKey->ttl : GENEVE_DEFAULT_TTL; + ipHdr->protocol = IPPROTO_UDP; + ASSERT(tunKey->dst == fwdInfo.dstIpAddr); + ASSERT(tunKey->src == fwdInfo.srcIpAddr || tunKey->src == 0); + ipHdr->saddr = fwdInfo.srcIpAddr; + ipHdr->daddr = fwdInfo.dstIpAddr; + ipHdr->check = 0; + + /* UDP header */ + udpHdr = (UDPHdr *)((PCHAR)ipHdr + sizeof *ipHdr); + udpHdr->source = htons(tunKey->flow_hash | MAXINT16); + udpHdr->dest = htons(vportGeneve->dstPort); + udpHdr->len = htons(NET_BUFFER_DATA_LENGTH(curNb) - headRoom + + sizeof *udpHdr + sizeof *geneveHdr + + tunKey->tunOptLen); + if (tunKey->flags & OVS_TNL_F_CSUM) { + UINT16 udpChksumLen = (UINT16) NET_BUFFER_DATA_LENGTH(curNb) - + sizeof *ipHdr - sizeof *ethHdr; + udpHdr->check = IPPseudoChecksum(&ipHdr->saddr, &ipHdr->daddr, + IPPROTO_UDP, udpChksumLen); + } else { + udpHdr->check = 0; + } + /* Geneve header */ + geneveHdr = (GeneveHdr *)((PCHAR)udpHdr + sizeof *udpHdr); + geneveHdr->version = GENEVE_VER; + geneveHdr->optLen = tunKey->tunOptLen / 4; + geneveHdr->oam = !!(tunKey->flags & OVS_TNL_F_OAM); + geneveHdr->critical = !!(tunKey->flags & OVS_TNL_F_CRT_OPT); + geneveHdr->reserved1 = 0; + geneveHdr->protocol = ETH_P_TEB_NBO; + geneveHdr->vni = GENEVE_TUNNELID_TO_VNI(tunKey->tunnelId); + geneveHdr->reserved2 = 0; + + /* Geneve header options */ + optHdr = (GeneveOptionHdr *)(geneveHdr + 1); + memcpy(optHdr, TunnelKeyGetOptions(tunKey), tunKey->tunOptLen); + + csumInfo.Value = 0; + csumInfo.Transmit.IpHeaderChecksum = 1; + csumInfo.Transmit.IsIPv4 = 1; + if (tunKey->flags & OVS_TNL_F_CSUM) { + csumInfo.Transmit.UdpChecksum = 1; + } + NET_BUFFER_LIST_INFO(curNbl, + TcpIpChecksumNetBufferListInfo) = csumInfo.Value; + } + return STATUS_SUCCESS; + +ret_error: + OvsCompleteNBL(switchContext, *newNbl, TRUE); + *newNbl = NULL; + return status; +} + +NDIS_STATUS OvsDecapGeneve(POVS_SWITCH_CONTEXT switchContext, + PNET_BUFFER_LIST curNbl, + OvsIPv4TunnelKey *tunKey, + PNET_BUFFER_LIST *newNbl) +{ + PNET_BUFFER curNb; + PMDL curMdl; + EthHdr *ethHdr; + IPHdr *ipHdr; + UDPHdr *udpHdr; + GeneveHdr *geneveHdr; + UINT32 tunnelSize; + UINT32 packetLength; + PUINT8 bufferStart; + PVOID optStart; + NDIS_STATUS status; + + /* Check the length of the UDP payload */ + curNb = NET_BUFFER_LIST_FIRST_NB(curNbl); + tunnelSize = OvsGetGeneveTunHdrMinSize(); + packetLength = NET_BUFFER_DATA_LENGTH(curNb); + if (packetLength <= tunnelSize) { + return NDIS_STATUS_INVALID_LENGTH; + } + + /* + * Create a copy of the NBL so that we have all the headers in one MDL. + */ + *newNbl = OvsPartialCopyNBL(switchContext, curNbl, + tunnelSize, 0, + TRUE /*copy NBL info */); + + if (*newNbl == NULL) { + return NDIS_STATUS_RESOURCES; + } + + /* XXX: Handle VLAN header. */ + curNbl = *newNbl; + curNb = NET_BUFFER_LIST_FIRST_NB(curNbl); + curMdl = NET_BUFFER_CURRENT_MDL(curNb); + bufferStart = (PUINT8)MmGetSystemAddressForMdlSafe(curMdl, LowPagePriority) + + NET_BUFFER_CURRENT_MDL_OFFSET(curNb); + if (!bufferStart) { + status = NDIS_STATUS_RESOURCES; + goto dropNbl; + } + + ethHdr = (EthHdr *)bufferStart; + /* XXX: Handle IP options. */ + ipHdr = (IPHdr *)((PCHAR)ethHdr + sizeof *ethHdr); + tunKey->src = ipHdr->saddr; + tunKey->dst = ipHdr->daddr; + tunKey->tos = ipHdr->tos; + tunKey->ttl = ipHdr->ttl; + tunKey->pad = 0; + udpHdr = (UDPHdr *)((PCHAR)ipHdr + sizeof *ipHdr); + + /* Validate if NIC has indicated checksum failure. */ + status = OvsValidateUDPChecksum(curNbl, udpHdr->check == 0); + if (status != NDIS_STATUS_SUCCESS) { + goto dropNbl; + } + + /* Calculate and verify UDP checksum if NIC didn't do it. */ + if (udpHdr->check != 0) { + status = OvsCalculateUDPChecksum(curNbl, curNb, ipHdr, udpHdr, + packetLength); + tunKey->flags |= OVS_TNL_F_CSUM; + if (status != NDIS_STATUS_SUCCESS) { + goto dropNbl; + } + } + + geneveHdr = (GeneveHdr *)((PCHAR)udpHdr + sizeof *udpHdr); + if (geneveHdr->protocol != ETH_P_TEB_NBO) { + status = STATUS_NDIS_INVALID_PACKET; + goto dropNbl; + } + tunKey->flags = OVS_TNL_F_KEY; + if (geneveHdr->oam) { + tunKey->flags |= OVS_TNL_F_OAM; + } + tunKey->tunnelId = GENEVE_VNI_TO_TUNNELID(geneveHdr->vni); + tunKey->tunOptLen = (uint8)geneveHdr->optLen * 4; + if (tunKey->tunOptLen > TUN_OPT_MAX_LEN || + packetLength < tunnelSize + tunKey->tunOptLen) { + status = NDIS_STATUS_INVALID_LENGTH; + goto dropNbl; + } + /* Clear out the receive flag for the inner packet. */ + NET_BUFFER_LIST_INFO(curNbl, TcpIpChecksumNetBufferListInfo) = 0; + + NdisAdvanceNetBufferDataStart(curNb, tunnelSize, FALSE, NULL); + if (tunKey->tunOptLen > 0) { + optStart = NdisGetDataBuffer(curNb, tunKey->tunOptLen, + TunnelKeyGetOptions(tunKey), 1, 0); + + /* If data is contiguous in the buffer, NdisGetDataBuffer will not copy + data to the storage. Manual copy is needed. */ + if (optStart != TunnelKeyGetOptions(tunKey)) { + memcpy(TunnelKeyGetOptions(tunKey), optStart, tunKey->tunOptLen); + } + NdisAdvanceNetBufferDataStart(curNb, tunKey->tunOptLen, FALSE, NULL); + } + + return NDIS_STATUS_SUCCESS; + +dropNbl: + OvsCompleteNBL(switchContext, *newNbl, TRUE); + *newNbl = NULL; + return status; +} diff --git a/datapath-windows/ovsext/Geneve.h b/datapath-windows/ovsext/Geneve.h new file mode 100644 index 0000000..0535e79 --- /dev/null +++ b/datapath-windows/ovsext/Geneve.h @@ -0,0 +1,122 @@ +/* + * Copyright (c) 2016 VMware, Inc. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + + +#ifndef __GENEVE_H_ +#define __GENEVE_H_ 1 + +#include "NetProto.h" +typedef struct _OVS_GENEVE_VPORT { + UINT16 dstPort; + UINT64 filterID; + UINT64 ipId; + /* + * To be filled + */ +} OVS_GENEVE_VPORT, *POVS_GENEVE_VPORT; + +/* Geneve Header: + * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + * |Ver| Opt Len |O|C| Rsvd. | Protocol Type | + * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + * | Virtual Network Identifier (VNI) | Reserved | + * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + * | Variable Length Options | + * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + * + * Option Header: + * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + * | Option Class | Type |R|R|R| Length | + * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + * | Variable Option Data | + * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + */ +typedef struct GeneveHdr { + /* Length of options fields in int32 excluding the common header */ + UINT32 optLen : 6; + /* Version. */ + UINT32 version:2; + /* Reserved. */ + UINT32 reserved1 : 6; + /* Critical options present */ + UINT32 critical : 1; + /* This packet contains a control message instead of a data payload */ + UINT32 oam:1; + /* Protocol Type. */ + UINT32 protocol:16; + /* VNI */ + UINT32 vni:24; + /* Reserved. */ + UINT32 reserved2:8; +} GeneveHdr; + +typedef struct GeneveOptionHdr { + /* Namespace for the 'type' field. */ + UINT32 optionClass:16; + /* Format of data contained in the option. */ + UINT32 type:8; + /* Reserved. */ + UINT32 reserved:3; + /* Length of option in int32 excluding the option header. */ + UINT32 length:5; +} GeneveOptionHdr; + +#define GENEVE_CRIT_OPT_TYPE (1 << 7) + +NTSTATUS OvsInitGeneveTunnel(POVS_VPORT_ENTRY vport, + UINT16 udpDestPort); + +VOID OvsCleanupGeneveTunnel(POVS_VPORT_ENTRY vport); + + +NDIS_STATUS OvsEncapGeneve(POVS_VPORT_ENTRY vport, + PNET_BUFFER_LIST curNbl, + OvsIPv4TunnelKey *tunKey, + POVS_SWITCH_CONTEXT switchContext, + POVS_PACKET_HDR_INFO layers, + PNET_BUFFER_LIST *newNbl); + +NDIS_STATUS OvsDecapGeneve(POVS_SWITCH_CONTEXT switchContext, + PNET_BUFFER_LIST curNbl, + OvsIPv4TunnelKey *tunKey, + PNET_BUFFER_LIST *newNbl); + +static __inline UINT32 +OvsGetGeneveTunHdrMinSize(VOID) +{ + /* XXX: Can L2 include VLAN at all? */ + return sizeof (EthHdr) + sizeof (IPHdr) + sizeof (UDPHdr) + + sizeof (GeneveHdr); +} + +static __inline UINT32 +OvsGetGeneveTunHdrMaxSize(VOID) +{ + /* XXX: Can L2 include VLAN at all? */ + return OvsGetGeneveTunHdrMinSize() + TUN_OPT_MAX_LEN; +} + +#define GENEVE_UDP_PORT 6081 +#define GENEVE_UDP_PORT_NBO 0xC117 +#define GENEVE_VER 0 +#define GENEVE_DEFAULT_TTL 64 +#define GENEVE_ID_IS_VALID(geneveID) (0 < (geneveID) && (vxlanID) <= 0xffffff) +#define GENEVE_TUNNELID_TO_VNI(_tID) (UINT32)(((UINT64)(_tID)) >> 40) +#define GENEVE_VNI_TO_TUNNELID(_vni) (((UINT64)(_vni)) << 40) +#define ETH_P_TEB_NBO 0x5865 /* Trans Ether Bridging */ + +#endif /* __GENEVE_H_ */ + diff --git a/datapath-windows/ovsext/Util.h b/datapath-windows/ovsext/Util.h index bcd38dd..e666e74 100644 --- a/datapath-windows/ovsext/Util.h +++ b/datapath-windows/ovsext/Util.h @@ -38,6 +38,7 @@ #define OVS_TUNFLT_POOL_TAG 'WSVO' #define OVS_RECIRC_POOL_TAG 'CSVO' #define OVS_CT_POOL_TAG 'CTVO' +#define OVS_GENEVE_POOL_TAG 'GNVO' VOID *OvsAllocateMemory(size_t size); VOID *OvsAllocateMemoryWithTag(size_t size, ULONG tag); diff --git a/datapath-windows/ovsext/Vport.c b/datapath-windows/ovsext/Vport.c index b69360e..1462453 100644 --- a/datapath-windows/ovsext/Vport.c +++ b/datapath-windows/ovsext/Vport.c @@ -27,6 +27,7 @@ #include "User.h" #include "Vport.h" #include "Vxlan.h" +#include "Geneve.h" #ifdef OVS_DBG_MOD #undef OVS_DBG_MOD @@ -1075,6 +1076,9 @@ OvsInitTunnelVport(PVOID userContext, case OVS_VPORT_TYPE_STT: status = OvsInitSttTunnel(vport, dstPort); break; + case OVS_VPORT_TYPE_GENEVE: + status = OvsInitGeneveTunnel(vport, dstPort); + break; default: ASSERT(0); } @@ -1218,6 +1222,7 @@ InitOvsVportCommon(POVS_SWITCH_CONTEXT switchContext, case OVS_VPORT_TYPE_GRE: case OVS_VPORT_TYPE_VXLAN: case OVS_VPORT_TYPE_STT: + case OVS_VPORT_TYPE_GENEVE: { UINT16 dstPort = GetPortFromPriv(vport); hash = OvsJhashBytes(&dstPort, @@ -1301,6 +1306,9 @@ OvsRemoveAndDeleteVport(PVOID usrParamsContext, return status; } } + case OVS_VPORT_TYPE_GENEVE: + OvsCleanupGeneveTunnel(vport); + break; case OVS_VPORT_TYPE_STT: OvsCleanupSttTunnel(vport); break; @@ -1362,9 +1370,7 @@ OvsRemoveAndDeleteVport(PVOID usrParamsContext, InitializeListHead(&vport->ovsNameLink); RemoveEntryList(&vport->portNoLink); InitializeListHead(&vport->portNoLink); - if (OVS_VPORT_TYPE_VXLAN == vport->ovsType || - OVS_VPORT_TYPE_STT == vport->ovsType || - OVS_VPORT_TYPE_GRE == vport->ovsType) { + if (OvsIsTunnelVportType(vport->ovsType)) { RemoveEntryList(&vport->tunnelVportLink); InitializeListHead(&vport->tunnelVportLink); } @@ -2255,7 +2261,7 @@ OvsNewVportCmdHandler(POVS_USER_PARAMS_CONTEXT usrParamsCtx, if (OvsIsTunnelVportType(portType)) { UINT16 transportPortDest = 0; - UINT8 nwProto; + UINT8 nwProto = IPPROTO_NONE; POVS_VPORT_ENTRY dupVport; switch (portType) { @@ -2266,6 +2272,9 @@ OvsNewVportCmdHandler(POVS_USER_PARAMS_CONTEXT usrParamsCtx, transportPortDest = VXLAN_UDP_PORT; nwProto = IPPROTO_UDP; break; + case OVS_VPORT_TYPE_GENEVE: + transportPortDest = GENEVE_UDP_PORT; + break; case OVS_VPORT_TYPE_STT: transportPortDest = STT_TCP_PORT; nwProto = IPPROTO_TCP; @@ -2393,6 +2402,9 @@ Cleanup: case OVS_VPORT_TYPE_STT: OvsCleanupSttTunnel(vport); break; + case OVS_VPORT_TYPE_GENEVE: + OvsCleanupGeneveTunnel(vport); + break; default: ASSERT(!"Invalid tunnel port type"); } diff --git a/datapath-windows/ovsext/ovsext.vcxproj b/datapath-windows/ovsext/ovsext.vcxproj index 0356ddf..02fa60c 100644 --- a/datapath-windows/ovsext/ovsext.vcxproj +++ b/datapath-windows/ovsext/ovsext.vcxproj @@ -81,6 +81,7 @@ + @@ -182,6 +183,7 @@ +