From patchwork Sat Oct 24 20:40:21 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sairam Venugopal X-Patchwork-Id: 535463 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from archives.nicira.com (li376-54.members.linode.com [96.126.127.54]) by ozlabs.org (Postfix) with ESMTP id 136F814131F for ; Sun, 25 Oct 2015 07:42:42 +1100 (AEDT) Received: from archives.nicira.com (localhost [127.0.0.1]) by archives.nicira.com (Postfix) with ESMTP id 8BC6A10924; Sat, 24 Oct 2015 13:42:38 -0700 (PDT) X-Original-To: dev@openvswitch.org Delivered-To: dev@openvswitch.org Received: from mx3v1.cudamail.com (mx3.cudamail.com [64.34.241.5]) by archives.nicira.com (Postfix) with ESMTPS id D2DFE1090B for ; Sat, 24 Oct 2015 13:42:36 -0700 (PDT) Received: from bar3.cudamail.com (bar1 [192.168.15.1]) by mx3v1.cudamail.com (Postfix) with ESMTP id 5F4FF618608 for ; Sat, 24 Oct 2015 14:42:36 -0600 (MDT) X-ASG-Debug-ID: 1445719355-03dd7b106c251a50001-byXFYA Received: from mx3-pf3.cudamail.com ([192.168.14.3]) by bar3.cudamail.com with ESMTP id eJCr35APshEpLFfr (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sat, 24 Oct 2015 14:42:35 -0600 (MDT) X-Barracuda-Envelope-From: vsairam@vmware.com X-Barracuda-RBL-Trusted-Forwarder: 192.168.14.3 Received: from unknown (HELO smtp-outbound-1.vmware.com) (208.91.2.12) by mx3-pf3.cudamail.com with ESMTPS (DHE-RSA-AES256-SHA encrypted); 24 Oct 2015 20:42:35 -0000 Received-SPF: pass (mx3-pf3.cudamail.com: SPF record at _spf.vmware.com designates 208.91.2.12 as permitted sender) X-Barracuda-Apparent-Source-IP: 208.91.2.12 X-Barracuda-RBL-IP: 208.91.2.12 Received: from sc9-mailhost3.vmware.com (sc9-mailhost3.vmware.com [10.113.161.73]) by smtp-outbound-1.vmware.com (Postfix) with ESMTP id 5FF112939B for ; Sat, 24 Oct 2015 13:42:38 -0700 (PDT) Received: from localhost.localdomain (unknown [10.33.78.151]) by sc9-mailhost3.vmware.com (Postfix) with ESMTP id A33A7405CA; Sat, 24 Oct 2015 13:42:34 -0700 (PDT) X-CudaMail-Envelope-Sender: vsairam@vmware.com From: Sairam Venugopal To: dev@openvswitch.org X-CudaMail-Whitelist-To: dev@openvswitch.org X-CudaMail-MID: CM-V3-1023018229 X-CudaMail-DTE: 102415 X-CudaMail-Originating-IP: 208.91.2.12 Date: Sat, 24 Oct 2015 13:40:21 -0700 X-ASG-Orig-Subj: [##CM-V3-1023018229##][PATCH 2/3] datapath-windows: STT - Add support for TCP Segmentation Offload Message-Id: <1445719222-4664-3-git-send-email-vsairam@vmware.com> X-Mailer: git-send-email 1.9.5.msysgit.0 In-Reply-To: <1445719222-4664-1-git-send-email-vsairam@vmware.com> References: <1445719222-4664-1-git-send-email-vsairam@vmware.com> X-Barracuda-Connect: UNKNOWN[192.168.14.3] X-Barracuda-Start-Time: 1445719355 X-Barracuda-Encrypted: DHE-RSA-AES256-SHA X-Barracuda-URL: https://web.cudamail.com:443/cgi-mod/mark.cgi X-ASG-Whitelist: Header =?UTF-8?B?eFwtY3VkYW1haWxcLXdoaXRlbGlzdFwtdG8=?= X-Virus-Scanned: by bsmtpd at cudamail.com X-Barracuda-BRTS-Status: 1 X-ASG-Whitelist: EmailCat (corporate) Subject: [ovs-dev] [PATCH 2/3] datapath-windows: STT - Add support for TCP Segmentation Offload X-BeenThere: dev@openvswitch.org X-Mailman-Version: 2.1.16 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: dev-bounces@openvswitch.org Sender: "dev" Create and initialize the background thread and buffer that assists in defragmenting and completing a TSO packet. Signed-off-by: Sairam Venugopal Acked-by: Nithin Raju --- datapath-windows/ovsext/Stt.c | 128 ++++++++++++++++++++++++++++++++++++++- datapath-windows/ovsext/Stt.h | 33 +++++++++- datapath-windows/ovsext/Switch.c | 7 +++ 3 files changed, 162 insertions(+), 6 deletions(-) diff --git a/datapath-windows/ovsext/Stt.c b/datapath-windows/ovsext/Stt.c index 4a5a4a6..b78ef95 100644 --- a/datapath-windows/ovsext/Stt.c +++ b/datapath-windows/ovsext/Stt.c @@ -35,6 +35,11 @@ #define OVS_DBG_MOD OVS_DBG_STT #include "Debug.h" +KSTART_ROUTINE OvsSttDefragCleaner; +static PLIST_ENTRY OvsSttPktFragHash; +static NDIS_SPIN_LOCK OvsSttSpinLock; +static OVS_STT_THREAD_CTX sttDefragThreadCtx; + static NDIS_STATUS OvsDoEncapStt(POVS_VPORT_ENTRY vport, PNET_BUFFER_LIST curNbl, const OvsIPv4TunnelKey *tunKey, @@ -349,7 +354,7 @@ OvsCalculateTCPChecksum(PNET_BUFFER_LIST curNbl, PNET_BUFFER curNb) if (csumInfo.Receive.TcpChecksumSucceeded) { return NDIS_STATUS_SUCCESS; } - + EthHdr *eth = (EthHdr *)NdisGetDataBuffer(curNb, sizeof(EthHdr), NULL, 1, 0); @@ -379,6 +384,123 @@ OvsCalculateTCPChecksum(PNET_BUFFER_LIST curNbl, PNET_BUFFER curNb) } /* + *---------------------------------------------------------------------------- + * OvsInitSttDefragmentation + * Initialize the components used by the stt lso defragmentation + *---------------------------------------------------------------------------- + */ +NTSTATUS +OvsInitSttDefragmentation() +{ + NTSTATUS status; + HANDLE threadHandle = NULL; + + /* Init the sync-lock */ + NdisAllocateSpinLock(&OvsSttSpinLock); + + /* Init the Hash Buffer */ + OvsSttPktFragHash = (PLIST_ENTRY) OvsAllocateMemoryWithTag( + sizeof(LIST_ENTRY) + * STT_HASH_TABLE_SIZE, + OVS_STT_POOL_TAG); + if (OvsSttPktFragHash == NULL) { + NdisFreeSpinLock(&OvsSttSpinLock); + return STATUS_INSUFFICIENT_RESOURCES; + } + + for (int i = 0; i < STT_HASH_TABLE_SIZE; i++) { + InitializeListHead(&OvsSttPktFragHash[i]); + } + + /* Init Defrag Cleanup Thread */ + KeInitializeEvent(&sttDefragThreadCtx.event, NotificationEvent, FALSE); + status = PsCreateSystemThread(&threadHandle, SYNCHRONIZE, NULL, NULL, + NULL, OvsSttDefragCleaner, + &sttDefragThreadCtx); + + if (status != STATUS_SUCCESS) { + OvsCleanupSttDefragmentation(); + return status; + } + + ObReferenceObjectByHandle(threadHandle, SYNCHRONIZE, NULL, KernelMode, + &sttDefragThreadCtx.threadObject, NULL); + ZwClose(threadHandle); + threadHandle = NULL; + return STATUS_SUCCESS; +} + +/* + *---------------------------------------------------------------------------- + * OvsCleanupSttDefragmentation + * Cleanup memory and thread that were spawned for STT LSO defragmentation + *---------------------------------------------------------------------------- + */ +VOID +OvsCleanupSttDefragmentation(VOID) +{ + NdisAcquireSpinLock(&OvsSttSpinLock); + sttDefragThreadCtx.exit = 1; + KeSetEvent(&sttDefragThreadCtx.event, 0, FALSE); + NdisReleaseSpinLock(&OvsSttSpinLock); + + KeWaitForSingleObject(sttDefragThreadCtx.threadObject, Executive, + KernelMode, FALSE, NULL); + ObDereferenceObject(sttDefragThreadCtx.threadObject); + + if (OvsSttPktFragHash) { + OvsFreeMemoryWithTag(OvsSttPktFragHash, OVS_STT_POOL_TAG); + OvsSttPktFragHash = NULL; + } + + NdisFreeSpinLock(&OvsSttSpinLock); +} + +/* + *---------------------------------------------------------------------------- + * OvsSttDefragCleaner + * Runs periodically and cleans up the buffer to remove expired segments + *---------------------------------------------------------------------------- + */ +VOID +OvsSttDefragCleaner(PVOID data) +{ + POVS_STT_THREAD_CTX context = (POVS_STT_THREAD_CTX)data; + PLIST_ENTRY link, next; + POVS_STT_PKT_ENTRY entry; + BOOLEAN success = TRUE; + + while (success) { + NdisAcquireSpinLock(&OvsSttSpinLock); + if (context->exit) { + NdisReleaseSpinLock(&OvsSttSpinLock); + break; + } + + /* Set the timeout for the thread and cleanup */ + UINT64 currentTime, threadSleepTimeout; + NdisGetCurrentSystemTime((LARGE_INTEGER *)¤tTime); + threadSleepTimeout = currentTime + STT_CLEANUP_INTERVAL; + + for (int i = 0; i < STT_HASH_TABLE_SIZE; i++) { + LIST_FORALL_SAFE(&OvsSttPktFragHash[i], link, next) { + entry = CONTAINING_RECORD(link, OVS_STT_PKT_ENTRY, link); + if (entry->timeout < currentTime) { + RemoveEntryList(&entry->link); + OvsFreeMemoryWithTag(entry, OVS_STT_POOL_TAG); + } + } + } + + NdisReleaseSpinLock(&OvsSttSpinLock); + KeWaitForSingleObject(&context->event, Executive, KernelMode, + FALSE, (LARGE_INTEGER *)&threadSleepTimeout); + } + + PsTerminateSystemThread(STATUS_SUCCESS); +} + +/* * -------------------------------------------------------------------------- * OvsDecapStt -- * Decapsulates an STT packet. @@ -416,7 +538,7 @@ OvsDecapStt(POVS_SWITCH_CONTEXT switchContext, if (csumInfo.Receive.TcpChecksumFailed) { return NDIS_STATUS_INVALID_PACKET; } - + /* Calculate the TCP Checksum */ status = OvsCalculateTCPChecksum(curNbl, curNb); if (status != NDIS_STATUS_SUCCESS) { @@ -455,7 +577,7 @@ OvsDecapStt(POVS_SWITCH_CONTEXT switchContext, hdrLen = STT_HDR_LEN; NdisAdvanceNetBufferDataStart(curNb, hdrLen, FALSE, NULL); advanceCnt += hdrLen; - + /* Verify checksum for inner packet if it's required */ if (!(sttHdr->flags & STT_CSUM_VERIFIED)) { BOOLEAN innerChecksumPartial = sttHdr->flags & STT_CSUM_PARTIAL; diff --git a/datapath-windows/ovsext/Stt.h b/datapath-windows/ovsext/Stt.h index 38d721c..9a45379 100644 --- a/datapath-windows/ovsext/Stt.h +++ b/datapath-windows/ovsext/Stt.h @@ -34,6 +34,11 @@ #define STT_PROTO_TCP (1 << 3) #define STT_PROTO_TYPES (STT_PROTO_IPV4 | STT_PROTO_TCP) +#define STT_HASH_TABLE_SIZE ((UINT32)1 << 10) +#define STT_HASH_TABLE_MASK (STT_HASH_TABLE_SIZE - 1) +#define STT_ENTRY_TIMEOUT 300000000 // 30s +#define STT_CLEANUP_INTERVAL 300000000 // 30s + #define STT_ETH_PAD 2 typedef struct SttHdr { UINT8 version; @@ -58,14 +63,32 @@ typedef struct _OVS_STT_VPORT { UINT64 slowOutPkts; } OVS_STT_VPORT, *POVS_STT_VPORT; +typedef struct _OVS_STT_PKT_KEY { + UINT32 sAddr; + UINT32 dAddr; + UINT32 ackSeq; +} OVS_STT_PKT_KEY, *POVS_STT_PKT_KEY; + +typedef struct _OVS_STT_PKT_ENTRY { + OVS_STT_PKT_KEY ovsPktKey; + UINT64 timeout; + UINT32 recvdLen; + SttHdr sttHdr; + PCHAR packetBuf; + LIST_ENTRY link; +} OVS_STT_PKT_ENTRY, *POVS_STT_PKT_ENTRY; + +typedef struct _OVS_STT_THREAD_CTX { + KEVENT event; + PVOID threadObject; + UINT32 exit; +} OVS_STT_THREAD_CTX, *POVS_STT_THREAD_CTX; + NTSTATUS OvsInitSttTunnel(POVS_VPORT_ENTRY vport, UINT16 udpDestPort); VOID OvsCleanupSttTunnel(POVS_VPORT_ENTRY vport); - -void OvsCleanupSttTunnel(POVS_VPORT_ENTRY vport); - NDIS_STATUS OvsEncapStt(POVS_VPORT_ENTRY vport, PNET_BUFFER_LIST curNbl, OvsIPv4TunnelKey *tunKey, @@ -79,6 +102,10 @@ NDIS_STATUS OvsDecapStt(POVS_SWITCH_CONTEXT switchContext, OvsIPv4TunnelKey *tunKey, PNET_BUFFER_LIST *newNbl); +NTSTATUS OvsInitSttDefragmentation(); + +VOID OvsCleanupSttDefragmentation(VOID); + static __inline UINT32 OvsGetSttTunHdrSize(VOID) { diff --git a/datapath-windows/ovsext/Switch.c b/datapath-windows/ovsext/Switch.c index f176fa0..2878e91 100644 --- a/datapath-windows/ovsext/Switch.c +++ b/datapath-windows/ovsext/Switch.c @@ -212,6 +212,12 @@ OvsCreateSwitch(NDIS_HANDLE ndisFilterHandle, goto create_switch_done; } + status = OvsInitSttDefragmentation(); + if (status != STATUS_SUCCESS) { + OVS_LOG_ERROR("Exit: Failed to initialize Stt Defragmentation"); + goto create_switch_done; + } + *switchContextOut = switchContext; create_switch_done: @@ -242,6 +248,7 @@ OvsExtDetach(NDIS_HANDLE filterModuleContext) } OvsDeleteSwitch(switchContext); OvsCleanupIpHelper(); + OvsCleanupSttDefragmentation(); /* This completes the cleanup, and a new attach can be handled now. */ OVS_LOG_TRACE("Exit: OvsDetach Successfully");