From patchwork Fri Apr 13 17:40:13 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Traynor X-Patchwork-Id: 898047 X-Patchwork-Delegate: ian.stokes@intel.com Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40N4mm58HWz9ryr for ; Sat, 14 Apr 2018 03:40:24 +1000 (AEST) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id F00BADB7; Fri, 13 Apr 2018 17:40:22 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 3F19CC21 for ; Fri, 13 Apr 2018 17:40:21 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 4F338165 for ; Fri, 13 Apr 2018 17:40:20 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 79AADEBFFD; Fri, 13 Apr 2018 17:40:19 +0000 (UTC) Received: from ktraynor.remote.csb (unknown [10.36.117.255]) by smtp.corp.redhat.com (Postfix) with ESMTP id B26652026980; Fri, 13 Apr 2018 17:40:17 +0000 (UTC) From: Kevin Traynor To: dev@openvswitch.org Date: Fri, 13 Apr 2018 18:40:13 +0100 Message-Id: <1523641213-28553-1-git-send-email-ktraynor@redhat.com> In-Reply-To: <1522844392-10114-1-git-send-email-ktraynor@redhat.com> References: <1522844392-10114-1-git-send-email-ktraynor@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Fri, 13 Apr 2018 17:40:19 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Fri, 13 Apr 2018 17:40:19 +0000 (UTC) for IP:'10.11.54.4' DOMAIN:'int-mx04.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'ktraynor@redhat.com' RCPT:'' X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Cc: Ilya Maximets , mark.b.kavanagh81@gmail.com Subject: [ovs-dev] [PATCH v2] netdev-dpdk: Free mempool only when no in-use mbufs. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org DPDK mempools are freed when they are no longer needed. This can happen when a port is removed or a port's mtu is reconfigured so that a new mempool is used. It is possible that an mbuf is attempted to be returned to a freed mempool from NIC Tx queues and this can lead to a segfault. In order to prevent this, only free mempools when they are not needed and have no in-use mbufs. As this might not be possible immediately, create a free list of mempools and sweep it anytime a port tries to get a mempool. Fixes: 8d38823bdf8b ("netdev-dpdk: fix memory leak") Cc: mark.b.kavanagh81@gmail.com Cc: Ilya Maximets Reported-by: Venkatesan Pradeep Signed-off-by: Kevin Traynor --- v2: Get number of entries on the mempool ring directly. lib/netdev-dpdk.c | 86 +++++++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 81 insertions(+), 5 deletions(-) diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index ee39cbe..3306b19 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -297,4 +297,14 @@ static struct ovs_mutex dpdk_mp_mutex OVS_ACQ_AFTER(dpdk_mutex) = OVS_MUTEX_INITIALIZER; +/* Contains all 'struct dpdk_mp's. */ +static struct ovs_list dpdk_mp_free_list OVS_GUARDED_BY(dpdk_mp_mutex) + = OVS_LIST_INITIALIZER(&dpdk_mp_free_list); + +/* Wrapper for a mempool released but not yet freed. */ +struct dpdk_mp { + struct rte_mempool *mp; + struct ovs_list list_node OVS_GUARDED_BY(dpdk_mp_mutex); + }; + /* There should be one 'struct dpdk_tx_queue' created for * each cpu core. */ @@ -512,4 +522,56 @@ ovs_rte_pktmbuf_init(struct rte_mempool *mp OVS_UNUSED, } +static int +dpdk_mp_full(const struct rte_mempool *mp) OVS_REQUIRES(dpdk_mp_mutex) +{ + unsigned ring_count; + /* This logic is needed because rte_mempool_full() is not guaranteed to + * be atomic and mbufs could be moved from mempool cache --> mempool ring + * during the call. However, as no mbufs will be taken from the mempool + * at this time, we can work around it by also checking the ring entries + * separately and ensuring that they have not changed. + */ + ring_count = rte_mempool_ops_get_count(mp); + if (rte_mempool_full(mp) && rte_mempool_ops_get_count(mp) == ring_count) { + return 1; + } + + return 0; +} + +/* Free unused mempools. */ +static void +dpdk_mp_sweep(void) +{ + struct dpdk_mp *dmp, *next; + + ovs_mutex_lock(&dpdk_mp_mutex); + LIST_FOR_EACH_SAFE (dmp, next, list_node, &dpdk_mp_free_list) { + if (dpdk_mp_full(dmp->mp)) { + VLOG_DBG("Freeing mempool \"%s\"", dmp->mp->name); + ovs_list_remove(&dmp->list_node); + rte_mempool_free(dmp->mp); + rte_free(dmp); + } + } + ovs_mutex_unlock(&dpdk_mp_mutex); +} + +/* Ensure a mempool will not be freed. */ +static void +dpdk_mp_do_not_free(struct rte_mempool *mp) OVS_REQUIRES(dpdk_mp_mutex) +{ + struct dpdk_mp *dmp, *next; + + LIST_FOR_EACH_SAFE (dmp, next, list_node, &dpdk_mp_free_list) { + if (dmp->mp == mp) { + VLOG_DBG("Removing mempool \"%s\" from free list", dmp->mp->name); + ovs_list_remove(&dmp->list_node); + rte_free(dmp); + break; + } + } +} + /* Returns a valid pointer when either of the following is true: * - a new mempool was just created; @@ -578,4 +640,6 @@ dpdk_mp_create(struct netdev_dpdk *dev, int mtu) VLOG_DBG("A mempool with name \"%s\" already exists at %p.", mp_name, mp); + /* Ensure this reused mempool will not be freed. */ + dpdk_mp_do_not_free(mp); } else { VLOG_ERR("Failed mempool \"%s\" create request of %u mbufs", @@ -590,5 +654,5 @@ dpdk_mp_create(struct netdev_dpdk *dev, int mtu) /* Release an existing mempool. */ static void -dpdk_mp_free(struct rte_mempool *mp) +dpdk_mp_release(struct rte_mempool *mp) { if (!mp) { @@ -597,6 +661,16 @@ dpdk_mp_free(struct rte_mempool *mp) ovs_mutex_lock(&dpdk_mp_mutex); - VLOG_DBG("Releasing \"%s\" mempool", mp->name); - rte_mempool_free(mp); + if (dpdk_mp_full(mp)) { + VLOG_DBG("Freeing mempool \"%s\"", mp->name); + rte_mempool_free(mp); + } else { + struct dpdk_mp *dmp; + + dmp = dpdk_rte_mzalloc(sizeof *dmp); + if (dmp) { + dmp->mp = mp; + ovs_list_push_back(&dpdk_mp_free_list, &dmp->list_node); + } + } ovs_mutex_unlock(&dpdk_mp_mutex); } @@ -616,4 +690,6 @@ netdev_dpdk_mempool_configure(struct netdev_dpdk *dev) int ret = 0; + dpdk_mp_sweep(); + mp = dpdk_mp_create(dev, FRAME_LEN_TO_MTU(buf_size)); if (!mp) { @@ -628,5 +704,5 @@ netdev_dpdk_mempool_configure(struct netdev_dpdk *dev) if (dev->mp != mp) { /* A new mempool was created, release the previous one. */ - dpdk_mp_free(dev->mp); + dpdk_mp_release(dev->mp); } else { ret = EEXIST; @@ -1083,5 +1159,5 @@ common_destruct(struct netdev_dpdk *dev) { rte_free(dev->tx_q); - dpdk_mp_free(dev->mp); + dpdk_mp_release(dev->mp); ovs_list_remove(&dev->list_node);