From patchwork Wed Oct 10 16:22:23 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Lam, Tiago" X-Patchwork-Id: 981986 X-Patchwork-Delegate: ian.stokes@intel.com Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=intel.com Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 42Vfcj31Nqz9s3l for ; Thu, 11 Oct 2018 03:26:45 +1100 (AEDT) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id B15C5AA5; Wed, 10 Oct 2018 16:23:56 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id A43C1A84 for ; Wed, 10 Oct 2018 16:23:55 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 1AAEED0 for ; Wed, 10 Oct 2018 16:23:55 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Oct 2018 09:23:54 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,364,1534834800"; d="scan'208";a="81516418" Received: from silpixa00399125.ir.intel.com ([10.237.223.34]) by orsmga006.jf.intel.com with ESMTP; 10 Oct 2018 09:23:51 -0700 From: Tiago Lam To: ovs-dev@openvswitch.org Date: Wed, 10 Oct 2018 17:22:23 +0100 Message-Id: <1539188552-129083-6-git-send-email-tiago.lam@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1539188552-129083-1-git-send-email-tiago.lam@intel.com> References: <1539188552-129083-1-git-send-email-tiago.lam@intel.com> X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Cc: Marcin Ksiadz , Przemyslaw Lal , fbl@sysclose.org, Michael Qiu , i.maximets@samsung.com Subject: [ovs-dev] [PATCH v11 05/14] dp-packet: Fix data_len handling multi-seg mbufs. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org When a dp_packet is from a DPDK source, and it contains multi-segment mbufs, the data_len is not equal to the packet size, pkt_len. Instead, the data_len of each mbuf in the chain should be considered while distributing the new (provided) size. To account for the above dp_packet_set_size() has been changed so that, in the multi-segment mbufs case, only the data_len on the last mbuf of the chain and the total size of the packet, pkt_len, are changed. The data_len on the intermediate mbufs preceeding the last mbuf is not changed by dp_packet_set_size(). Furthermore, in some cases dp_packet_set_size() may be used to set a smaller size than the current packet size, thus effectively trimming the end of the packet. In the multi-segment mbufs case this may lead to lingering mbufs that may need freeing. __dp_packet_set_data() now also updates an mbufs' data_len after setting the data offset. This is so that both fields are always in sync for each mbuf in a chain. Co-authored-by: Michael Qiu Co-authored-by: Mark Kavanagh Co-authored-by: Przemyslaw Lal Co-authored-by: Marcin Ksiadz Co-authored-by: Yuanhan Liu Signed-off-by: Michael Qiu Signed-off-by: Mark Kavanagh Signed-off-by: Przemyslaw Lal Signed-off-by: Marcin Ksiadz Signed-off-by: Yuanhan Liu Signed-off-by: Tiago Lam Acked-by: Eelco Chaudron --- lib/dp-packet.h | 83 ++++++++++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 71 insertions(+), 12 deletions(-) diff --git a/lib/dp-packet.h b/lib/dp-packet.h index 6376039..223efe2 100644 --- a/lib/dp-packet.h +++ b/lib/dp-packet.h @@ -426,20 +426,60 @@ dp_packet_size(const struct dp_packet *b) return b->mbuf.pkt_len; } +/* Sets the size of the packet 'b' to 'v'. For non-DPDK packets this only means + * setting b->size_, but if used in a DPDK packet it means adjusting the first + * mbuf pkt_len and last mbuf data_len, to reflect the real size, which can + * lead to free'ing tail mbufs that are no longer used. + * + * This function should be used for setting the size only, and if there's an + * assumption that the tail end of 'b' will be trimmed. For adjustng the head + * 'end' of 'b', dp_packet_pull() should be used instead. */ static inline void dp_packet_set_size(struct dp_packet *b, uint32_t v) { - /* netdev-dpdk does not currently support segmentation; consequently, for - * all intents and purposes, 'data_len' (16 bit) and 'pkt_len' (32 bit) may - * be used interchangably. - * - * On the datapath, it is expected that the size of packets - * (and thus 'v') will always be <= UINT16_MAX; this means that there is no - * loss of accuracy in assigning 'v' to 'data_len'. - */ - b->mbuf.data_len = (uint16_t)v; /* Current seg length. */ - b->mbuf.pkt_len = v; /* Total length of all segments linked to - * this segment. */ + if (b->source == DPBUF_DPDK) { + struct rte_mbuf *mbuf = &b->mbuf; + uint16_t new_len = v; + uint16_t data_len; + uint16_t nb_segs = 0; + uint16_t pkt_len = 0; + + /* Trim 'v' length bytes from the end of the chained buffers, freeing + any buffers that may be left floating */ + while (mbuf) { + data_len = MIN(new_len, mbuf->data_len); + mbuf->data_len = data_len; + + if (new_len - data_len <= 0) { + /* Free the rest of chained mbufs */ + free_dpdk_buf(CONTAINER_OF(mbuf->next, struct dp_packet, + mbuf)); + mbuf->next = NULL; + } else if (!mbuf->next) { + /* Don't assign more than what we have available */ + mbuf->data_len = MIN(new_len, + mbuf->buf_len - mbuf->data_off); + } + + new_len -= data_len; + nb_segs += 1; + pkt_len += mbuf->data_len; + mbuf = mbuf->next; + } + + /* pkt_len != v would effectively mean that pkt_len < than 'v' (as + * being bigger is logically impossible). Being < than 'v' would mean + * the 'v' provided was bigger than the available room, which is the + * responsibility of the caller to make sure there is enough room */ + ovs_assert(pkt_len == v); + + b->mbuf.nb_segs = nb_segs; + b->mbuf.pkt_len = pkt_len; + } else { + b->mbuf.data_len = v; + /* Total length of all segments linked to this segment. */ + b->mbuf.pkt_len = v; + } } static inline uint16_t @@ -451,7 +491,26 @@ __packet_data(const struct dp_packet *b) static inline void __packet_set_data(struct dp_packet *b, uint16_t v) { - b->mbuf.data_off = v; + if (b->source == DPBUF_DPDK) { + /* Moving data_off away from the first mbuf in the chain is not a + * possibility using DPBUF_DPDK dp_packets */ + ovs_assert(v == UINT16_MAX || v <= b->mbuf.buf_len); + + uint16_t prev_ofs = b->mbuf.data_off; + b->mbuf.data_off = v; + int16_t ofs_diff = prev_ofs - b->mbuf.data_off; + + /* When dealing with DPDK mbufs, keep data_off and data_len in sync. + * Thus, update data_len if the length changes with the move of + * data_off. However, if data_len is 0, there's no data to move and + * data_len should remain 0. */ + + if (b->mbuf.data_len != 0) { + b->mbuf.data_len += ofs_diff; + } + } else { + b->mbuf.data_off = v; + } } static inline uint16_t