Patch Detail

GET /api/patches/692500/?format=api
HTTP 200 OK
Allow: GET, PUT, PATCH, HEAD, OPTIONS
Content-Type: application/json
Vary: Accept

{
    "id": 692500,
    "url": "http://patchwork.ozlabs.org/api/patches/692500/?format=api",
    "web_url": "http://patchwork.ozlabs.org/project/intel-wired-lan/patch/1478639119-14656-11-git-send-email-bimmy.pujari@intel.com/",
    "project": {
        "id": 46,
        "url": "http://patchwork.ozlabs.org/api/projects/46/?format=api",
        "name": "Intel Wired Ethernet development",
        "link_name": "intel-wired-lan",
        "list_id": "intel-wired-lan.osuosl.org",
        "list_email": "intel-wired-lan@osuosl.org",
        "web_url": "",
        "scm_url": "",
        "webscm_url": "",
        "list_archive_url": "",
        "list_archive_url_format": "",
        "commit_url_format": ""
    },
    "msgid": "<1478639119-14656-11-git-send-email-bimmy.pujari@intel.com>",
    "list_archive_url": null,
    "date": "2016-11-08T21:05:14",
    "name": "[next,S52-V2,10/15] i40e: simplify txd use count calculation",
    "commit_ref": null,
    "pull_url": null,
    "state": "accepted",
    "archived": false,
    "hash": "d645457ad66e5b3d12ed4771255abe03117a5f4b",
    "submitter": {
        "id": 68919,
        "url": "http://patchwork.ozlabs.org/api/people/68919/?format=api",
        "name": "Pujari, Bimmy",
        "email": "bimmy.pujari@intel.com"
    },
    "delegate": {
        "id": 68,
        "url": "http://patchwork.ozlabs.org/api/users/68/?format=api",
        "username": "jtkirshe",
        "first_name": "Jeff",
        "last_name": "Kirsher",
        "email": "jeffrey.t.kirsher@intel.com"
    },
    "mbox": "http://patchwork.ozlabs.org/project/intel-wired-lan/patch/1478639119-14656-11-git-send-email-bimmy.pujari@intel.com/mbox/",
    "series": [],
    "comments": "http://patchwork.ozlabs.org/api/patches/692500/comments/",
    "check": "pending",
    "checks": "http://patchwork.ozlabs.org/api/patches/692500/checks/",
    "tags": {},
    "related": [],
    "headers": {
        "Return-Path": "<intel-wired-lan-bounces@lists.osuosl.org>",
        "X-Original-To": [
            "incoming@patchwork.ozlabs.org",
            "intel-wired-lan@lists.osuosl.org"
        ],
        "Delivered-To": [
            "patchwork-incoming@bilbo.ozlabs.org",
            "intel-wired-lan@lists.osuosl.org"
        ],
        "Received": [
            "from fraxinus.osuosl.org (smtp4.osuosl.org [140.211.166.137])\n\t(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))\n\t(No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3tD21p1GHsz9t1d\n\tfor <incoming@patchwork.ozlabs.org>;\n\tWed,  9 Nov 2016 08:07:09 +1100 (AEDT)",
            "from localhost (localhost [127.0.0.1])\n\tby fraxinus.osuosl.org (Postfix) with ESMTP id 0F7ABC1299;\n\tTue,  8 Nov 2016 21:07:08 +0000 (UTC)",
            "from fraxinus.osuosl.org ([127.0.0.1])\n\tby localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024)\n\twith ESMTP id WDQF5vSYgR-3; Tue,  8 Nov 2016 21:07:04 +0000 (UTC)",
            "from ash.osuosl.org (ash.osuosl.org [140.211.166.34])\n\tby fraxinus.osuosl.org (Postfix) with ESMTP id 7BF09C13E1;\n\tTue,  8 Nov 2016 21:07:03 +0000 (UTC)",
            "from silver.osuosl.org (smtp3.osuosl.org [140.211.166.136])\n\tby ash.osuosl.org (Postfix) with ESMTP id 588FB1C22F9\n\tfor <intel-wired-lan@lists.osuosl.org>;\n\tTue,  8 Nov 2016 21:07:00 +0000 (UTC)",
            "from localhost (localhost [127.0.0.1])\n\tby silver.osuosl.org (Postfix) with ESMTP id 4EF0F2E50D\n\tfor <intel-wired-lan@lists.osuosl.org>;\n\tTue,  8 Nov 2016 21:07:00 +0000 (UTC)",
            "from silver.osuosl.org ([127.0.0.1])\n\tby localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024)\n\twith ESMTP id Vf0Z+paE5ol9 for <intel-wired-lan@lists.osuosl.org>;\n\tTue,  8 Nov 2016 21:06:55 +0000 (UTC)",
            "from mga05.intel.com (mga05.intel.com [192.55.52.43])\n\tby silver.osuosl.org (Postfix) with ESMTPS id 38A9631BBA\n\tfor <intel-wired-lan@lists.osuosl.org>;\n\tTue,  8 Nov 2016 21:06:55 +0000 (UTC)",
            "from fmsmga002.fm.intel.com ([10.253.24.26])\n\tby fmsmga105.fm.intel.com with ESMTP; 08 Nov 2016 13:06:54 -0800",
            "from bimmy.jf.intel.com (HELO bimmy.linux1.jf.intel.com)\n\t([134.134.2.167])\n\tby fmsmga002.fm.intel.com with ESMTP; 08 Nov 2016 13:06:54 -0800"
        ],
        "X-Virus-Scanned": [
            "amavisd-new at osuosl.org",
            "amavisd-new at osuosl.org"
        ],
        "X-Greylist": "domain auto-whitelisted by SQLgrey-1.7.6",
        "X-ExtLoop1": "1",
        "X-IronPort-AV": "E=Sophos; i=\"5.31,611,1473145200\"; d=\"scan'208\";\n\ta=\"1082486418\"",
        "From": "Bimmy Pujari <bimmy.pujari@intel.com>",
        "To": "intel-wired-lan@lists.osuosl.org",
        "Date": "Tue,  8 Nov 2016 13:05:14 -0800",
        "Message-Id": "<1478639119-14656-11-git-send-email-bimmy.pujari@intel.com>",
        "X-Mailer": "git-send-email 2.4.11",
        "In-Reply-To": "<1478639119-14656-1-git-send-email-bimmy.pujari@intel.com>",
        "References": "<1478639119-14656-1-git-send-email-bimmy.pujari@intel.com>",
        "Subject": "[Intel-wired-lan] [next PATCH S52-V2 10/15] i40e: simplify txd use\n\tcount calculation",
        "X-BeenThere": "intel-wired-lan@lists.osuosl.org",
        "X-Mailman-Version": "2.1.18-1",
        "Precedence": "list",
        "List-Id": "Intel Wired Ethernet Linux Kernel Driver Development\n\t<intel-wired-lan.lists.osuosl.org>",
        "List-Unsubscribe": "<http://lists.osuosl.org/mailman/options/intel-wired-lan>, \n\t<mailto:intel-wired-lan-request@lists.osuosl.org?subject=unsubscribe>",
        "List-Archive": "<http://lists.osuosl.org/pipermail/intel-wired-lan/>",
        "List-Post": "<mailto:intel-wired-lan@lists.osuosl.org>",
        "List-Help": "<mailto:intel-wired-lan-request@lists.osuosl.org?subject=help>",
        "List-Subscribe": "<http://lists.osuosl.org/mailman/listinfo/intel-wired-lan>, \n\t<mailto:intel-wired-lan-request@lists.osuosl.org?subject=subscribe>",
        "MIME-Version": "1.0",
        "Content-Type": "text/plain; charset=\"us-ascii\"",
        "Content-Transfer-Encoding": "7bit",
        "Errors-To": "intel-wired-lan-bounces@lists.osuosl.org",
        "Sender": "\"Intel-wired-lan\" <intel-wired-lan-bounces@lists.osuosl.org>"
    },
    "content": "From: Mitch Williams <mitch.a.williams@intel.com>\n\nThe i40e_txd_use_count function was fast but confusing. In the comments,\nit even admits that it's ugly. So replace it with a new function that is\n(very) slightly faster and has extensive commenting to help the thicker\namong us (including the author, who will forget in a week) understand\nhow it works.\n\nSigned-off-by: Mitch Williams <mitch.a.williams@intel.com>\nSigned-off-by: Alexander Duyck <alexander.h.duyck@intel.com>\nChange-ID: Ifb533f13786a0bf39cb29f77969a5be2c83d9a87\n---\nTesting Hints : Lots and lots of TSO with various\ndata sizes. This should operate exactly as before, with no noticeable\nperformance change.\n\n drivers/net/ethernet/intel/i40e/i40e_txrx.h   | 45 +++++++++++++++++----------\n drivers/net/ethernet/intel/i40evf/i40e_txrx.h | 45 +++++++++++++++++----------\n 2 files changed, 56 insertions(+), 34 deletions(-)",
    "diff": "diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.h b/drivers/net/ethernet/intel/i40e/i40e_txrx.h\nindex de8550f..e065321 100644\n--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.h\n+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.h\n@@ -173,26 +173,37 @@ static inline bool i40e_test_staterr(union i40e_rx_desc *rx_desc,\n #define I40E_MAX_DATA_PER_TXD_ALIGNED \\\n \t(I40E_MAX_DATA_PER_TXD & ~(I40E_MAX_READ_REQ_SIZE - 1))\n \n-/* This ugly bit of math is equivalent to DIV_ROUNDUP(size, X) where X is\n- * the value I40E_MAX_DATA_PER_TXD_ALIGNED.  It is needed due to the fact\n- * that 12K is not a power of 2 and division is expensive.  It is used to\n- * approximate the number of descriptors used per linear buffer.  Note\n- * that this will overestimate in some cases as it doesn't account for the\n- * fact that we will add up to 4K - 1 in aligning the 12K buffer, however\n- * the error should not impact things much as large buffers usually mean\n- * we will use fewer descriptors then there are frags in an skb.\n+/**\n+ * i40e_txd_use_count  - estimate the number of descriptors needed for Tx\n+ * @size: transmit request size in bytes\n+ *\n+ * Due to hardware alignment restrictions (4K alignment), we need to\n+ * assume that we can have no more than 12K of data per descriptor, even\n+ * though each descriptor can take up to 16K - 1 bytes of aligned memory.\n+ * Thus, we need to divide by 12K. But division is slow! Instead,\n+ * we decompose the operation into shifts and one relatively cheap\n+ * multiply operation.\n+ *\n+ * To divide by 12K, we first divide by 4K, then divide by 3:\n+ *     To divide by 4K, shift right by 12 bits\n+ *     To divide by 3, multiply by 85, then divide by 256\n+ *     (Divide by 256 is done by shifting right by 8 bits)\n+ * Finally, we add one to round up. Because 256 isn't an exact multiple of\n+ * 3, we'll underestimate near each multiple of 12K. This is actually more\n+ * accurate as we have 4K - 1 of wiggle room that we can fit into the last\n+ * segment.  For our purposes this is accurate out to 1M which is orders of\n+ * magnitude greater than our largest possible GSO size.\n+ *\n+ * This would then be implemented as:\n+ *     return (((size >> 12) * 85) >> 8) + 1;\n+ *\n+ * Since multiplication and division are commutative, we can reorder\n+ * operations into:\n+ *     return ((size * 85) >> 20) + 1;\n  */\n static inline unsigned int i40e_txd_use_count(unsigned int size)\n {\n-\tconst unsigned int max = I40E_MAX_DATA_PER_TXD_ALIGNED;\n-\tconst unsigned int reciprocal = ((1ull << 32) - 1 + (max / 2)) / max;\n-\tunsigned int adjust = ~(u32)0;\n-\n-\t/* if we rounded up on the reciprocal pull down the adjustment */\n-\tif ((max * reciprocal) > adjust)\n-\t\tadjust = ~(u32)(reciprocal - 1);\n-\n-\treturn (u32)((((u64)size * reciprocal) + adjust) >> 32);\n+\treturn ((size * 85) >> 20) + 1;\n }\n \n /* Tx Descriptors needed, worst case */\ndiff --git a/drivers/net/ethernet/intel/i40evf/i40e_txrx.h b/drivers/net/ethernet/intel/i40evf/i40e_txrx.h\nindex a586e19..a5fc789 100644\n--- a/drivers/net/ethernet/intel/i40evf/i40e_txrx.h\n+++ b/drivers/net/ethernet/intel/i40evf/i40e_txrx.h\n@@ -173,26 +173,37 @@ static inline bool i40e_test_staterr(union i40e_rx_desc *rx_desc,\n #define I40E_MAX_DATA_PER_TXD_ALIGNED \\\n \t(I40E_MAX_DATA_PER_TXD & ~(I40E_MAX_READ_REQ_SIZE - 1))\n \n-/* This ugly bit of math is equivalent to DIV_ROUNDUP(size, X) where X is\n- * the value I40E_MAX_DATA_PER_TXD_ALIGNED.  It is needed due to the fact\n- * that 12K is not a power of 2 and division is expensive.  It is used to\n- * approximate the number of descriptors used per linear buffer.  Note\n- * that this will overestimate in some cases as it doesn't account for the\n- * fact that we will add up to 4K - 1 in aligning the 12K buffer, however\n- * the error should not impact things much as large buffers usually mean\n- * we will use fewer descriptors then there are frags in an skb.\n+/**\n+ * i40e_txd_use_count  - estimate the number of descriptors needed for Tx\n+ * @size: transmit request size in bytes\n+ *\n+ * Due to hardware alignment restrictions (4K alignment), we need to\n+ * assume that we can have no more than 12K of data per descriptor, even\n+ * though each descriptor can take up to 16K - 1 bytes of aligned memory.\n+ * Thus, we need to divide by 12K. But division is slow! Instead,\n+ * we decompose the operation into shifts and one relatively cheap\n+ * multiply operation.\n+ *\n+ * To divide by 12K, we first divide by 4K, then divide by 3:\n+ *     To divide by 4K, shift right by 12 bits\n+ *     To divide by 3, multiply by 85, then divide by 256\n+ *     (Divide by 256 is done by shifting right by 8 bits)\n+ * Finally, we add one to round up. Because 256 isn't an exact multiple of\n+ * 3, we'll underestimate near each multiple of 12K. This is actually more\n+ * accurate as we have 4K - 1 of wiggle room that we can fit into the last\n+ * segment.  For our purposes this is accurate out to 1M which is orders of\n+ * magnitude greater than our largest possible GSO size.\n+ *\n+ * This would then be implemented as:\n+ *     return (((size >> 12) * 85) >> 8) + 1;\n+ *\n+ * Since multiplication and division are commutative, we can reorder\n+ * operations into:\n+ *     return ((size * 85) >> 20) + 1;\n  */\n static inline unsigned int i40e_txd_use_count(unsigned int size)\n {\n-\tconst unsigned int max = I40E_MAX_DATA_PER_TXD_ALIGNED;\n-\tconst unsigned int reciprocal = ((1ull << 32) - 1 + (max / 2)) / max;\n-\tunsigned int adjust = ~(u32)0;\n-\n-\t/* if we rounded up on the reciprocal pull down the adjustment */\n-\tif ((max * reciprocal) > adjust)\n-\t\tadjust = ~(u32)(reciprocal - 1);\n-\n-\treturn (u32)((((u64)size * reciprocal) + adjust) >> 32);\n+\treturn ((size * 85) >> 20) + 1;\n }\n \n /* Tx Descriptors needed, worst case */\n",
    "prefixes": [
        "next",
        "S52-V2",
        "10/15"
    ]
}