Patch Detail
get:
Show a patch.
patch:
Update a patch.
put:
Update a patch.
GET /api/patches/817322/?format=api
{ "id": 817322, "url": "http://patchwork.ozlabs.org/api/patches/817322/?format=api", "web_url": "http://patchwork.ozlabs.org/project/linuxppc-dev/patch/1505950480-14830-3-git-send-email-wei.guo.simon@gmail.com/", "project": { "id": 2, "url": "http://patchwork.ozlabs.org/api/projects/2/?format=api", "name": "Linux PPC development", "link_name": "linuxppc-dev", "list_id": "linuxppc-dev.lists.ozlabs.org", "list_email": "linuxppc-dev@lists.ozlabs.org", "web_url": "https://github.com/linuxppc/wiki/wiki", "scm_url": "https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git", "webscm_url": "https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/", "list_archive_url": "https://lore.kernel.org/linuxppc-dev/", "list_archive_url_format": "https://lore.kernel.org/linuxppc-dev/{}/", "commit_url_format": "https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/commit/?id={}" }, "msgid": "<1505950480-14830-3-git-send-email-wei.guo.simon@gmail.com>", "list_archive_url": "https://lore.kernel.org/linuxppc-dev/1505950480-14830-3-git-send-email-wei.guo.simon@gmail.com/", "date": "2017-09-20T23:34:39", "name": "[v2,2/3] powerpc/64: enhance memcmp() with VMX instruction for long bytes comparision", "commit_ref": null, "pull_url": null, "state": "superseded", "archived": true, "hash": "8452b4f4059370944a05d975c399c87f8ca40d23", "submitter": { "id": 68632, "url": "http://patchwork.ozlabs.org/api/people/68632/?format=api", "name": "Simon Guo", "email": "wei.guo.simon@gmail.com" }, "delegate": null, "mbox": "http://patchwork.ozlabs.org/project/linuxppc-dev/patch/1505950480-14830-3-git-send-email-wei.guo.simon@gmail.com/mbox/", "series": [ { "id": 4540, "url": "http://patchwork.ozlabs.org/api/series/4540/?format=api", "web_url": "http://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=4540", "date": "2017-09-20T23:34:38", "name": "powerpc/64: memcmp() optimization", "version": 2, "mbox": "http://patchwork.ozlabs.org/series/4540/mbox/" } ], "comments": "http://patchwork.ozlabs.org/api/patches/817322/comments/", "check": "pending", "checks": "http://patchwork.ozlabs.org/api/patches/817322/checks/", "tags": {}, "related": [], "headers": { "Return-Path": "<linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>", "X-Original-To": [ "patchwork-incoming@ozlabs.org", "linuxppc-dev@lists.ozlabs.org" ], "Delivered-To": [ "patchwork-incoming@ozlabs.org", "linuxppc-dev@lists.ozlabs.org" ], "Received": [ "from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68])\n\t(using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3xz2TB4vvsz9sRW\n\tfor <patchwork-incoming@ozlabs.org>;\n\tFri, 22 Sep 2017 15:43:26 +1000 (AEST)", "from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\tby lists.ozlabs.org (Postfix) with ESMTP id 3xz2TB3nPwzDsMB\n\tfor <patchwork-incoming@ozlabs.org>;\n\tFri, 22 Sep 2017 15:43:26 +1000 (AEST)", "from mail-pg0-x244.google.com (mail-pg0-x244.google.com\n\t[IPv6:2607:f8b0:400e:c05::244])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128\n\tbits)) (No client certificate requested)\n\tby lists.ozlabs.org (Postfix) with ESMTPS id 3xz2My0QsVzDsN0\n\tfor <linuxppc-dev@lists.ozlabs.org>;\n\tFri, 22 Sep 2017 15:38:53 +1000 (AEST)", "by mail-pg0-x244.google.com with SMTP id i130so86197pgc.0\n\tfor <linuxppc-dev@lists.ozlabs.org>;\n\tThu, 21 Sep 2017 22:38:53 -0700 (PDT)", "from simonLocalRHEL7.x64 ([112.73.6.48])\n\tby smtp.gmail.com with ESMTPSA id\n\tr12sm6234639pfd.187.2017.09.21.22.38.48\n\t(version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);\n\tThu, 21 Sep 2017 22:38:51 -0700 (PDT)" ], "Authentication-Results": [ "ozlabs.org;\n\tdkim=fail reason=\"signature verification failed\" (2048-bit key;\n\tunprotected) header.d=gmail.com header.i=@gmail.com\n\theader.b=\"R4Nn8pPh\"; dkim-atps=neutral", "lists.ozlabs.org;\n\tdkim=fail reason=\"signature verification failed\" (2048-bit key;\n\tunprotected) header.d=gmail.com header.i=@gmail.com\n\theader.b=\"R4Nn8pPh\"; dkim-atps=neutral", "ozlabs.org;\n\tspf=pass (mailfrom) smtp.mailfrom=gmail.com\n\t(client-ip=2607:f8b0:400e:c05::244; helo=mail-pg0-x244.google.com;\n\tenvelope-from=wei.guo.simon@gmail.com; receiver=<UNKNOWN>)", "lists.ozlabs.org; dkim=pass (2048-bit key;\n\tunprotected) header.d=gmail.com header.i=@gmail.com\n\theader.b=\"R4Nn8pPh\"; dkim-atps=neutral" ], "DKIM-Signature": "v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;\n\th=from:to:cc:subject:date:message-id:in-reply-to:references;\n\tbh=RqGkfPYpj0WDlieHPH/XpVhhwhDfY/eypKzSzxABFys=;\n\tb=R4Nn8pPhsvrD6jZgKzjM8I0mocb83ZP4/R1yT5Egbo5jgPEHkrpKENumX99PvfkvTN\n\t/4c+eLFwnSYOnyTx7exKMdMDZ3am6khR95xL+JIMiDiqIYgLaT6Z+ExhhjmV84RS2tzG\n\tg1n9xBTQBpnIGEtwPWzvmD6GmBlF8TBojnLrLIej4E2Q22xVrnEtNqzcWyYlkuKAHwRt\n\tlTTl4pVxMnM6o0xC/uyW1dB0cawbrpHICtIFXN5H4sIm34tVgz7oCezimr2oJ6yDrbsi\n\t7OgmfEpjjq1eR5gwPC6GuX9FxdAJjiUsvn6BVfp8nz1HEhWLtGFi7f2lf8ZrIBU6b03Z\n\tHknQ==", "X-Google-DKIM-Signature": "v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=1e100.net; s=20161025;\n\th=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to\n\t:references;\n\tbh=RqGkfPYpj0WDlieHPH/XpVhhwhDfY/eypKzSzxABFys=;\n\tb=hCAeVP4py/xf0WBD04yRqblivA0OuuHyStsES/RtFWWgxFnH5o//l0nM5fDvkTmnex\n\tgLlJzwux4n1Y1sZMKGsarKvCvpDyQrO343fiV5wh4HcKfw1bo008DwB585dduL1hy5H8\n\tiPaZy0R0kFjLOGhJKZVIdyj+zNQ8NFTbR3YzxNcYZhJMfPMOnP9S0rlNyRRiqkaT0wfc\n\tuyzO3nITID0RkS1aqIc9Czj/lWixT46kDbIj9sA9GTtETI+ibqlG1h4OIWn5nzBeEn/y\n\tRCs+Wmq7qG2qcmK/VGfOzqae2eK0k0D8OFqu3LN209Q9xnekEenLE//a4kfc2agHj88q\n\t0C3Q==", "X-Gm-Message-State": "AHPjjUhMZGvD+47V0rmF4Al0pprBvEUz/BFKpgIyM8BWBK5Z70/ueGN7\n\t+N+4+JGvWjk1VP9JgJKOyUtM3Q==", "X-Google-Smtp-Source": "AOwi7QBpfaqxxLjBmwZ5t7AmjVM1lBsg+aXfN1x/t2vtYz/dHVA2LKU9e3rtUUShXd2h4Qao9YZhVA==", "X-Received": "by 10.99.188.25 with SMTP id q25mr6076227pge.54.1506058731717;\n\tThu, 21 Sep 2017 22:38:51 -0700 (PDT)", "From": "wei.guo.simon@gmail.com", "To": "linuxppc-dev@lists.ozlabs.org", "Subject": "[PATCH v2 2/3] powerpc/64: enhance memcmp() with VMX instruction for\n\tlong bytes comparision", "Date": "Thu, 21 Sep 2017 07:34:39 +0800", "Message-Id": "<1505950480-14830-3-git-send-email-wei.guo.simon@gmail.com>", "X-Mailer": "git-send-email 1.8.3.1", "In-Reply-To": "<1505950480-14830-1-git-send-email-wei.guo.simon@gmail.com>", "References": "<1505950480-14830-1-git-send-email-wei.guo.simon@gmail.com>", "X-BeenThere": "linuxppc-dev@lists.ozlabs.org", "X-Mailman-Version": "2.1.24", "Precedence": "list", "List-Id": "Linux on PowerPC Developers Mail List\n\t<linuxppc-dev.lists.ozlabs.org>", "List-Unsubscribe": "<https://lists.ozlabs.org/options/linuxppc-dev>,\n\t<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=unsubscribe>", "List-Archive": "<http://lists.ozlabs.org/pipermail/linuxppc-dev/>", "List-Post": "<mailto:linuxppc-dev@lists.ozlabs.org>", "List-Help": "<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=help>", "List-Subscribe": "<https://lists.ozlabs.org/listinfo/linuxppc-dev>,\n\t<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=subscribe>", "Cc": "Simon Guo <wei.guo.simon@gmail.com>,\n\tDavid Laight <David.Laight@ACULAB.COM>, \n\t\"Naveen N. Rao\" <naveen.n.rao@linux.vnet.ibm.com>", "Errors-To": "linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org", "Sender": "\"Linuxppc-dev\"\n\t<linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>" }, "content": "From: Simon Guo <wei.guo.simon@gmail.com>\n\nThis patch add VMX primitives to do memcmp() in case the compare size\nexceeds 4K bytes.\n\nTest result with following test program(replace the \"^>\" with \"\"):\n------\n># cat tools/testing/selftests/powerpc/stringloops/memcmp.c\n>#include <malloc.h>\n>#include <stdlib.h>\n>#include <string.h>\n>#include <time.h>\n>#include \"utils.h\"\n>#define SIZE (1024 * 1024 * 900)\n>#define ITERATIONS 40\n\nint test_memcmp(const void *s1, const void *s2, size_t n);\n\nstatic int testcase(void)\n{\n char *s1;\n char *s2;\n unsigned long i;\n\n s1 = memalign(128, SIZE);\n if (!s1) {\n perror(\"memalign\");\n exit(1);\n }\n\n s2 = memalign(128, SIZE);\n if (!s2) {\n perror(\"memalign\");\n exit(1);\n }\n\n for (i = 0; i < SIZE; i++) {\n s1[i] = i & 0xff;\n s2[i] = i & 0xff;\n }\n for (i = 0; i < ITERATIONS; i++) {\n\t\tint ret = test_memcmp(s1, s2, SIZE);\n\n\t\tif (ret) {\n\t\t\tprintf(\"return %d at[%ld]! should have returned zero\\n\", ret, i);\n\t\t\tabort();\n\t\t}\n\t}\n\n return 0;\n}\n\nint main(void)\n{\n return test_harness(testcase, \"memcmp\");\n}\n------\nWithout VMX patch:\n 7.435191479 seconds time elapsed ( +- 0.51% )\nWith VMX patch:\n 6.802038938 seconds time elapsed ( +- 0.56% )\n\t\tThere is ~+8% improvement.\n\nHowever I am not aware whether there is use case in kernel for memcmp on\nlarge size yet.\n\nSigned-off-by: Simon Guo <wei.guo.simon@gmail.com>\n---\n arch/powerpc/include/asm/asm-prototypes.h | 2 +-\n arch/powerpc/lib/copypage_power7.S | 2 +-\n arch/powerpc/lib/memcmp_64.S | 82 +++++++++++++++++++++++++++++++\n arch/powerpc/lib/memcpy_power7.S | 2 +-\n arch/powerpc/lib/vmx-helper.c | 2 +-\n 5 files changed, 86 insertions(+), 4 deletions(-)", "diff": "diff --git a/arch/powerpc/include/asm/asm-prototypes.h b/arch/powerpc/include/asm/asm-prototypes.h\nindex 7330150..e6530d8 100644\n--- a/arch/powerpc/include/asm/asm-prototypes.h\n+++ b/arch/powerpc/include/asm/asm-prototypes.h\n@@ -49,7 +49,7 @@ void __trace_hcall_exit(long opcode, unsigned long retval,\n /* VMX copying */\n int enter_vmx_usercopy(void);\n int exit_vmx_usercopy(void);\n-int enter_vmx_copy(void);\n+int enter_vmx_ops(void);\n void * exit_vmx_copy(void *dest);\n \n /* Traps */\ndiff --git a/arch/powerpc/lib/copypage_power7.S b/arch/powerpc/lib/copypage_power7.S\nindex ca5fc8f..9e7729e 100644\n--- a/arch/powerpc/lib/copypage_power7.S\n+++ b/arch/powerpc/lib/copypage_power7.S\n@@ -60,7 +60,7 @@ _GLOBAL(copypage_power7)\n \tstd\tr4,-STACKFRAMESIZE+STK_REG(R30)(r1)\n \tstd\tr0,16(r1)\n \tstdu\tr1,-STACKFRAMESIZE(r1)\n-\tbl\tenter_vmx_copy\n+\tbl\tenter_vmx_ops\n \tcmpwi\tr3,0\n \tld\tr0,STACKFRAMESIZE+16(r1)\n \tld\tr3,STK_REG(R31)(r1)\ndiff --git a/arch/powerpc/lib/memcmp_64.S b/arch/powerpc/lib/memcmp_64.S\nindex 6dccfb8..40218fc 100644\n--- a/arch/powerpc/lib/memcmp_64.S\n+++ b/arch/powerpc/lib/memcmp_64.S\n@@ -162,6 +162,13 @@ _GLOBAL(memcmp)\n \tblr\n \n .Llong:\n+#ifdef CONFIG_ALTIVEC\n+\t/* Try to use vmx loop if length is larger than 4K */\n+\tcmpldi cr6,r5,4096\n+\tbgt\tcr6,.Lvmx_cmp\n+\n+.Llong_novmx_cmp:\n+#endif\n \tli\toff8,8\n \tli\toff16,16\n \tli\toff24,24\n@@ -319,4 +326,79 @@ _GLOBAL(memcmp)\n 8:\n \tblr\n \n+#ifdef CONFIG_ALTIVEC\n+.Lvmx_cmp:\n+\tmflr r0\n+\tstd r3,-STACKFRAMESIZE+STK_REG(R31)(r1)\n+\tstd r4,-STACKFRAMESIZE+STK_REG(R30)(r1)\n+\tstd r5,-STACKFRAMESIZE+STK_REG(R29)(r1)\n+\tstd r0,16(r1)\n+\tstdu r1,-STACKFRAMESIZE(r1)\n+\tbl enter_vmx_ops\n+\tcmpwi cr1,r3,0\n+\tld r0,STACKFRAMESIZE+16(r1)\n+\tld r3,STK_REG(R31)(r1)\n+\tld r4,STK_REG(R30)(r1)\n+\tld r5,STK_REG(R29)(r1)\n+\taddi\tr1,r1,STACKFRAMESIZE\n+\tmtlr r0\n+\tbeq cr1,.Llong_novmx_cmp\n+\n+3:\n+\t/* Enter with src/dst address 8 bytes aligned, and len is\n+\t * no less than 4KB. Need to align with 16 bytes further.\n+\t */\n+\tandi.\trA,r3,8\n+\tbeq\t4f\n+\tLD\trA,0,r3\n+\tLD\trB,0,r4\n+\tcmpld\tcr0,rA,rB\n+\tbne\tcr0,.LcmpAB_lightweight\n+\n+\taddi\tr3,r3,8\n+\taddi\tr4,r4,8\n+\taddi\tr5,r5,-8\n+\n+4:\n+\t/* compare 32 bytes for each loop */\n+\tsrdi\tr0,r5,5\n+\tmtctr\tr0\n+\tandi.\tr5,r5,31\n+\tli\toff16,16\n+\n+.balign 16\n+5:\n+\tlvx \tv0,0,r3\n+\tlvx \tv1,0,r4\n+\tvcmpequd. v0,v0,v1\n+\tbf\t24,7f\n+\tlvx \tv0,off16,r3\n+\tlvx \tv1,off16,r4\n+\tvcmpequd. v0,v0,v1\n+\tbf\t24,6f\n+\taddi\tr3,r3,32\n+\taddi\tr4,r4,32\n+\tbdnz\t5b\n+\n+\tcmpdi\tr5,0\n+\tbeq\t.Lzero\n+\tb\t.L8bytes_aligned\n+\n+6:\n+\taddi\tr3,r3,16\n+\taddi\tr4,r4,16\n+\n+7:\n+\tLD\trA,0,r3\n+\tLD\trB,0,r4\n+\tcmpld\tcr0,rA,rB\n+\tbne\tcr0,.LcmpAB_lightweight\n+\n+\tli\toff8,8\n+\tLD\trA,off8,r3\n+\tLD\trB,off8,r4\n+\tcmpld\tcr0,rA,rB\n+\tbne\tcr0,.LcmpAB_lightweight\n+\tb\t.Lzero\n+#endif\n EXPORT_SYMBOL(memcmp)\ndiff --git a/arch/powerpc/lib/memcpy_power7.S b/arch/powerpc/lib/memcpy_power7.S\nindex 193909a..682e386 100644\n--- a/arch/powerpc/lib/memcpy_power7.S\n+++ b/arch/powerpc/lib/memcpy_power7.S\n@@ -230,7 +230,7 @@ _GLOBAL(memcpy_power7)\n \tstd\tr5,-STACKFRAMESIZE+STK_REG(R29)(r1)\n \tstd\tr0,16(r1)\n \tstdu\tr1,-STACKFRAMESIZE(r1)\n-\tbl\tenter_vmx_copy\n+\tbl\tenter_vmx_ops\n \tcmpwi\tcr1,r3,0\n \tld\tr0,STACKFRAMESIZE+16(r1)\n \tld\tr3,STK_REG(R31)(r1)\ndiff --git a/arch/powerpc/lib/vmx-helper.c b/arch/powerpc/lib/vmx-helper.c\nindex bf925cd..923a9ab 100644\n--- a/arch/powerpc/lib/vmx-helper.c\n+++ b/arch/powerpc/lib/vmx-helper.c\n@@ -53,7 +53,7 @@ int exit_vmx_usercopy(void)\n \treturn 0;\n }\n \n-int enter_vmx_copy(void)\n+int enter_vmx_ops(void)\n {\n \tif (in_interrupt())\n \t\treturn 0;\n", "prefixes": [ "v2", "2/3" ] }