Patch Detail
get:
Show a patch.
patch:
Update a patch.
put:
Update a patch.
GET /api/patches/810508/?format=api
{ "id": 810508, "url": "http://patchwork.ozlabs.org/api/patches/810508/?format=api", "web_url": "http://patchwork.ozlabs.org/project/gcc/patch/07fc21e1-8747-eca0-4dde-4f364ef1a414@foss.arm.com/", "project": { "id": 17, "url": "http://patchwork.ozlabs.org/api/projects/17/?format=api", "name": "GNU Compiler Collection", "link_name": "gcc", "list_id": "gcc-patches.gcc.gnu.org", "list_email": "gcc-patches@gcc.gnu.org", "web_url": null, "scm_url": null, "webscm_url": null, "list_archive_url": "", "list_archive_url_format": "", "commit_url_format": "" }, "msgid": "<07fc21e1-8747-eca0-4dde-4f364ef1a414@foss.arm.com>", "list_archive_url": null, "date": "2017-09-06T10:52:31", "name": "[AArch64] Merge stores of D register values of different modes", "commit_ref": null, "pull_url": null, "state": "new", "archived": false, "hash": "d58e9842462812bbade719095c684badd15799e5", "submitter": { "id": 71950, "url": "http://patchwork.ozlabs.org/api/people/71950/?format=api", "name": "Jackson Woodruff", "email": "jackson.woodruff@foss.arm.com" }, "delegate": null, "mbox": "http://patchwork.ozlabs.org/project/gcc/patch/07fc21e1-8747-eca0-4dde-4f364ef1a414@foss.arm.com/mbox/", "series": [ { "id": 1763, "url": "http://patchwork.ozlabs.org/api/series/1763/?format=api", "web_url": "http://patchwork.ozlabs.org/project/gcc/list/?series=1763", "date": "2017-09-06T10:52:31", "name": "[AArch64] Merge stores of D register values of different modes", "version": 1, "mbox": "http://patchwork.ozlabs.org/series/1763/mbox/" } ], "comments": "http://patchwork.ozlabs.org/api/patches/810508/comments/", "check": "pending", "checks": "http://patchwork.ozlabs.org/api/patches/810508/checks/", "tags": {}, "related": [], "headers": { "Return-Path": "<gcc-patches-return-461584-incoming=patchwork.ozlabs.org@gcc.gnu.org>", "X-Original-To": "incoming@patchwork.ozlabs.org", "Delivered-To": [ "patchwork-incoming@bilbo.ozlabs.org", "mailing list gcc-patches@gcc.gnu.org" ], "Authentication-Results": [ "ozlabs.org;\n\tspf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org\n\t(client-ip=209.132.180.131; helo=sourceware.org;\n\tenvelope-from=gcc-patches-return-461584-incoming=patchwork.ozlabs.org@gcc.gnu.org;\n\treceiver=<UNKNOWN>)", "ozlabs.org; dkim=pass (1024-bit key;\n\tunprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org\n\theader.b=\"NV0FSKAI\"; dkim-atps=neutral", "sourceware.org; auth=none" ], "Received": [ "from sourceware.org (server1.sourceware.org [209.132.180.131])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3xnL5Y44g9z9s3T\n\tfor <incoming@patchwork.ozlabs.org>;\n\tWed, 6 Sep 2017 20:52:48 +1000 (AEST)", "(qmail 61090 invoked by alias); 6 Sep 2017 10:52:40 -0000", "(qmail 61059 invoked by uid 89); 6 Sep 2017 10:52:38 -0000", "from foss.arm.com (HELO foss.arm.com) (217.140.101.70) by\n\tsourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP;\n\tWed, 06 Sep 2017 10:52:35 +0000", "from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249])\tby\n\tusa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id\n\t12AD080D; Wed, 6 Sep 2017 03:52:34 -0700 (PDT)", "from [10.2.206.195] (e112997-lin.cambridge.arm.com\n\t[10.2.206.195])\tby usa-sjc-imap-foss1.foss.arm.com (Postfix)\n\twith ESMTPSA id 2F1A23F3E1; Wed, 6 Sep 2017 03:52:33 -0700 (PDT)" ], "DomainKey-Signature": "a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id\n\t:list-unsubscribe:list-archive:list-post:list-help:sender\n\t:subject:from:to:references:message-id:date:mime-version\n\t:in-reply-to:content-type; q=dns; s=default; b=EmoNMyyCeOKeDz9y6\n\ttCPN84wT/ww9Jyrx1+BX6miE5GInNkxHF7bFtr34MUVPdWjlSBtjCY5Cwyydbtfx\n\tUDEF2GpWJj0holdwKeVzbTFDA1SP43dqVaID6POCSf9/gySsAh8W9frkGV1cpEok\n\t5U1MFz2Wvs2WoexvYUbaOU0JsU=", "DKIM-Signature": "v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id\n\t:list-unsubscribe:list-archive:list-post:list-help:sender\n\t:subject:from:to:references:message-id:date:mime-version\n\t:in-reply-to:content-type; s=default; bh=RcNYC62j544z9u+KJFmMy4y\n\tRnxE=; b=NV0FSKAIH1sgVHe+NJlez5NRDTl54pjenFoj0tD977l6+rM+6xRLSLF\n\toTK/ffxZRTy5UR8tb3E0zLiH+ePBCLyjajvqd/VZAzLsefEMC4u25eIdcN4DU1Ta\n\t+imhYnUiOywnKfAjSrMyBNgWK0A+Uewjp6UJv7iBQj465S79MGEw=", "Mailing-List": "contact gcc-patches-help@gcc.gnu.org; run by ezmlm", "Precedence": "bulk", "List-Id": "<gcc-patches.gcc.gnu.org>", "List-Unsubscribe": "<mailto:gcc-patches-unsubscribe-incoming=patchwork.ozlabs.org@gcc.gnu.org>", "List-Archive": "<http://gcc.gnu.org/ml/gcc-patches/>", "List-Post": "<mailto:gcc-patches@gcc.gnu.org>", "List-Help": "<mailto:gcc-patches-help@gcc.gnu.org>", "Sender": "gcc-patches-owner@gcc.gnu.org", "X-Virus-Found": "No", "X-Spam-SWARE-Status": "No, score=-25.7 required=5.0 tests=BAYES_00, GIT_PATCH_0,\n\tGIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3,\n\tKAM_LAZY_DOMAIN_SECURITY, KAM_LOTSOFHASH,\n\tRP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=", "X-HELO": "foss.arm.com", "Subject": "[AArch64] Merge stores of D register values of different modes", "From": "Jackson Woodruff <jackson.woodruff@foss.arm.com>", "To": "GCC Patches <gcc-patches@gcc.gnu.org>, James.Greenhalgh@arm.com,\n\tRichard.Earnshaw@arm.com", "References": "<93501be4-3408-a2d9-f5e1-56a5569bda96@foss.arm.com>", "Message-ID": "<07fc21e1-8747-eca0-4dde-4f364ef1a414@foss.arm.com>", "Date": "Wed, 6 Sep 2017 11:52:31 +0100", "User-Agent": "Mozilla/5.0 (X11; Linux x86_64;\n\trv:52.0) Gecko/20100101 Thunderbird/52.3.0", "MIME-Version": "1.0", "In-Reply-To": "<93501be4-3408-a2d9-f5e1-56a5569bda96@foss.arm.com>", "Content-Type": "multipart/mixed;\n\tboundary=\"------------BB4D599AF5246DA13FCABCAB\"", "X-IsSubscribed": "yes" }, "content": "Hi all,\n\nThis patch merges loads and stores from D-registers that are of \ndifferent modes.\n\nCode like this:\n\n typedef int __attribute__((vector_size(8))) vec;\n struct pair\n {\n vec v;\n double d;\n }\n\n void\n assign (struct pair *p, vec v)\n {\n p->v = v;\n p->d = 1.0;\n }\n\nNow generates a stp instruction whereas previously it generated two \n`str` instructions. Likewise for loads.\n\nI have taken the opportunity to merge some of the patterns into a single \npattern. Previously, we had different patterns for DI, DF, SI, SF modes. \nThe patch uses the new iterators to reduce these to two patterns.\n\n\nThis patch also merges storing of double zero values with\nlong integer values:\n\n struct pair\n {\n long long l;\n double d;\n }\n\n void\n foo (struct pair *p)\n {\n p->l = 10;\n p->d = 0.0;\n }\n\nNow generates a single store pair instruction rather than two `str` \ninstructions.\n\nBootstrap and testsuite run OK. OK for trunk?\n\nJackson\n\ngcc/\n\n2017-07-21 Jackson Woodruff <jackson.woodruff@arm.com>\n\n\t* config/aarch64/aarch64.md: New patterns to generate stp\n\tand ldp.\n\t* config/aarch64/aarch64-ldpstp.md: Modified peephole\n\tfor different mode ldpstp and added peephole for merge zero\n\tstores. Likewise for loads.\n\t* config/aarch64/aarch64.c (aarch64_operands_ok_for_ldpstp):\n\tAdded size check.\n\t(aarch64_gen_store_pair): Rename calls to match new patterns.\n\t(aarch64_gen_load_pair): Rename calls to match new patterns.\n\t* config/aarch64/aarch64-simd.md (store_pair<mode>): Updated\n\tpattern to match two modes.\n\t(store_pair_sw, store_pair_dw): New patterns to generate stp for\n\tsingle words and double words.\n\t(load_pair_sw, load_pair_dw): Likewise.\n\t(store_pair_sf, store_pair_df, store_pair_si, store_pair_di):\n\tRemoved.\n\t(load_pair_sf, load_pair_df, load_pair_si, load_pair_di):\n\tRemoved.\n\t* config/aarch64/iterators.md: New mode iterators for\n\ttypes in d registers and duplicate DX and SX modes.\n\tNew iterator for DI, DF, SI, SF.\n\t* config/aarch64/predicates.md (aarch64_reg_zero_or_fp_zero):\n\tNew.\n\n\ngcc/testsuite/\n\n2017-07-21 Jackson Woodruff <jackson.woodruff@arm.com>\n\n\t* gcc.target/aarch64/ldp_stp_6.c: New.\n\t* gcc.target/aarch64/ldp_stp_7.c: New.\n\t* gcc.target/aarch64/ldp_stp_8.c: New.", "diff": "diff --git a/gcc/config/aarch64/aarch64-ldpstp.md b/gcc/config/aarch64/aarch64-ldpstp.md\nindex e8dda42c2dd1e30c4607c67a2156ff7813bd89ea..14e860d258e548d4118d957675f8bdbb74615337 100644\n--- a/gcc/config/aarch64/aarch64-ldpstp.md\n+++ b/gcc/config/aarch64/aarch64-ldpstp.md\n@@ -99,10 +99,10 @@\n })\n \n (define_peephole2\n- [(set (match_operand:VD 0 \"register_operand\" \"\")\n-\t(match_operand:VD 1 \"aarch64_mem_pair_operand\" \"\"))\n- (set (match_operand:VD 2 \"register_operand\" \"\")\n-\t(match_operand:VD 3 \"memory_operand\" \"\"))]\n+ [(set (match_operand:DREG 0 \"register_operand\" \"\")\n+\t(match_operand:DREG 1 \"aarch64_mem_pair_operand\" \"\"))\n+ (set (match_operand:DREG2 2 \"register_operand\" \"\")\n+\t(match_operand:DREG2 3 \"memory_operand\" \"\"))]\n \"aarch64_operands_ok_for_ldpstp (operands, true, <MODE>mode)\"\n [(parallel [(set (match_dup 0) (match_dup 1))\n \t (set (match_dup 2) (match_dup 3))])]\n@@ -119,11 +119,12 @@\n })\n \n (define_peephole2\n- [(set (match_operand:VD 0 \"aarch64_mem_pair_operand\" \"\")\n-\t(match_operand:VD 1 \"register_operand\" \"\"))\n- (set (match_operand:VD 2 \"memory_operand\" \"\")\n-\t(match_operand:VD 3 \"register_operand\" \"\"))]\n- \"TARGET_SIMD && aarch64_operands_ok_for_ldpstp (operands, false, <MODE>mode)\"\n+ [(set (match_operand:DREG 0 \"aarch64_mem_pair_operand\" \"\")\n+\t(match_operand:DREG 1 \"register_operand\" \"\"))\n+ (set (match_operand:DREG2 2 \"memory_operand\" \"\")\n+\t(match_operand:DREG2 3 \"register_operand\" \"\"))]\n+ \"TARGET_SIMD\n+ && aarch64_operands_ok_for_ldpstp (operands, false, <DREG:MODE>mode)\"\n [(parallel [(set (match_dup 0) (match_dup 1))\n \t (set (match_dup 2) (match_dup 3))])]\n {\n@@ -138,7 +139,6 @@\n }\n })\n \n-\n ;; Handle sign/zero extended consecutive load/store.\n \n (define_peephole2\n@@ -181,6 +181,30 @@\n }\n })\n \n+;; Handle storing of a floating point zero.\n+;; We can match modes that won't work for a stp instruction\n+;; as aarch64_operands_ok_for_ldpstp checks that the modes are\n+;; compatible.\n+(define_peephole2\n+ [(set (match_operand:DSX 0 \"aarch64_mem_pair_operand\" \"\")\n+\t(match_operand:DSX 1 \"aarch64_reg_zero_or_fp_zero\" \"\"))\n+ (set (match_operand:<FCVT_TARGET> 2 \"memory_operand\" \"\")\n+\t(match_operand:<FCVT_TARGET> 3 \"aarch64_reg_zero_or_fp_zero\" \"\"))]\n+ \"aarch64_operands_ok_for_ldpstp (operands, false, DImode)\"\n+ [(parallel [(set (match_dup 0) (match_dup 1))\n+\t (set (match_dup 2) (match_dup 3))])]\n+{\n+ rtx base, offset_1, offset_2;\n+\n+ extract_base_offset_in_addr (operands[0], &base, &offset_1);\n+ extract_base_offset_in_addr (operands[2], &base, &offset_2);\n+ if (INTVAL (offset_1) > INTVAL (offset_2))\n+ {\n+ std::swap (operands[0], operands[2]);\n+ std::swap (operands[1], operands[3]);\n+ }\n+})\n+\n ;; Handle consecutive load/store whose offset is out of the range\n ;; supported by ldp/ldpsw/stp. We firstly adjust offset in a scratch\n ;; register, then merge them into ldp/ldpsw/stp by using the adjusted\ndiff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md\nindex f3e084f8778d70c82823b92fa80ff96021ad26db..34f321a117cb96211a69119939fc518504bbf1a4 100644\n--- a/gcc/config/aarch64/aarch64-simd.md\n+++ b/gcc/config/aarch64/aarch64-simd.md\n@@ -172,11 +172,11 @@\n [(set_attr \"type\" \"neon_store1_1reg<q>\")]\n )\n \n-(define_insn \"load_pair<mode>\"\n- [(set (match_operand:VD 0 \"register_operand\" \"=w\")\n-\t(match_operand:VD 1 \"aarch64_mem_pair_operand\" \"Ump\"))\n- (set (match_operand:VD 2 \"register_operand\" \"=w\")\n-\t(match_operand:VD 3 \"memory_operand\" \"m\"))]\n+(define_insn \"load_pair<DREG:mode><DREG2:mode>\"\n+ [(set (match_operand:DREG 0 \"register_operand\" \"=w\")\n+\t(match_operand:DREG 1 \"aarch64_mem_pair_operand\" \"Ump\"))\n+ (set (match_operand:DREG2 2 \"register_operand\" \"=w\")\n+\t(match_operand:DREG2 3 \"memory_operand\" \"m\"))]\n \"TARGET_SIMD\n && rtx_equal_p (XEXP (operands[3], 0),\n \t\t plus_constant (Pmode,\n@@ -186,11 +186,11 @@\n [(set_attr \"type\" \"neon_ldp\")]\n )\n \n-(define_insn \"store_pair<mode>\"\n- [(set (match_operand:VD 0 \"aarch64_mem_pair_operand\" \"=Ump\")\n-\t(match_operand:VD 1 \"register_operand\" \"w\"))\n- (set (match_operand:VD 2 \"memory_operand\" \"=m\")\n-\t(match_operand:VD 3 \"register_operand\" \"w\"))]\n+(define_insn \"vec_store_pair<DREG:mode><DREG2:mode>\"\n+ [(set (match_operand:DREG 0 \"aarch64_mem_pair_operand\" \"=Ump\")\n+\t(match_operand:DREG 1 \"register_operand\" \"w\"))\n+ (set (match_operand:DREG2 2 \"memory_operand\" \"=m\")\n+\t(match_operand:DREG2 3 \"register_operand\" \"w\"))]\n \"TARGET_SIMD\n && rtx_equal_p (XEXP (operands[2], 0),\n \t\t plus_constant (Pmode,\ndiff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c\nindex 28c4e0e64766060851c0c7cd6b86995fae25353d..a3bd1b1180903703d33ca822d06afc74f1748c44 100644\n--- a/gcc/config/aarch64/aarch64.c\n+++ b/gcc/config/aarch64/aarch64.c\n@@ -3179,10 +3179,10 @@ aarch64_gen_store_pair (machine_mode mode, rtx mem1, rtx reg1, rtx mem2,\n switch (mode)\n {\n case DImode:\n- return gen_store_pairdi (mem1, reg1, mem2, reg2);\n+ return gen_store_pair_dw_DIDI (mem1, reg1, mem2, reg2);\n \n case DFmode:\n- return gen_store_pairdf (mem1, reg1, mem2, reg2);\n+ return gen_store_pair_dw_DFDF (mem1, reg1, mem2, reg2);\n \n default:\n gcc_unreachable ();\n@@ -3199,10 +3199,10 @@ aarch64_gen_load_pair (machine_mode mode, rtx reg1, rtx mem1, rtx reg2,\n switch (mode)\n {\n case DImode:\n- return gen_load_pairdi (reg1, mem1, reg2, mem2);\n+ return gen_load_pair_dw_DIDI (reg1, mem1, reg2, mem2);\n \n case DFmode:\n- return gen_load_pairdf (reg1, mem1, reg2, mem2);\n+ return gen_load_pair_dw_DFDF (reg1, mem1, reg2, mem2);\n \n default:\n gcc_unreachable ();\n@@ -14712,6 +14712,11 @@ aarch64_operands_ok_for_ldpstp (rtx *operands, bool load,\n if (!rtx_equal_p (base_1, base_2))\n return false;\n \n+ /* Check that the operands are of the same size. */\n+ if (GET_MODE_SIZE (GET_MODE (mem_1))\n+ != GET_MODE_SIZE (GET_MODE (mem_2)))\n+ return false;\n+\n offval_1 = INTVAL (offset_1);\n offval_2 = INTVAL (offset_2);\n msize = GET_MODE_SIZE (mode);\ndiff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md\nindex c1bca07308d84f50a6fa5af116f0fa20589882db..46affe8c63a58bd60b993349555e81c4c5008113 100644\n--- a/gcc/config/aarch64/aarch64.md\n+++ b/gcc/config/aarch64/aarch64.md\n@@ -1220,141 +1220,76 @@\n \n ;; Operands 1 and 3 are tied together by the final condition; so we allow\n ;; fairly lax checking on the second memory operation.\n-(define_insn \"load_pairsi\"\n- [(set (match_operand:SI 0 \"register_operand\" \"=r,*w\")\n-\t(match_operand:SI 1 \"aarch64_mem_pair_operand\" \"Ump,Ump\"))\n- (set (match_operand:SI 2 \"register_operand\" \"=r,*w\")\n-\t(match_operand:SI 3 \"memory_operand\" \"m,m\"))]\n- \"rtx_equal_p (XEXP (operands[3], 0),\n-\t\tplus_constant (Pmode,\n-\t\t\t XEXP (operands[1], 0),\n-\t\t\t GET_MODE_SIZE (SImode)))\"\n+(define_insn \"load_pair_sw_<SX:MODE><SX2:MODE>\"\n+ [(set (match_operand:SX 0 \"register_operand\" \"=r,w\")\n+\t(match_operand:SX 1 \"aarch64_mem_pair_operand\" \"Ump,Ump\"))\n+ (set (match_operand:SX2 2 \"register_operand\" \"=r,w\")\n+\t(match_operand:SX2 3 \"memory_operand\" \"m,m\"))]\n+ \"rtx_equal_p (XEXP (operands[3], 0),\n+\t\t plus_constant (Pmode,\n+\t\t\t\tXEXP (operands[1], 0),\n+\t\t\t\tGET_MODE_SIZE (<MODE>mode)))\"\n \"@\n- ldp\\\\t%w0, %w2, %1\n- ldp\\\\t%s0, %s2, %1\"\n+ ldp\\t%w0, %w2, %1\n+ ldp\\t%s0, %s2, %1\"\n [(set_attr \"type\" \"load2,neon_load1_2reg\")\n (set_attr \"fp\" \"*,yes\")]\n )\n \n-(define_insn \"load_pairdi\"\n- [(set (match_operand:DI 0 \"register_operand\" \"=r,*w\")\n-\t(match_operand:DI 1 \"aarch64_mem_pair_operand\" \"Ump,Ump\"))\n- (set (match_operand:DI 2 \"register_operand\" \"=r,*w\")\n-\t(match_operand:DI 3 \"memory_operand\" \"m,m\"))]\n- \"rtx_equal_p (XEXP (operands[3], 0),\n-\t\tplus_constant (Pmode,\n-\t\t\t XEXP (operands[1], 0),\n-\t\t\t GET_MODE_SIZE (DImode)))\"\n+;; Storing different modes that can still be merged\n+(define_insn \"load_pair_dw_<DX:MODE><DX2:MODE>\"\n+ [(set (match_operand:DX 0 \"register_operand\" \"=r,w\")\n+\t(match_operand:DX 1 \"aarch64_mem_pair_operand\" \"Ump,Ump\"))\n+ (set (match_operand:DX2 2 \"register_operand\" \"=r,w\")\n+\t(match_operand:DX2 3 \"memory_operand\" \"m,m\"))]\n+ \"rtx_equal_p (XEXP (operands[3], 0),\n+\t\t plus_constant (Pmode,\n+\t\t\t\tXEXP (operands[1], 0),\n+\t\t\t\tGET_MODE_SIZE (<MODE>mode)))\"\n \"@\n- ldp\\\\t%x0, %x2, %1\n- ldp\\\\t%d0, %d2, %1\"\n+ ldp\\t%x0, %x2, %1\n+ ldp\\t%d0, %d2, %1\"\n [(set_attr \"type\" \"load2,neon_load1_2reg\")\n (set_attr \"fp\" \"*,yes\")]\n )\n \n \n+\n ;; Operands 0 and 2 are tied together by the final condition; so we allow\n ;; fairly lax checking on the second memory operation.\n-(define_insn \"store_pairsi\"\n- [(set (match_operand:SI 0 \"aarch64_mem_pair_operand\" \"=Ump,Ump\")\n-\t(match_operand:SI 1 \"aarch64_reg_or_zero\" \"rZ,*w\"))\n- (set (match_operand:SI 2 \"memory_operand\" \"=m,m\")\n-\t(match_operand:SI 3 \"aarch64_reg_or_zero\" \"rZ,*w\"))]\n- \"rtx_equal_p (XEXP (operands[2], 0),\n-\t\tplus_constant (Pmode,\n-\t\t\t XEXP (operands[0], 0),\n-\t\t\t GET_MODE_SIZE (SImode)))\"\n+(define_insn \"store_pair_sw_<SX:MODE><SX2:MODE>\"\n+ [(set (match_operand:SX 0 \"aarch64_mem_pair_operand\" \"=Ump,Ump\")\n+\t(match_operand:SX 1 \"aarch64_reg_zero_or_fp_zero\" \"rYZ,w\"))\n+ (set (match_operand:SX2 2 \"memory_operand\" \"=m,m\")\n+\t(match_operand:SX2 3 \"aarch64_reg_zero_or_fp_zero\" \"rYZ,w\"))]\n+ \"rtx_equal_p (XEXP (operands[2], 0),\n+\t\t plus_constant (Pmode,\n+\t\t\t\tXEXP (operands[0], 0),\n+\t\t\t\tGET_MODE_SIZE (<MODE>mode)))\"\n \"@\n- stp\\\\t%w1, %w3, %0\n- stp\\\\t%s1, %s3, %0\"\n+ stp\\t%w1, %w3, %0\n+ stp\\t%s1, %s3, %0\"\n [(set_attr \"type\" \"store2,neon_store1_2reg\")\n (set_attr \"fp\" \"*,yes\")]\n )\n \n-(define_insn \"store_pairdi\"\n- [(set (match_operand:DI 0 \"aarch64_mem_pair_operand\" \"=Ump,Ump\")\n-\t(match_operand:DI 1 \"aarch64_reg_or_zero\" \"rZ,*w\"))\n- (set (match_operand:DI 2 \"memory_operand\" \"=m,m\")\n-\t(match_operand:DI 3 \"aarch64_reg_or_zero\" \"rZ,*w\"))]\n- \"rtx_equal_p (XEXP (operands[2], 0),\n-\t\tplus_constant (Pmode,\n-\t\t\t XEXP (operands[0], 0),\n-\t\t\t GET_MODE_SIZE (DImode)))\"\n+;; Storing different modes that can still be merged\n+(define_insn \"store_pair_dw_<DX:MODE><DX2:MODE>\"\n+ [(set (match_operand:DX 0 \"aarch64_mem_pair_operand\" \"=Ump,Ump\")\n+\t(match_operand:DX 1 \"aarch64_reg_zero_or_fp_zero\" \"rYZ,w\"))\n+ (set (match_operand:DX2 2 \"memory_operand\" \"=m,m\")\n+\t(match_operand:DX2 3 \"aarch64_reg_zero_or_fp_zero\" \"rYZ,w\"))]\n+ \"rtx_equal_p (XEXP (operands[2], 0),\n+\t\t plus_constant (Pmode,\n+\t\t\t\tXEXP (operands[0], 0),\n+\t\t\t\tGET_MODE_SIZE (<MODE>mode)))\"\n \"@\n- stp\\\\t%x1, %x3, %0\n- stp\\\\t%d1, %d3, %0\"\n+ stp\\t%x1, %x3, %0\n+ stp\\t%d1, %d3, %0\"\n [(set_attr \"type\" \"store2,neon_store1_2reg\")\n (set_attr \"fp\" \"*,yes\")]\n )\n \n-;; Operands 1 and 3 are tied together by the final condition; so we allow\n-;; fairly lax checking on the second memory operation.\n-(define_insn \"load_pairsf\"\n- [(set (match_operand:SF 0 \"register_operand\" \"=w,*r\")\n-\t(match_operand:SF 1 \"aarch64_mem_pair_operand\" \"Ump,Ump\"))\n- (set (match_operand:SF 2 \"register_operand\" \"=w,*r\")\n-\t(match_operand:SF 3 \"memory_operand\" \"m,m\"))]\n- \"rtx_equal_p (XEXP (operands[3], 0),\n-\t\tplus_constant (Pmode,\n-\t\t\t XEXP (operands[1], 0),\n-\t\t\t GET_MODE_SIZE (SFmode)))\"\n- \"@\n- ldp\\\\t%s0, %s2, %1\n- ldp\\\\t%w0, %w2, %1\"\n- [(set_attr \"type\" \"neon_load1_2reg,load2\")\n- (set_attr \"fp\" \"yes,*\")]\n-)\n-\n-(define_insn \"load_pairdf\"\n- [(set (match_operand:DF 0 \"register_operand\" \"=w,*r\")\n-\t(match_operand:DF 1 \"aarch64_mem_pair_operand\" \"Ump,Ump\"))\n- (set (match_operand:DF 2 \"register_operand\" \"=w,*r\")\n-\t(match_operand:DF 3 \"memory_operand\" \"m,m\"))]\n- \"rtx_equal_p (XEXP (operands[3], 0),\n-\t\tplus_constant (Pmode,\n-\t\t\t XEXP (operands[1], 0),\n-\t\t\t GET_MODE_SIZE (DFmode)))\"\n- \"@\n- ldp\\\\t%d0, %d2, %1\n- ldp\\\\t%x0, %x2, %1\"\n- [(set_attr \"type\" \"neon_load1_2reg,load2\")\n- (set_attr \"fp\" \"yes,*\")]\n-)\n-\n-;; Operands 0 and 2 are tied together by the final condition; so we allow\n-;; fairly lax checking on the second memory operation.\n-(define_insn \"store_pairsf\"\n- [(set (match_operand:SF 0 \"aarch64_mem_pair_operand\" \"=Ump,Ump\")\n-\t(match_operand:SF 1 \"aarch64_reg_or_fp_zero\" \"w,*rY\"))\n- (set (match_operand:SF 2 \"memory_operand\" \"=m,m\")\n-\t(match_operand:SF 3 \"aarch64_reg_or_fp_zero\" \"w,*rY\"))]\n- \"rtx_equal_p (XEXP (operands[2], 0),\n-\t\tplus_constant (Pmode,\n-\t\t\t XEXP (operands[0], 0),\n-\t\t\t GET_MODE_SIZE (SFmode)))\"\n- \"@\n- stp\\\\t%s1, %s3, %0\n- stp\\\\t%w1, %w3, %0\"\n- [(set_attr \"type\" \"neon_store1_2reg,store2\")\n- (set_attr \"fp\" \"yes,*\")]\n-)\n-\n-(define_insn \"store_pairdf\"\n- [(set (match_operand:DF 0 \"aarch64_mem_pair_operand\" \"=Ump,Ump\")\n-\t(match_operand:DF 1 \"aarch64_reg_or_fp_zero\" \"w,*rY\"))\n- (set (match_operand:DF 2 \"memory_operand\" \"=m,m\")\n-\t(match_operand:DF 3 \"aarch64_reg_or_fp_zero\" \"w,*rY\"))]\n- \"rtx_equal_p (XEXP (operands[2], 0),\n-\t\tplus_constant (Pmode,\n-\t\t\t XEXP (operands[0], 0),\n-\t\t\t GET_MODE_SIZE (DFmode)))\"\n- \"@\n- stp\\\\t%d1, %d3, %0\n- stp\\\\t%x1, %x3, %0\"\n- [(set_attr \"type\" \"neon_store1_2reg,store2\")\n- (set_attr \"fp\" \"yes,*\")]\n-)\n-\n ;; Load pair with post-index writeback. This is primarily used in function\n ;; epilogues.\n (define_insn \"loadwb_pair<GPI:mode>_<P:mode>\"\ndiff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md\nindex cceb57525c7aa44933419bd317b1f03a7b76f4c4..6147d93f56649cbc9fe577a433bca610e476ab2c 100644\n--- a/gcc/config/aarch64/iterators.md\n+++ b/gcc/config/aarch64/iterators.md\n@@ -69,6 +69,12 @@\n ;; Double vector modes.\n (define_mode_iterator VD [V8QI V4HI V4HF V2SI V2SF])\n \n+;; All modes stored in registers d0-d31.\n+(define_mode_iterator DREG [V8QI V4HI V4HF V2SI V2SF DF])\n+\n+;; Copy of the above.\n+(define_mode_iterator DREG2 [V8QI V4HI V4HF V2SI V2SF DF])\n+\n ;; vector, 64-bit container, all integer modes\n (define_mode_iterator VD_BHSI [V8QI V4HI V2SI])\n \n@@ -235,6 +241,18 @@\n ;; Double scalar modes\n (define_mode_iterator DX [DI DF])\n \n+;; Duplicate of the above\n+(define_mode_iterator DX2 [DI DF])\n+\n+;; Single scalar modes\n+(define_mode_iterator SX [SI SF])\n+\n+;; Duplicate of the above\n+(define_mode_iterator SX2 [SI SF])\n+\n+;; Single and double integer and float modes\n+(define_mode_iterator DSX [DF DI SF SI])\n+\n ;; Modes available for <f>mul lane operations.\n (define_mode_iterator VMUL [V4HI V8HI V2SI V4SI\n \t\t\t (V4HF \"TARGET_SIMD_F16INST\")\ndiff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md\nindex 11243c4ce00aa7d16a886bb24b01180801c68f4e..ee6e050dd839c329baa05bdfe878b786f1def969 100644\n--- a/gcc/config/aarch64/predicates.md\n+++ b/gcc/config/aarch64/predicates.md\n@@ -62,6 +62,10 @@\n \t(and (match_code \"const_double\")\n \t (match_test \"aarch64_float_const_zero_rtx_p (op)\"))))\n \n+(define_predicate \"aarch64_reg_zero_or_fp_zero\"\n+ (ior (match_operand 0 \"aarch64_reg_or_fp_zero\")\n+ (match_operand 0 \"aarch64_reg_or_zero\")))\n+\n (define_predicate \"aarch64_reg_zero_or_m1_or_1\"\n (and (match_code \"reg,subreg,const_int\")\n (ior (match_operand 0 \"register_operand\")\ndiff --git a/gcc/testsuite/gcc.target/aarch64/ldp_stp_6.c b/gcc/testsuite/gcc.target/aarch64/ldp_stp_6.c\nnew file mode 100644\nindex 0000000000000000000000000000000000000000..2d982f3389b668f2042d48ba3db04e619fd999f3\n--- /dev/null\n+++ b/gcc/testsuite/gcc.target/aarch64/ldp_stp_6.c\n@@ -0,0 +1,20 @@\n+/* { dg-options \"-O2\" } */\n+\n+typedef float __attribute__ ((vector_size (8))) vec;\n+\n+struct pair\n+{\n+ vec e1;\n+ double e2;\n+};\n+\n+vec tmp;\n+\n+void\n+stp (struct pair *p)\n+{\n+ p->e1 = tmp;\n+ p->e2 = 1.0;\n+\n+ /* { dg-final { scan-assembler \"stp\\td\\[0-9\\]+, d\\[0-9\\]+, \\\\\\[x\\[0-9\\]+\\\\\\]\" } } */\n+}\ndiff --git a/gcc/testsuite/gcc.target/aarch64/ldp_stp_7.c b/gcc/testsuite/gcc.target/aarch64/ldp_stp_7.c\nnew file mode 100644\nindex 0000000000000000000000000000000000000000..06607de6b3e36a4d759d915a9f7880284391aa08\n--- /dev/null\n+++ b/gcc/testsuite/gcc.target/aarch64/ldp_stp_7.c\n@@ -0,0 +1,47 @@\n+/* { dg-options \"-O2\" } */\n+\n+struct pair\n+{\n+ double a;\n+ long int b;\n+};\n+\n+void\n+stp (struct pair *p)\n+{\n+ p->a = 0.0;\n+ p->b = 1;\n+}\n+\n+/* { dg-final { scan-assembler \"stp\\txzr, x\\[0-9\\]+, \\\\\\[x\\[0-9\\]+\\\\\\]\" } } */\n+\n+void\n+stp2 (struct pair *p)\n+{\n+ p->a = 0.0;\n+ p->b = 0;\n+}\n+\n+struct reverse_pair\n+{\n+ long int a;\n+ double b;\n+};\n+\n+void\n+stp_reverse (struct reverse_pair *p)\n+{\n+ p->a = 1;\n+ p->b = 0.0;\n+}\n+\n+/* { dg-final { scan-assembler \"stp\\tx\\[0-9\\]+, xzr, \\\\\\[x\\[0-9\\]+\\\\\\]\" } } */\n+\n+void\n+stp_reverse2 (struct reverse_pair *p)\n+{\n+ p->a = 0;\n+ p->b = 0.0;\n+}\n+\n+/* { dg-final { scan-assembler-times \"stp\\txzr, xzr, \\\\\\[x\\[0-9\\]+\\\\\\]\" 2 } } */\ndiff --git a/gcc/testsuite/gcc.target/aarch64/ldp_stp_8.c b/gcc/testsuite/gcc.target/aarch64/ldp_stp_8.c\nnew file mode 100644\nindex 0000000000000000000000000000000000000000..1a47e233814e564d549245683a4e59fdb422bdad\n--- /dev/null\n+++ b/gcc/testsuite/gcc.target/aarch64/ldp_stp_8.c\n@@ -0,0 +1,30 @@\n+/* { dg-options \"-O2\" } */\n+\n+typedef float __attribute__ ((vector_size (8))) fvec;\n+typedef int __attribute__ ((vector_size (8))) ivec;\n+\n+struct pair\n+{\n+ double a;\n+ fvec b;\n+};\n+\n+void ldp (double *a, fvec *b, struct pair *p)\n+{\n+ *a = p->a;\n+ *b = p->b;\n+}\n+\n+struct vec_pair\n+{\n+ fvec a;\n+ ivec b;\n+};\n+\n+void ldp2 (fvec *a, ivec *b, struct vec_pair *p)\n+{\n+ *a = p->a;\n+ *b = p->b;\n+}\n+\n+/* { dg-final { scan-assembler-times \"ldp\\td\\[0-9\\], d\\[0-9\\]+, \\\\\\[x\\[0-9\\]+\\\\\\]\" 2 } } */\n", "prefixes": [ "AArch64" ] }