[{"id":1767285,"web_url":"http://patchwork.ozlabs.org/comment/1767285/","msgid":"<871snbbwdo.fsf@linaro.org>","list_archive_url":null,"date":"2017-09-12T18:32:51","subject":"Re: [AArch64] Merge stores of D register values of different modes","submitter":{"id":5450,"url":"http://patchwork.ozlabs.org/api/people/5450/","name":"Richard Sandiford","email":"richard.sandiford@linaro.org"},"content":"Thanks for doing this, looks good to me FWIW.  I was just wondering:\n\nJackson Woodruff <jackson.woodruff@foss.arm.com> writes:\n> @@ -14712,6 +14712,11 @@ aarch64_operands_ok_for_ldpstp (rtx *operands, bool load,\n>    if (!rtx_equal_p (base_1, base_2))\n>      return false;\n>  \n> +  /* Check that the operands are of the same size.  */\n> +  if (GET_MODE_SIZE (GET_MODE (mem_1))\n> +      != GET_MODE_SIZE (GET_MODE (mem_2)))\n> +    return false;\n> +\n>    offval_1 = INTVAL (offset_1);\n>    offval_2 = INTVAL (offset_2);\n>    msize = GET_MODE_SIZE (mode);\n\nwhen can this trigger?  Your iterators always seem to enforce correct\npairings, so maybe this should be an assert instead.\n\nThanks,\nRichard","headers":{"Return-Path":"<gcc-patches-return-461982-incoming=patchwork.ozlabs.org@gcc.gnu.org>","X-Original-To":"incoming@patchwork.ozlabs.org","Delivered-To":["patchwork-incoming@bilbo.ozlabs.org","mailing list gcc-patches@gcc.gnu.org"],"Authentication-Results":["ozlabs.org;\n\tspf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org\n\t(client-ip=209.132.180.131; helo=sourceware.org;\n\tenvelope-from=gcc-patches-return-461982-incoming=patchwork.ozlabs.org@gcc.gnu.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (1024-bit key;\n\tunprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org\n\theader.b=\"sodixtjT\"; dkim-atps=neutral","sourceware.org; auth=none"],"Received":["from sourceware.org (server1.sourceware.org [209.132.180.131])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3xsD1s0j7Jz9s81\n\tfor <incoming@patchwork.ozlabs.org>;\n\tWed, 13 Sep 2017 04:33:04 +1000 (AEST)","(qmail 105737 invoked by alias); 12 Sep 2017 18:32:56 -0000","(qmail 105723 invoked by uid 89); 12 Sep 2017 18:32:56 -0000","from mail-wm0-f52.google.com (HELO mail-wm0-f52.google.com)\n\t(74.125.82.52) by sourceware.org\n\t(qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP;\n\tTue, 12 Sep 2017 18:32:54 +0000","by mail-wm0-f52.google.com with SMTP id i189so1509092wmf.1 for\n\t<gcc-patches@gcc.gnu.org>; Tue, 12 Sep 2017 11:32:54 -0700 (PDT)","from localhost ([2.25.234.0]) by smtp.gmail.com with ESMTPSA id\n\t77sm18557661wmx.10.2017.09.12.11.32.51 (version=TLS1_2\n\tcipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256);\n\tTue, 12 Sep 2017 11:32:51 -0700 (PDT)"],"DomainKey-Signature":"a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id\n\t:list-unsubscribe:list-archive:list-post:list-help:sender:from\n\t:to:cc:subject:references:date:in-reply-to:message-id\n\t:mime-version:content-type; q=dns; s=default; b=PGr1JqN/+xwroYCG\n\tujB3rqNO0BBbHNDISE0godEmnJwzon0L7/Qlhta+DVD1QAnb85VuFyxU4tZ+H/sD\n\tJ61EiWHIHB3Il2frbx1gucsvNKSHxrG1O2UDUy6PsJ+jGVvqRjiQ9zS/usXr19nY\n\tEKQVqMm3JRmfRLW9GG+hVUaSohg=","DKIM-Signature":"v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id\n\t:list-unsubscribe:list-archive:list-post:list-help:sender:from\n\t:to:cc:subject:references:date:in-reply-to:message-id\n\t:mime-version:content-type; s=default; bh=o58ozuW5TurRGdHqTTatYN\n\tL3Hdc=; b=sodixtjTLUXA0rhJzScIc5V1ryUUSrEl5GXbrhgNLdQ6PN8O9XTwv4\n\tn300ZqcdWQBKP7Ed0lFflbm/by8bjlX6vgQE24jFkpNpgz6r4PZlieGOjNfLrUlC\n\tIEf0CirWGVn7zCaq9kG/d/3YhB/NXZ4Vr3EjnsVW+m2yHeNF1QbEY=","Mailing-List":"contact gcc-patches-help@gcc.gnu.org; run by ezmlm","Precedence":"bulk","List-Id":"<gcc-patches.gcc.gnu.org>","List-Unsubscribe":"<mailto:gcc-patches-unsubscribe-incoming=patchwork.ozlabs.org@gcc.gnu.org>","List-Archive":"<http://gcc.gnu.org/ml/gcc-patches/>","List-Post":"<mailto:gcc-patches@gcc.gnu.org>","List-Help":"<mailto:gcc-patches-help@gcc.gnu.org>","Sender":"gcc-patches-owner@gcc.gnu.org","X-Virus-Found":"No","X-Spam-SWARE-Status":"No, score=-2.7 required=5.0 tests=AWL, BAYES_00,\n\tRCVD_IN_DNSWL_NONE,\n\tSPF_PASS autolearn=ham version=3.3.2 spammy=","X-HELO":"mail-wm0-f52.google.com","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net;\n\ts=20161025;\n\th=x-gm-message-state:from:to:mail-followup-to:cc:subject:references\n\t:date:in-reply-to:message-id:user-agent:mime-version;\n\tbh=oefSEChPN6MX6hP8g+RW+QmS5BDEcSy2i6P17Zc3UnM=;\n\tb=dwg+XMPMDQ3oylyH+7kBCSIpwulHlnTZo7gRwlod8o+6wiJiIlRMyAIvKO83ffgNnY\n\t0TiDIJKr07QGQ5yjODbasTZX7GhEaGqydRomfJZucLnDxBmkNYfyNk6xSQNRa3FWxBrr\n\tI7atM73EGSiN5cGiPGWKaia9fdpvdVHc965UPWEejZ0rmSmkEHvMBTcL93ZUrE6/twmL\n\thxpVu1uc0h/KalXaPk7999Llf7HvjG1XgnteCZbDPega+GOjCctCKAgVbILHKnC5hM+6\n\tULXMmpGWTWeLTXvZKr3/3/O9e70Yw7IqK2fkw6b7JU7PK8AzHDiyQsFXPXLM8jGqBMKA\n\t1T+A==","X-Gm-Message-State":"AHPjjUg6C0WaXhOJUL7bFQGA9KVuT4pigssdQerafRXrArz/+tY4SEH8\tTi5lbE8JcdH7Eu/jOnhyw3iaUw==","X-Google-Smtp-Source":"AOwi7QAAuy/x39xNFtdnHAgWU3prWNw0PDfLA2Q6VA9a2yTrupyvYJxP9BSdyXY+N0t6Ajgcvh+I3g==","X-Received":"by 10.28.183.85 with SMTP id h82mr400433wmf.24.1505241172759;\n\tTue, 12 Sep 2017 11:32:52 -0700 (PDT)","From":"Richard Sandiford <richard.sandiford@linaro.org>","To":"Jackson Woodruff <jackson.woodruff@foss.arm.com>","Mail-Followup-To":"Jackson Woodruff <jackson.woodruff@foss.arm.com>,\n\tGCC Patches <gcc-patches@gcc.gnu.org>,\n\tJames.Greenhalgh@arm.com, Richard.Earnshaw@arm.com,\n\trichard.sandiford@linaro.org","Cc":"GCC Patches <gcc-patches@gcc.gnu.org>, James.Greenhalgh@arm.com,\n\tRichard.Earnshaw@arm.com","Subject":"Re: [AArch64] Merge stores of D register values of different modes","References":"<93501be4-3408-a2d9-f5e1-56a5569bda96@foss.arm.com>\t<07fc21e1-8747-eca0-4dde-4f364ef1a414@foss.arm.com>","Date":"Tue, 12 Sep 2017 19:32:51 +0100","In-Reply-To":"<07fc21e1-8747-eca0-4dde-4f364ef1a414@foss.arm.com>\n\t(Jackson\tWoodruff's message of \"Wed, 6 Sep 2017 11:52:31 +0100\")","Message-ID":"<871snbbwdo.fsf@linaro.org>","User-Agent":"Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux)","MIME-Version":"1.0","Content-Type":"text/plain"}},{"id":1767888,"web_url":"http://patchwork.ozlabs.org/comment/1767888/","msgid":"<020fd264-3996-5cd1-ebc9-9f1aa82543ec@foss.arm.com>","list_archive_url":null,"date":"2017-09-13T13:35:52","subject":"Re: [AArch64] Merge stores of D register values of different modes","submitter":{"id":71950,"url":"http://patchwork.ozlabs.org/api/people/71950/","name":"Jackson Woodruff","email":"jackson.woodruff@foss.arm.com"},"content":"On 09/12/2017 07:32 PM, Richard Sandiford wrote:\n> Thanks for doing this, looks good to me FWIW.  I was just wondering:\n> \n> Jackson Woodruff <jackson.woodruff@foss.arm.com> writes:\n>> @@ -14712,6 +14712,11 @@ aarch64_operands_ok_for_ldpstp (rtx *operands, bool load,\n>>     if (!rtx_equal_p (base_1, base_2))\n>>       return false;\n>>   \n>> +  /* Check that the operands are of the same size.  */\n>> +  if (GET_MODE_SIZE (GET_MODE (mem_1))\n>> +      != GET_MODE_SIZE (GET_MODE (mem_2)))\n>> +    return false;\n>> +\n>>     offval_1 = INTVAL (offset_1);\n>>     offval_2 = INTVAL (offset_2);\n>>     msize = GET_MODE_SIZE (mode);\n> \n> when can this trigger?  Your iterators always seem to enforce correct\n> pairings, so maybe this should be an assert instead.\n\nYes, it's true that this should never be triggered. I've changed it to \nan assert.\n\nI have also rebased on top of the renaming of load/store attributes \npatch https://gcc.gnu.org/ml/gcc-patches/2017-09/msg00702.html which had \nsome conflicts with this.\n\nIs the updated patch OK for trunk?\n\nThanks,\nJackson.\n\n> \n> Thanks,\n> Richard\n>\ndiff --git a/gcc/config/aarch64/aarch64-ldpstp.md b/gcc/config/aarch64/aarch64-ldpstp.md\nindex e8dda42c2dd1e30c4607c67a2156ff7813bd89ea..14e860d258e548d4118d957675f8bdbb74615337 100644\n--- a/gcc/config/aarch64/aarch64-ldpstp.md\n+++ b/gcc/config/aarch64/aarch64-ldpstp.md\n@@ -99,10 +99,10 @@\n })\n \n (define_peephole2\n-  [(set (match_operand:VD 0 \"register_operand\" \"\")\n-\t(match_operand:VD 1 \"aarch64_mem_pair_operand\" \"\"))\n-   (set (match_operand:VD 2 \"register_operand\" \"\")\n-\t(match_operand:VD 3 \"memory_operand\" \"\"))]\n+  [(set (match_operand:DREG 0 \"register_operand\" \"\")\n+\t(match_operand:DREG 1 \"aarch64_mem_pair_operand\" \"\"))\n+   (set (match_operand:DREG2 2 \"register_operand\" \"\")\n+\t(match_operand:DREG2 3 \"memory_operand\" \"\"))]\n   \"aarch64_operands_ok_for_ldpstp (operands, true, <MODE>mode)\"\n   [(parallel [(set (match_dup 0) (match_dup 1))\n \t      (set (match_dup 2) (match_dup 3))])]\n@@ -119,11 +119,12 @@\n })\n \n (define_peephole2\n-  [(set (match_operand:VD 0 \"aarch64_mem_pair_operand\" \"\")\n-\t(match_operand:VD 1 \"register_operand\" \"\"))\n-   (set (match_operand:VD 2 \"memory_operand\" \"\")\n-\t(match_operand:VD 3 \"register_operand\" \"\"))]\n-  \"TARGET_SIMD && aarch64_operands_ok_for_ldpstp (operands, false, <MODE>mode)\"\n+  [(set (match_operand:DREG 0 \"aarch64_mem_pair_operand\" \"\")\n+\t(match_operand:DREG 1 \"register_operand\" \"\"))\n+   (set (match_operand:DREG2 2 \"memory_operand\" \"\")\n+\t(match_operand:DREG2 3 \"register_operand\" \"\"))]\n+  \"TARGET_SIMD\n+   && aarch64_operands_ok_for_ldpstp (operands, false, <DREG:MODE>mode)\"\n   [(parallel [(set (match_dup 0) (match_dup 1))\n \t      (set (match_dup 2) (match_dup 3))])]\n {\n@@ -138,7 +139,6 @@\n     }\n })\n \n-\n ;; Handle sign/zero extended consecutive load/store.\n \n (define_peephole2\n@@ -181,6 +181,30 @@\n     }\n })\n \n+;; Handle storing of a floating point zero.\n+;; We can match modes that won't work for a stp instruction\n+;; as aarch64_operands_ok_for_ldpstp checks that the modes are\n+;; compatible.\n+(define_peephole2\n+  [(set (match_operand:DSX 0 \"aarch64_mem_pair_operand\" \"\")\n+\t(match_operand:DSX 1 \"aarch64_reg_zero_or_fp_zero\" \"\"))\n+   (set (match_operand:<FCVT_TARGET> 2 \"memory_operand\" \"\")\n+\t(match_operand:<FCVT_TARGET> 3 \"aarch64_reg_zero_or_fp_zero\" \"\"))]\n+  \"aarch64_operands_ok_for_ldpstp (operands, false, DImode)\"\n+  [(parallel [(set (match_dup 0) (match_dup 1))\n+\t      (set (match_dup 2) (match_dup 3))])]\n+{\n+  rtx base, offset_1, offset_2;\n+\n+  extract_base_offset_in_addr (operands[0], &base, &offset_1);\n+  extract_base_offset_in_addr (operands[2], &base, &offset_2);\n+  if (INTVAL (offset_1) > INTVAL (offset_2))\n+    {\n+      std::swap (operands[0], operands[2]);\n+      std::swap (operands[1], operands[3]);\n+    }\n+})\n+\n ;; Handle consecutive load/store whose offset is out of the range\n ;; supported by ldp/ldpsw/stp.  We firstly adjust offset in a scratch\n ;; register, then merge them into ldp/ldpsw/stp by using the adjusted\ndiff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md\nindex 8f045c210502330af9d47f6adfd46a9e36328b74..90f9415b3986eb737ecdfeed43fe798cdbb8334e 100644\n--- a/gcc/config/aarch64/aarch64-simd.md\n+++ b/gcc/config/aarch64/aarch64-simd.md\n@@ -172,11 +172,11 @@\n   [(set_attr \"type\" \"neon_store1_1reg<q>\")]\n )\n \n-(define_insn \"load_pair<mode>\"\n-  [(set (match_operand:VD 0 \"register_operand\" \"=w\")\n-\t(match_operand:VD 1 \"aarch64_mem_pair_operand\" \"Ump\"))\n-   (set (match_operand:VD 2 \"register_operand\" \"=w\")\n-\t(match_operand:VD 3 \"memory_operand\" \"m\"))]\n+(define_insn \"load_pair<DREG:mode><DREG2:mode>\"\n+  [(set (match_operand:DREG 0 \"register_operand\" \"=w\")\n+\t(match_operand:DREG 1 \"aarch64_mem_pair_operand\" \"Ump\"))\n+   (set (match_operand:DREG2 2 \"register_operand\" \"=w\")\n+\t(match_operand:DREG2 3 \"memory_operand\" \"m\"))]\n   \"TARGET_SIMD\n    && rtx_equal_p (XEXP (operands[3], 0),\n \t\t   plus_constant (Pmode,\n@@ -186,11 +186,11 @@\n   [(set_attr \"type\" \"neon_ldp\")]\n )\n \n-(define_insn \"store_pair<mode>\"\n-  [(set (match_operand:VD 0 \"aarch64_mem_pair_operand\" \"=Ump\")\n-\t(match_operand:VD 1 \"register_operand\" \"w\"))\n-   (set (match_operand:VD 2 \"memory_operand\" \"=m\")\n-\t(match_operand:VD 3 \"register_operand\" \"w\"))]\n+(define_insn \"vec_store_pair<DREG:mode><DREG2:mode>\"\n+  [(set (match_operand:DREG 0 \"aarch64_mem_pair_operand\" \"=Ump\")\n+\t(match_operand:DREG 1 \"register_operand\" \"w\"))\n+   (set (match_operand:DREG2 2 \"memory_operand\" \"=m\")\n+\t(match_operand:DREG2 3 \"register_operand\" \"w\"))]\n   \"TARGET_SIMD\n    && rtx_equal_p (XEXP (operands[2], 0),\n \t\t   plus_constant (Pmode,\ndiff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c\nindex de1fbdca950b452f5616f37eb0ea719ee793cfdc..ea2ff88f91a18b3fcd43bd0dcafd9ebdcc0b2366 100644\n--- a/gcc/config/aarch64/aarch64.c\n+++ b/gcc/config/aarch64/aarch64.c\n@@ -3191,10 +3191,10 @@ aarch64_gen_store_pair (machine_mode mode, rtx mem1, rtx reg1, rtx mem2,\n   switch (mode)\n     {\n     case E_DImode:\n-      return gen_store_pairdi (mem1, reg1, mem2, reg2);\n+      return gen_store_pair_dw_DIDI (mem1, reg1, mem2, reg2);\n \n     case E_DFmode:\n-      return gen_store_pairdf (mem1, reg1, mem2, reg2);\n+      return gen_store_pair_dw_DFDF (mem1, reg1, mem2, reg2);\n \n     default:\n       gcc_unreachable ();\n@@ -3211,10 +3211,10 @@ aarch64_gen_load_pair (machine_mode mode, rtx reg1, rtx mem1, rtx reg2,\n   switch (mode)\n     {\n     case E_DImode:\n-      return gen_load_pairdi (reg1, mem1, reg2, mem2);\n+      return gen_load_pair_dw_DIDI (reg1, mem1, reg2, mem2);\n \n     case E_DFmode:\n-      return gen_load_pairdf (reg1, mem1, reg2, mem2);\n+      return gen_load_pair_dw_DFDF (reg1, mem1, reg2, mem2);\n \n     default:\n       gcc_unreachable ();\n@@ -14751,6 +14751,10 @@ aarch64_operands_ok_for_ldpstp (rtx *operands, bool load,\n   if (!rtx_equal_p (base_1, base_2))\n     return false;\n \n+  /* The operands must be of the same size.  */\n+  gcc_assert (GET_MODE_SIZE (GET_MODE (mem_1))\n+\t      == GET_MODE_SIZE (GET_MODE (mem_2)));\n+\n   offval_1 = INTVAL (offset_1);\n   offval_2 = INTVAL (offset_2);\n   msize = GET_MODE_SIZE (mode);\ndiff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md\nindex f8cdb063546afaf3ca977d078da6417729af88a6..46af41379621927ab54835c7adc4cd2b5057fbfe 100644\n--- a/gcc/config/aarch64/aarch64.md\n+++ b/gcc/config/aarch64/aarch64.md\n@@ -1224,15 +1224,15 @@\n \n ;; Operands 1 and 3 are tied together by the final condition; so we allow\n ;; fairly lax checking on the second memory operation.\n-(define_insn \"load_pairsi\"\n-  [(set (match_operand:SI 0 \"register_operand\" \"=r,*w\")\n-\t(match_operand:SI 1 \"aarch64_mem_pair_operand\" \"Ump,Ump\"))\n-   (set (match_operand:SI 2 \"register_operand\" \"=r,*w\")\n-\t(match_operand:SI 3 \"memory_operand\" \"m,m\"))]\n-  \"rtx_equal_p (XEXP (operands[3], 0),\n-\t\tplus_constant (Pmode,\n-\t\t\t       XEXP (operands[1], 0),\n-\t\t\t       GET_MODE_SIZE (SImode)))\"\n+(define_insn \"load_pair_sw_<SX:MODE><SX2:MODE>\"\n+  [(set (match_operand:SX 0 \"register_operand\" \"=r,w\")\n+\t(match_operand:SX 1 \"aarch64_mem_pair_operand\" \"Ump,Ump\"))\n+   (set (match_operand:SX2 2 \"register_operand\" \"=r,w\")\n+\t(match_operand:SX2 3 \"memory_operand\" \"m,m\"))]\n+   \"rtx_equal_p (XEXP (operands[3], 0),\n+\t\t plus_constant (Pmode,\n+\t\t\t\tXEXP (operands[1], 0),\n+\t\t\t\tGET_MODE_SIZE (<MODE>mode)))\"\n   \"@\n    ldp\\\\t%w0, %w2, %1\n    ldp\\\\t%s0, %s2, %1\"\n@@ -1240,15 +1240,16 @@\n    (set_attr \"fp\" \"*,yes\")]\n )\n \n-(define_insn \"load_pairdi\"\n-  [(set (match_operand:DI 0 \"register_operand\" \"=r,*w\")\n-\t(match_operand:DI 1 \"aarch64_mem_pair_operand\" \"Ump,Ump\"))\n-   (set (match_operand:DI 2 \"register_operand\" \"=r,*w\")\n-\t(match_operand:DI 3 \"memory_operand\" \"m,m\"))]\n-  \"rtx_equal_p (XEXP (operands[3], 0),\n-\t\tplus_constant (Pmode,\n-\t\t\t       XEXP (operands[1], 0),\n-\t\t\t       GET_MODE_SIZE (DImode)))\"\n+;; Storing different modes that can still be merged\n+(define_insn \"load_pair_dw_<DX:MODE><DX2:MODE>\"\n+  [(set (match_operand:DX 0 \"register_operand\" \"=r,w\")\n+\t(match_operand:DX 1 \"aarch64_mem_pair_operand\" \"Ump,Ump\"))\n+   (set (match_operand:DX2 2 \"register_operand\" \"=r,w\")\n+\t(match_operand:DX2 3 \"memory_operand\" \"m,m\"))]\n+   \"rtx_equal_p (XEXP (operands[3], 0),\n+\t\t plus_constant (Pmode,\n+\t\t\t\tXEXP (operands[1], 0),\n+\t\t\t\tGET_MODE_SIZE (<MODE>mode)))\"\n   \"@\n    ldp\\\\t%x0, %x2, %1\n    ldp\\\\t%d0, %d2, %1\"\n@@ -1257,17 +1258,18 @@\n )\n \n \n+\n ;; Operands 0 and 2 are tied together by the final condition; so we allow\n ;; fairly lax checking on the second memory operation.\n-(define_insn \"store_pairsi\"\n-  [(set (match_operand:SI 0 \"aarch64_mem_pair_operand\" \"=Ump,Ump\")\n-\t(match_operand:SI 1 \"aarch64_reg_or_zero\" \"rZ,*w\"))\n-   (set (match_operand:SI 2 \"memory_operand\" \"=m,m\")\n-\t(match_operand:SI 3 \"aarch64_reg_or_zero\" \"rZ,*w\"))]\n-  \"rtx_equal_p (XEXP (operands[2], 0),\n-\t\tplus_constant (Pmode,\n-\t\t\t       XEXP (operands[0], 0),\n-\t\t\t       GET_MODE_SIZE (SImode)))\"\n+(define_insn \"store_pair_sw_<SX:MODE><SX2:MODE>\"\n+  [(set (match_operand:SX 0 \"aarch64_mem_pair_operand\" \"=Ump,Ump\")\n+\t(match_operand:SX 1 \"aarch64_reg_zero_or_fp_zero\" \"rYZ,w\"))\n+   (set (match_operand:SX2 2 \"memory_operand\" \"=m,m\")\n+\t(match_operand:SX2 3 \"aarch64_reg_zero_or_fp_zero\" \"rYZ,w\"))]\n+   \"rtx_equal_p (XEXP (operands[2], 0),\n+\t\t plus_constant (Pmode,\n+\t\t\t\tXEXP (operands[0], 0),\n+\t\t\t\tGET_MODE_SIZE (<MODE>mode)))\"\n   \"@\n    stp\\\\t%w1, %w3, %0\n    stp\\\\t%s1, %s3, %0\"\n@@ -1275,15 +1277,16 @@\n    (set_attr \"fp\" \"*,yes\")]\n )\n \n-(define_insn \"store_pairdi\"\n-  [(set (match_operand:DI 0 \"aarch64_mem_pair_operand\" \"=Ump,Ump\")\n-\t(match_operand:DI 1 \"aarch64_reg_or_zero\" \"rZ,*w\"))\n-   (set (match_operand:DI 2 \"memory_operand\" \"=m,m\")\n-\t(match_operand:DI 3 \"aarch64_reg_or_zero\" \"rZ,*w\"))]\n-  \"rtx_equal_p (XEXP (operands[2], 0),\n-\t\tplus_constant (Pmode,\n-\t\t\t       XEXP (operands[0], 0),\n-\t\t\t       GET_MODE_SIZE (DImode)))\"\n+;; Storing different modes that can still be merged\n+(define_insn \"store_pair_dw_<DX:MODE><DX2:MODE>\"\n+  [(set (match_operand:DX 0 \"aarch64_mem_pair_operand\" \"=Ump,Ump\")\n+\t(match_operand:DX 1 \"aarch64_reg_zero_or_fp_zero\" \"rYZ,w\"))\n+   (set (match_operand:DX2 2 \"memory_operand\" \"=m,m\")\n+\t(match_operand:DX2 3 \"aarch64_reg_zero_or_fp_zero\" \"rYZ,w\"))]\n+   \"rtx_equal_p (XEXP (operands[2], 0),\n+\t\t plus_constant (Pmode,\n+\t\t\t\tXEXP (operands[0], 0),\n+\t\t\t\tGET_MODE_SIZE (<MODE>mode)))\"\n   \"@\n    stp\\\\t%x1, %x3, %0\n    stp\\\\t%d1, %d3, %0\"\n@@ -1291,74 +1294,6 @@\n    (set_attr \"fp\" \"*,yes\")]\n )\n \n-;; Operands 1 and 3 are tied together by the final condition; so we allow\n-;; fairly lax checking on the second memory operation.\n-(define_insn \"load_pairsf\"\n-  [(set (match_operand:SF 0 \"register_operand\" \"=w,*r\")\n-\t(match_operand:SF 1 \"aarch64_mem_pair_operand\" \"Ump,Ump\"))\n-   (set (match_operand:SF 2 \"register_operand\" \"=w,*r\")\n-\t(match_operand:SF 3 \"memory_operand\" \"m,m\"))]\n-  \"rtx_equal_p (XEXP (operands[3], 0),\n-\t\tplus_constant (Pmode,\n-\t\t\t       XEXP (operands[1], 0),\n-\t\t\t       GET_MODE_SIZE (SFmode)))\"\n-  \"@\n-   ldp\\\\t%s0, %s2, %1\n-   ldp\\\\t%w0, %w2, %1\"\n-  [(set_attr \"type\" \"neon_load1_2reg,load_8\")\n-   (set_attr \"fp\" \"yes,*\")]\n-)\n-\n-(define_insn \"load_pairdf\"\n-  [(set (match_operand:DF 0 \"register_operand\" \"=w,*r\")\n-\t(match_operand:DF 1 \"aarch64_mem_pair_operand\" \"Ump,Ump\"))\n-   (set (match_operand:DF 2 \"register_operand\" \"=w,*r\")\n-\t(match_operand:DF 3 \"memory_operand\" \"m,m\"))]\n-  \"rtx_equal_p (XEXP (operands[3], 0),\n-\t\tplus_constant (Pmode,\n-\t\t\t       XEXP (operands[1], 0),\n-\t\t\t       GET_MODE_SIZE (DFmode)))\"\n-  \"@\n-   ldp\\\\t%d0, %d2, %1\n-   ldp\\\\t%x0, %x2, %1\"\n-  [(set_attr \"type\" \"neon_load1_2reg,load_16\")\n-   (set_attr \"fp\" \"yes,*\")]\n-)\n-\n-;; Operands 0 and 2 are tied together by the final condition; so we allow\n-;; fairly lax checking on the second memory operation.\n-(define_insn \"store_pairsf\"\n-  [(set (match_operand:SF 0 \"aarch64_mem_pair_operand\" \"=Ump,Ump\")\n-\t(match_operand:SF 1 \"aarch64_reg_or_fp_zero\" \"w,*rY\"))\n-   (set (match_operand:SF 2 \"memory_operand\" \"=m,m\")\n-\t(match_operand:SF 3 \"aarch64_reg_or_fp_zero\" \"w,*rY\"))]\n-  \"rtx_equal_p (XEXP (operands[2], 0),\n-\t\tplus_constant (Pmode,\n-\t\t\t       XEXP (operands[0], 0),\n-\t\t\t       GET_MODE_SIZE (SFmode)))\"\n-  \"@\n-   stp\\\\t%s1, %s3, %0\n-   stp\\\\t%w1, %w3, %0\"\n-  [(set_attr \"type\" \"neon_store1_2reg,store_8\")\n-   (set_attr \"fp\" \"yes,*\")]\n-)\n-\n-(define_insn \"store_pairdf\"\n-  [(set (match_operand:DF 0 \"aarch64_mem_pair_operand\" \"=Ump,Ump\")\n-\t(match_operand:DF 1 \"aarch64_reg_or_fp_zero\" \"w,*rY\"))\n-   (set (match_operand:DF 2 \"memory_operand\" \"=m,m\")\n-\t(match_operand:DF 3 \"aarch64_reg_or_fp_zero\" \"w,*rY\"))]\n-  \"rtx_equal_p (XEXP (operands[2], 0),\n-\t\tplus_constant (Pmode,\n-\t\t\t       XEXP (operands[0], 0),\n-\t\t\t       GET_MODE_SIZE (DFmode)))\"\n-  \"@\n-   stp\\\\t%d1, %d3, %0\n-   stp\\\\t%x1, %x3, %0\"\n-  [(set_attr \"type\" \"neon_store1_2reg,store_16\")\n-   (set_attr \"fp\" \"yes,*\")]\n-)\n-\n ;; Load pair with post-index writeback.  This is primarily used in function\n ;; epilogues.\n (define_insn \"loadwb_pair<GPI:mode>_<P:mode>\"\ndiff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md\nindex 477dc35daf6a1184be15d942c62a111604f62f3c..0e1e9704e3866136959c08ba10077fa31c72c0ce 100644\n--- a/gcc/config/aarch64/iterators.md\n+++ b/gcc/config/aarch64/iterators.md\n@@ -69,6 +69,12 @@\n ;; Double vector modes.\n (define_mode_iterator VD [V8QI V4HI V4HF V2SI V2SF])\n \n+;; All modes stored in registers d0-d31.\n+(define_mode_iterator DREG [V8QI V4HI V4HF V2SI V2SF DF])\n+\n+;; Copy of the above.\n+(define_mode_iterator DREG2 [V8QI V4HI V4HF V2SI V2SF DF])\n+\n ;; vector, 64-bit container, all integer modes\n (define_mode_iterator VD_BHSI [V8QI V4HI V2SI])\n \n@@ -235,6 +241,18 @@\n ;; Double scalar modes\n (define_mode_iterator DX [DI DF])\n \n+;; Duplicate of the above\n+(define_mode_iterator DX2 [DI DF])\n+\n+;; Single scalar modes\n+(define_mode_iterator SX [SI SF])\n+\n+;; Duplicate of the above\n+(define_mode_iterator SX2 [SI SF])\n+\n+;; Single and double integer and float modes\n+(define_mode_iterator DSX [DF DI SF SI])\n+\n ;; Modes available for <f>mul lane operations.\n (define_mode_iterator VMUL [V4HI V8HI V2SI V4SI\n \t\t\t    (V4HF \"TARGET_SIMD_F16INST\")\ndiff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md\nindex 11243c4ce00aa7d16a886bb24b01180801c68f4e..ee6e050dd839c329baa05bdfe878b786f1def969 100644\n--- a/gcc/config/aarch64/predicates.md\n+++ b/gcc/config/aarch64/predicates.md\n@@ -62,6 +62,10 @@\n \t(and (match_code \"const_double\")\n \t     (match_test \"aarch64_float_const_zero_rtx_p (op)\"))))\n \n+(define_predicate \"aarch64_reg_zero_or_fp_zero\"\n+  (ior (match_operand 0 \"aarch64_reg_or_fp_zero\")\n+       (match_operand 0 \"aarch64_reg_or_zero\")))\n+\n (define_predicate \"aarch64_reg_zero_or_m1_or_1\"\n   (and (match_code \"reg,subreg,const_int\")\n        (ior (match_operand 0 \"register_operand\")\ndiff --git a/gcc/testsuite/gcc.target/aarch64/ldp_stp_6.c b/gcc/testsuite/gcc.target/aarch64/ldp_stp_6.c\nnew file mode 100644\nindex 0000000000000000000000000000000000000000..2d982f3389b668f2042d48ba3db04e619fd999f3\n--- /dev/null\n+++ b/gcc/testsuite/gcc.target/aarch64/ldp_stp_6.c\n@@ -0,0 +1,20 @@\n+/* { dg-options \"-O2\" } */\n+\n+typedef float __attribute__ ((vector_size (8))) vec;\n+\n+struct pair\n+{\n+  vec e1;\n+  double e2;\n+};\n+\n+vec tmp;\n+\n+void\n+stp (struct pair *p)\n+{\n+  p->e1 = tmp;\n+  p->e2 = 1.0;\n+\n+  /* { dg-final { scan-assembler \"stp\\td\\[0-9\\]+, d\\[0-9\\]+, \\\\\\[x\\[0-9\\]+\\\\\\]\" } } */\n+}\ndiff --git a/gcc/testsuite/gcc.target/aarch64/ldp_stp_7.c b/gcc/testsuite/gcc.target/aarch64/ldp_stp_7.c\nnew file mode 100644\nindex 0000000000000000000000000000000000000000..06607de6b3e36a4d759d915a9f7880284391aa08\n--- /dev/null\n+++ b/gcc/testsuite/gcc.target/aarch64/ldp_stp_7.c\n@@ -0,0 +1,47 @@\n+/* { dg-options \"-O2\" } */\n+\n+struct pair\n+{\n+  double a;\n+  long int b;\n+};\n+\n+void\n+stp (struct pair *p)\n+{\n+  p->a = 0.0;\n+  p->b = 1;\n+}\n+\n+/* { dg-final { scan-assembler \"stp\\txzr, x\\[0-9\\]+, \\\\\\[x\\[0-9\\]+\\\\\\]\" } } */\n+\n+void\n+stp2 (struct pair *p)\n+{\n+  p->a = 0.0;\n+  p->b = 0;\n+}\n+\n+struct reverse_pair\n+{\n+  long int a;\n+  double b;\n+};\n+\n+void\n+stp_reverse (struct reverse_pair *p)\n+{\n+  p->a = 1;\n+  p->b = 0.0;\n+}\n+\n+/* { dg-final { scan-assembler \"stp\\tx\\[0-9\\]+, xzr, \\\\\\[x\\[0-9\\]+\\\\\\]\" } } */\n+\n+void\n+stp_reverse2 (struct reverse_pair *p)\n+{\n+  p->a = 0;\n+  p->b = 0.0;\n+}\n+\n+/* { dg-final { scan-assembler-times \"stp\\txzr, xzr, \\\\\\[x\\[0-9\\]+\\\\\\]\" 2 } } */\ndiff --git a/gcc/testsuite/gcc.target/aarch64/ldp_stp_8.c b/gcc/testsuite/gcc.target/aarch64/ldp_stp_8.c\nnew file mode 100644\nindex 0000000000000000000000000000000000000000..1a47e233814e564d549245683a4e59fdb422bdad\n--- /dev/null\n+++ b/gcc/testsuite/gcc.target/aarch64/ldp_stp_8.c\n@@ -0,0 +1,30 @@\n+/* { dg-options \"-O2\" } */\n+\n+typedef float __attribute__ ((vector_size (8))) fvec;\n+typedef int __attribute__ ((vector_size (8))) ivec;\n+\n+struct pair\n+{\n+  double a;\n+  fvec b;\n+};\n+\n+void ldp (double *a, fvec *b, struct pair *p)\n+{\n+  *a = p->a;\n+  *b = p->b;\n+}\n+\n+struct vec_pair\n+{\n+  fvec a;\n+  ivec b;\n+};\n+\n+void ldp2 (fvec *a, ivec *b, struct vec_pair *p)\n+{\n+  *a = p->a;\n+  *b = p->b;\n+}\n+\n+/* { dg-final { scan-assembler-times \"ldp\\td\\[0-9\\], d\\[0-9\\]+, \\\\\\[x\\[0-9\\]+\\\\\\]\" 2 } } */","headers":{"Return-Path":"<gcc-patches-return-462033-incoming=patchwork.ozlabs.org@gcc.gnu.org>","X-Original-To":"incoming@patchwork.ozlabs.org","Delivered-To":["patchwork-incoming@bilbo.ozlabs.org","mailing list gcc-patches@gcc.gnu.org"],"Authentication-Results":["ozlabs.org;\n\tspf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org\n\t(client-ip=209.132.180.131; helo=sourceware.org;\n\tenvelope-from=gcc-patches-return-462033-incoming=patchwork.ozlabs.org@gcc.gnu.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (1024-bit key;\n\tunprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org\n\theader.b=\"JIn5gcCV\"; dkim-atps=neutral","sourceware.org; auth=none"],"Received":["from sourceware.org (server1.sourceware.org [209.132.180.131])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3xsjNm55J1z9sNV\n\tfor <incoming@patchwork.ozlabs.org>;\n\tWed, 13 Sep 2017 23:36:07 +1000 (AEST)","(qmail 7883 invoked by alias); 13 Sep 2017 13:36:00 -0000","(qmail 7355 invoked by uid 89); 13 Sep 2017 13:36:00 -0000","from foss.arm.com (HELO foss.arm.com) (217.140.101.70) by\n\tsourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP;\n\tWed, 13 Sep 2017 13:35:57 +0000","from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249])\tby\n\tusa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id\n\t625081596; Wed, 13 Sep 2017 06:35:55 -0700 (PDT)","from [10.2.206.195] (e112997-lin.cambridge.arm.com\n\t[10.2.206.195])\tby usa-sjc-imap-foss1.foss.arm.com (Postfix)\n\twith ESMTPSA id 7326B3F58C; Wed, 13 Sep 2017 06:35:54 -0700 (PDT)"],"DomainKey-Signature":"a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id\n\t:list-unsubscribe:list-archive:list-post:list-help:sender\n\t:subject:to:references:from:message-id:date:mime-version\n\t:in-reply-to:content-type; q=dns; s=default; b=KyTqqFjCwoB5fntvR\n\tFqw0vcyvBFtUqzLmqklgXZSJ1sW2UtzyImy8s0/uod843v79qHsDVC2dND921pQ1\n\te6fMsFvTYiweZO4dDredGO8pQUhQ/Wgq1fp4l4VQYeQ02vjzijFwOZVdXjBVdeFy\n\tQZo5/UBqotWIiSBB1bi3JYCisE=","DKIM-Signature":"v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id\n\t:list-unsubscribe:list-archive:list-post:list-help:sender\n\t:subject:to:references:from:message-id:date:mime-version\n\t:in-reply-to:content-type; s=default; bh=njms3wECv5/A/hZdi8dVSNN\n\td5/8=; b=JIn5gcCVJe6bKaG7Vlb7ld7faRP2R1KcOdUzFO3IRq9ku0AfdAh/npQ\n\tJhj9jbXrtJo4bTC4c6IAUYcW/QzyuXg2jDD+flNaSMhwWOSAAR7bEAEEo8tPO1N2\n\tTUPVdEoMqmqWxMR6IPQSiNKYWEzVgUZBaVZTo5yaQWzUV687zG8M=","Mailing-List":"contact gcc-patches-help@gcc.gnu.org; run by ezmlm","Precedence":"bulk","List-Id":"<gcc-patches.gcc.gnu.org>","List-Unsubscribe":"<mailto:gcc-patches-unsubscribe-incoming=patchwork.ozlabs.org@gcc.gnu.org>","List-Archive":"<http://gcc.gnu.org/ml/gcc-patches/>","List-Post":"<mailto:gcc-patches@gcc.gnu.org>","List-Help":"<mailto:gcc-patches-help@gcc.gnu.org>","Sender":"gcc-patches-owner@gcc.gnu.org","X-Virus-Found":"No","X-Spam-SWARE-Status":"No, score=-25.7 required=5.0 tests=BAYES_00, GIT_PATCH_0,\n\tGIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3,\n\tKAM_LAZY_DOMAIN_SECURITY, KAM_LOTSOFHASH,\n\tRP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=rw","X-HELO":"foss.arm.com","Subject":"Re: [AArch64] Merge stores of D register values of different modes","To":"GCC Patches <gcc-patches@gcc.gnu.org>, James.Greenhalgh@arm.com,\n\tRichard.Earnshaw@arm.com, richard.sandiford@linaro.org","References":"<93501be4-3408-a2d9-f5e1-56a5569bda96@foss.arm.com>\n\t<07fc21e1-8747-eca0-4dde-4f364ef1a414@foss.arm.com>\n\t<871snbbwdo.fsf@linaro.org>","From":"Jackson Woodruff <jackson.woodruff@foss.arm.com>","Message-ID":"<020fd264-3996-5cd1-ebc9-9f1aa82543ec@foss.arm.com>","Date":"Wed, 13 Sep 2017 14:35:52 +0100","User-Agent":"Mozilla/5.0 (X11; Linux x86_64;\n\trv:52.0) Gecko/20100101 Thunderbird/52.3.0","MIME-Version":"1.0","In-Reply-To":"<871snbbwdo.fsf@linaro.org>","Content-Type":"multipart/mixed;\n\tboundary=\"------------2AEE4FF69DC01075083B352F\"","X-IsSubscribed":"yes"}}]