Patch Detail
get:
Show a patch.
patch:
Update a patch.
put:
Update a patch.
GET /api/patches/2218974/?format=api
{ "id": 2218974, "url": "http://patchwork.ozlabs.org/api/patches/2218974/?format=api", "web_url": "http://patchwork.ozlabs.org/project/gcc/patch/20260402081207.A91884A0B0@imap1.dmz-prg2.suse.org/", "project": { "id": 17, "url": "http://patchwork.ozlabs.org/api/projects/17/?format=api", "name": "GNU Compiler Collection", "link_name": "gcc", "list_id": "gcc-patches.gcc.gnu.org", "list_email": "gcc-patches@gcc.gnu.org", "web_url": null, "scm_url": null, "webscm_url": null, "list_archive_url": "", "list_archive_url_format": "", "commit_url_format": "" }, "msgid": "<20260402081207.A91884A0B0@imap1.dmz-prg2.suse.org>", "list_archive_url": null, "date": "2026-04-02T08:11:52", "name": "middle-end/121467 - split the Standard Pattern Names section", "commit_ref": null, "pull_url": null, "state": "new", "archived": false, "hash": "cd05a871596f1c52167f4d7e2d7bc6c18e5320a6", "submitter": { "id": 4338, "url": "http://patchwork.ozlabs.org/api/people/4338/?format=api", "name": "Richard Biener", "email": "rguenther@suse.de" }, "delegate": null, "mbox": "http://patchwork.ozlabs.org/project/gcc/patch/20260402081207.A91884A0B0@imap1.dmz-prg2.suse.org/mbox/", "series": [ { "id": 498449, "url": "http://patchwork.ozlabs.org/api/series/498449/?format=api", "web_url": "http://patchwork.ozlabs.org/project/gcc/list/?series=498449", "date": "2026-04-02T08:11:52", "name": "middle-end/121467 - split the Standard Pattern Names section", "version": 1, "mbox": "http://patchwork.ozlabs.org/series/498449/mbox/" } ], "comments": "http://patchwork.ozlabs.org/api/patches/2218974/comments/", "check": "pending", "checks": "http://patchwork.ozlabs.org/api/patches/2218974/checks/", "tags": {}, "related": [], "headers": { "Return-Path": "<gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org>", "X-Original-To": [ "incoming@patchwork.ozlabs.org", "gcc-patches@gcc.gnu.org" ], "Delivered-To": [ "patchwork-incoming@legolas.ozlabs.org", "gcc-patches@gcc.gnu.org" ], "Authentication-Results": [ "legolas.ozlabs.org;\n\tdkim=pass (1024-bit key;\n unprotected) header.d=suse.de header.i=@suse.de header.a=rsa-sha256\n header.s=susede2_rsa header.b=QEUTA0kz;\n\tdkim=pass header.d=suse.de header.i=@suse.de header.a=ed25519-sha256\n header.s=susede2_ed25519 header.b=gZ+M5G+b;\n\tdkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de\n header.a=rsa-sha256 header.s=susede2_rsa header.b=QEUTA0kz;\n\tdkim=neutral header.d=suse.de header.i=@suse.de header.a=ed25519-sha256\n header.s=susede2_ed25519 header.b=gZ+M5G+b;\n\tdkim-atps=neutral", "legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org\n (client-ip=38.145.34.32; helo=vm01.sourceware.org;\n envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org;\n receiver=patchwork.ozlabs.org)", "sourceware.org;\n\tdkim=pass (1024-bit key,\n unprotected) header.d=suse.de header.i=@suse.de header.a=rsa-sha256\n header.s=susede2_rsa header.b=QEUTA0kz;\n\tdkim=pass header.d=suse.de header.i=@suse.de header.a=ed25519-sha256\n header.s=susede2_ed25519 header.b=gZ+M5G+b;\n\tdkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de\n header.a=rsa-sha256 header.s=susede2_rsa header.b=QEUTA0kz;\n\tdkim=neutral header.d=suse.de header.i=@suse.de header.a=ed25519-sha256\n header.s=susede2_ed25519 header.b=gZ+M5G+b", "sourceware.org;\n dmarc=pass (p=none dis=none) header.from=suse.de", "sourceware.org; spf=pass smtp.mailfrom=suse.de", "server2.sourceware.org;\n arc=none smtp.remote-ip=195.135.223.130", "smtp-out1.suse.de;\n\tnone" ], "Received": [ "from vm01.sourceware.org (vm01.sourceware.org [38.145.34.32])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange x25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4fmZPd2PJfz1yCs\n\tfor <incoming@patchwork.ozlabs.org>; Thu, 02 Apr 2026 19:13:57 +1100 (AEDT)", "from vm01.sourceware.org (localhost [127.0.0.1])\n\tby sourceware.org (Postfix) with ESMTP id 5E0214BA23CF\n\tfor <incoming@patchwork.ozlabs.org>; Thu, 2 Apr 2026 08:13:54 +0000 (GMT)", "from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130])\n by sourceware.org (Postfix) with ESMTPS id 1445C4BA2E1A\n for <gcc-patches@gcc.gnu.org>; Thu, 2 Apr 2026 08:12:09 +0000 (GMT)", "from imap1.dmz-prg2.suse.org (unknown [10.150.64.97])\n (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest\n SHA256)\n (No client certificate requested)\n by smtp-out1.suse.de (Postfix) with ESMTPS id F03AD4D31B;\n Thu, 2 Apr 2026 08:12:07 +0000 (UTC)", "from imap1.dmz-prg2.suse.org (localhost [127.0.0.1])\n (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest\n SHA256)\n (No client certificate requested)\n by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id A91884A0B0;\n Thu, 2 Apr 2026 08:12:07 +0000 (UTC)", "from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167])\n by imap1.dmz-prg2.suse.org with ESMTPSA id +RmxJ9ckzmniGQAAD6G6ig\n (envelope-from <rguenther@suse.de>); Thu, 02 Apr 2026 08:12:07 +0000" ], "DKIM-Filter": [ "OpenDKIM Filter v2.11.0 sourceware.org 5E0214BA23CF", "OpenDKIM Filter v2.11.0 sourceware.org 1445C4BA2E1A" ], "DMARC-Filter": "OpenDMARC Filter v1.4.2 sourceware.org 1445C4BA2E1A", "ARC-Filter": "OpenARC Filter v1.0.0 sourceware.org 1445C4BA2E1A", "ARC-Seal": "i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1775117529; cv=none;\n b=RgiIltAMogDs1Oc/EEh7tslX2/e22I4RTPak1LKtgjyZuocmap/cB8C7yp5d3xH/ubPPu8OiEfNA3X3tDFMJR4kD+WDYFebW87NpeC2XFKqa3yEO5oGpIqGGwvl14D/mvyZ+ONILSOuUx8fdbMMxOFQAAwUmF9fyzlFs/tPVIRU=", "ARC-Message-Signature": "i=1; a=rsa-sha256; d=sourceware.org; s=key;\n t=1775117529; c=relaxed/simple;\n bh=N1k2evpz5OVU5f25P00v8NGn6b2YEyqeHeQB5j/XuX8=;\n h=DKIM-Signature:DKIM-Signature:DKIM-Signature:DKIM-Signature:Date:\n From:To:Subject:MIME-Version:Message-Id;\n b=SlvmPX1iWW5qYli7dRrwFun9bzPuu027M6c0oMv6MmRVaTVuURuQ7kvJ0bAG9d4fM4Mn/ZtaNTs4Prk3oKG1prW0ccNByQWXtrGYTrVLy5ypX6+EAklpeYhyj96YdJuau2btT7wjDiz/Au/aqB/I7SimePrvGagzk6m+dG7OdLI=", "ARC-Authentication-Results": "i=1; server2.sourceware.org", "DKIM-Signature": [ "v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de;\n s=susede2_rsa;\n t=1775117528;\n h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc:\n mime-version:mime-version:content-type:content-type;\n bh=Ex9vgq6UQTghOnpXiK/m93gUP+bcr5e4RvnbOPTt6sk=;\n b=QEUTA0kzmjW14labOWx91Eci1+To9243HWRA9+L/v+dlL+a6zWZRI9j7cTCYzAaY9QWyLY\n qvH7/YmR2OtCpAYnavfnJQDn1gw0GmSo+R+Tk6evjB5Y2qqqCLFQ6mIgQC70zpYvebVxVG\n TJl2IGxXPEM9uFSXo4DmKW3Mo3DCSwA=", "v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de;\n s=susede2_ed25519; t=1775117528;\n h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc:\n mime-version:mime-version:content-type:content-type;\n bh=Ex9vgq6UQTghOnpXiK/m93gUP+bcr5e4RvnbOPTt6sk=;\n b=gZ+M5G+bs53zSOvlaFIjRNI328rZho9mK2IAqfDdvyXr0HB0fTDksIXiI6UvSzzpsebENB\n sNil60mNimI69UDw==", "v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de;\n s=susede2_rsa;\n t=1775117528;\n h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc:\n mime-version:mime-version:content-type:content-type;\n bh=Ex9vgq6UQTghOnpXiK/m93gUP+bcr5e4RvnbOPTt6sk=;\n b=QEUTA0kzmjW14labOWx91Eci1+To9243HWRA9+L/v+dlL+a6zWZRI9j7cTCYzAaY9QWyLY\n qvH7/YmR2OtCpAYnavfnJQDn1gw0GmSo+R+Tk6evjB5Y2qqqCLFQ6mIgQC70zpYvebVxVG\n TJl2IGxXPEM9uFSXo4DmKW3Mo3DCSwA=", "v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de;\n s=susede2_ed25519; t=1775117528;\n h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc:\n mime-version:mime-version:content-type:content-type;\n bh=Ex9vgq6UQTghOnpXiK/m93gUP+bcr5e4RvnbOPTt6sk=;\n b=gZ+M5G+bs53zSOvlaFIjRNI328rZho9mK2IAqfDdvyXr0HB0fTDksIXiI6UvSzzpsebENB\n sNil60mNimI69UDw==" ], "Date": "Thu, 2 Apr 2026 10:11:52 +0200 (CEST)", "From": "Richard Biener <rguenther@suse.de>", "To": "gcc-patches@gcc.gnu.org", "cc": "sloosemore@baylibre.com", "Subject": "[PATCH] middle-end/121467 - split the Standard Pattern Names section", "MIME-Version": "1.0", "Content-Type": "text/plain; charset=US-ASCII", "Message-Id": "<20260402081207.A91884A0B0@imap1.dmz-prg2.suse.org>", "X-Spamd-Result": "default: False [-4.30 / 50.00]; BAYES_HAM(-3.00)[100.00%];\n NEURAL_HAM_LONG(-1.00)[-1.000];\n NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain];\n FUZZY_RATELIMITED(0.00)[rspamd.com];\n RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[];\n MISSING_XM_UA(0.00)[]; RCPT_COUNT_TWO(0.00)[2];\n RCVD_TLS_ALL(0.00)[];\n DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519];\n TO_MATCH_ENVRCPT_ALL(0.00)[]; FROM_HAS_DN(0.00)[];\n MIME_TRACE(0.00)[0:+]; FROM_EQ_ENVFROM(0.00)[];\n TO_DN_NONE(0.00)[]; RCVD_COUNT_TWO(0.00)[2];\n DBL_BLOCKED_OPENRESOLVER(0.00)[imap1.dmz-prg2.suse.org:helo,\n imap1.dmz-prg2.suse.org:mid]", "X-BeenThere": "gcc-patches@gcc.gnu.org", "X-Mailman-Version": "2.1.30", "Precedence": "list", "List-Id": "Gcc-patches mailing list <gcc-patches.gcc.gnu.org>", "List-Unsubscribe": "<https://gcc.gnu.org/mailman/options/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>", "List-Archive": "<https://gcc.gnu.org/pipermail/gcc-patches/>", "List-Post": "<mailto:gcc-patches@gcc.gnu.org>", "List-Help": "<mailto:gcc-patches-request@gcc.gnu.org?subject=help>", "List-Subscribe": "<https://gcc.gnu.org/mailman/listinfo/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>", "Errors-To": "gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org" }, "content": "The following splits the standard pattern names section into two\n(for now), listing vector related patterns separately. It also\nadds a separate index for the many standard pattern names we have,\nsomething long overdue.\n\nv2 adds proper menu and node to the new subsections and transitions\nto the new index. I have elided the 'instruction pattern' designation\non the index entries, that just made the index awful to look at and\nis entirely redundant.\n\nI built and inspected the texinfo and pdf internals manual using\ntexinfo 7.1.\n\nOK?\n\nThanks,\nRichard.\n\n\t* doc/gccint.texi: Add named pattern index with @mdindex.\n\t* doc/md.texi (Standard Pattern Names For Generation): Split\n\ttable into two using subsections, splitting out vectorizer\n\trelated standard patterns. Use @mdindex for all standard\n\tpattern names.\n---\n gcc/doc/gccint.texi | 9 +\n gcc/doc/md.texi | 6365 ++++++++++++++++++++++---------------------\n 2 files changed, 3199 insertions(+), 3175 deletions(-)", "diff": "diff --git a/gcc/doc/gccint.texi b/gcc/doc/gccint.texi\nindex 30c5509ca41..3f248c59428 100644\n--- a/gcc/doc/gccint.texi\n+++ b/gcc/doc/gccint.texi\n@@ -18,6 +18,9 @@\n @c Likewise for parameters.\n @defcodeindex pa\n \n+@c And for named patterns.\n+@defcodeindex md\n+\n @c Merge the standard indexes into a single one.\n @syncodeindex fn cp\n @syncodeindex vr cp\n@@ -143,6 +146,7 @@ Additional tutorial information is linked to from\n * Option Index:: Index to command line options.\n * Parameter Index:: Index to parameters settable from the command line.\n * Concept Index:: Index of concepts and symbol names.\n+* Named Pattern Index:: Index of standard pattern names.\n @end menu\n \n @include contribute.texi\n@@ -212,6 +216,11 @@ form; it may sometimes be useful to look up both forms.\n \n @printindex cp\n \n+@node Named Pattern Index\n+@unnumbered Named Pattern Index\n+\n+@printindex md\n+\n @c ---------------------------------------------------------------------\n @c Epilogue\n @c ---------------------------------------------------------------------\ndiff --git a/gcc/doc/md.texi b/gcc/doc/md.texi\nindex edbdb1d50f1..9e4cccaf5ee 100644\n--- a/gcc/doc/md.texi\n+++ b/gcc/doc/md.texi\n@@ -4804,13 +4804,21 @@ definition from the i386 machine description.)\n @cindex pattern names\n @cindex names, pattern\n \n-Here is a table of the instruction names that are meaningful in the RTL\n+Here are tables of the instruction names that are meaningful in the RTL\n generation pass of the compiler. Giving one of these names to an\n instruction pattern tells the RTL generation pass that it can use the\n pattern to accomplish a certain task.\n \n+@menu\n+* General Standard Pattern Names::\n+* Standard Pattern Names for Vectorization::\n+@end menu\n+\n+@node General Standard Pattern Names\n+@subsection General Standard Pattern Names\n+\n @table @asis\n-@cindex @code{mov@var{m}} instruction pattern\n+@mdindex @code{mov@var{m}}\n @item @samp{mov@var{m}}\n Here @var{m} stands for a two-letter machine mode name, in lowercase.\n This instruction pattern moves data with that machine mode from operand\n@@ -4898,8 +4906,8 @@ floating point registers, then the constraints of the fixed point\n @samp{mov@var{m}} instructions must be designed to avoid ever trying to\n reload into a floating point register.\n \n-@cindex @code{reload_in} instruction pattern\n-@cindex @code{reload_out} instruction pattern\n+@mdindex @code{reload_in}\n+@mdindex @code{reload_out}\n @item @samp{reload_in@var{m}}\n @itemx @samp{reload_out@var{m}}\n These named patterns have been obsoleted by the target hook\n@@ -4921,14 +4929,14 @@ matches the @code{ALL_REGS} register class. This may relieve ports\n of the burden of defining an @code{ALL_REGS} constraint letter just\n for these patterns.\n \n-@cindex @code{movstrict@var{m}} instruction pattern\n+@mdindex @code{movstrict@var{m}}\n @item @samp{movstrict@var{m}}\n Like @samp{mov@var{m}} except that if operand 0 is a @code{subreg}\n with mode @var{m} of a register whose natural mode is wider,\n the @samp{movstrict@var{m}} instruction is guaranteed not to alter\n any of the register except the part which belongs to mode @var{m}.\n \n-@cindex @code{movmisalign@var{m}} instruction pattern\n+@mdindex @code{movmisalign@var{m}}\n @item @samp{movmisalign@var{m}}\n This variant of a move pattern is designed to load or store a value\n from a memory address that is not naturally aligned for its mode.\n@@ -4939,7 +4947,7 @@ memory, so that it's easy to tell whether this is a load or store.\n This pattern is used by the autovectorizer, and when expanding a\n @code{MISALIGNED_INDIRECT_REF} expression.\n \n-@cindex @code{load_multiple} instruction pattern\n+@mdindex @code{load_multiple}\n @item @samp{load_multiple}\n Load several consecutive memory locations into consecutive registers.\n Operand 0 is the first of the consecutive registers, operand 1\n@@ -4962,3809 +4970,3816 @@ also need @code{use} or @code{clobber} elements). Use a\n @code{match_parallel} (@pxref{RTL Template}) to recognize the insn. See\n @file{rs6000.md} for examples of the use of this insn pattern.\n \n-@cindex @samp{store_multiple} instruction pattern\n+@mdindex @samp{store_multiple}\n @item @samp{store_multiple}\n Similar to @samp{load_multiple}, but store several consecutive registers\n into consecutive memory locations. Operand 0 is the first of the\n consecutive memory locations, operand 1 is the first register, and\n operand 2 is a constant: the number of consecutive registers.\n \n-@cindex @code{vec_load_lanes@var{m}@var{n}} instruction pattern\n-@item @samp{vec_load_lanes@var{m}@var{n}}\n-Perform an interleaved load of several vectors from memory operand 1\n-into register operand 0. Both operands have mode @var{m}. The register\n-operand is viewed as holding consecutive vectors of mode @var{n},\n-while the memory operand is a flat array that contains the same number\n-of elements. The operation is equivalent to:\n+@mdindex @code{push@var{m}1}\n+@item @samp{push@var{m}1}\n+Output a push instruction. Operand 0 is value to push. Used only when\n+@code{PUSH_ROUNDING} is defined. For historical reason, this pattern may be\n+missing and in such case an @code{mov} expander is used instead, with a\n+@code{MEM} expression forming the push operation. The @code{mov} expander\n+method is deprecated.\n \n-@smallexample\n-int c = GET_MODE_SIZE (@var{m}) / GET_MODE_SIZE (@var{n});\n-for (j = 0; j < GET_MODE_NUNITS (@var{n}); j++)\n- for (i = 0; i < c; i++)\n- operand0[i][j] = operand1[j * c + i];\n-@end smallexample\n+@mdindex @code{add@var{m}3}\n+@item @samp{add@var{m}3}\n+Add operand 2 and operand 1, storing the result in operand 0. All operands\n+must have mode @var{m}. This can be used even on two-address machines, by\n+means of constraints requiring operands 1 and 0 to be the same location.\n \n-For example, @samp{vec_load_lanestiv4hi} loads 8 16-bit values\n-from memory into a register of mode @samp{TI}@. The register\n-contains two consecutive vectors of mode @samp{V4HI}@.\n+@mdindex @code{ssadd@var{m}3}\n+@mdindex @code{usadd@var{m}3}\n+@mdindex @code{sub@var{m}3}\n+@mdindex @code{sssub@var{m}3}\n+@mdindex @code{ussub@var{m}3}\n+@mdindex @code{mul@var{m}3}\n+@mdindex @code{ssmul@var{m}3}\n+@mdindex @code{usmul@var{m}3}\n+@mdindex @code{div@var{m}3}\n+@mdindex @code{ssdiv@var{m}3}\n+@mdindex @code{udiv@var{m}3}\n+@mdindex @code{usdiv@var{m}3}\n+@mdindex @code{mod@var{m}3}\n+@mdindex @code{umod@var{m}3}\n+@mdindex @code{umin@var{m}3}\n+@mdindex @code{umax@var{m}3}\n+@mdindex @code{and@var{m}3}\n+@mdindex @code{ior@var{m}3}\n+@mdindex @code{xor@var{m}3}\n+@item @samp{ssadd@var{m}3}, @samp{usadd@var{m}3}\n+@itemx @samp{sub@var{m}3}, @samp{sssub@var{m}3}, @samp{ussub@var{m}3}\n+@itemx @samp{mul@var{m}3}, @samp{ssmul@var{m}3}, @samp{usmul@var{m}3}\n+@itemx @samp{div@var{m}3}, @samp{ssdiv@var{m}3}\n+@itemx @samp{udiv@var{m}3}, @samp{usdiv@var{m}3}\n+@itemx @samp{mod@var{m}3}, @samp{umod@var{m}3}\n+@itemx @samp{umin@var{m}3}, @samp{umax@var{m}3}\n+@itemx @samp{and@var{m}3}, @samp{ior@var{m}3}, @samp{xor@var{m}3}\n+Similar, for other arithmetic operations.\n \n-This pattern can only be used if:\n-@smallexample\n-TARGET_ARRAY_MODE_SUPPORTED_P (@var{n}, @var{c})\n-@end smallexample\n-is true. GCC assumes that, if a target supports this kind of\n-instruction for some mode @var{n}, it also supports unaligned\n-loads for vectors of mode @var{n}.\n+@mdindex @code{ustrunc@var{m}@var{n}2}\n+@item @samp{ustrunc@var{m}@var{n}2}\n+Truncate the operand 1, and storing the result in operand 0. There will\n+be saturation during the trunction. The result will be saturated to the\n+maximal value of operand 0 type if there is overflow when truncation. The\n+operand 1 must have mode @var{n}, and the operand 0 must have mode @var{m}.\n+Both scalar and vector integer modes are allowed.\n \n-This pattern is not allowed to @code{FAIL}.\n+@mdindex @code{sstrunc@var{m}@var{n}2}\n+@item @samp{sstrunc@var{m}@var{n}2}\n+Similar but for signed.\n \n-@cindex @code{vec_mask_load_lanes@var{m}@var{n}} instruction pattern\n-@item @samp{vec_mask_load_lanes@var{m}@var{n}}\n-Like @samp{vec_load_lanes@var{m}@var{n}}, but takes an additional\n-mask operand (operand 2) that specifies which elements of the destination\n-vectors should be loaded. Other elements of the destination vectors are\n-taken from operand 3, which is an else operand in the subvector mode\n-@var{n}, similar to the one in @code{maskload}.\n-The operation is equivalent to:\n+@mdindex @code{andn@var{m}3}\n+@item @samp{andn@var{m}3}\n+Like @code{and@var{m}3}, but it uses bitwise-complement of operand 2\n+rather than operand 2 itself.\n \n-@smallexample\n-int c = GET_MODE_SIZE (@var{m}) / GET_MODE_SIZE (@var{n});\n-for (j = 0; j < GET_MODE_NUNITS (@var{n}); j++)\n- if (operand2[j])\n- for (i = 0; i < c; i++)\n- operand0[i][j] = operand1[j * c + i];\n- else\n- for (i = 0; i < c; i++)\n- operand0[i][j] = operand3[j];\n-@end smallexample\n+@mdindex @code{iorn@var{m}3}\n+@item @samp{iorn@var{m}3}\n+Like @code{ior@var{m}3}, but it uses bitwise-complement of operand 2\n+rather than operand 2 itself.\n \n-This pattern is not allowed to @code{FAIL}.\n+@mdindex @code{addv@var{m}4}\n+@item @samp{addv@var{m}4}\n+Like @code{add@var{m}3} but takes a @code{code_label} as operand 3 and\n+emits code to jump to it if signed overflow occurs during the addition.\n+This pattern is used to implement the built-in functions performing\n+signed integer addition with overflow checking.\n \n-@cindex @code{vec_mask_len_load_lanes@var{m}@var{n}} instruction pattern\n-@item @samp{vec_mask_len_load_lanes@var{m}@var{n}}\n-Like @samp{vec_load_lanes@var{m}@var{n}}, but takes an additional\n-mask operand (operand 2), length operand (operand 4) as well as bias operand\n-(operand 5) that specifies which elements of the destination vectors should be\n-loaded. Other elements of the destination vectors are taken from operand 3,\n-which is an else operand similar to the one in @code{maskload}.\n-The operation is equivalent to:\n+@mdindex @code{subv@var{m}4}\n+@mdindex @code{mulv@var{m}4}\n+@item @samp{subv@var{m}4}, @samp{mulv@var{m}4}\n+Similar, for other signed arithmetic operations.\n \n-@smallexample\n-int c = GET_MODE_SIZE (@var{m}) / GET_MODE_SIZE (@var{n});\n-for (j = 0; j < operand4 + operand5; j++)\n- for (i = 0; i < c; i++)\n- if (operand2[j])\n- operand0[i][j] = operand1[j * c + i];\n- else\n- operand0[i][j] = operand3[j];\n-@end smallexample\n+@mdindex @code{uaddv@var{m}4}\n+@item @samp{uaddv@var{m}4}\n+Like @code{addv@var{m}4} but for unsigned addition. That is to\n+say, the operation is the same as signed addition but the jump\n+is taken only on unsigned overflow.\n \n-This pattern is not allowed to @code{FAIL}.\n+@mdindex @code{usubv@var{m}4}\n+@mdindex @code{umulv@var{m}4}\n+@item @samp{usubv@var{m}4}, @samp{umulv@var{m}4}\n+Similar, for other unsigned arithmetic operations.\n \n-@cindex @code{vec_store_lanes@var{m}@var{n}} instruction pattern\n-@item @samp{vec_store_lanes@var{m}@var{n}}\n-Equivalent to @samp{vec_load_lanes@var{m}@var{n}}, with the memory\n-and register operands reversed. That is, the instruction is\n-equivalent to:\n+@mdindex @code{uaddc@var{m}5}\n+@item @samp{uaddc@var{m}5}\n+Adds unsigned operands 2, 3 and 4 (where the last operand is guaranteed to\n+have only values 0 or 1) together, sets operand 0 to the result of the\n+addition of the 3 operands and sets operand 1 to 1 iff there was\n+overflow on the unsigned additions, and to 0 otherwise. So, it is\n+an addition with carry in (operand 4) and carry out (operand 1).\n+All operands have the same mode.\n \n-@smallexample\n-int c = GET_MODE_SIZE (@var{m}) / GET_MODE_SIZE (@var{n});\n-for (j = 0; j < GET_MODE_NUNITS (@var{n}); j++)\n- for (i = 0; i < c; i++)\n- operand0[j * c + i] = operand1[i][j];\n-@end smallexample\n+@mdindex @code{usubc@var{m}5}\n+@item @samp{usubc@var{m}5}\n+Similarly to @samp{uaddc@var{m}5}, except subtracts unsigned operands 3\n+and 4 from operand 2 instead of adding them. So, it is\n+a subtraction with carry/borrow in (operand 4) and carry/borrow out\n+(operand 1). All operands have the same mode.\n \n-for a memory operand 0 and register operand 1.\n+@mdindex @code{addptr@var{m}3}\n+@item @samp{addptr@var{m}3}\n+Like @code{add@var{m}3} but is guaranteed to only be used for address\n+calculations. The expanded code is not allowed to clobber the\n+condition code. It only needs to be defined if @code{add@var{m}3}\n+sets the condition code. If adds used for address calculations and\n+normal adds are not compatible it is required to expand a distinct\n+pattern (e.g.@: using an unspec). The pattern is used by LRA to emit\n+address calculations. @code{add@var{m}3} is used if\n+@code{addptr@var{m}3} is not defined.\n \n-This pattern is not allowed to @code{FAIL}.\n+@mdindex @code{fma@var{m}4}\n+@item @samp{fma@var{m}4}\n+Multiply operand 2 and operand 1, then add operand 3, storing the\n+result in operand 0 without doing an intermediate rounding step. All\n+operands must have mode @var{m}. This pattern is used to implement\n+the @code{fma}, @code{fmaf}, and @code{fmal} builtin functions from\n+the ISO C99 standard.\n \n-@cindex @code{vec_mask_store_lanes@var{m}@var{n}} instruction pattern\n-@item @samp{vec_mask_store_lanes@var{m}@var{n}}\n-Like @samp{vec_store_lanes@var{m}@var{n}}, but takes an additional\n-mask operand (operand 2) that specifies which elements of the source\n-vectors should be stored. The operation is equivalent to:\n+@mdindex @code{fms@var{m}4}\n+@item @samp{fms@var{m}4}\n+Like @code{fma@var{m}4}, except operand 3 subtracted from the\n+product instead of added to the product. This is represented\n+in the rtl as\n \n @smallexample\n-int c = GET_MODE_SIZE (@var{m}) / GET_MODE_SIZE (@var{n});\n-for (j = 0; j < GET_MODE_NUNITS (@var{n}); j++)\n- if (operand2[j])\n- for (i = 0; i < c; i++)\n- operand0[j * c + i] = operand1[i][j];\n+(fma:@var{m} @var{op1} @var{op2} (neg:@var{m} @var{op3}))\n @end smallexample\n \n-This pattern is not allowed to @code{FAIL}.\n-\n-@cindex @code{vec_mask_len_store_lanes@var{m}@var{n}} instruction pattern\n-@item @samp{vec_mask_len_store_lanes@var{m}@var{n}}\n-Like @samp{vec_store_lanes@var{m}@var{n}}, but takes an additional\n-mask operand (operand 2), length operand (operand 3) as well as bias operand (operand 4)\n-that specifies which elements of the source vectors should be stored.\n-The operation is equivalent to:\n+@mdindex @code{fnma@var{m}4}\n+@item @samp{fnma@var{m}4}\n+Like @code{fma@var{m}4} except that the intermediate product\n+is negated before being added to operand 3. This is represented\n+in the rtl as\n \n @smallexample\n-int c = GET_MODE_SIZE (@var{m}) / GET_MODE_SIZE (@var{n});\n-for (j = 0; j < operand3 + operand4; j++)\n- if (operand2[j])\n- for (i = 0; i < c; i++)\n- operand0[j * c + i] = operand1[i][j];\n+(fma:@var{m} (neg:@var{m} @var{op1}) @var{op2} @var{op3})\n @end smallexample\n \n-This pattern is not allowed to @code{FAIL}.\n+@mdindex @code{fnms@var{m}4}\n+@item @samp{fnms@var{m}4}\n+Like @code{fms@var{m}4} except that the intermediate product\n+is negated before subtracting operand 3. This is represented\n+in the rtl as\n \n-@cindex @code{gather_load@var{m}@var{n}} instruction pattern\n-@item @samp{gather_load@var{m}@var{n}}\n-Load several separate memory locations into a vector of mode @var{m}.\n-Operand 1 is a scalar base address and operand 2 is a vector of mode @var{n}\n-containing offsets from that base. Operand 0 is a destination vector with\n-the same number of elements as @var{n}. For each element index @var{i}:\n+@smallexample\n+(fma:@var{m} (neg:@var{m} @var{op1}) @var{op2} (neg:@var{m} @var{op3}))\n+@end smallexample\n \n-@itemize @bullet\n-@item\n-extend the offset element @var{i} to address width, using zero\n-extension if operand 3 is 1 and sign extension if operand 3 is zero;\n-@item\n-multiply the extended offset by operand 4;\n-@item\n-add the result to the base; and\n-@item\n-load the value at that address into element @var{i} of operand 0.\n-@end itemize\n+@mdindex @code{min@var{m}3}\n+@mdindex @code{max@var{m}3}\n+@item @samp{smin@var{m}3}, @samp{smax@var{m}3}\n+Signed minimum and maximum operations. When used with floating point,\n+if both operands are zeros, or if either operand is @code{NaN}, then\n+it is unspecified which of the two operands is returned as the result.\n \n-The value of operand 3 does not matter if the offsets are already\n-address width.\n+@mdindex @code{fmin@var{m}3}\n+@mdindex @code{fmax@var{m}3}\n+@item @samp{fmin@var{m}3}, @samp{fmax@var{m}3}\n+IEEE-conformant minimum and maximum operations. If one operand is a quiet\n+@code{NaN}, then the other operand is returned. If both operands are quiet\n+@code{NaN}, then a quiet @code{NaN} is returned. In the case when gcc supports\n+signaling @code{NaN} (-fsignaling-nans) an invalid floating point exception is\n+raised and a quiet @code{NaN} is returned.\n \n-@cindex @code{mask_gather_load@var{m}@var{n}} instruction pattern\n-@item @samp{mask_gather_load@var{m}@var{n}}\n-Like @samp{gather_load@var{m}@var{n}}, but takes an extra mask operand as\n-operand 5.\n-Other elements of the destination vectors are taken from operand 6,\n-which is an else operand similar to the one in @code{maskload}.\n-Bit @var{i} of the mask is set if element @var{i}\n-of the result should be loaded from memory and clear if element @var{i}\n-of the result should be set to operand 6.\n+All operands have mode @var{m}, which is a scalar or vector\n+floating-point mode. These patterns are not allowed to @code{FAIL}.\n \n-@cindex @code{mask_len_gather_load@var{m}@var{n}} instruction pattern\n-@item @samp{mask_len_gather_load@var{m}@var{n}}\n-Like @samp{gather_load@var{m}@var{n}}, but takes an extra mask operand\n-(operand 5) and an else operand (operand 6) as well as a len operand\n-(operand 7) and a bias operand (operand 8).\n+@mdindex @code{ssad@var{m}}\n+@mdindex @code{usad@var{m}}\n+@item @samp{ssad@var{m}}\n+@item @samp{usad@var{m}}\n+Compute the sum of absolute differences of two signed/unsigned elements.\n+Operand 1 and operand 2 are of the same mode. Their absolute difference, which\n+is of a wider mode, is computed and added to operand 3. Operand 3 is of a mode\n+equal or wider than the mode of the absolute difference. The result is placed\n+in operand 0, which is of the same mode as operand 3.\n+@var{m} is the mode of operand 1 and operand 2.\n \n-Similar to mask_len_load the instruction loads at\n-most (operand 7 + operand 8) elements from memory.\n-Bit @var{i} of the mask is set if element @var{i} of the result should\n-be loaded from memory and clear if element @var{i} of the result should\n-be set to element @var{i} of operand 6.\n-Mask elements @var{i} with @var{i} > (operand 7 + operand 8) are ignored.\n+@mdindex @code{widen_ssum@var{n}@var{m}3}\n+@mdindex @code{widen_usum@var{n}@var{m}3}\n+@item @samp{widen_ssum@var{n}@var{m}3}\n+@itemx @samp{widen_usum@var{n}@var{m}3}\n+Operands 0 and 2 are of the same mode, which is wider than the mode of\n+operand 1. Add operand 1 to operand 2 and place the widened result in\n+operand 0. (This is used express accumulation of elements into an accumulator\n+of a wider mode.)\n+@var{m} is the mode of operand 1 and @var{n} is the mode of operand 0.\n \n-@cindex @code{mask_len_strided_load@var{m}} instruction pattern\n-@item @samp{mask_len_strided_load@var{m}}\n-Load several separate memory locations into a destination vector of mode @var{m}.\n-Operand 0 is a destination vector of mode @var{m}.\n-Operand 1 is a scalar base address and operand 2 is a scalar stride of Pmode.\n-operand 3 is mask operand, operand 4 is length operand and operand 5 is bias operand.\n-The instruction can be seen as a special case of @code{mask_len_gather_load@var{m}@var{n}}\n-with an offset vector that is a @code{vec_series} with zero as base and operand 2 as step.\n-For each element the load address is operand 1 + @var{i} * operand 2.\n-Similar to mask_len_load, the instruction loads at most (operand 4 + operand 5) elements from memory.\n-Element @var{i} of the mask (operand 3) is set if element @var{i} of the result should\n-be loaded from memory and clear if element @var{i} of the result should be zero.\n-Mask elements @var{i} with @var{i} > (operand 4 + operand 5) are ignored.\n+@mdindex @code{smulhs@var{m}3}\n+@mdindex @code{umulhs@var{m}3}\n+@item @samp{smulhs@var{m}3}\n+@itemx @samp{umulhs@var{m}3}\n+Signed/unsigned multiply high with scale. This is equivalent to the C code:\n+@smallexample\n+narrow op0, op1, op2;\n+@dots{}\n+op0 = (narrow) (((wide) op1 * (wide) op2) >> (N / 2 - 1));\n+@end smallexample\n+where the sign of @samp{narrow} determines whether this is a signed\n+or unsigned operation, and @var{N} is the size of @samp{wide} in bits.\n+@var{m} is the mode for all 3 operands (narrow). The wide mode is not specified\n+and is defined to fit the whole multiply.\n \n-@cindex @code{scatter_store@var{m}@var{n}} instruction pattern\n-@item @samp{scatter_store@var{m}@var{n}}\n-Store a vector of mode @var{m} into several distinct memory locations.\n-Operand 0 is a scalar base address and operand 1 is a vector of mode\n-@var{n} containing offsets from that base. Operand 4 is the vector of\n-values that should be stored, which has the same number of elements as\n-@var{n}. For each element index @var{i}:\n+@mdindex @code{smulhrs@var{m}3}\n+@mdindex @code{umulhrs@var{m}3}\n+@item @samp{smulhrs@var{m}3}\n+@itemx @samp{umulhrs@var{m}3}\n+Signed/unsigned multiply high with round and scale. This is\n+equivalent to the C code:\n+@smallexample\n+narrow op0, op1, op2;\n+@dots{}\n+op0 = (narrow) (((((wide) op1 * (wide) op2) >> (N / 2 - 2)) + 1) >> 1);\n+@end smallexample\n+where the sign of @samp{narrow} determines whether this is a signed\n+or unsigned operation, and @var{N} is the size of @samp{wide} in bits.\n+@var{m} is the mode for all 3 operands (narrow). The wide mode is not specified\n+and is defined to fit the whole multiply.\n \n-@itemize @bullet\n-@item\n-extend the offset element @var{i} to address width, using zero\n-extension if operand 2 is 1 and sign extension if operand 2 is zero;\n-@item\n-multiply the extended offset by operand 3;\n-@item\n-add the result to the base; and\n-@item\n-store element @var{i} of operand 4 to that address.\n-@end itemize\n+@mdindex @code{sdiv_pow2@var{m}3}\n+@mdindex @code{sdiv_pow2@var{m}3}\n+@item @samp{sdiv_pow2@var{m}3}\n+@itemx @samp{sdiv_pow2@var{m}3}\n+Signed division by power-of-2 immediate. Equivalent to:\n+@smallexample\n+signed op0, op1;\n+@dots{}\n+op0 = op1 / (1 << imm);\n+@end smallexample\n \n-The value of operand 2 does not matter if the offsets are already\n-address width.\n+@mdindex @code{mulhisi3}\n+@item @samp{mulhisi3}\n+Multiply operands 1 and 2, which have mode @code{HImode}, and store\n+a @code{SImode} product in operand 0.\n \n-@cindex @code{mask_scatter_store@var{m}@var{n}} instruction pattern\n-@item @samp{mask_scatter_store@var{m}@var{n}}\n-Like @samp{scatter_store@var{m}@var{n}}, but takes an extra mask operand as\n-operand 5. Bit @var{i} of the mask is set if element @var{i}\n-of the result should be stored to memory.\n+@mdindex @code{mulqihi3}\n+@mdindex @code{mulsidi3}\n+@item @samp{mulqihi3}, @samp{mulsidi3}\n+Similar widening-multiplication instructions of other widths.\n \n-@cindex @code{mask_len_scatter_store@var{m}@var{n}} instruction pattern\n-@item @samp{mask_len_scatter_store@var{m}@var{n}}\n-Like @samp{scatter_store@var{m}@var{n}}, but takes an extra mask operand (operand 5),\n-a len operand (operand 6) as well as a bias operand (operand 7). The instruction stores\n-at most (operand 6 + operand 7) elements of (operand 4) to memory.\n-Bit @var{i} of the mask is set if element @var{i} of (operand 4) should be stored.\n-Mask elements @var{i} with @var{i} > (operand 6 + operand 7) are ignored.\n+@mdindex @code{umulqihi3}\n+@mdindex @code{umulhisi3}\n+@mdindex @code{umulsidi3}\n+@item @samp{umulqihi3}, @samp{umulhisi3}, @samp{umulsidi3}\n+Similar widening-multiplication instructions that do unsigned\n+multiplication.\n \n-@cindex @code{mask_len_strided_store@var{m}} instruction pattern\n-@item @samp{mask_len_strided_store@var{m}}\n-Store a vector of mode m into several distinct memory locations.\n-Operand 0 is a scalar base address and operand 1 is scalar stride of Pmode.\n-Operand 2 is the vector of values that should be stored, which is of mode @var{m}.\n-operand 3 is mask operand, operand 4 is length operand and operand 5 is bias operand.\n-The instruction can be seen as a special case of @code{mask_len_scatter_store@var{m}@var{n}}\n-with an offset vector that is a @code{vec_series} with zero as base and operand 1 as step.\n-For each element the store address is operand 0 + @var{i} * operand 1.\n-Similar to mask_len_store, the instruction stores at most (operand 4 + operand 5) elements of\n-mask (operand 3) to memory. Element @var{i} of the mask is set if element @var{i} of (operand 3)\n-should be stored. Mask elements @var{i} with @var{i} > (operand 4 + operand 5) are ignored.\n+@mdindex @code{usmulqihi3}\n+@mdindex @code{usmulhisi3}\n+@mdindex @code{usmulsidi3}\n+@item @samp{usmulqihi3}, @samp{usmulhisi3}, @samp{usmulsidi3}\n+Similar widening-multiplication instructions that interpret the first\n+operand as unsigned and the second operand as signed, then do a signed\n+multiplication.\n \n-@cindex @code{vec_set@var{m}} instruction pattern\n-@item @samp{vec_set@var{m}}\n-Set given field in the vector value. Operand 0 is the vector to modify,\n-operand 1 is new value of field and operand 2 specify the field index.\n+@mdindex @code{smul@var{m}3_highpart}\n+@item @samp{smul@var{m}3_highpart}\n+Perform a signed multiplication of operands 1 and 2, which have mode\n+@var{m}, and store the most significant half of the product in operand 0.\n+The least significant half of the product is discarded. This may be\n+represented in RTL using a @code{smul_highpart} RTX expression.\n \n-This pattern is not allowed to @code{FAIL}.\n+@mdindex @code{umul@var{m}3_highpart}\n+@item @samp{umul@var{m}3_highpart}\n+Similar, but the multiplication is unsigned. This may be represented\n+in RTL using an @code{umul_highpart} RTX expression.\n \n-@cindex @code{vec_extract@var{m}@var{n}} instruction pattern\n-@item @samp{vec_extract@var{m}@var{n}}\n-Extract given field from the vector value. Operand 1 is the vector, operand 2\n-specify field index and operand 0 place to store value into. The\n-@var{n} mode is the mode of the field or vector of fields that should be\n-extracted, should be either element mode of the vector mode @var{m}, or\n-a vector mode with the same element mode and smaller number of elements.\n-If @var{n} is a vector mode the index is counted in multiples of\n-mode @var{n}.\n+@mdindex @code{madd@var{m}@var{n}4}\n+@item @samp{madd@var{m}@var{n}4}\n+Multiply operands 1 and 2, sign-extend them to mode @var{n}, add\n+operand 3, and store the result in operand 0. Operands 1 and 2\n+have mode @var{m} and operands 0 and 3 have mode @var{n}.\n+Both modes must be integer or fixed-point modes and @var{n} must be twice\n+the size of @var{m}.\n \n-This pattern is not allowed to @code{FAIL}.\n+In other words, @code{madd@var{m}@var{n}4} is like\n+@code{mul@var{m}@var{n}3} except that it also adds operand 3.\n \n-@cindex @code{vec_init@var{m}@var{n}} instruction pattern\n-@item @samp{vec_init@var{m}@var{n}}\n-Initialize the vector to given values. Operand 0 is the vector to initialize\n-and operand 1 is parallel containing values for individual fields. The\n-@var{n} mode is the mode of the elements, should be either element mode of\n-the vector mode @var{m}, or a vector mode with the same element mode and\n-smaller number of elements.\n+These instructions are not allowed to @code{FAIL}.\n \n-@cindex @code{vec_duplicate@var{m}} instruction pattern\n-@item @samp{vec_duplicate@var{m}}\n-Initialize vector output operand 0 so that each element has the value given\n-by scalar input operand 1. The vector has mode @var{m} and the scalar has\n-the mode appropriate for one element of @var{m}.\n+@mdindex @code{umadd@var{m}@var{n}4}\n+@item @samp{umadd@var{m}@var{n}4}\n+Like @code{madd@var{m}@var{n}4}, but zero-extend the multiplication\n+operands instead of sign-extending them.\n \n-This pattern only handles duplicates of non-constant inputs. Constant\n-vectors go through the @code{mov@var{m}} pattern instead.\n+@mdindex @code{ssmadd@var{m}@var{n}4}\n+@item @samp{ssmadd@var{m}@var{n}4}\n+Like @code{madd@var{m}@var{n}4}, but all involved operations must be\n+signed-saturating.\n \n-This pattern is not allowed to @code{FAIL}.\n+@mdindex @code{usmadd@var{m}@var{n}4}\n+@item @samp{usmadd@var{m}@var{n}4}\n+Like @code{umadd@var{m}@var{n}4}, but all involved operations must be\n+unsigned-saturating.\n \n-@cindex @code{vec_series@var{m}} instruction pattern\n-@item @samp{vec_series@var{m}}\n-Initialize vector output operand 0 so that element @var{i} is equal to\n-operand 1 plus @var{i} times operand 2. In other words, create a linear\n-series whose base value is operand 1 and whose step is operand 2.\n+@mdindex @code{msub@var{m}@var{n}4}\n+@item @samp{msub@var{m}@var{n}4}\n+Multiply operands 1 and 2, sign-extend them to mode @var{n}, subtract the\n+result from operand 3, and store the result in operand 0. Operands 1 and 2\n+have mode @var{m} and operands 0 and 3 have mode @var{n}.\n+Both modes must be integer or fixed-point modes and @var{n} must be twice\n+the size of @var{m}.\n \n-The vector output has mode @var{m} and the scalar inputs have the mode\n-appropriate for one element of @var{m}. This pattern is not used for\n-floating-point vectors, in order to avoid having to specify the\n-rounding behavior for @var{i} > 1.\n+In other words, @code{msub@var{m}@var{n}4} is like\n+@code{mul@var{m}@var{n}3} except that it also subtracts the result\n+from operand 3.\n \n-This pattern is not allowed to @code{FAIL}.\n+These instructions are not allowed to @code{FAIL}.\n \n-@cindex @code{while_ult@var{m}@var{n}} instruction pattern\n-@item @code{while_ult@var{m}@var{n}}\n-Set operand 0 to a mask that is true while incrementing operand 1\n-gives a value that is less than operand 2, for a vector length up to operand 3.\n-Operand 0 has mode @var{n} and operands 1 and 2 are scalar integers of mode\n-@var{m}. Operand 3 should be omitted when @var{n} is a vector mode, and\n-a @code{CONST_INT} otherwise. The operation for vector modes is equivalent to:\n+@mdindex @code{umsub@var{m}@var{n}4}\n+@item @samp{umsub@var{m}@var{n}4}\n+Like @code{msub@var{m}@var{n}4}, but zero-extend the multiplication\n+operands instead of sign-extending them.\n \n-@smallexample\n-operand0[0] = operand1 < operand2;\n-for (i = 1; i < GET_MODE_NUNITS (@var{n}); i++)\n- operand0[i] = operand0[i - 1] && (operand1 + i < operand2);\n-@end smallexample\n+@mdindex @code{ssmsub@var{m}@var{n}4}\n+@item @samp{ssmsub@var{m}@var{n}4}\n+Like @code{msub@var{m}@var{n}4}, but all involved operations must be\n+signed-saturating.\n \n-And for non-vector modes the operation is equivalent to:\n+@mdindex @code{usmsub@var{m}@var{n}4}\n+@item @samp{usmsub@var{m}@var{n}4}\n+Like @code{umsub@var{m}@var{n}4}, but all involved operations must be\n+unsigned-saturating.\n \n-@smallexample\n-operand0[0] = operand1 < operand2;\n-for (i = 1; i < operand3; i++)\n- operand0[i] = operand0[i - 1] && (operand1 + i < operand2);\n-@end smallexample\n+@mdindex @code{divmod@var{m}4}\n+@item @samp{divmod@var{m}4}\n+Signed division that produces both a quotient and a remainder.\n+Operand 1 is divided by operand 2 to produce a quotient stored\n+in operand 0 and a remainder stored in operand 3.\n \n-@cindex @code{select_vl@var{m}@var{n}} instruction pattern\n-@item @code{select_vl@var{m}@var{n}}\n-Set operand 0 (of mode @var{n}) to the number of scalar iterations that\n-should be handled by one iteration of a vector loop. Operand 1 is the\n-total number of scalar iterations that the loop needs to process and\n-operand 2 is a maximum bound on the result (also known as the\n-maximum ``vectorization factor''). Operand 3 (of mode @var{m}) is\n-a dummy parameter to pass the vector mode to be used.\n+For machines with an instruction that produces both a quotient and a\n+remainder, provide a pattern for @samp{divmod@var{m}4} but do not\n+provide patterns for @samp{div@var{m}3} and @samp{mod@var{m}3}. This\n+allows optimization in the relatively common case when both the quotient\n+and remainder are computed.\n \n-The maximum value of operand 0 is given by:\n-@smallexample\n-operand0 = MIN (operand1, operand2)\n-@end smallexample\n-However, targets might choose a lower value than this, based on\n-target-specific criteria. Each iteration of the vector loop might\n-therefore process a different number of scalar iterations, which in turn\n-means that induction variables will have a variable step. Because of\n-this, it is generally not useful to define this instruction if it will\n-always calculate the maximum value.\n+If an instruction that just produces a quotient or just a remainder\n+exists and is more efficient than the instruction that produces both,\n+write the output routine of @samp{divmod@var{m}4} to call\n+@code{find_reg_note} and look for a @code{REG_UNUSED} note on the\n+quotient or remainder and generate the appropriate instruction.\n \n-This optab is only useful on targets that implement @samp{len_load_@var{m}}\n-and/or @samp{len_store_@var{m}} or the associated @samp{_len} variants.\n+@mdindex @code{udivmod@var{m}4}\n+@item @samp{udivmod@var{m}4}\n+Similar, but does unsigned division.\n \n-@cindex @code{check_raw_ptrs@var{m}} instruction pattern\n-@item @samp{check_raw_ptrs@var{m}}\n-Check whether, given two pointers @var{a} and @var{b} and a length @var{len},\n-a write of @var{len} bytes at @var{a} followed by a read of @var{len} bytes\n-at @var{b} can be split into interleaved byte accesses\n-@samp{@var{a}[0], @var{b}[0], @var{a}[1], @var{b}[1], @dots{}}\n-without affecting the dependencies between the bytes. Set operand 0\n-to true if the split is possible and false otherwise.\n+@anchor{shift patterns}\n+@mdindex @code{ashl@var{m}3}\n+@mdindex @code{ssashl@var{m}3}\n+@mdindex @code{usashl@var{m}3}\n+@item @samp{ashl@var{m}3}, @samp{ssashl@var{m}3}, @samp{usashl@var{m}3}\n+Arithmetic-shift operand 1 left by a number of bits specified by operand\n+2, and store the result in operand 0. Here @var{m} is the mode of\n+operand 0 and operand 1; operand 2's mode is specified by the\n+instruction pattern, and the compiler will convert the operand to that\n+mode before generating the instruction. The shift or rotate expander\n+or instruction pattern should explicitly specify the mode of the operand 2,\n+it should never be @code{VOIDmode}. The meaning of out-of-range shift\n+counts can optionally be specified by @code{TARGET_SHIFT_TRUNCATION_MASK}.\n+@xref{TARGET_SHIFT_TRUNCATION_MASK}. Operand 2 is always a scalar type.\n \n-Operands 1, 2 and 3 provide the values of @var{a}, @var{b} and @var{len}\n-respectively. Operand 4 is a constant integer that provides the known\n-common alignment of @var{a} and @var{b}. All inputs have mode @var{m}.\n+@mdindex @code{ashr@var{m}3}\n+@mdindex @code{lshr@var{m}3}\n+@mdindex @code{rotl@var{m}3}\n+@mdindex @code{rotr@var{m}3}\n+@item @samp{ashr@var{m}3}, @samp{lshr@var{m}3}, @samp{rotl@var{m}3}, @samp{rotr@var{m}3}\n+Other shift and rotate instructions, analogous to the\n+@code{ashl@var{m}3} instructions. Operand 2 is always a scalar type.\n \n-This split is possible if:\n+@mdindex @code{vashl@var{m}3}\n+@mdindex @code{vashr@var{m}3}\n+@mdindex @code{vlshr@var{m}3}\n+@mdindex @code{vrotl@var{m}3}\n+@mdindex @code{vrotr@var{m}3}\n+@item @samp{vashl@var{m}3}, @samp{vashr@var{m}3}, @samp{vlshr@var{m}3}, @samp{vrotl@var{m}3}, @samp{vrotr@var{m}3}\n+Vector shift and rotate instructions that take vectors as operand 2\n+instead of a scalar type.\n \n+@mdindex @code{uabd@var{m}3}\n+@mdindex @code{sabd@var{m}3}\n+@item @samp{uabd@var{m}}, @samp{sabd@var{m}}\n+Signed and unsigned absolute difference instructions. These\n+instructions find the difference between operands 1 and 2\n+then return the absolute value. A C code equivalent would be:\n @smallexample\n-@var{a} == @var{b} || @var{a} + @var{len} <= @var{b} || @var{b} + @var{len} <= @var{a}\n+op0 = op1 > op2 ? op1 - op2 : op2 - op1;\n @end smallexample\n \n-You should only define this pattern if the target has a way of accelerating\n-the test without having to do the individual comparisons.\n-\n-@cindex @code{check_war_ptrs@var{m}} instruction pattern\n-@item @samp{check_war_ptrs@var{m}}\n-Like @samp{check_raw_ptrs@var{m}}, but with the read and write swapped round.\n-The split is possible in this case if:\n-\n+@mdindex @code{avg@var{m}3_floor}\n+@mdindex @code{uavg@var{m}3_floor}\n+@item @samp{avg@var{m}3_floor}\n+@itemx @samp{uavg@var{m}3_floor}\n+Signed and unsigned average instructions. These instructions add\n+operands 1 and 2 without truncation, divide the result by 2,\n+round towards -Inf, and store the result in operand 0. This is\n+equivalent to the C code:\n @smallexample\n-@var{b} <= @var{a} || @var{a} + @var{len} <= @var{b}\n+narrow op0, op1, op2;\n+@dots{}\n+op0 = (narrow) (((wide) op1 + (wide) op2) >> 1);\n @end smallexample\n+where the sign of @samp{narrow} determines whether this is a signed\n+or unsigned operation.\n \n-@cindex @code{vec_cmp@var{m}@var{n}} instruction pattern\n-@item @samp{vec_cmp@var{m}@var{n}}\n-Output a vector comparison. Operand 0 of mode @var{n} is the destination for\n-predicate in operand 1 which is a signed vector comparison with operands of\n-mode @var{m} in operands 2 and 3. Predicate is computed by elementwise\n-evaluation of the vector comparison with a truth value of all-ones and a false\n-value of all-zeros.\n+@mdindex @code{avg@var{m}3_ceil}\n+@mdindex @code{uavg@var{m}3_ceil}\n+@item @samp{avg@var{m}3_ceil}\n+@itemx @samp{uavg@var{m}3_ceil}\n+Like @samp{avg@var{m}3_floor} and @samp{uavg@var{m}3_floor}, but round\n+towards +Inf. This is equivalent to the C code:\n+@smallexample\n+narrow op0, op1, op2;\n+@dots{}\n+op0 = (narrow) (((wide) op1 + (wide) op2 + 1) >> 1);\n+@end smallexample\n \n-@cindex @code{vec_cmpu@var{m}@var{n}} instruction pattern\n-@item @samp{vec_cmpu@var{m}@var{n}}\n-Similar to @code{vec_cmp@var{m}@var{n}} but perform unsigned vector comparison.\n+@mdindex @code{bswap@var{m}2}\n+@item @samp{bswap@var{m}2}\n+Reverse the order of bytes of operand 1 and store the result in operand 0.\n \n-@cindex @code{vec_cmpeq@var{m}@var{n}} instruction pattern\n-@item @samp{vec_cmpeq@var{m}@var{n}}\n-Similar to @code{vec_cmp@var{m}@var{n}} but perform equality or non-equality\n-vector comparison only. If @code{vec_cmp@var{m}@var{n}}\n-or @code{vec_cmpu@var{m}@var{n}} instruction pattern is supported,\n-it will be preferred over @code{vec_cmpeq@var{m}@var{n}}, so there is\n-no need to define this instruction pattern if the others are supported.\n+@mdindex @code{neg@var{m}2}\n+@mdindex @code{ssneg@var{m}2}\n+@mdindex @code{usneg@var{m}2}\n+@item @samp{neg@var{m}2}, @samp{ssneg@var{m}2}, @samp{usneg@var{m}2}\n+Negate operand 1 and store the result in operand 0.\n \n-@cindex @code{vcond_mask_@var{m}@var{n}} instruction pattern\n-@item @samp{vcond_mask_@var{m}@var{n}}\n-Output a conditional vector move. Operand 0 is the destination to\n-receive a combination of operand 1 and operand 2, depending on the\n-mask in operand 3. Operands 0, 1, and 2 have mode @var{m} while\n-operand 3 has mode @var{n}.\n+@mdindex @code{negv@var{m}3}\n+@item @samp{negv@var{m}3}\n+Like @code{neg@var{m}2} but takes a @code{code_label} as operand 2 and\n+emits code to jump to it if signed overflow occurs during the negation.\n \n-Suppose that @var{m} has @var{e} elements. There are then two\n-supported forms of @var{n}. The first form is an integer or\n-boolean vector that also has @var{e} elements. In this case, each\n-element is -1 or 0, with -1 selecting elements from operand 1 and\n-0 selecting elements from operand 2. The second supported form\n-of @var{n} is a scalar integer that has at least @var{e} bits.\n-A set bit then selects from operand 1 and a clear bit selects\n-from operand 2. Bits @var{e} and above have no effect.\n+@mdindex @code{abs@var{m}2}\n+@item @samp{abs@var{m}2}\n+Store the absolute value of operand 1 into operand 0.\n \n-Subject to those restrictions, the behavior is equivalent to:\n+@mdindex @code{sqrt@var{m}2}\n+@item @samp{sqrt@var{m}2}\n+Store the square root of operand 1 into operand 0. Both operands have\n+mode @var{m}, which is a scalar or vector floating-point mode.\n \n-@smallexample\n-for (i = 0; i < @var{e}; i++)\n- op0[i] = op3[i] ? op1[i] : op2[i];\n-@end smallexample\n+This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{vcond_mask_len_@var{m}@var{n}} instruction pattern\n-@item @samp{vcond_mask_len_@var{m}@var{n}}\n-Set each element of operand 0 to the corresponding element of operand 2\n-or operand 3. Choose operand 2 if both the element index is less than\n-operand 4 plus operand 5 and the corresponding element of operand 1\n-is nonzero:\n+@mdindex @code{rsqrt@var{m}2}\n+@item @samp{rsqrt@var{m}2}\n+Store the reciprocal of the square root of operand 1 into operand 0.\n+Both operands have mode @var{m}, which is a scalar or vector\n+floating-point mode.\n \n-@smallexample\n-for (i = 0; i < GET_MODE_NUNITS (@var{m}); i++)\n- op0[i] = i < op4 + op5 && op1[i] ? op2[i] : op3[i];\n-@end smallexample\n+On most architectures this pattern is only approximate, so either\n+its C condition or the @code{TARGET_OPTAB_SUPPORTED_P} hook should\n+check for the appropriate math flags. (Using the C condition is\n+more direct, but using @code{TARGET_OPTAB_SUPPORTED_P} can be useful\n+if a target-specific built-in also uses the @samp{rsqrt@var{m}2}\n+pattern.)\n \n-Operands 0, 2 and 3 have mode @var{m}. Operand 1 has mode @var{n}.\n-Operands 4 and 5 have a target-dependent scalar integer mode.\n+This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{maskload@var{m}@var{n}} instruction pattern\n-@item @samp{maskload@var{m}@var{n}}\n-Perform a masked load of vector from memory operand 1 of mode @var{m}\n-into register operand 0. The mask is provided in register operand 2 of\n-mode @var{n}. Operand 3 (the ``else value'') is of mode @var{m} and\n-specifies which value is loaded when the mask is unset.\n-The predicate of operand 3 must only accept the else values that the target\n-actually supports. Currently three values are attempted, zero, -1, and\n-undefined. GCC handles an else value of zero more efficiently than -1 or\n-undefined.\n+@mdindex @code{fmod@var{m}3}\n+@item @samp{fmod@var{m}3}\n+Store the remainder of dividing operand 1 by operand 2 into\n+operand 0, rounded towards zero to an integer. All operands have\n+mode @var{m}, which is a scalar or vector floating-point mode.\n \n This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{maskstore@var{m}@var{n}} instruction pattern\n-@item @samp{maskstore@var{m}@var{n}}\n-Perform a masked store of vector from register operand 1 of mode @var{m}\n-into memory operand 0. Mask is provided in register operand 2 of\n-mode @var{n}.\n+@mdindex @code{remainder@var{m}3}\n+@item @samp{remainder@var{m}3}\n+Store the remainder of dividing operand 1 by operand 2 into\n+operand 0, rounded to the nearest integer. All operands have\n+mode @var{m}, which is a scalar or vector floating-point mode.\n \n This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{len_load_@var{m}} instruction pattern\n-@item @samp{len_load_@var{m}}\n-Load (operand 3 + operand 4) elements from memory operand 1\n-into vector register operand 0. Operands 0 and 1 have mode @var{m},\n-which must be a vector mode. Operand 3 has whichever integer mode the\n-target prefers. Operand 2 (the ``else value'') is of mode @var{m} and\n-specifies which value is loaded for the remaining elements. The predicate\n-of operand 2 must only accept the else values that the target actually\n-supports. Operand 4 conceptually has mode @code{QI}.\n-\n-Operand 3 can be a variable or a constant amount. Operand 4 specifies a\n-constant bias: it is either a constant 0 or a constant -1. The predicate on\n-operand 4 must only accept the bias values that the target actually supports.\n-GCC handles a bias of 0 more efficiently than a bias of -1.\n+@mdindex @code{scalb@var{m}3}\n+@item @samp{scalb@var{m}3}\n+Raise @code{FLT_RADIX} to the power of operand 2, multiply it by\n+operand 1, and store the result in operand 0. All operands have\n+mode @var{m}, which is a scalar or vector floating-point mode.\n \n-If (operand 3 + operand 4) exceeds the number of elements in mode\n-@var{m}, the behavior is undefined.\n+This pattern is not allowed to @code{FAIL}.\n \n-If the target prefers the length to be measured in bytes rather than\n-elements, it should only implement this pattern for vectors of @code{QI}\n-elements.\n+@mdindex @code{ldexp@var{m}3}\n+@item @samp{ldexp@var{m}3}\n+Raise 2 to the power of operand 2, multiply it by operand 1, and store\n+the result in operand 0. Operands 0 and 1 have mode @var{m}, which is\n+a scalar or vector floating-point mode. Operand 2's mode has\n+the same number of elements as @var{m} and each element is wide\n+enough to store an @code{int}. The integers are signed.\n \n This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{len_store_@var{m}} instruction pattern\n-@item @samp{len_store_@var{m}}\n-Store (operand 2 + operand 3) vector elements from vector register operand 1\n-into memory operand 0, leaving the other elements of\n-operand 0 unchanged. Operands 0 and 1 have mode @var{m}, which must be\n-a vector mode. Operand 2 has whichever integer mode the target prefers.\n-Operand 3 conceptually has mode @code{QI}.\n-\n-Operand 2 can be a variable or a constant amount. Operand 3 specifies a\n-constant bias: it is either a constant 0 or a constant -1. The predicate on\n-operand 3 must only accept the bias values that the target actually supports.\n-GCC handles a bias of 0 more efficiently than a bias of -1.\n+@mdindex @code{cos@var{m}2}\n+@item @samp{cos@var{m}2}\n+Store the cosine of operand 1 into operand 0. Both operands have\n+mode @var{m}, which is a scalar or vector floating-point mode.\n \n-If (operand 2 + operand 3) exceeds the number of elements in mode\n-@var{m}, the behavior is undefined.\n+This pattern is not allowed to @code{FAIL}.\n \n-If the target prefers the length to be measured in bytes\n-rather than elements, it should only implement this pattern for vectors\n-of @code{QI} elements.\n+@mdindex @code{sin@var{m}2}\n+@item @samp{sin@var{m}2}\n+Store the sine of operand 1 into operand 0. Both operands have\n+mode @var{m}, which is a scalar or vector floating-point mode.\n \n This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{mask_len_load@var{m}@var{n}} instruction pattern\n-@item @samp{mask_len_load@var{m}@var{n}}\n-Perform a masked load from the memory location pointed to by operand 1\n-into register operand 0. (operand 3 + operand 4) elements are loaded from\n-memory and other elements in operand 0 are set to undefined values.\n-This is a combination of len_load and maskload.\n-Operands 0 and 1 have mode @var{m}, which must be a vector mode. Operand 3\n-has whichever integer mode the target prefers. A mask is specified in\n-operand 2 which must be of type @var{n}. The mask has lower precedence than\n-the length and is itself subject to length masking,\n-i.e. only mask indices < (operand 4 + operand 5) are used.\n-Operand 3 is an else operand similar to the one in @code{maskload}.\n-Operand 4 conceptually has mode @code{QI}.\n-\n-Operand 4 can be a variable or a constant amount. Operand 5 specifies a\n-constant bias: it is either a constant 0 or a constant -1. The predicate on\n-operand 5 must only accept the bias values that the target actually supports.\n-GCC handles a bias of 0 more efficiently than a bias of -1.\n+@mdindex @code{sincos@var{m}3}\n+@item @samp{sincos@var{m}3}\n+Store the cosine of operand 2 into operand 0 and the sine of\n+operand 2 into operand 1. All operands have mode @var{m},\n+which is a scalar or vector floating-point mode.\n \n-If (operand 4 + operand 5) exceeds the number of elements in mode\n-@var{m}, the behavior is undefined.\n+Targets that can calculate the sine and cosine simultaneously can\n+implement this pattern as opposed to implementing individual\n+@code{sin@var{m}2} and @code{cos@var{m}2} patterns. The @code{sin}\n+and @code{cos} built-in functions will then be expanded to the\n+@code{sincos@var{m}3} pattern, with one of the output values\n+left unused.\n \n-If the target prefers the length to be measured in bytes\n-rather than elements, it should only implement this pattern for vectors\n-of @code{QI} elements.\n+@mdindex @code{tan@var{m}2}\n+@item @samp{tan@var{m}2}\n+Store the tangent of operand 1 into operand 0. Both operands have\n+mode @var{m}, which is a scalar or vector floating-point mode.\n \n This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{mask_len_store@var{m}@var{n}} instruction pattern\n-@item @samp{mask_len_store@var{m}@var{n}}\n-Perform a masked store from vector register operand 1 into memory operand 0.\n-(operand 3 + operand 4) elements are stored to memory\n-and leave the other elements of operand 0 unchanged.\n-This is a combination of len_store and maskstore.\n-Operands 0 and 1 have mode @var{m}, which must be a vector mode. Operand 3 has whichever\n-integer mode the target prefers. A mask is specified in operand 2 which must be\n-of type @var{n}. The mask has lower precedence than the length and is itself subject to\n-length masking, i.e. only mask indices < (operand 3 + operand 4) are used.\n-Operand 4 conceptually has mode @code{QI}.\n-\n-Operand 2 can be a variable or a constant amount. Operand 3 specifies a\n-constant bias: it is either a constant 0 or a constant -1. The predicate on\n-operand 4 must only accept the bias values that the target actually supports.\n-GCC handles a bias of 0 more efficiently than a bias of -1.\n+@mdindex @code{asin@var{m}2}\n+@item @samp{asin@var{m}2}\n+Store the arc sine of operand 1 into operand 0. Both operands have\n+mode @var{m}, which is a scalar or vector floating-point mode.\n \n-If (operand 2 + operand 4) exceeds the number of elements in mode\n-@var{m}, the behavior is undefined.\n+This pattern is not allowed to @code{FAIL}.\n \n-If the target prefers the length to be measured in bytes\n-rather than elements, it should only implement this pattern for vectors\n-of @code{QI} elements.\n+@mdindex @code{acos@var{m}2}\n+@item @samp{acos@var{m}2}\n+Store the arc cosine of operand 1 into operand 0. Both operands have\n+mode @var{m}, which is a scalar or vector floating-point mode.\n \n This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{vec_perm@var{m}} instruction pattern\n-@item @samp{vec_perm@var{m}}\n-Output a (variable) vector permutation. Operand 0 is the destination\n-to receive elements from operand 1 and operand 2, which are of mode\n-@var{m}. Operand 3 is the @dfn{selector}. It is an integral mode\n-vector of the same width and number of elements as mode @var{m}.\n-\n-The input elements are numbered from 0 in operand 1 through\n-@math{2*@var{N}-1} in operand 2. The elements of the selector must\n-be computed modulo @math{2*@var{N}}. Note that if\n-@code{rtx_equal_p(operand1, operand2)}, this can be implemented\n-with just operand 1 and selector elements modulo @var{N}.\n+@mdindex @code{atan@var{m}2}\n+@item @samp{atan@var{m}2}\n+Store the arc tangent of operand 1 into operand 0. Both operands have\n+mode @var{m}, which is a scalar or vector floating-point mode.\n \n-In order to make things easy for a number of targets, if there is no\n-@samp{vec_perm} pattern for mode @var{m}, but there is for mode @var{q}\n-where @var{q} is a vector of @code{QImode} of the same width as @var{m},\n-the middle-end will lower the mode @var{m} @code{VEC_PERM_EXPR} to\n-mode @var{q}.\n+This pattern is not allowed to @code{FAIL}.\n \n-See also @code{TARGET_VECTORIZER_VEC_PERM_CONST}, which performs\n-the analogous operation for constant selectors.\n+@mdindex @code{fegetround@var{m}}\n+@item @samp{fegetround@var{m}}\n+Store the current machine floating-point rounding mode into operand 0.\n+Operand 0 has mode @var{m}, which is scalar. This pattern is used to\n+implement the @code{fegetround} function from the ISO C99 standard.\n \n-@cindex @code{push@var{m}1} instruction pattern\n-@item @samp{push@var{m}1}\n-Output a push instruction. Operand 0 is value to push. Used only when\n-@code{PUSH_ROUNDING} is defined. For historical reason, this pattern may be\n-missing and in such case an @code{mov} expander is used instead, with a\n-@code{MEM} expression forming the push operation. The @code{mov} expander\n-method is deprecated.\n+@mdindex @code{feclearexcept@var{m}}\n+@mdindex @code{feraiseexcept@var{m}}\n+@item @samp{feclearexcept@var{m}}\n+@item @samp{feraiseexcept@var{m}}\n+Clears or raises the supported machine floating-point exceptions\n+represented by the bits in operand 1. Error status is stored as\n+nonzero value in operand 0. Both operands have mode @var{m}, which is\n+a scalar. These patterns are used to implement the\n+@code{feclearexcept} and @code{feraiseexcept} functions from the ISO\n+C99 standard.\n \n-@cindex @code{add@var{m}3} instruction pattern\n-@item @samp{add@var{m}3}\n-Add operand 2 and operand 1, storing the result in operand 0. All operands\n-must have mode @var{m}. This can be used even on two-address machines, by\n-means of constraints requiring operands 1 and 0 to be the same location.\n+@mdindex @code{exp@var{m}2}\n+@item @samp{exp@var{m}2}\n+Raise e (the base of natural logarithms) to the power of operand 1\n+and store the result in operand 0. Both operands have mode @var{m},\n+which is a scalar or vector floating-point mode.\n \n-@cindex @code{ssadd@var{m}3} instruction pattern\n-@cindex @code{usadd@var{m}3} instruction pattern\n-@cindex @code{sub@var{m}3} instruction pattern\n-@cindex @code{sssub@var{m}3} instruction pattern\n-@cindex @code{ussub@var{m}3} instruction pattern\n-@cindex @code{mul@var{m}3} instruction pattern\n-@cindex @code{ssmul@var{m}3} instruction pattern\n-@cindex @code{usmul@var{m}3} instruction pattern\n-@cindex @code{div@var{m}3} instruction pattern\n-@cindex @code{ssdiv@var{m}3} instruction pattern\n-@cindex @code{udiv@var{m}3} instruction pattern\n-@cindex @code{usdiv@var{m}3} instruction pattern\n-@cindex @code{mod@var{m}3} instruction pattern\n-@cindex @code{umod@var{m}3} instruction pattern\n-@cindex @code{umin@var{m}3} instruction pattern\n-@cindex @code{umax@var{m}3} instruction pattern\n-@cindex @code{and@var{m}3} instruction pattern\n-@cindex @code{ior@var{m}3} instruction pattern\n-@cindex @code{xor@var{m}3} instruction pattern\n-@item @samp{ssadd@var{m}3}, @samp{usadd@var{m}3}\n-@itemx @samp{sub@var{m}3}, @samp{sssub@var{m}3}, @samp{ussub@var{m}3}\n-@itemx @samp{mul@var{m}3}, @samp{ssmul@var{m}3}, @samp{usmul@var{m}3}\n-@itemx @samp{div@var{m}3}, @samp{ssdiv@var{m}3}\n-@itemx @samp{udiv@var{m}3}, @samp{usdiv@var{m}3}\n-@itemx @samp{mod@var{m}3}, @samp{umod@var{m}3}\n-@itemx @samp{umin@var{m}3}, @samp{umax@var{m}3}\n-@itemx @samp{and@var{m}3}, @samp{ior@var{m}3}, @samp{xor@var{m}3}\n-Similar, for other arithmetic operations.\n+This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{ustrunc@var{m}@var{n}2} instruction pattern\n-@item @samp{ustrunc@var{m}@var{n}2}\n-Truncate the operand 1, and storing the result in operand 0. There will\n-be saturation during the trunction. The result will be saturated to the\n-maximal value of operand 0 type if there is overflow when truncation. The\n-operand 1 must have mode @var{n}, and the operand 0 must have mode @var{m}.\n-Both scalar and vector integer modes are allowed.\n+@mdindex @code{expm1@var{m}2}\n+@item @samp{expm1@var{m}2}\n+Raise e (the base of natural logarithms) to the power of operand 1,\n+subtract 1, and store the result in operand 0. Both operands have\n+mode @var{m}, which is a scalar or vector floating-point mode.\n \n-@cindex @code{sstrunc@var{m}@var{n}2} instruction pattern\n-@item @samp{sstrunc@var{m}@var{n}2}\n-Similar but for signed.\n+For inputs close to zero, the pattern is expected to be more\n+accurate than a separate @code{exp@var{m}2} and @code{sub@var{m}3}\n+would be.\n \n-@cindex @code{andn@var{m}3} instruction pattern\n-@item @samp{andn@var{m}3}\n-Like @code{and@var{m}3}, but it uses bitwise-complement of operand 2\n-rather than operand 2 itself.\n+This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{iorn@var{m}3} instruction pattern\n-@item @samp{iorn@var{m}3}\n-Like @code{ior@var{m}3}, but it uses bitwise-complement of operand 2\n-rather than operand 2 itself.\n+@mdindex @code{exp10@var{m}2}\n+@item @samp{exp10@var{m}2}\n+Raise 10 to the power of operand 1 and store the result in operand 0.\n+Both operands have mode @var{m}, which is a scalar or vector\n+floating-point mode.\n \n-@cindex @code{addv@var{m}4} instruction pattern\n-@item @samp{addv@var{m}4}\n-Like @code{add@var{m}3} but takes a @code{code_label} as operand 3 and\n-emits code to jump to it if signed overflow occurs during the addition.\n-This pattern is used to implement the built-in functions performing\n-signed integer addition with overflow checking.\n+This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{subv@var{m}4} instruction pattern\n-@cindex @code{mulv@var{m}4} instruction pattern\n-@item @samp{subv@var{m}4}, @samp{mulv@var{m}4}\n-Similar, for other signed arithmetic operations.\n+@mdindex @code{exp2@var{m}2}\n+@item @samp{exp2@var{m}2}\n+Raise 2 to the power of operand 1 and store the result in operand 0.\n+Both operands have mode @var{m}, which is a scalar or vector\n+floating-point mode.\n \n-@cindex @code{uaddv@var{m}4} instruction pattern\n-@item @samp{uaddv@var{m}4}\n-Like @code{addv@var{m}4} but for unsigned addition. That is to\n-say, the operation is the same as signed addition but the jump\n-is taken only on unsigned overflow.\n+This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{usubv@var{m}4} instruction pattern\n-@cindex @code{umulv@var{m}4} instruction pattern\n-@item @samp{usubv@var{m}4}, @samp{umulv@var{m}4}\n-Similar, for other unsigned arithmetic operations.\n+@mdindex @code{log@var{m}2}\n+@item @samp{log@var{m}2}\n+Store the natural logarithm of operand 1 into operand 0. Both operands\n+have mode @var{m}, which is a scalar or vector floating-point mode.\n \n-@cindex @code{uaddc@var{m}5} instruction pattern\n-@item @samp{uaddc@var{m}5}\n-Adds unsigned operands 2, 3 and 4 (where the last operand is guaranteed to\n-have only values 0 or 1) together, sets operand 0 to the result of the\n-addition of the 3 operands and sets operand 1 to 1 iff there was\n-overflow on the unsigned additions, and to 0 otherwise. So, it is\n-an addition with carry in (operand 4) and carry out (operand 1).\n-All operands have the same mode.\n+This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{usubc@var{m}5} instruction pattern\n-@item @samp{usubc@var{m}5}\n-Similarly to @samp{uaddc@var{m}5}, except subtracts unsigned operands 3\n-and 4 from operand 2 instead of adding them. So, it is\n-a subtraction with carry/borrow in (operand 4) and carry/borrow out\n-(operand 1). All operands have the same mode.\n+@mdindex @code{log1p@var{m}2}\n+@item @samp{log1p@var{m}2}\n+Add 1 to operand 1, compute the natural logarithm, and store\n+the result in operand 0. Both operands have mode @var{m}, which is\n+a scalar or vector floating-point mode.\n \n-@cindex @code{addptr@var{m}3} instruction pattern\n-@item @samp{addptr@var{m}3}\n-Like @code{add@var{m}3} but is guaranteed to only be used for address\n-calculations. The expanded code is not allowed to clobber the\n-condition code. It only needs to be defined if @code{add@var{m}3}\n-sets the condition code. If adds used for address calculations and\n-normal adds are not compatible it is required to expand a distinct\n-pattern (e.g.@: using an unspec). The pattern is used by LRA to emit\n-address calculations. @code{add@var{m}3} is used if\n-@code{addptr@var{m}3} is not defined.\n+For inputs close to zero, the pattern is expected to be more\n+accurate than a separate @code{add@var{m}3} and @code{log@var{m}2}\n+would be.\n \n-@cindex @code{fma@var{m}4} instruction pattern\n-@item @samp{fma@var{m}4}\n-Multiply operand 2 and operand 1, then add operand 3, storing the\n-result in operand 0 without doing an intermediate rounding step. All\n-operands must have mode @var{m}. This pattern is used to implement\n-the @code{fma}, @code{fmaf}, and @code{fmal} builtin functions from\n-the ISO C99 standard.\n+This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{fms@var{m}4} instruction pattern\n-@item @samp{fms@var{m}4}\n-Like @code{fma@var{m}4}, except operand 3 subtracted from the\n-product instead of added to the product. This is represented\n-in the rtl as\n+@mdindex @code{log10@var{m}2}\n+@item @samp{log10@var{m}2}\n+Store the base-10 logarithm of operand 1 into operand 0. Both operands\n+have mode @var{m}, which is a scalar or vector floating-point mode.\n \n-@smallexample\n-(fma:@var{m} @var{op1} @var{op2} (neg:@var{m} @var{op3}))\n-@end smallexample\n+This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{fnma@var{m}4} instruction pattern\n-@item @samp{fnma@var{m}4}\n-Like @code{fma@var{m}4} except that the intermediate product\n-is negated before being added to operand 3. This is represented\n-in the rtl as\n+@mdindex @code{log2@var{m}2}\n+@item @samp{log2@var{m}2}\n+Store the base-2 logarithm of operand 1 into operand 0. Both operands\n+have mode @var{m}, which is a scalar or vector floating-point mode.\n \n-@smallexample\n-(fma:@var{m} (neg:@var{m} @var{op1}) @var{op2} @var{op3})\n-@end smallexample\n+This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{fnms@var{m}4} instruction pattern\n-@item @samp{fnms@var{m}4}\n-Like @code{fms@var{m}4} except that the intermediate product\n-is negated before subtracting operand 3. This is represented\n-in the rtl as\n+@mdindex @code{logb@var{m}2}\n+@item @samp{logb@var{m}2}\n+Store the base-@code{FLT_RADIX} logarithm of operand 1 into operand 0.\n+Both operands have mode @var{m}, which is a scalar or vector\n+floating-point mode.\n \n-@smallexample\n-(fma:@var{m} (neg:@var{m} @var{op1}) @var{op2} (neg:@var{m} @var{op3}))\n-@end smallexample\n+This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{min@var{m}3} instruction pattern\n-@cindex @code{max@var{m}3} instruction pattern\n-@item @samp{smin@var{m}3}, @samp{smax@var{m}3}\n-Signed minimum and maximum operations. When used with floating point,\n-if both operands are zeros, or if either operand is @code{NaN}, then\n-it is unspecified which of the two operands is returned as the result.\n+@mdindex @code{signbit@var{m}2}\n+@item @samp{signbit@var{m}2}\n+Store the sign bit of floating-point operand 1 in operand 0.\n+@var{m} is either a scalar or vector mode. When it is a scalar,\n+operand 1 has mode @var{m} but operand 0 must have mode @code{SImode}.\n+When @var{m} is a vector, operand 1 has the mode @var{m}.\n+operand 0's mode should be an vector integer mode which has\n+the same number of elements and the same size as mode @var{m}.\n \n-@cindex @code{fmin@var{m}3} instruction pattern\n-@cindex @code{fmax@var{m}3} instruction pattern\n-@item @samp{fmin@var{m}3}, @samp{fmax@var{m}3}\n-IEEE-conformant minimum and maximum operations. If one operand is a quiet\n-@code{NaN}, then the other operand is returned. If both operands are quiet\n-@code{NaN}, then a quiet @code{NaN} is returned. In the case when gcc supports\n-signaling @code{NaN} (-fsignaling-nans) an invalid floating point exception is\n-raised and a quiet @code{NaN} is returned.\n+This pattern is not allowed to @code{FAIL}.\n \n-All operands have mode @var{m}, which is a scalar or vector\n-floating-point mode. These patterns are not allowed to @code{FAIL}.\n+@mdindex @code{significand@var{m}2}\n+@item @samp{significand@var{m}2}\n+Store the significand of floating-point operand 1 in operand 0.\n+Both operands have mode @var{m}, which is a scalar or vector\n+floating-point mode.\n \n-@cindex @code{reduc_smin_scal_@var{m}} instruction pattern\n-@cindex @code{reduc_smax_scal_@var{m}} instruction pattern\n-@item @samp{reduc_smin_scal_@var{m}}, @samp{reduc_smax_scal_@var{m}}\n-Find the signed minimum/maximum of the elements of a vector. The vector is\n-operand 1, and operand 0 is the scalar result, with mode equal to the mode of\n-the elements of the input vector.\n+This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{reduc_umin_scal_@var{m}} instruction pattern\n-@cindex @code{reduc_umax_scal_@var{m}} instruction pattern\n-@item @samp{reduc_umin_scal_@var{m}}, @samp{reduc_umax_scal_@var{m}}\n-Find the unsigned minimum/maximum of the elements of a vector. The vector is\n-operand 1, and operand 0 is the scalar result, with mode equal to the mode of\n-the elements of the input vector.\n+@mdindex @code{pow@var{m}3}\n+@item @samp{pow@var{m}3}\n+Store the value of operand 1 raised to the exponent operand 2\n+into operand 0. All operands have mode @var{m}, which is a scalar\n+or vector floating-point mode.\n \n-@cindex @code{reduc_fmin_scal_@var{m}} instruction pattern\n-@cindex @code{reduc_fmax_scal_@var{m}} instruction pattern\n-@item @samp{reduc_fmin_scal_@var{m}}, @samp{reduc_fmax_scal_@var{m}}\n-Find the floating-point minimum/maximum of the elements of a vector,\n-using the same rules as @code{fmin@var{m}3} and @code{fmax@var{m}3}.\n-Operand 1 is a vector of mode @var{m} and operand 0 is the scalar\n-result, which has mode @code{GET_MODE_INNER (@var{m})}.\n+This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{reduc_plus_scal_@var{m}} instruction pattern\n-@item @samp{reduc_plus_scal_@var{m}}\n-Compute the sum of the elements of a vector. The vector is operand 1, and\n-operand 0 is the scalar result, with mode equal to the mode of the elements of\n-the input vector.\n+@mdindex @code{atan2@var{m}3}\n+@item @samp{atan2@var{m}3}\n+Store the arc tangent (inverse tangent) of operand 1 divided by\n+operand 2 into operand 0, using the signs of both arguments to\n+determine the quadrant of the result. All operands have mode\n+@var{m}, which is a scalar or vector floating-point mode.\n \n-@cindex @code{reduc_and_scal_@var{m}} instruction pattern\n-@cindex @code{reduc_ior_scal_@var{m}} instruction pattern\n-@cindex @code{reduc_xor_scal_@var{m}} instruction pattern\n-@item @samp{reduc_and_scal_@var{m}}\n-@itemx @samp{reduc_ior_scal_@var{m}}\n-@itemx @samp{reduc_xor_scal_@var{m}}\n-Compute the bitwise @code{AND}/@code{IOR}/@code{XOR} reduction of the elements\n-of a vector of mode @var{m}. Operand 1 is the vector input and operand 0\n-is the scalar result. The mode of the scalar result is the same as one\n-element of @var{m}.\n+This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{reduc_sbool_and_scal_@var{m}} instruction pattern\n-@cindex @code{reduc_sbool_ior_scal_@var{m}} instruction pattern\n-@cindex @code{reduc_sbool_xor_scal_@var{m}} instruction pattern\n-@item @samp{reduc_sbool_and_scal_@var{m}}\n-@itemx @samp{reduc_sbool_ior_scal_@var{m}}\n-@itemx @samp{reduc_sbool_xor_scal_@var{m}}\n-Compute the bitwise @code{AND}/@code{IOR}/@code{XOR} reduction of the elements\n-of a vector boolean of mode @var{m}. Operand 1 is the vector input and\n-operand 0 is the scalar result. The mode of the scalar result is @var{QImode}\n-with its value either zero or one. If mode @var{m} is a scalar integer mode\n-then operand 2 is the number of elements in the input vector to provide\n-disambiguation for the case @var{m} is ambiguous.\n+@mdindex @code{floor@var{m}2}\n+@item @samp{floor@var{m}2}\n+Store the largest integral value not greater than operand 1 in operand 0.\n+Both operands have mode @var{m}, which is a scalar or vector\n+floating-point mode. If @option{-ffp-int-builtin-inexact} is in\n+effect, the ``inexact'' exception may be raised for noninteger\n+operands; otherwise, it may not.\n \n-@cindex @code{extract_last_@var{m}} instruction pattern\n-@item @code{extract_last_@var{m}}\n-Find the last set bit in mask operand 1 and extract the associated element\n-of vector operand 2. Store the result in scalar operand 0. Operand 2\n-has vector mode @var{m} while operand 0 has the mode appropriate for one\n-element of @var{m}. Operand 1 has the usual mask mode for vectors of mode\n-@var{m}; see @code{TARGET_VECTORIZE_GET_MASK_MODE}.\n+This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{fold_extract_last_@var{m}} instruction pattern\n-@item @code{fold_extract_last_@var{m}}\n-If any bits of mask operand 2 are set, find the last set bit, extract\n-the associated element from vector operand 3, and store the result\n-in operand 0. Store operand 1 in operand 0 otherwise. Operand 3\n-has mode @var{m} and operands 0 and 1 have the mode appropriate for\n-one element of @var{m}. Operand 2 has the usual mask mode for vectors\n-of mode @var{m}; see @code{TARGET_VECTORIZE_GET_MASK_MODE}.\n+@mdindex @code{btrunc@var{m}2}\n+@item @samp{btrunc@var{m}2}\n+Round operand 1 to an integer, towards zero, and store the result in\n+operand 0. Both operands have mode @var{m}, which is a scalar or\n+vector floating-point mode. If @option{-ffp-int-builtin-inexact} is\n+in effect, the ``inexact'' exception may be raised for noninteger\n+operands; otherwise, it may not.\n \n-@cindex @code{len_fold_extract_last_@var{m}} instruction pattern\n-@item @code{len_fold_extract_last_@var{m}}\n-Like @samp{fold_extract_last_@var{m}}, but takes an extra length operand as\n-operand 4 and an extra bias operand as operand 5. The last associated element\n-is extracted should have the index i < len (operand 4) + bias (operand 5).\n+This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{fold_left_plus_@var{m}} instruction pattern\n-@item @code{fold_left_plus_@var{m}}\n-Take scalar operand 1 and successively add each element from vector\n-operand 2. Store the result in scalar operand 0. The vector has\n-mode @var{m} and the scalars have the mode appropriate for one\n-element of @var{m}. The operation is strictly in-order: there is\n-no reassociation.\n+@mdindex @code{round@var{m}2}\n+@item @samp{round@var{m}2}\n+Round operand 1 to the nearest integer, rounding away from zero in the\n+event of a tie, and store the result in operand 0. Both operands have\n+mode @var{m}, which is a scalar or vector floating-point mode. If\n+@option{-ffp-int-builtin-inexact} is in effect, the ``inexact''\n+exception may be raised for noninteger operands; otherwise, it may\n+not.\n \n-@cindex @code{mask_fold_left_plus_@var{m}} instruction pattern\n-@item @code{mask_fold_left_plus_@var{m}}\n-Like @samp{fold_left_plus_@var{m}}, but takes an additional mask operand\n-(operand 3) that specifies which elements of the source vector should be added.\n+This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{mask_len_fold_left_plus_@var{m}} instruction pattern\n-@item @code{mask_len_fold_left_plus_@var{m}}\n-Like @samp{fold_left_plus_@var{m}}, but takes an additional mask operand\n-(operand 3), len operand (operand 4) and bias operand (operand 5) that\n-performs following operations strictly in-order (no reassociation):\n+@mdindex @code{ceil@var{m}2}\n+@item @samp{ceil@var{m}2}\n+Store the smallest integral value not less than operand 1 in operand 0.\n+Both operands have mode @var{m}, which is a scalar or vector\n+floating-point mode. If @option{-ffp-int-builtin-inexact} is in\n+effect, the ``inexact'' exception may be raised for noninteger\n+operands; otherwise, it may not.\n \n-@smallexample\n-operand0 = operand1;\n-for (i = 0; i < LEN + BIAS; i++)\n- if (operand3[i])\n- operand0 += operand2[i];\n-@end smallexample\n+This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{sdot_prod@var{m}@var{n}} instruction pattern\n-@item @samp{sdot_prod@var{m}@var{n}}\n+@mdindex @code{nearbyint@var{m}2}\n+@item @samp{nearbyint@var{m}2}\n+Round operand 1 to an integer, using the current rounding mode, and\n+store the result in operand 0. Do not raise an inexact condition when\n+the result is different from the argument. Both operands have mode\n+@var{m}, which is a scalar or vector floating-point mode.\n \n-Multiply operand 1 by operand 2 without loss of precision, given that\n-both operands contain signed elements. Add each product to the overlapping\n-element of operand 3 and store the result in operand 0. Operands 0 and 3\n-have mode @var{m} and operands 1 and 2 have mode @var{n}, with @var{n}\n-having narrower elements than @var{m}.\n+This pattern is not allowed to @code{FAIL}.\n \n-Semantically the expressions perform the multiplication in the following signs\n+@mdindex @code{rint@var{m}2}\n+@item @samp{rint@var{m}2}\n+Round operand 1 to an integer, using the current rounding mode, and\n+store the result in operand 0. Raise an inexact condition when\n+the result is different from the argument. Both operands have mode\n+@var{m}, which is a scalar or vector floating-point mode.\n \n-@smallexample\n-sdot<signed op0, signed op1, signed op2, signed op3> ==\n- op0 = sign-ext (op1) * sign-ext (op2) + op3\n-@dots{}\n-@end smallexample\n+This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{udot_prod@var{m}@var{n}} instruction pattern\n-@item @samp{udot_prod@var{m}@var{n}}\n+@mdindex @code{lrint@var{m}@var{n}2}\n+@item @samp{lrint@var{m}@var{n}2}\n+Convert operand 1 (valid for floating point mode @var{m}) to fixed\n+point mode @var{n} as a signed number according to the current\n+rounding mode and store in operand 0 (which has mode @var{n}).\n \n-Multiply operand 1 by operand 2 without loss of precision, given that\n-both operands contain unsigned elements. Add each product to the overlapping\n-element of operand 3 and store the result in operand 0. Operands 0 and 3\n-have mode @var{m} and operands 1 and 2 have mode @var{n}, with @var{n}\n-having narrower elements than @var{m}.\n+@mdindex @code{lround@var{m}@var{n}2}\n+@item @samp{lround@var{m}@var{n}2}\n+Convert operand 1 (valid for floating point mode @var{m}) to fixed\n+point mode @var{n} as a signed number rounding to nearest and away\n+from zero and store in operand 0 (which has mode @var{n}).\n \n-Semantically the expressions perform the multiplication in the following signs\n+@mdindex @code{lfloor@var{m}@var{n}2}\n+@item @samp{lfloor@var{m}@var{n}2}\n+Convert operand 1 (valid for floating point mode @var{m}) to fixed\n+point mode @var{n} as a signed number rounding down and store in\n+operand 0 (which has mode @var{n}).\n \n-@smallexample\n-udot<unsigned op0, unsigned op1, unsigned op2, unsigned op3> ==\n- op0 = zero-ext (op1) * zero-ext (op2) + op3\n-@dots{}\n-@end smallexample\n+@mdindex @code{lceil@var{m}@var{n}2}\n+@item @samp{lceil@var{m}@var{n}2}\n+Convert operand 1 (valid for floating point mode @var{m}) to fixed\n+point mode @var{n} as a signed number rounding up and store in\n+operand 0 (which has mode @var{n}).\n \n-@cindex @code{usdot_prod@var{m}@var{n}} instruction pattern\n-@item @samp{usdot_prod@var{m}@var{n}}\n-Compute the sum of the products of elements of different signs.\n-Multiply operand 1 by operand 2 without loss of precision, given that operand 1\n-is unsigned and operand 2 is signed. Add each product to the overlapping\n-element of operand 3 and store the result in operand 0. Operands 0 and 3 have\n-mode @var{m} and operands 1 and 2 have mode @var{n}, with @var{n} having\n-narrower elements than @var{m}.\n+@mdindex @code{copysign@var{m}3}\n+@item @samp{copysign@var{m}3}\n+Store a value with the magnitude of operand 1 and the sign of operand\n+2 into operand 0. All operands have mode @var{m}, which is a scalar or\n+vector floating-point mode.\n \n-Semantically the expressions perform the multiplication in the following signs\n+This pattern is not allowed to @code{FAIL}.\n \n-@smallexample\n-usdot<signed op0, unsigned op1, signed op2, signed op3> ==\n- op0 = ((signed-conv) zero-ext (op1)) * sign-ext (op2) + op3\n-@dots{}\n-@end smallexample\n+@mdindex @code{xorsign@var{m}3}\n+@item @samp{xorsign@var{m}3}\n+Equivalent to @samp{op0 = op1 * copysign (1.0, op2)}: store a value with\n+the magnitude of operand 1 and the sign of operand 2 into operand 0.\n+All operands have mode @var{m}, which is a scalar or vector\n+floating-point mode.\n \n-@cindex @code{ssad@var{m}} instruction pattern\n-@cindex @code{usad@var{m}} instruction pattern\n-@item @samp{ssad@var{m}}\n-@item @samp{usad@var{m}}\n-Compute the sum of absolute differences of two signed/unsigned elements.\n-Operand 1 and operand 2 are of the same mode. Their absolute difference, which\n-is of a wider mode, is computed and added to operand 3. Operand 3 is of a mode\n-equal or wider than the mode of the absolute difference. The result is placed\n-in operand 0, which is of the same mode as operand 3.\n-@var{m} is the mode of operand 1 and operand 2.\n+This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{widen_ssum@var{n}@var{m}3} instruction pattern\n-@cindex @code{widen_usum@var{n}@var{m}3} instruction pattern\n-@item @samp{widen_ssum@var{n}@var{m}3}\n-@itemx @samp{widen_usum@var{n}@var{m}3}\n-Operands 0 and 2 are of the same mode, which is wider than the mode of\n-operand 1. Add operand 1 to operand 2 and place the widened result in\n-operand 0. (This is used express accumulation of elements into an accumulator\n-of a wider mode.)\n-@var{m} is the mode of operand 1 and @var{n} is the mode of operand 0.\n+@mdindex @code{issignaling@var{m}2}\n+@item @samp{issignaling@var{m}2}\n+Set operand 0 to 1 if operand 1 is a signaling NaN and to 0 otherwise.\n \n-@cindex @code{smulhs@var{m}3} instruction pattern\n-@cindex @code{umulhs@var{m}3} instruction pattern\n-@item @samp{smulhs@var{m}3}\n-@itemx @samp{umulhs@var{m}3}\n-Signed/unsigned multiply high with scale. This is equivalent to the C code:\n-@smallexample\n-narrow op0, op1, op2;\n-@dots{}\n-op0 = (narrow) (((wide) op1 * (wide) op2) >> (N / 2 - 1));\n-@end smallexample\n-where the sign of @samp{narrow} determines whether this is a signed\n-or unsigned operation, and @var{N} is the size of @samp{wide} in bits.\n-@var{m} is the mode for all 3 operands (narrow). The wide mode is not specified\n-and is defined to fit the whole multiply.\n+@mdindex @code{ffs@var{m}2}\n+@item @samp{ffs@var{m}2}\n+Store into operand 0 one plus the index of the least significant 1-bit\n+of operand 1. If operand 1 is zero, store zero.\n \n-@cindex @code{smulhrs@var{m}3} instruction pattern\n-@cindex @code{umulhrs@var{m}3} instruction pattern\n-@item @samp{smulhrs@var{m}3}\n-@itemx @samp{umulhrs@var{m}3}\n-Signed/unsigned multiply high with round and scale. This is\n-equivalent to the C code:\n-@smallexample\n-narrow op0, op1, op2;\n-@dots{}\n-op0 = (narrow) (((((wide) op1 * (wide) op2) >> (N / 2 - 2)) + 1) >> 1);\n-@end smallexample\n-where the sign of @samp{narrow} determines whether this is a signed\n-or unsigned operation, and @var{N} is the size of @samp{wide} in bits.\n-@var{m} is the mode for all 3 operands (narrow). The wide mode is not specified\n-and is defined to fit the whole multiply.\n+@var{m} is either a scalar or vector integer mode. When it is a scalar,\n+operand 1 has mode @var{m} but operand 0 can have whatever scalar\n+integer mode is suitable for the target. The compiler will insert\n+conversion instructions as necessary (typically to convert the result\n+to the same width as @code{int}). When @var{m} is a vector, both\n+operands must have mode @var{m}.\n \n-@cindex @code{sdiv_pow2@var{m}3} instruction pattern\n-@cindex @code{sdiv_pow2@var{m}3} instruction pattern\n-@item @samp{sdiv_pow2@var{m}3}\n-@itemx @samp{sdiv_pow2@var{m}3}\n-Signed division by power-of-2 immediate. Equivalent to:\n-@smallexample\n-signed op0, op1;\n-@dots{}\n-op0 = op1 / (1 << imm);\n-@end smallexample\n+This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{vec_shl_insert_@var{m}} instruction pattern\n-@item @samp{vec_shl_insert_@var{m}}\n-Shift the elements in vector input operand 1 left one element (i.e.@:\n-away from element 0) and fill the vacated element 0 with the scalar\n-in operand 2. Store the result in vector output operand 0. Operands\n-0 and 1 have mode @var{m} and operand 2 has the mode appropriate for\n-one element of @var{m}.\n+@mdindex @code{clrsb@var{m}2}\n+@item @samp{clrsb@var{m}2}\n+Count leading redundant sign bits.\n+Store into operand 0 the number of redundant sign bits in operand 1, starting\n+at the most significant bit position.\n+A redundant sign bit is defined as any sign bit after the first. As such,\n+this count will be one less than the count of leading sign bits.\n \n-@cindex @code{vec_shl_@var{m}} instruction pattern\n-@item @samp{vec_shl_@var{m}}\n-Whole vector left shift in bits, i.e.@: away from element 0.\n-Operand 1 is a vector to be shifted.\n-Operand 2 is an integer shift amount in bits.\n-Operand 0 is where the resulting shifted vector is stored.\n-The output and input vectors should have the same modes.\n+@var{m} is either a scalar or vector integer mode. When it is a scalar,\n+operand 1 has mode @var{m} but operand 0 can have whatever scalar\n+integer mode is suitable for the target. The compiler will insert\n+conversion instructions as necessary (typically to convert the result\n+to the same width as @code{int}). When @var{m} is a vector, both\n+operands must have mode @var{m}.\n \n-@cindex @code{vec_shr_@var{m}} instruction pattern\n-@item @samp{vec_shr_@var{m}}\n-Whole vector right shift in bits, i.e.@: towards element 0.\n-Operand 1 is a vector to be shifted.\n-Operand 2 is an integer shift amount in bits.\n-Operand 0 is where the resulting shifted vector is stored.\n-The output and input vectors should have the same modes.\n+This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{vec_pack_trunc_@var{m}} instruction pattern\n-@item @samp{vec_pack_trunc_@var{m}}\n-Narrow (demote) and merge the elements of two vectors. Operands 1 and 2\n-are vectors of the same mode having N integral or floating point elements\n-of size S@. Operand 0 is the resulting vector in which 2*N elements of\n-size S/2 are concatenated after narrowing them down using truncation.\n+@mdindex @code{clz@var{m}2}\n+@item @samp{clz@var{m}2}\n+Store into operand 0 the number of leading 0-bits in operand 1, starting\n+at the most significant bit position. If operand 1 is 0, the\n+@code{CLZ_DEFINED_VALUE_AT_ZERO} (@pxref{Misc}) macro defines if\n+the result is undefined or has a useful value.\n \n-@cindex @code{vec_pack_sbool_trunc_@var{m}} instruction pattern\n-@item @samp{vec_pack_sbool_trunc_@var{m}}\n-Narrow and merge the elements of two vectors. Operands 1 and 2 are vectors\n-of the same type having N boolean elements. Operand 0 is the resulting\n-vector in which 2*N elements are concatenated. The last operand (operand 3)\n-is the number of elements in the output vector 2*N as a @code{CONST_INT}.\n-This instruction pattern is used when all the vector input and output\n-operands have the same scalar mode @var{m} and thus using\n-@code{vec_pack_trunc_@var{m}} would be ambiguous.\n+@var{m} is either a scalar or vector integer mode. When it is a scalar,\n+operand 1 has mode @var{m} but operand 0 can have whatever scalar\n+integer mode is suitable for the target. The compiler will insert\n+conversion instructions as necessary (typically to convert the result\n+to the same width as @code{int}). When @var{m} is a vector, both\n+operands must have mode @var{m}.\n \n-@cindex @code{vec_pack_ssat_@var{m}} instruction pattern\n-@cindex @code{vec_pack_usat_@var{m}} instruction pattern\n-@item @samp{vec_pack_ssat_@var{m}}, @samp{vec_pack_usat_@var{m}}\n-Narrow (demote) and merge the elements of two vectors. Operands 1 and 2\n-are vectors of the same mode having N integral elements of size S.\n-Operand 0 is the resulting vector in which the elements of the two input\n-vectors are concatenated after narrowing them down using signed/unsigned\n-saturating arithmetic.\n+This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{vec_pack_sfix_trunc_@var{m}} instruction pattern\n-@cindex @code{vec_pack_ufix_trunc_@var{m}} instruction pattern\n-@item @samp{vec_pack_sfix_trunc_@var{m}}, @samp{vec_pack_ufix_trunc_@var{m}}\n-Narrow, convert to signed/unsigned integral type and merge the elements\n-of two vectors. Operands 1 and 2 are vectors of the same mode having N\n-floating point elements of size S@. Operand 0 is the resulting vector\n-in which 2*N elements of size S/2 are concatenated.\n+@mdindex @code{ctz@var{m}2}\n+@item @samp{ctz@var{m}2}\n+Store into operand 0 the number of trailing 0-bits in operand 1, starting\n+at the least significant bit position. If operand 1 is 0, the\n+@code{CTZ_DEFINED_VALUE_AT_ZERO} (@pxref{Misc}) macro defines if\n+the result is undefined or has a useful value.\n \n-@cindex @code{vec_packs_float_@var{m}} instruction pattern\n-@cindex @code{vec_packu_float_@var{m}} instruction pattern\n-@item @samp{vec_packs_float_@var{m}}, @samp{vec_packu_float_@var{m}}\n-Narrow, convert to floating point type and merge the elements\n-of two vectors. Operands 1 and 2 are vectors of the same mode having N\n-signed/unsigned integral elements of size S@. Operand 0 is the resulting vector\n-in which 2*N elements of size S/2 are concatenated.\n+@var{m} is either a scalar or vector integer mode. When it is a scalar,\n+operand 1 has mode @var{m} but operand 0 can have whatever scalar\n+integer mode is suitable for the target. The compiler will insert\n+conversion instructions as necessary (typically to convert the result\n+to the same width as @code{int}). When @var{m} is a vector, both\n+operands must have mode @var{m}.\n \n-@cindex @code{vec_unpacks_hi_@var{m}} instruction pattern\n-@cindex @code{vec_unpacks_lo_@var{m}} instruction pattern\n-@item @samp{vec_unpacks_hi_@var{m}}, @samp{vec_unpacks_lo_@var{m}}\n-Extract and widen (promote) the high/low part of a vector of signed\n-integral or floating point elements. The input vector (operand 1) has N\n-elements of size S@. Widen (promote) the high/low elements of the vector\n-using signed or floating point extension and place the resulting N/2\n-values of size 2*S in the output vector (operand 0).\n+This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{vec_unpacku_hi_@var{m}} instruction pattern\n-@cindex @code{vec_unpacku_lo_@var{m}} instruction pattern\n-@item @samp{vec_unpacku_hi_@var{m}}, @samp{vec_unpacku_lo_@var{m}}\n-Extract and widen (promote) the high/low part of a vector of unsigned\n-integral elements. The input vector (operand 1) has N elements of size S.\n-Widen (promote) the high/low elements of the vector using zero extension and\n-place the resulting N/2 values of size 2*S in the output vector (operand 0).\n+@mdindex @code{popcount@var{m}2}\n+@item @samp{popcount@var{m}2}\n+Store into operand 0 the number of 1-bits in operand 1.\n \n-@cindex @code{vec_unpacks_sbool_hi_@var{m}} instruction pattern\n-@cindex @code{vec_unpacks_sbool_lo_@var{m}} instruction pattern\n-@item @samp{vec_unpacks_sbool_hi_@var{m}}, @samp{vec_unpacks_sbool_lo_@var{m}}\n-Extract the high/low part of a vector of boolean elements that have scalar\n-mode @var{m}. The input vector (operand 1) has N elements, the output\n-vector (operand 0) has N/2 elements. The last operand (operand 2) is the\n-number of elements of the input vector N as a @code{CONST_INT}. These\n-patterns are used if both the input and output vectors have the same scalar\n-mode @var{m} and thus using @code{vec_unpacks_hi_@var{m}} or\n-@code{vec_unpacks_lo_@var{m}} would be ambiguous.\n+@var{m} is either a scalar or vector integer mode. When it is a scalar,\n+operand 1 has mode @var{m} but operand 0 can have whatever scalar\n+integer mode is suitable for the target. The compiler will insert\n+conversion instructions as necessary (typically to convert the result\n+to the same width as @code{int}). When @var{m} is a vector, both\n+operands must have mode @var{m}.\n \n-@cindex @code{vec_unpacks_float_hi_@var{m}} instruction pattern\n-@cindex @code{vec_unpacks_float_lo_@var{m}} instruction pattern\n-@cindex @code{vec_unpacku_float_hi_@var{m}} instruction pattern\n-@cindex @code{vec_unpacku_float_lo_@var{m}} instruction pattern\n-@item @samp{vec_unpacks_float_hi_@var{m}}, @samp{vec_unpacks_float_lo_@var{m}}\n-@itemx @samp{vec_unpacku_float_hi_@var{m}}, @samp{vec_unpacku_float_lo_@var{m}}\n-Extract, convert to floating point type and widen the high/low part of a\n-vector of signed/unsigned integral elements. The input vector (operand 1)\n-has N elements of size S@. Convert the high/low elements of the vector using\n-floating point conversion and place the resulting N/2 values of size 2*S in\n-the output vector (operand 0).\n+This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{vec_unpack_sfix_trunc_hi_@var{m}} instruction pattern\n-@cindex @code{vec_unpack_sfix_trunc_lo_@var{m}} instruction pattern\n-@cindex @code{vec_unpack_ufix_trunc_hi_@var{m}} instruction pattern\n-@cindex @code{vec_unpack_ufix_trunc_lo_@var{m}} instruction pattern\n-@item @samp{vec_unpack_sfix_trunc_hi_@var{m}},\n-@itemx @samp{vec_unpack_sfix_trunc_lo_@var{m}}\n-@itemx @samp{vec_unpack_ufix_trunc_hi_@var{m}}\n-@itemx @samp{vec_unpack_ufix_trunc_lo_@var{m}}\n-Extract, convert to signed/unsigned integer type and widen the high/low part of a\n-vector of floating point elements. The input vector (operand 1)\n-has N elements of size S@. Convert the high/low elements of the vector\n-to integers and place the resulting N/2 values of size 2*S in\n-the output vector (operand 0).\n+@mdindex @code{parity@var{m}2}\n+@item @samp{parity@var{m}2}\n+Store into operand 0 the parity of operand 1, i.e.@: the number of 1-bits\n+in operand 1 modulo 2.\n \n-@cindex @code{vec_widen_umult_hi_@var{m}} instruction pattern\n-@cindex @code{vec_widen_umult_lo_@var{m}} instruction pattern\n-@cindex @code{vec_widen_smult_hi_@var{m}} instruction pattern\n-@cindex @code{vec_widen_smult_lo_@var{m}} instruction pattern\n-@cindex @code{vec_widen_umult_even_@var{m}} instruction pattern\n-@cindex @code{vec_widen_umult_odd_@var{m}} instruction pattern\n-@cindex @code{vec_widen_smult_even_@var{m}} instruction pattern\n-@cindex @code{vec_widen_smult_odd_@var{m}} instruction pattern\n-@item @samp{vec_widen_umult_hi_@var{m}}, @samp{vec_widen_umult_lo_@var{m}}\n-@itemx @samp{vec_widen_smult_hi_@var{m}}, @samp{vec_widen_smult_lo_@var{m}}\n-@itemx @samp{vec_widen_umult_even_@var{m}}, @samp{vec_widen_umult_odd_@var{m}}\n-@itemx @samp{vec_widen_smult_even_@var{m}}, @samp{vec_widen_smult_odd_@var{m}}\n-Signed/Unsigned widening multiplication. The two inputs (operands 1 and 2)\n-are vectors with N signed/unsigned elements of size S@. Multiply the high/low\n-or even/odd elements of the two vectors, and put the N/2 products of size 2*S\n-in the output vector (operand 0). A target shouldn't implement even/odd pattern\n-pair if it is less efficient than lo/hi one.\n+@var{m} is either a scalar or vector integer mode. When it is a scalar,\n+operand 1 has mode @var{m} but operand 0 can have whatever scalar\n+integer mode is suitable for the target. The compiler will insert\n+conversion instructions as necessary (typically to convert the result\n+to the same width as @code{int}). When @var{m} is a vector, both\n+operands must have mode @var{m}.\n \n-@cindex @code{vec_widen_ushiftl_hi_@var{m}} instruction pattern\n-@cindex @code{vec_widen_ushiftl_lo_@var{m}} instruction pattern\n-@cindex @code{vec_widen_sshiftl_hi_@var{m}} instruction pattern\n-@cindex @code{vec_widen_sshiftl_lo_@var{m}} instruction pattern\n-@item @samp{vec_widen_ushiftl_hi_@var{m}}, @samp{vec_widen_ushiftl_lo_@var{m}}\n-@itemx @samp{vec_widen_sshiftl_hi_@var{m}}, @samp{vec_widen_sshiftl_lo_@var{m}}\n-Signed/Unsigned widening shift left. The first input (operand 1) is a vector\n-with N signed/unsigned elements of size S@. Operand 2 is a constant. Shift\n-the high/low elements of operand 1, and put the N/2 results of size 2*S in the\n-output vector (operand 0).\n+This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{vec_widen_saddl_hi_@var{m}} instruction pattern\n-@cindex @code{vec_widen_saddl_lo_@var{m}} instruction pattern\n-@cindex @code{vec_widen_uaddl_hi_@var{m}} instruction pattern\n-@cindex @code{vec_widen_uaddl_lo_@var{m}} instruction pattern\n-@item @samp{vec_widen_uaddl_hi_@var{m}}, @samp{vec_widen_uaddl_lo_@var{m}}\n-@itemx @samp{vec_widen_saddl_hi_@var{m}}, @samp{vec_widen_saddl_lo_@var{m}}\n-Signed/Unsigned widening add long. Operands 1 and 2 are vectors with N\n-signed/unsigned elements of size S@. Add the high/low elements of 1 and 2\n-together, widen the resulting elements and put the N/2 results of size 2*S in\n-the output vector (operand 0).\n+@mdindex @code{one_cmpl@var{m}2}\n+@item @samp{one_cmpl@var{m}2}\n+Store the bitwise-complement of operand 1 into operand 0.\n \n-@cindex @code{vec_widen_ssubl_hi_@var{m}} instruction pattern\n-@cindex @code{vec_widen_ssubl_lo_@var{m}} instruction pattern\n-@cindex @code{vec_widen_usubl_hi_@var{m}} instruction pattern\n-@cindex @code{vec_widen_usubl_lo_@var{m}} instruction pattern\n-@item @samp{vec_widen_usubl_hi_@var{m}}, @samp{vec_widen_usubl_lo_@var{m}}\n-@itemx @samp{vec_widen_ssubl_hi_@var{m}}, @samp{vec_widen_ssubl_lo_@var{m}}\n-Signed/Unsigned widening subtract long. Operands 1 and 2 are vectors with N\n-signed/unsigned elements of size S@. Subtract the high/low elements of 2 from\n-1 and widen the resulting elements. Put the N/2 results of size 2*S in the\n-output vector (operand 0).\n+@mdindex @code{cpymem@var{m}}\n+@item @samp{cpymem@var{m}}\n+Block copy instruction. The destination and source blocks of memory\n+are the first two operands, and both are @code{mem:BLK}s with an\n+address in mode @code{Pmode}.\n \n-@cindex @code{vec_widen_sabd_hi_@var{m}} instruction pattern\n-@cindex @code{vec_widen_sabd_lo_@var{m}} instruction pattern\n-@cindex @code{vec_widen_sabd_odd_@var{m}} instruction pattern\n-@cindex @code{vec_widen_sabd_even_@var{m}} instruction pattern\n-@cindex @code{vec_widen_uabd_hi_@var{m}} instruction pattern\n-@cindex @code{vec_widen_uabd_lo_@var{m}} instruction pattern\n-@cindex @code{vec_widen_uabd_odd_@var{m}} instruction pattern\n-@cindex @code{vec_widen_uabd_even_@var{m}} instruction pattern\n-@item @samp{vec_widen_uabd_hi_@var{m}}, @samp{vec_widen_uabd_lo_@var{m}}\n-@itemx @samp{vec_widen_uabd_odd_@var{m}}, @samp{vec_widen_uabd_even_@var{m}}\n-@itemx @samp{vec_widen_sabd_hi_@var{m}}, @samp{vec_widen_sabd_lo_@var{m}}\n-@itemx @samp{vec_widen_sabd_odd_@var{m}}, @samp{vec_widen_sabd_even_@var{m}}\n-Signed/Unsigned widening absolute difference. Operands 1 and 2 are\n-vectors with N signed/unsigned elements of size S@. Find the absolute\n-difference between operands 1 and 2 and widen the resulting elements.\n-Put the N/2 results of size 2*S in the output vector (operand 0).\n+The number of bytes to copy is the third operand, in mode @var{m}.\n+Usually, you specify @code{Pmode} for @var{m}. However, if you can\n+generate better code knowing the range of valid lengths is smaller than\n+those representable in a full Pmode pointer, you should provide\n+a pattern with a\n+mode corresponding to the range of values you can handle efficiently\n+(e.g., @code{QImode} for values in the range 0--127; note we avoid numbers\n+that appear negative) and also a pattern with @code{Pmode}.\n \n-@cindex @code{vec_trunc_add_high@var{m}} instruction pattern\n-@item @samp{vec_trunc_add_high@var{m}}\n-Signed or unsigned addition of two input integer vectors of mode @var{m}, then\n-extracts the most significant half of each result element and narrows it to\n-elements of half the original width.\n+The fourth operand is the known shared alignment of the source and\n+destination, in the form of a @code{const_int} rtx. Thus, if the\n+compiler knows that both source and destination are word-aligned,\n+it may provide the value 4 for this operand.\n \n-Concretely, it computes:\n-@code{(bits(a)/2)((a + b) >> bits(a)/2)}\n+Optional operands 5 and 6 specify expected alignment and size of block\n+respectively. The expected alignment differs from alignment in operand 4\n+in a way that the blocks are not required to be aligned according to it in\n+all cases. This expected alignment is also in bytes, just like operand 4.\n+Expected size, when unknown, is set to @code{(const_int -1)}.\n \n-where @code{bits(a)} is the width in bits of each input element.\n+Descriptions of multiple @code{cpymem@var{m}} patterns can only be\n+beneficial if the patterns for smaller modes have fewer restrictions\n+on their first, second and fourth operands. Note that the mode @var{m}\n+in @code{cpymem@var{m}} does not impose any restriction on the mode of\n+individually copied data units in the block.\n \n-Operand 1 and 2 are of integer vector mode @var{m} containing the same number\n-of signed or unsigned integral elements. The result (operand @code{0}) is of an\n-integer vector mode with the same number of elements but elements of half of the\n-width of those of mode @var{m}.\n+The @code{cpymem@var{m}} patterns need not give special consideration\n+to the possibility that the source and destination strings might\n+overlap. An exception is the case where source and destination are\n+equal, this case needs to be handled correctly.\n+These patterns are used to do inline expansion of @code{__builtin_memcpy}.\n \n-This operation currently only used for early break result compression when the\n-result of a vector boolean can be represented as 0 or -1.\n+@mdindex @code{movmem@var{m}}\n+@item @samp{movmem@var{m}}\n+Block move instruction. The destination and source blocks of memory\n+are the first two operands, and both are @code{mem:BLK}s with an\n+address in mode @code{Pmode}.\n \n-@cindex @code{vec_addsub@var{m}3} instruction pattern\n-@item @samp{vec_addsub@var{m}3}\n-Alternating subtract, add with even lanes doing subtract and odd\n-lanes doing addition. Operands 1 and 2 and the outout operand are vectors\n-with mode @var{m}.\n+The number of bytes to copy is the third operand, in mode @var{m}.\n+Usually, you specify @code{Pmode} for @var{m}. However, if you can\n+generate better code knowing the range of valid lengths is smaller than\n+those representable in a full Pmode pointer, you should provide\n+a pattern with a\n+mode corresponding to the range of values you can handle efficiently\n+(e.g., @code{QImode} for values in the range 0--127; note we avoid numbers\n+that appear negative) and also a pattern with @code{Pmode}.\n \n-@cindex @code{vec_fmaddsub@var{m}4} instruction pattern\n-@item @samp{vec_fmaddsub@var{m}4}\n-Alternating multiply subtract, add with even lanes doing subtract and odd\n-lanes doing addition of the third operand to the multiplication result\n-of the first two operands. Operands 1, 2 and 3 and the outout operand are vectors\n-with mode @var{m}.\n+The fourth operand is the known shared alignment of the source and\n+destination, in the form of a @code{const_int} rtx. Thus, if the\n+compiler knows that both source and destination are word-aligned,\n+it may provide the value 4 for this operand.\n \n-@cindex @code{vec_fmsubadd@var{m}4} instruction pattern\n-@item @samp{vec_fmsubadd@var{m}4}\n-Alternating multiply add, subtract with even lanes doing addition and odd\n-lanes doing subtraction of the third operand to the multiplication result\n-of the first two operands. Operands 1, 2 and 3 and the outout operand are vectors\n-with mode @var{m}.\n+Optional operands 5 and 6 specify expected alignment and size of block\n+respectively. The expected alignment differs from alignment in operand 4\n+in a way that the blocks are not required to be aligned according to it in\n+all cases. This expected alignment is also in bytes, just like operand 4.\n+Expected size, when unknown, is set to @code{(const_int -1)}.\n \n-These instructions are not allowed to @code{FAIL}.\n+Descriptions of multiple @code{movmem@var{m}} patterns can only be\n+beneficial if the patterns for smaller modes have fewer restrictions\n+on their first, second and fourth operands. Note that the mode @var{m}\n+in @code{movmem@var{m}} does not impose any restriction on the mode of\n+individually copied data units in the block.\n \n-@cindex @code{mulhisi3} instruction pattern\n-@item @samp{mulhisi3}\n-Multiply operands 1 and 2, which have mode @code{HImode}, and store\n-a @code{SImode} product in operand 0.\n+The @code{movmem@var{m}} patterns must correctly handle the case where\n+the source and destination strings overlap. These patterns are used to\n+do inline expansion of @code{__builtin_memmove}.\n \n-@cindex @code{mulqihi3} instruction pattern\n-@cindex @code{mulsidi3} instruction pattern\n-@item @samp{mulqihi3}, @samp{mulsidi3}\n-Similar widening-multiplication instructions of other widths.\n+@mdindex @code{movstr}\n+@item @samp{movstr}\n+String copy instruction, with @code{stpcpy} semantics. Operand 0 is\n+an output operand in mode @code{Pmode}. The addresses of the\n+destination and source strings are operands 1 and 2, and both are\n+@code{mem:BLK}s with addresses in mode @code{Pmode}. The execution of\n+the expansion of this pattern should store in operand 0 the address in\n+which the @code{NUL} terminator was stored in the destination string.\n \n-@cindex @code{umulqihi3} instruction pattern\n-@cindex @code{umulhisi3} instruction pattern\n-@cindex @code{umulsidi3} instruction pattern\n-@item @samp{umulqihi3}, @samp{umulhisi3}, @samp{umulsidi3}\n-Similar widening-multiplication instructions that do unsigned\n-multiplication.\n+This pattern has also several optional operands that are same as in\n+@code{setmem}.\n \n-@cindex @code{usmulqihi3} instruction pattern\n-@cindex @code{usmulhisi3} instruction pattern\n-@cindex @code{usmulsidi3} instruction pattern\n-@item @samp{usmulqihi3}, @samp{usmulhisi3}, @samp{usmulsidi3}\n-Similar widening-multiplication instructions that interpret the first\n-operand as unsigned and the second operand as signed, then do a signed\n-multiplication.\n+@mdindex @code{setmem@var{m}}\n+@item @samp{setmem@var{m}}\n+Block set instruction. The destination string is the first operand,\n+given as a @code{mem:BLK} whose address is in mode @code{Pmode}. The\n+number of bytes to set is the second operand, in mode @var{m}. The value to\n+initialize the memory with is the third operand. Targets that only support the\n+clearing of memory should reject any value that is not the constant 0. See\n+@samp{cpymem@var{m}} for a discussion of the choice of mode.\n \n-@cindex @code{smul@var{m}3_highpart} instruction pattern\n-@item @samp{smul@var{m}3_highpart}\n-Perform a signed multiplication of operands 1 and 2, which have mode\n-@var{m}, and store the most significant half of the product in operand 0.\n-The least significant half of the product is discarded. This may be\n-represented in RTL using a @code{smul_highpart} RTX expression.\n+The fourth operand is the known alignment of the destination, in the form\n+of a @code{const_int} rtx. Thus, if the compiler knows that the\n+destination is word-aligned, it may provide the value 4 for this\n+operand.\n \n-@cindex @code{umul@var{m}3_highpart} instruction pattern\n-@item @samp{umul@var{m}3_highpart}\n-Similar, but the multiplication is unsigned. This may be represented\n-in RTL using an @code{umul_highpart} RTX expression.\n+Optional operands 5 and 6 specify expected alignment and size of block\n+respectively. The expected alignment differs from alignment in operand 4\n+in a way that the blocks are not required to be aligned according to it in\n+all cases. This expected alignment is also in bytes, just like operand 4.\n+Expected size, when unknown, is set to @code{(const_int -1)}.\n+Operand 7 is the minimal size of the block and operand 8 is the\n+maximal size of the block (NULL if it cannot be represented as CONST_INT).\n+Operand 9 is the probable maximal size (i.e.@: we cannot rely on it for\n+correctness, but it can be used for choosing proper code sequence for a\n+given size).\n \n-@cindex @code{madd@var{m}@var{n}4} instruction pattern\n-@item @samp{madd@var{m}@var{n}4}\n-Multiply operands 1 and 2, sign-extend them to mode @var{n}, add\n-operand 3, and store the result in operand 0. Operands 1 and 2\n-have mode @var{m} and operands 0 and 3 have mode @var{n}.\n-Both modes must be integer or fixed-point modes and @var{n} must be twice\n-the size of @var{m}.\n+The use for multiple @code{setmem@var{m}} is as for @code{cpymem@var{m}}.\n \n-In other words, @code{madd@var{m}@var{n}4} is like\n-@code{mul@var{m}@var{n}3} except that it also adds operand 3.\n+@mdindex @code{cmpstrn@var{m}}\n+@item @samp{cmpstrn@var{m}}\n+String compare instruction, with five operands. Operand 0 is the output;\n+it has mode @var{m}. The remaining four operands are like the operands\n+of @samp{cpymem@var{m}}. The two memory blocks specified are compared\n+byte by byte in lexicographic order starting at the beginning of each\n+string. The instruction is not allowed to prefetch more than one byte\n+at a time since either string may end in the first byte and reading past\n+that may access an invalid page or segment and cause a fault. The\n+comparison terminates early if the fetched bytes are different or if\n+they are equal to zero. The effect of the instruction is to store a\n+value in operand 0 whose sign indicates the result of the comparison.\n \n-These instructions are not allowed to @code{FAIL}.\n+@mdindex @code{cmpstr@var{m}}\n+@item @samp{cmpstr@var{m}}\n+String compare instruction, without known maximum length. Operand 0 is the\n+output; it has mode @var{m}. The second and third operand are the blocks of\n+memory to be compared; both are @code{mem:BLK} with an address in mode\n+@code{Pmode}.\n \n-@cindex @code{umadd@var{m}@var{n}4} instruction pattern\n-@item @samp{umadd@var{m}@var{n}4}\n-Like @code{madd@var{m}@var{n}4}, but zero-extend the multiplication\n-operands instead of sign-extending them.\n+The fourth operand is the known shared alignment of the source and\n+destination, in the form of a @code{const_int} rtx. Thus, if the\n+compiler knows that both source and destination are word-aligned,\n+it may provide the value 4 for this operand.\n \n-@cindex @code{ssmadd@var{m}@var{n}4} instruction pattern\n-@item @samp{ssmadd@var{m}@var{n}4}\n-Like @code{madd@var{m}@var{n}4}, but all involved operations must be\n-signed-saturating.\n+The two memory blocks specified are compared byte by byte in lexicographic\n+order starting at the beginning of each string. The instruction is not allowed\n+to prefetch more than one byte at a time since either string may end in the\n+first byte and reading past that may access an invalid page or segment and\n+cause a fault. The comparison will terminate when the fetched bytes\n+are different or if they are equal to zero. The effect of the\n+instruction is to store a value in operand 0 whose sign indicates the\n+result of the comparison.\n \n-@cindex @code{usmadd@var{m}@var{n}4} instruction pattern\n-@item @samp{usmadd@var{m}@var{n}4}\n-Like @code{umadd@var{m}@var{n}4}, but all involved operations must be\n-unsigned-saturating.\n+@mdindex @code{cmpmem@var{m}}\n+@item @samp{cmpmem@var{m}}\n+Block compare instruction, with five operands like the operands\n+of @samp{cmpstr@var{m}}. The two memory blocks specified are compared\n+byte by byte in lexicographic order starting at the beginning of each\n+block. Unlike @samp{cmpstr@var{m}} the instruction can prefetch\n+any bytes in the two memory blocks. Also unlike @samp{cmpstr@var{m}}\n+the comparison will not stop if both bytes are zero. The effect of\n+the instruction is to store a value in operand 0 whose sign indicates\n+the result of the comparison.\n \n-@cindex @code{msub@var{m}@var{n}4} instruction pattern\n-@item @samp{msub@var{m}@var{n}4}\n-Multiply operands 1 and 2, sign-extend them to mode @var{n}, subtract the\n-result from operand 3, and store the result in operand 0. Operands 1 and 2\n-have mode @var{m} and operands 0 and 3 have mode @var{n}.\n-Both modes must be integer or fixed-point modes and @var{n} must be twice\n-the size of @var{m}.\n+@mdindex @code{strlen@var{m}}\n+@item @samp{strlen@var{m}}\n+Compute the length of a string, with three operands.\n+Operand 0 is the result (of mode @var{m}), operand 1 is\n+a @code{mem} referring to the first character of the string,\n+operand 2 is the character to search for (normally zero),\n+and operand 3 is a constant describing the known alignment\n+of the beginning of the string.\n \n-In other words, @code{msub@var{m}@var{n}4} is like\n-@code{mul@var{m}@var{n}3} except that it also subtracts the result\n-from operand 3.\n+@mdindex @code{rawmemchr@var{m}}\n+@item @samp{rawmemchr@var{m}}\n+Scan memory referred to by operand 1 for the first occurrence of operand 2.\n+Operand 1 is a @code{mem} and operand 2 a @code{const_int} of mode @var{m}.\n+Operand 0 is the result, i.e., a pointer to the first occurrence of operand 2\n+in the memory block given by operand 1.\n \n-These instructions are not allowed to @code{FAIL}.\n+@mdindex @code{float@var{m}@var{n}2}\n+@item @samp{float@var{m}@var{n}2}\n+Convert signed integer operand 1 (valid for fixed point mode @var{m}) to\n+floating point mode @var{n} and store in operand 0 (which has mode\n+@var{n}).\n \n-@cindex @code{umsub@var{m}@var{n}4} instruction pattern\n-@item @samp{umsub@var{m}@var{n}4}\n-Like @code{msub@var{m}@var{n}4}, but zero-extend the multiplication\n-operands instead of sign-extending them.\n+@mdindex @code{floatuns@var{m}@var{n}2}\n+@item @samp{floatuns@var{m}@var{n}2}\n+Convert unsigned integer operand 1 (valid for fixed point mode @var{m})\n+to floating point mode @var{n} and store in operand 0 (which has mode\n+@var{n}).\n \n-@cindex @code{ssmsub@var{m}@var{n}4} instruction pattern\n-@item @samp{ssmsub@var{m}@var{n}4}\n-Like @code{msub@var{m}@var{n}4}, but all involved operations must be\n-signed-saturating.\n+@mdindex @code{fix@var{m}@var{n}2}\n+@item @samp{fix@var{m}@var{n}2}\n+Convert operand 1 (valid for floating point mode @var{m}) to fixed\n+point mode @var{n} as a signed number and store in operand 0 (which\n+has mode @var{n}). This instruction's result is defined only when\n+the value of operand 1 is an integer.\n \n-@cindex @code{usmsub@var{m}@var{n}4} instruction pattern\n-@item @samp{usmsub@var{m}@var{n}4}\n-Like @code{umsub@var{m}@var{n}4}, but all involved operations must be\n-unsigned-saturating.\n+If the machine description defines this pattern, it also needs to\n+define the @code{ftrunc} pattern.\n \n-@cindex @code{divmod@var{m}4} instruction pattern\n-@item @samp{divmod@var{m}4}\n-Signed division that produces both a quotient and a remainder.\n-Operand 1 is divided by operand 2 to produce a quotient stored\n-in operand 0 and a remainder stored in operand 3.\n+@mdindex @code{fixuns@var{m}@var{n}2}\n+@item @samp{fixuns@var{m}@var{n}2}\n+Convert operand 1 (valid for floating point mode @var{m}) to fixed\n+point mode @var{n} as an unsigned number and store in operand 0 (which\n+has mode @var{n}). This instruction's result is defined only when the\n+value of operand 1 is an integer.\n \n-For machines with an instruction that produces both a quotient and a\n-remainder, provide a pattern for @samp{divmod@var{m}4} but do not\n-provide patterns for @samp{div@var{m}3} and @samp{mod@var{m}3}. This\n-allows optimization in the relatively common case when both the quotient\n-and remainder are computed.\n+@mdindex @code{ftrunc@var{m}2}\n+@item @samp{ftrunc@var{m}2}\n+Convert operand 1 (valid for floating point mode @var{m}) to an\n+integer value, still represented in floating point mode @var{m}, and\n+store it in operand 0 (valid for floating point mode @var{m}).\n \n-If an instruction that just produces a quotient or just a remainder\n-exists and is more efficient than the instruction that produces both,\n-write the output routine of @samp{divmod@var{m}4} to call\n-@code{find_reg_note} and look for a @code{REG_UNUSED} note on the\n-quotient or remainder and generate the appropriate instruction.\n+@mdindex @code{fix_trunc@var{m}@var{n}2}\n+@item @samp{fix_trunc@var{m}@var{n}2}\n+Like @samp{fix@var{m}@var{n}2} but works for any floating point value\n+of mode @var{m} by converting the value to an integer.\n \n-@cindex @code{udivmod@var{m}4} instruction pattern\n-@item @samp{udivmod@var{m}4}\n-Similar, but does unsigned division.\n+@mdindex @code{fixuns_trunc@var{m}@var{n}2}\n+@item @samp{fixuns_trunc@var{m}@var{n}2}\n+Like @samp{fixuns@var{m}@var{n}2} but works for any floating point\n+value of mode @var{m} by converting the value to an integer.\n \n-@anchor{shift patterns}\n-@cindex @code{ashl@var{m}3} instruction pattern\n-@cindex @code{ssashl@var{m}3} instruction pattern\n-@cindex @code{usashl@var{m}3} instruction pattern\n-@item @samp{ashl@var{m}3}, @samp{ssashl@var{m}3}, @samp{usashl@var{m}3}\n-Arithmetic-shift operand 1 left by a number of bits specified by operand\n-2, and store the result in operand 0. Here @var{m} is the mode of\n-operand 0 and operand 1; operand 2's mode is specified by the\n-instruction pattern, and the compiler will convert the operand to that\n-mode before generating the instruction. The shift or rotate expander\n-or instruction pattern should explicitly specify the mode of the operand 2,\n-it should never be @code{VOIDmode}. The meaning of out-of-range shift\n-counts can optionally be specified by @code{TARGET_SHIFT_TRUNCATION_MASK}.\n-@xref{TARGET_SHIFT_TRUNCATION_MASK}. Operand 2 is always a scalar type.\n+@mdindex @code{trunc@var{m}@var{n}2}\n+@item @samp{trunc@var{m}@var{n}2}\n+Truncate operand 1 (valid for mode @var{m}) to mode @var{n} and\n+store in operand 0 (which has mode @var{n}). Both modes must be fixed\n+point or both floating point.\n \n-@cindex @code{ashr@var{m}3} instruction pattern\n-@cindex @code{lshr@var{m}3} instruction pattern\n-@cindex @code{rotl@var{m}3} instruction pattern\n-@cindex @code{rotr@var{m}3} instruction pattern\n-@item @samp{ashr@var{m}3}, @samp{lshr@var{m}3}, @samp{rotl@var{m}3}, @samp{rotr@var{m}3}\n-Other shift and rotate instructions, analogous to the\n-@code{ashl@var{m}3} instructions. Operand 2 is always a scalar type.\n+@mdindex @code{extend@var{m}@var{n}2}\n+@item @samp{extend@var{m}@var{n}2}\n+Sign-extend operand 1 (valid for mode @var{m}) to mode @var{n} and\n+store in operand 0 (which has mode @var{n}). Both modes must be fixed\n+point or both floating point.\n \n-@cindex @code{vashl@var{m}3} instruction pattern\n-@cindex @code{vashr@var{m}3} instruction pattern\n-@cindex @code{vlshr@var{m}3} instruction pattern\n-@cindex @code{vrotl@var{m}3} instruction pattern\n-@cindex @code{vrotr@var{m}3} instruction pattern\n-@item @samp{vashl@var{m}3}, @samp{vashr@var{m}3}, @samp{vlshr@var{m}3}, @samp{vrotl@var{m}3}, @samp{vrotr@var{m}3}\n-Vector shift and rotate instructions that take vectors as operand 2\n-instead of a scalar type.\n+@mdindex @code{zero_extend@var{m}@var{n}2}\n+@item @samp{zero_extend@var{m}@var{n}2}\n+Zero-extend operand 1 (valid for mode @var{m}) to mode @var{n} and\n+store in operand 0 (which has mode @var{n}). Both modes must be fixed\n+point.\n \n-@cindex @code{uabd@var{m}3} instruction pattern\n-@cindex @code{sabd@var{m}3} instruction pattern\n-@item @samp{uabd@var{m}}, @samp{sabd@var{m}}\n-Signed and unsigned absolute difference instructions. These\n-instructions find the difference between operands 1 and 2\n-then return the absolute value. A C code equivalent would be:\n-@smallexample\n-op0 = op1 > op2 ? op1 - op2 : op2 - op1;\n-@end smallexample\n+@mdindex @code{fract@var{m}@var{n}2}\n+@item @samp{fract@var{m}@var{n}2}\n+Convert operand 1 of mode @var{m} to mode @var{n} and store in\n+operand 0 (which has mode @var{n}). Mode @var{m} and mode @var{n}\n+could be fixed-point to fixed-point, signed integer to fixed-point,\n+fixed-point to signed integer, floating-point to fixed-point,\n+or fixed-point to floating-point.\n+When overflows or underflows happen, the results are undefined.\n \n-@cindex @code{avg@var{m}3_floor} instruction pattern\n-@cindex @code{uavg@var{m}3_floor} instruction pattern\n-@item @samp{avg@var{m}3_floor}\n-@itemx @samp{uavg@var{m}3_floor}\n-Signed and unsigned average instructions. These instructions add\n-operands 1 and 2 without truncation, divide the result by 2,\n-round towards -Inf, and store the result in operand 0. This is\n-equivalent to the C code:\n-@smallexample\n-narrow op0, op1, op2;\n-@dots{}\n-op0 = (narrow) (((wide) op1 + (wide) op2) >> 1);\n-@end smallexample\n-where the sign of @samp{narrow} determines whether this is a signed\n-or unsigned operation.\n+@mdindex @code{satfract@var{m}@var{n}2}\n+@item @samp{satfract@var{m}@var{n}2}\n+Convert operand 1 of mode @var{m} to mode @var{n} and store in\n+operand 0 (which has mode @var{n}). Mode @var{m} and mode @var{n}\n+could be fixed-point to fixed-point, signed integer to fixed-point,\n+or floating-point to fixed-point.\n+When overflows or underflows happen, the instruction saturates the\n+results to the maximum or the minimum.\n \n-@cindex @code{avg@var{m}3_ceil} instruction pattern\n-@cindex @code{uavg@var{m}3_ceil} instruction pattern\n-@item @samp{avg@var{m}3_ceil}\n-@itemx @samp{uavg@var{m}3_ceil}\n-Like @samp{avg@var{m}3_floor} and @samp{uavg@var{m}3_floor}, but round\n-towards +Inf. This is equivalent to the C code:\n-@smallexample\n-narrow op0, op1, op2;\n-@dots{}\n-op0 = (narrow) (((wide) op1 + (wide) op2 + 1) >> 1);\n-@end smallexample\n+@mdindex @code{fractuns@var{m}@var{n}2}\n+@item @samp{fractuns@var{m}@var{n}2}\n+Convert operand 1 of mode @var{m} to mode @var{n} and store in\n+operand 0 (which has mode @var{n}). Mode @var{m} and mode @var{n}\n+could be unsigned integer to fixed-point, or\n+fixed-point to unsigned integer.\n+When overflows or underflows happen, the results are undefined.\n \n-@cindex @code{bswap@var{m}2} instruction pattern\n-@item @samp{bswap@var{m}2}\n-Reverse the order of bytes of operand 1 and store the result in operand 0.\n+@mdindex @code{satfractuns@var{m}@var{n}2}\n+@item @samp{satfractuns@var{m}@var{n}2}\n+Convert unsigned integer operand 1 of mode @var{m} to fixed-point mode\n+@var{n} and store in operand 0 (which has mode @var{n}).\n+When overflows or underflows happen, the instruction saturates the\n+results to the maximum or the minimum.\n \n-@cindex @code{neg@var{m}2} instruction pattern\n-@cindex @code{ssneg@var{m}2} instruction pattern\n-@cindex @code{usneg@var{m}2} instruction pattern\n-@item @samp{neg@var{m}2}, @samp{ssneg@var{m}2}, @samp{usneg@var{m}2}\n-Negate operand 1 and store the result in operand 0.\n+@mdindex @code{extv@var{m}}\n+@item @samp{extv@var{m}}\n+Extract a bit-field from register operand 1, sign-extend it, and store\n+it in operand 0. Operand 2 specifies the width of the field in bits\n+and operand 3 the starting bit, which counts from the most significant\n+bit if @samp{BITS_BIG_ENDIAN} is true and from the least significant bit\n+otherwise.\n \n-@cindex @code{negv@var{m}3} instruction pattern\n-@item @samp{negv@var{m}3}\n-Like @code{neg@var{m}2} but takes a @code{code_label} as operand 2 and\n-emits code to jump to it if signed overflow occurs during the negation.\n+Operands 0 and 1 both have mode @var{m}. Operands 2 and 3 have a\n+target-specific mode.\n \n-@cindex @code{abs@var{m}2} instruction pattern\n-@item @samp{abs@var{m}2}\n-Store the absolute value of operand 1 into operand 0.\n+@mdindex @code{extvmisalign@var{m}}\n+@item @samp{extvmisalign@var{m}}\n+Extract a bit-field from memory operand 1, sign extend it, and store\n+it in operand 0. Operand 2 specifies the width in bits and operand 3\n+the starting bit. The starting bit is always somewhere in the first byte of\n+operand 1; it counts from the most significant bit if @samp{BITS_BIG_ENDIAN}\n+is true and from the least significant bit otherwise.\n \n-@cindex @code{sqrt@var{m}2} instruction pattern\n-@item @samp{sqrt@var{m}2}\n-Store the square root of operand 1 into operand 0. Both operands have\n-mode @var{m}, which is a scalar or vector floating-point mode.\n+Operand 0 has mode @var{m} while operand 1 has @code{BLK} mode.\n+Operands 2 and 3 have a target-specific mode.\n \n-This pattern is not allowed to @code{FAIL}.\n+The instruction must not read beyond the last byte of the bit-field.\n \n-@cindex @code{rsqrt@var{m}2} instruction pattern\n-@item @samp{rsqrt@var{m}2}\n-Store the reciprocal of the square root of operand 1 into operand 0.\n-Both operands have mode @var{m}, which is a scalar or vector\n-floating-point mode.\n+@mdindex @code{extzv@var{m}}\n+@item @samp{extzv@var{m}}\n+Like @samp{extv@var{m}} except that the bit-field value is zero-extended.\n \n-On most architectures this pattern is only approximate, so either\n-its C condition or the @code{TARGET_OPTAB_SUPPORTED_P} hook should\n-check for the appropriate math flags. (Using the C condition is\n-more direct, but using @code{TARGET_OPTAB_SUPPORTED_P} can be useful\n-if a target-specific built-in also uses the @samp{rsqrt@var{m}2}\n-pattern.)\n+@mdindex @code{extzvmisalign@var{m}}\n+@item @samp{extzvmisalign@var{m}}\n+Like @samp{extvmisalign@var{m}} except that the bit-field value is\n+zero-extended.\n \n-This pattern is not allowed to @code{FAIL}.\n+@mdindex @code{insv@var{m}}\n+@item @samp{insv@var{m}}\n+Insert operand 3 into a bit-field of register operand 0. Operand 1\n+specifies the width of the field in bits and operand 2 the starting bit,\n+which counts from the most significant bit if @samp{BITS_BIG_ENDIAN}\n+is true and from the least significant bit otherwise.\n \n-@cindex @code{fmod@var{m}3} instruction pattern\n-@item @samp{fmod@var{m}3}\n-Store the remainder of dividing operand 1 by operand 2 into\n-operand 0, rounded towards zero to an integer. All operands have\n-mode @var{m}, which is a scalar or vector floating-point mode.\n+Operands 0 and 3 both have mode @var{m}. Operands 1 and 2 have a\n+target-specific mode.\n \n-This pattern is not allowed to @code{FAIL}.\n+@mdindex @code{insvmisalign@var{m}}\n+@item @samp{insvmisalign@var{m}}\n+Insert operand 3 into a bit-field of memory operand 0. Operand 1\n+specifies the width of the field in bits and operand 2 the starting bit.\n+The starting bit is always somewhere in the first byte of operand 0;\n+it counts from the most significant bit if @samp{BITS_BIG_ENDIAN}\n+is true and from the least significant bit otherwise.\n \n-@cindex @code{remainder@var{m}3} instruction pattern\n-@item @samp{remainder@var{m}3}\n-Store the remainder of dividing operand 1 by operand 2 into\n-operand 0, rounded to the nearest integer. All operands have\n-mode @var{m}, which is a scalar or vector floating-point mode.\n+Operand 3 has mode @var{m} while operand 0 has @code{BLK} mode.\n+Operands 1 and 2 have a target-specific mode.\n \n-This pattern is not allowed to @code{FAIL}.\n+The instruction must not read or write beyond the last byte of the bit-field.\n \n-@cindex @code{scalb@var{m}3} instruction pattern\n-@item @samp{scalb@var{m}3}\n-Raise @code{FLT_RADIX} to the power of operand 2, multiply it by\n-operand 1, and store the result in operand 0. All operands have\n-mode @var{m}, which is a scalar or vector floating-point mode.\n-\n-This pattern is not allowed to @code{FAIL}.\n+@mdindex @code{extv}\n+@item @samp{extv}\n+Extract a bit-field from operand 1 (a register or memory operand), where\n+operand 2 specifies the width in bits and operand 3 the starting bit,\n+and store it in operand 0. Operand 0 must have mode @code{word_mode}.\n+Operand 1 may have mode @code{byte_mode} or @code{word_mode}; often\n+@code{word_mode} is allowed only for registers. Operands 2 and 3 must\n+be valid for @code{word_mode}.\n \n-@cindex @code{ldexp@var{m}3} instruction pattern\n-@item @samp{ldexp@var{m}3}\n-Raise 2 to the power of operand 2, multiply it by operand 1, and store\n-the result in operand 0. Operands 0 and 1 have mode @var{m}, which is\n-a scalar or vector floating-point mode. Operand 2's mode has\n-the same number of elements as @var{m} and each element is wide\n-enough to store an @code{int}. The integers are signed.\n+The RTL generation pass generates this instruction only with constants\n+for operands 2 and 3 and the constant is never zero for operand 2.\n \n-This pattern is not allowed to @code{FAIL}.\n+The bit-field value is sign-extended to a full word integer\n+before it is stored in operand 0.\n \n-@cindex @code{cos@var{m}2} instruction pattern\n-@item @samp{cos@var{m}2}\n-Store the cosine of operand 1 into operand 0. Both operands have\n-mode @var{m}, which is a scalar or vector floating-point mode.\n+This pattern is deprecated; please use @samp{extv@var{m}} and\n+@code{extvmisalign@var{m}} instead.\n \n-This pattern is not allowed to @code{FAIL}.\n+@mdindex @code{extzv}\n+@item @samp{extzv}\n+Like @samp{extv} except that the bit-field value is zero-extended.\n \n-@cindex @code{sin@var{m}2} instruction pattern\n-@item @samp{sin@var{m}2}\n-Store the sine of operand 1 into operand 0. Both operands have\n-mode @var{m}, which is a scalar or vector floating-point mode.\n+This pattern is deprecated; please use @samp{extzv@var{m}} and\n+@code{extzvmisalign@var{m}} instead.\n \n-This pattern is not allowed to @code{FAIL}.\n+@mdindex @code{insv}\n+@item @samp{insv}\n+Store operand 3 (which must be valid for @code{word_mode}) into a\n+bit-field in operand 0, where operand 1 specifies the width in bits and\n+operand 2 the starting bit. Operand 0 may have mode @code{byte_mode} or\n+@code{word_mode}; often @code{word_mode} is allowed only for registers.\n+Operands 1 and 2 must be valid for @code{word_mode}.\n \n-@cindex @code{sincos@var{m}3} instruction pattern\n-@item @samp{sincos@var{m}3}\n-Store the cosine of operand 2 into operand 0 and the sine of\n-operand 2 into operand 1. All operands have mode @var{m},\n-which is a scalar or vector floating-point mode.\n+The RTL generation pass generates this instruction only with constants\n+for operands 1 and 2 and the constant is never zero for operand 1.\n \n-Targets that can calculate the sine and cosine simultaneously can\n-implement this pattern as opposed to implementing individual\n-@code{sin@var{m}2} and @code{cos@var{m}2} patterns. The @code{sin}\n-and @code{cos} built-in functions will then be expanded to the\n-@code{sincos@var{m}3} pattern, with one of the output values\n-left unused.\n+This pattern is deprecated; please use @samp{insv@var{m}} and\n+@code{insvmisalign@var{m}} instead.\n \n-@cindex @code{tan@var{m}2} instruction pattern\n-@item @samp{tan@var{m}2}\n-Store the tangent of operand 1 into operand 0. Both operands have\n-mode @var{m}, which is a scalar or vector floating-point mode.\n+@mdindex @code{mov@var{mode}cc}\n+@item @samp{mov@var{mode}cc}\n+Conditionally move operand 2 or operand 3 into operand 0 according to the\n+comparison in operand 1. If the comparison is true, operand 2 is moved\n+into operand 0, otherwise operand 3 is moved.\n \n-This pattern is not allowed to @code{FAIL}.\n+The mode of the operands being compared need not be the same as the operands\n+being moved. Some machines, sparc64 for example, have instructions that\n+conditionally move an integer value based on the floating point condition\n+codes and vice versa.\n \n-@cindex @code{asin@var{m}2} instruction pattern\n-@item @samp{asin@var{m}2}\n-Store the arc sine of operand 1 into operand 0. Both operands have\n-mode @var{m}, which is a scalar or vector floating-point mode.\n+If the machine does not have conditional move instructions, do not\n+define these patterns.\n \n-This pattern is not allowed to @code{FAIL}.\n+@mdindex @code{add@var{mode}cc}\n+@item @samp{add@var{mode}cc}\n+Similar to @samp{mov@var{mode}cc} but for conditional addition. Conditionally\n+move operand 2 or (operands 2 + operand 3) into operand 0 according to the\n+comparison in operand 1. If the comparison is false, operand 2 is moved into\n+operand 0, otherwise (operand 2 + operand 3) is moved.\n \n-@cindex @code{acos@var{m}2} instruction pattern\n-@item @samp{acos@var{m}2}\n-Store the arc cosine of operand 1 into operand 0. Both operands have\n-mode @var{m}, which is a scalar or vector floating-point mode.\n+@mdindex @code{neg@var{mode}cc}\n+@item @samp{neg@var{mode}cc}\n+Similar to @samp{mov@var{mode}cc} but for conditional negation. Conditionally\n+move the negation of operand 2 or the unchanged operand 3 into operand 0\n+according to the comparison in operand 1. If the comparison is true, the negation\n+of operand 2 is moved into operand 0, otherwise operand 3 is moved.\n \n-This pattern is not allowed to @code{FAIL}.\n+@mdindex @code{not@var{mode}cc}\n+@item @samp{not@var{mode}cc}\n+Similar to @samp{neg@var{mode}cc} but for conditional complement.\n+Conditionally move the bitwise complement of operand 2 or the unchanged\n+operand 3 into operand 0 according to the comparison in operand 1.\n+If the comparison is true, the complement of operand 2 is moved into\n+operand 0, otherwise operand 3 is moved.\n \n-@cindex @code{atan@var{m}2} instruction pattern\n-@item @samp{atan@var{m}2}\n-Store the arc tangent of operand 1 into operand 0. Both operands have\n-mode @var{m}, which is a scalar or vector floating-point mode.\n+@mdindex @code{cstore@var{mode}4}\n+@item @samp{cstore@var{mode}4}\n+Store zero or nonzero in operand 0 according to whether a comparison\n+is true. Operand 1 is a comparison operator. Operand 2 and operand 3\n+are the first and second operand of the comparison, respectively.\n+You specify the mode that operand 0 must have when you write the\n+@code{match_operand} expression. The compiler automatically sees which\n+mode you have used and supplies an operand of that mode.\n \n-This pattern is not allowed to @code{FAIL}.\n+The value stored for a true condition must have 1 as its low bit, or\n+else must be negative. Otherwise the instruction is not suitable and\n+you should omit it from the machine description. You describe to the\n+compiler exactly which value is stored by defining the macro\n+@code{STORE_FLAG_VALUE} (@pxref{Misc}). If a description cannot be\n+found that can be used for all the possible comparison operators, you\n+should pick one and use a @code{define_expand} to map all results\n+onto the one you chose.\n \n-@cindex @code{fegetround@var{m}} instruction pattern\n-@item @samp{fegetround@var{m}}\n-Store the current machine floating-point rounding mode into operand 0.\n-Operand 0 has mode @var{m}, which is scalar. This pattern is used to\n-implement the @code{fegetround} function from the ISO C99 standard.\n+These operations may @code{FAIL}, but should do so only in relatively\n+uncommon cases; if they would @code{FAIL} for common cases involving\n+integer comparisons, it is best to restrict the predicates to not\n+allow these operands. Likewise if a given comparison operator will\n+always fail, independent of the operands (for floating-point modes, the\n+@code{ordered_comparison_operator} predicate is often useful in this case).\n \n-@cindex @code{feclearexcept@var{m}} instruction pattern\n-@cindex @code{feraiseexcept@var{m}} instruction pattern\n-@item @samp{feclearexcept@var{m}}\n-@item @samp{feraiseexcept@var{m}}\n-Clears or raises the supported machine floating-point exceptions\n-represented by the bits in operand 1. Error status is stored as\n-nonzero value in operand 0. Both operands have mode @var{m}, which is\n-a scalar. These patterns are used to implement the\n-@code{feclearexcept} and @code{feraiseexcept} functions from the ISO\n-C99 standard.\n+If this pattern is omitted, the compiler will generate a conditional\n+branch---for example, it may copy a constant one to the target and branching\n+around an assignment of zero to the target---or a libcall. If the predicate\n+for operand 1 only rejects some operators, it will also try reordering the\n+operands and/or inverting the result value (e.g.@: by an exclusive OR).\n+These possibilities could be cheaper or equivalent to the instructions\n+used for the @samp{cstore@var{mode}4} pattern followed by those required\n+to convert a positive result from @code{STORE_FLAG_VALUE} to 1; in this\n+case, you can and should make operand 1's predicate reject some operators\n+in the @samp{cstore@var{mode}4} pattern, or remove the pattern altogether\n+from the machine description.\n \n-@cindex @code{exp@var{m}2} instruction pattern\n-@item @samp{exp@var{m}2}\n-Raise e (the base of natural logarithms) to the power of operand 1\n-and store the result in operand 0. Both operands have mode @var{m},\n-which is a scalar or vector floating-point mode.\n+@mdindex @code{tbranch_@var{op}@var{mode}3}\n+@item @samp{tbranch_@var{op}@var{mode}3}\n+Conditional branch instruction combined with a bit test-and-compare\n+instruction. Operand 0 is the operand of the comparison. Operand 1 is the bit\n+position of Operand 1 to test. Operand 3 is the @code{code_label} to jump to.\n+@var{op} is one of @var{eq} or @var{ne}.\n \n-This pattern is not allowed to @code{FAIL}.\n+@mdindex @code{jump}\n+@item @samp{jump}\n+A jump inside a function; an unconditional branch. Operand 0 is the\n+@code{code_label} to jump to. This pattern name is mandatory on all\n+machines.\n \n-@cindex @code{expm1@var{m}2} instruction pattern\n-@item @samp{expm1@var{m}2}\n-Raise e (the base of natural logarithms) to the power of operand 1,\n-subtract 1, and store the result in operand 0. Both operands have\n-mode @var{m}, which is a scalar or vector floating-point mode.\n+@mdindex @code{call}\n+@item @samp{call}\n+Subroutine call instruction returning no value. Operand 0 is the\n+function to call; operand 1 is the number of bytes of arguments pushed\n+as a @code{const_int}. Operand 2 is the result of calling the target\n+hook @code{TARGET_FUNCTION_ARG} with the second argument @code{arg}\n+yielding true for @code{arg.end_marker_p ()}, in a call after all\n+parameters have been passed to that hook. By default this is the first\n+register beyond those used for arguments in the call, or @code{NULL} if\n+all the argument-registers are used in the call.\n \n-For inputs close to zero, the pattern is expected to be more\n-accurate than a separate @code{exp@var{m}2} and @code{sub@var{m}3}\n-would be.\n+On most machines, operand 2 is not actually stored into the RTL\n+pattern. It is supplied for the sake of some RISC machines which need\n+to put this information into the assembler code; they can put it in\n+the RTL instead of operand 1.\n \n-This pattern is not allowed to @code{FAIL}.\n+Operand 0 should be a @code{mem} RTX whose address is the address of the\n+function. Note, however, that this address can be a @code{symbol_ref}\n+expression even if it would not be a legitimate memory address on the\n+target machine. If it is also not a valid argument for a call\n+instruction, the pattern for this operation should be a\n+@code{define_expand} (@pxref{Expander Definitions}) that places the\n+address into a register and uses that register in the call instruction.\n \n-@cindex @code{exp10@var{m}2} instruction pattern\n-@item @samp{exp10@var{m}2}\n-Raise 10 to the power of operand 1 and store the result in operand 0.\n-Both operands have mode @var{m}, which is a scalar or vector\n-floating-point mode.\n+@mdindex @code{call_value}\n+@item @samp{call_value}\n+Subroutine call instruction returning a value. Operand 0 is the hard\n+register in which the value is returned. There are three more\n+operands, the same as the three operands of the @samp{call}\n+instruction (but with numbers increased by one).\n \n-This pattern is not allowed to @code{FAIL}.\n+Subroutines that return @code{BLKmode} objects use the @samp{call}\n+insn.\n \n-@cindex @code{exp2@var{m}2} instruction pattern\n-@item @samp{exp2@var{m}2}\n-Raise 2 to the power of operand 1 and store the result in operand 0.\n-Both operands have mode @var{m}, which is a scalar or vector\n-floating-point mode.\n+@mdindex @code{call_pop}\n+@mdindex @code{call_value_pop}\n+@item @samp{call_pop}, @samp{call_value_pop}\n+Similar to @samp{call} and @samp{call_value}, except used if defined and\n+if @code{RETURN_POPS_ARGS} is nonzero. They should emit a @code{parallel}\n+that contains both the function call and a @code{set} to indicate the\n+adjustment made to the frame pointer.\n \n-This pattern is not allowed to @code{FAIL}.\n+For machines where @code{RETURN_POPS_ARGS} can be nonzero, the use of these\n+patterns increases the number of functions for which the frame pointer\n+can be eliminated, if desired.\n \n-@cindex @code{log@var{m}2} instruction pattern\n-@item @samp{log@var{m}2}\n-Store the natural logarithm of operand 1 into operand 0. Both operands\n-have mode @var{m}, which is a scalar or vector floating-point mode.\n+@mdindex @code{untyped_call}\n+@item @samp{untyped_call}\n+Subroutine call instruction returning a value of any type. Operand 0 is\n+the function to call; operand 1 is a memory location where the result of\n+calling the function is to be stored; operand 2 is a @code{parallel}\n+expression where each element is a @code{set} expression that indicates\n+the saving of a function return value into the result block.\n \n-This pattern is not allowed to @code{FAIL}.\n+This instruction pattern should be defined to support\n+@code{__builtin_apply} on machines where special instructions are needed\n+to call a subroutine with arbitrary arguments or to save the value\n+returned. This instruction pattern is required on machines that have\n+multiple registers that can hold a return value\n+(i.e.@: @code{FUNCTION_VALUE_REGNO_P} is true for more than one register).\n \n-@cindex @code{log1p@var{m}2} instruction pattern\n-@item @samp{log1p@var{m}2}\n-Add 1 to operand 1, compute the natural logarithm, and store\n-the result in operand 0. Both operands have mode @var{m}, which is\n-a scalar or vector floating-point mode.\n+@mdindex @code{return}\n+@item @samp{return}\n+Subroutine return instruction. This instruction pattern name should be\n+defined only if a single instruction can do all the work of returning\n+from a function.\n \n-For inputs close to zero, the pattern is expected to be more\n-accurate than a separate @code{add@var{m}3} and @code{log@var{m}2}\n-would be.\n+Like the @samp{mov@var{m}} patterns, this pattern is also used after the\n+RTL generation phase. In this case it is to support machines where\n+multiple instructions are usually needed to return from a function, but\n+some class of functions only requires one instruction to implement a\n+return. Normally, the applicable functions are those which do not need\n+to save any registers or allocate stack space.\n \n-This pattern is not allowed to @code{FAIL}.\n+It is valid for this pattern to expand to an instruction using\n+@code{simple_return} if no epilogue is required.\n \n-@cindex @code{log10@var{m}2} instruction pattern\n-@item @samp{log10@var{m}2}\n-Store the base-10 logarithm of operand 1 into operand 0. Both operands\n-have mode @var{m}, which is a scalar or vector floating-point mode.\n+@mdindex @code{simple_return}\n+@item @samp{simple_return}\n+Subroutine return instruction. This instruction pattern name should be\n+defined only if a single instruction can do all the work of returning\n+from a function on a path where no epilogue is required. This pattern\n+is very similar to the @code{return} instruction pattern, but it is emitted\n+only by the shrink-wrapping optimization on paths where the function\n+prologue has not been executed, and a function return should occur without\n+any of the effects of the epilogue. Additional uses may be introduced on\n+paths where both the prologue and the epilogue have executed.\n \n-This pattern is not allowed to @code{FAIL}.\n-\n-@cindex @code{log2@var{m}2} instruction pattern\n-@item @samp{log2@var{m}2}\n-Store the base-2 logarithm of operand 1 into operand 0. Both operands\n-have mode @var{m}, which is a scalar or vector floating-point mode.\n+@findex reload_completed\n+@findex leaf_function_p\n+For such machines, the condition specified in this pattern should only\n+be true when @code{reload_completed} is nonzero and the function's\n+epilogue would only be a single instruction. For machines with register\n+windows, the routine @code{leaf_function_p} may be used to determine if\n+a register window push is required.\n \n-This pattern is not allowed to @code{FAIL}.\n+Machines that have conditional return instructions should define patterns\n+such as\n \n-@cindex @code{logb@var{m}2} instruction pattern\n-@item @samp{logb@var{m}2}\n-Store the base-@code{FLT_RADIX} logarithm of operand 1 into operand 0.\n-Both operands have mode @var{m}, which is a scalar or vector\n-floating-point mode.\n+@smallexample\n+(define_insn \"\"\n+ [(set (pc)\n+ (if_then_else (match_operator\n+ 0 \"comparison_operator\"\n+ [(reg:CC CC_REG) (const_int 0)])\n+ (return)\n+ (pc)))]\n+ \"@var{condition}\"\n+ \"@dots{}\")\n+@end smallexample\n \n-This pattern is not allowed to @code{FAIL}.\n+where @var{condition} would normally be the same condition specified on the\n+named @samp{return} pattern.\n \n-@cindex @code{signbit@var{m}2} instruction pattern\n-@item @samp{signbit@var{m}2}\n-Store the sign bit of floating-point operand 1 in operand 0.\n-@var{m} is either a scalar or vector mode. When it is a scalar,\n-operand 1 has mode @var{m} but operand 0 must have mode @code{SImode}.\n-When @var{m} is a vector, operand 1 has the mode @var{m}.\n-operand 0's mode should be an vector integer mode which has\n-the same number of elements and the same size as mode @var{m}.\n+@mdindex @code{untyped_return}\n+@item @samp{untyped_return}\n+Untyped subroutine return instruction. This instruction pattern should\n+be defined to support @code{__builtin_return} on machines where special\n+instructions are needed to return a value of any type.\n \n-This pattern is not allowed to @code{FAIL}.\n+Operand 0 is a memory location where the result of calling a function\n+with @code{__builtin_apply} is stored; operand 1 is a @code{parallel}\n+expression where each element is a @code{set} expression that indicates\n+the restoring of a function return value from the result block.\n \n-@cindex @code{significand@var{m}2} instruction pattern\n-@item @samp{significand@var{m}2}\n-Store the significand of floating-point operand 1 in operand 0.\n-Both operands have mode @var{m}, which is a scalar or vector\n-floating-point mode.\n+@mdindex @code{nop}\n+@item @samp{nop}\n+No-op instruction. This instruction pattern name should always be defined\n+to output a no-op in assembler code. @code{(const_int 0)} will do as an\n+RTL pattern.\n \n-This pattern is not allowed to @code{FAIL}.\n+@mdindex @code{indirect_jump}\n+@item @samp{indirect_jump}\n+An instruction to jump to an address which is operand zero.\n+This pattern name is mandatory on all machines.\n \n-@cindex @code{pow@var{m}3} instruction pattern\n-@item @samp{pow@var{m}3}\n-Store the value of operand 1 raised to the exponent operand 2\n-into operand 0. All operands have mode @var{m}, which is a scalar\n-or vector floating-point mode.\n+@mdindex @code{casesi}\n+@item @samp{casesi}\n+Instruction to jump through a dispatch table, including bounds checking.\n+This instruction takes five operands:\n \n-This pattern is not allowed to @code{FAIL}.\n+@enumerate\n+@item\n+The index to dispatch on, which has mode @code{SImode}.\n \n-@cindex @code{atan2@var{m}3} instruction pattern\n-@item @samp{atan2@var{m}3}\n-Store the arc tangent (inverse tangent) of operand 1 divided by\n-operand 2 into operand 0, using the signs of both arguments to\n-determine the quadrant of the result. All operands have mode\n-@var{m}, which is a scalar or vector floating-point mode.\n+@item\n+The lower bound for indices in the table, an integer constant.\n \n-This pattern is not allowed to @code{FAIL}.\n+@item\n+The total range of indices in the table---the largest index\n+minus the smallest one (both inclusive).\n \n-@cindex @code{floor@var{m}2} instruction pattern\n-@item @samp{floor@var{m}2}\n-Store the largest integral value not greater than operand 1 in operand 0.\n-Both operands have mode @var{m}, which is a scalar or vector\n-floating-point mode. If @option{-ffp-int-builtin-inexact} is in\n-effect, the ``inexact'' exception may be raised for noninteger\n-operands; otherwise, it may not.\n+@item\n+A label that precedes the table itself.\n \n-This pattern is not allowed to @code{FAIL}.\n+@item\n+A label to jump to if the index has a value outside the bounds.\n+@end enumerate\n \n-@cindex @code{btrunc@var{m}2} instruction pattern\n-@item @samp{btrunc@var{m}2}\n-Round operand 1 to an integer, towards zero, and store the result in\n-operand 0. Both operands have mode @var{m}, which is a scalar or\n-vector floating-point mode. If @option{-ffp-int-builtin-inexact} is\n-in effect, the ``inexact'' exception may be raised for noninteger\n-operands; otherwise, it may not.\n+The table is an @code{addr_vec} or @code{addr_diff_vec} inside of a\n+@code{jump_table_data}. The number of elements in the table is one plus the\n+difference between the upper bound and the lower bound.\n \n-This pattern is not allowed to @code{FAIL}.\n+@mdindex @code{tablejump}\n+@item @samp{tablejump}\n+Instruction to jump to a variable address. This is a low-level\n+capability which can be used to implement a dispatch table when there\n+is no @samp{casesi} pattern.\n \n-@cindex @code{round@var{m}2} instruction pattern\n-@item @samp{round@var{m}2}\n-Round operand 1 to the nearest integer, rounding away from zero in the\n-event of a tie, and store the result in operand 0. Both operands have\n-mode @var{m}, which is a scalar or vector floating-point mode. If\n-@option{-ffp-int-builtin-inexact} is in effect, the ``inexact''\n-exception may be raised for noninteger operands; otherwise, it may\n-not.\n+This pattern requires two operands: the address or offset, and a label\n+which should immediately precede the jump table. If the macro\n+@code{CASE_VECTOR_PC_RELATIVE} evaluates to a nonzero value then the first\n+operand is an offset which counts from the address of the table; otherwise,\n+it is an absolute address to jump to. In either case, the first operand has\n+mode @code{Pmode}.\n \n-This pattern is not allowed to @code{FAIL}.\n+The @samp{tablejump} insn is always the last insn before the jump\n+table it uses. Its assembler code normally has no need to use the\n+second operand, but you should incorporate it in the RTL pattern so\n+that the jump optimizer will not delete the table as unreachable code.\n \n-@cindex @code{ceil@var{m}2} instruction pattern\n-@item @samp{ceil@var{m}2}\n-Store the smallest integral value not less than operand 1 in operand 0.\n-Both operands have mode @var{m}, which is a scalar or vector\n-floating-point mode. If @option{-ffp-int-builtin-inexact} is in\n-effect, the ``inexact'' exception may be raised for noninteger\n-operands; otherwise, it may not.\n \n-This pattern is not allowed to @code{FAIL}.\n+@mdindex @code{doloop_end}\n+@item @samp{doloop_end}\n+Conditional branch instruction that decrements a register and\n+jumps if the register is nonzero. Operand 0 is the register to\n+decrement and test; operand 1 is the label to jump to if the\n+register is nonzero.\n+@xref{Looping Patterns}.\n \n-@cindex @code{nearbyint@var{m}2} instruction pattern\n-@item @samp{nearbyint@var{m}2}\n-Round operand 1 to an integer, using the current rounding mode, and\n-store the result in operand 0. Do not raise an inexact condition when\n-the result is different from the argument. Both operands have mode\n-@var{m}, which is a scalar or vector floating-point mode.\n+This optional instruction pattern should be defined for machines with\n+low-overhead looping instructions as the loop optimizer will try to\n+modify suitable loops to utilize it. The target hook\n+@code{TARGET_CAN_USE_DOLOOP_P} controls the conditions under which\n+low-overhead loops can be used.\n \n-This pattern is not allowed to @code{FAIL}.\n+@mdindex @code{doloop_begin}\n+@item @samp{doloop_begin}\n+Companion instruction to @code{doloop_end} required for machines that\n+need to perform some initialization, such as loading a special counter\n+register. Operand 1 is the associated @code{doloop_end} pattern and\n+operand 0 is the register that it decrements.\n \n-@cindex @code{rint@var{m}2} instruction pattern\n-@item @samp{rint@var{m}2}\n-Round operand 1 to an integer, using the current rounding mode, and\n-store the result in operand 0. Raise an inexact condition when\n-the result is different from the argument. Both operands have mode\n-@var{m}, which is a scalar or vector floating-point mode.\n+If initialization insns do not always need to be emitted, use a\n+@code{define_expand} (@pxref{Expander Definitions}) and make it fail.\n \n-This pattern is not allowed to @code{FAIL}.\n+@mdindex @code{canonicalize_funcptr_for_compare}\n+@item @samp{canonicalize_funcptr_for_compare}\n+Canonicalize the function pointer in operand 1 and store the result\n+into operand 0.\n \n-@cindex @code{lrint@var{m}@var{n}2}\n-@item @samp{lrint@var{m}@var{n}2}\n-Convert operand 1 (valid for floating point mode @var{m}) to fixed\n-point mode @var{n} as a signed number according to the current\n-rounding mode and store in operand 0 (which has mode @var{n}).\n+Operand 0 is always a @code{reg} and has mode @code{Pmode}; operand 1\n+may be a @code{reg}, @code{mem}, @code{symbol_ref}, @code{const_int}, etc\n+and also has mode @code{Pmode}.\n \n-@cindex @code{lround@var{m}@var{n}2}\n-@item @samp{lround@var{m}@var{n}2}\n-Convert operand 1 (valid for floating point mode @var{m}) to fixed\n-point mode @var{n} as a signed number rounding to nearest and away\n-from zero and store in operand 0 (which has mode @var{n}).\n+Canonicalization of a function pointer usually involves computing\n+the address of the function which would be called if the function\n+pointer were used in an indirect call.\n \n-@cindex @code{lfloor@var{m}@var{n}2}\n-@item @samp{lfloor@var{m}@var{n}2}\n-Convert operand 1 (valid for floating point mode @var{m}) to fixed\n-point mode @var{n} as a signed number rounding down and store in\n-operand 0 (which has mode @var{n}).\n+Only define this pattern if function pointers on the target machine\n+can have different values but still call the same function when\n+used in an indirect call.\n \n-@cindex @code{lceil@var{m}@var{n}2}\n-@item @samp{lceil@var{m}@var{n}2}\n-Convert operand 1 (valid for floating point mode @var{m}) to fixed\n-point mode @var{n} as a signed number rounding up and store in\n-operand 0 (which has mode @var{n}).\n+@mdindex @code{save_stack_block}\n+@mdindex @code{save_stack_function}\n+@mdindex @code{save_stack_nonlocal}\n+@mdindex @code{restore_stack_block}\n+@mdindex @code{restore_stack_function}\n+@mdindex @code{restore_stack_nonlocal}\n+@item @samp{save_stack_block}\n+@itemx @samp{save_stack_function}\n+@itemx @samp{save_stack_nonlocal}\n+@itemx @samp{restore_stack_block}\n+@itemx @samp{restore_stack_function}\n+@itemx @samp{restore_stack_nonlocal}\n+Most machines save and restore the stack pointer by copying it to or\n+from an object of mode @code{Pmode}. Do not define these patterns on\n+such machines.\n \n-@cindex @code{copysign@var{m}3} instruction pattern\n-@item @samp{copysign@var{m}3}\n-Store a value with the magnitude of operand 1 and the sign of operand\n-2 into operand 0. All operands have mode @var{m}, which is a scalar or\n-vector floating-point mode.\n+Some machines require special handling for stack pointer saves and\n+restores. On those machines, define the patterns corresponding to the\n+non-standard cases by using a @code{define_expand} (@pxref{Expander\n+Definitions}) that produces the required insns. The three types of\n+saves and restores are:\n \n-This pattern is not allowed to @code{FAIL}.\n+@enumerate\n+@item\n+@samp{save_stack_block} saves the stack pointer at the start of a block\n+that allocates a variable-sized object, and @samp{restore_stack_block}\n+restores the stack pointer when the block is exited.\n \n-@cindex @code{xorsign@var{m}3} instruction pattern\n-@item @samp{xorsign@var{m}3}\n-Equivalent to @samp{op0 = op1 * copysign (1.0, op2)}: store a value with\n-the magnitude of operand 1 and the sign of operand 2 into operand 0.\n-All operands have mode @var{m}, which is a scalar or vector\n-floating-point mode.\n+@item\n+@samp{save_stack_function} and @samp{restore_stack_function} do a\n+similar job for the outermost block of a function and are used when the\n+function allocates variable-sized objects or calls @code{alloca}. Only\n+the epilogue uses the restored stack pointer, allowing a simpler save or\n+restore sequence on some machines.\n \n-This pattern is not allowed to @code{FAIL}.\n+@item\n+@samp{save_stack_nonlocal} is used in functions that contain labels\n+branched to by nested functions. It saves the stack pointer in such a\n+way that the inner function can use @samp{restore_stack_nonlocal} to\n+restore the stack pointer. The compiler generates code to restore the\n+frame and argument pointer registers, but some machines require saving\n+and restoring additional data such as register window information or\n+stack backchains. Place insns in these patterns to save and restore any\n+such required data.\n+@end enumerate\n \n-@cindex @code{issignaling@var{m}2} instruction pattern\n-@item @samp{issignaling@var{m}2}\n-Set operand 0 to 1 if operand 1 is a signaling NaN and to 0 otherwise.\n+When saving the stack pointer, operand 0 is the save area and operand 1\n+is the stack pointer. The mode used to allocate the save area defaults\n+to @code{Pmode} but you can override that choice by defining the\n+@code{STACK_SAVEAREA_MODE} macro (@pxref{Storage Layout}). You must\n+specify an integral mode, or @code{VOIDmode} if no save area is needed\n+for a particular type of save (either because no save is needed or\n+because a machine-specific save area can be used). Operand 0 is the\n+stack pointer and operand 1 is the save area for restore operations. If\n+@samp{save_stack_block} is defined, operand 0 must not be\n+@code{VOIDmode} since these saves can be arbitrarily nested.\n \n-@cindex @code{cadd90@var{m}3} instruction pattern\n-@item @samp{cadd90@var{m}3}\n-Perform vector add and subtract on even/odd number pairs. The operation being\n-matched is semantically described as\n+A save area is a @code{mem} that is at a constant offset from\n+@code{virtual_stack_vars_rtx} when the stack pointer is saved for use by\n+nonlocal gotos and a @code{reg} in the other two cases.\n \n-@smallexample\n- for (int i = 0; i < N; i += 2)\n- @{\n- c[i] = a[i] - b[i+1];\n- c[i+1] = a[i+1] + b[i];\n- @}\n-@end smallexample\n+@mdindex @code{allocate_stack}\n+@item @samp{allocate_stack}\n+Subtract (or add if @code{STACK_GROWS_DOWNWARD} is undefined) operand 1 from\n+the stack pointer to create space for dynamically allocated data.\n \n-This operation is semantically equivalent to performing a vector addition of\n-complex numbers in operand 1 with operand 2 rotated by 90 degrees around\n-the argand plane and storing the result in operand 0.\n+Store the resultant pointer to this space into operand 0. If you\n+are allocating space from the main stack, do this by emitting a\n+move insn to copy @code{virtual_stack_dynamic_rtx} to operand 0.\n+If you are allocating the space elsewhere, generate code to copy the\n+location of the space to operand 0. In the latter case, you must\n+ensure this space gets freed when the corresponding space on the main\n+stack is free.\n \n-In GCC lane ordering the real part of the number must be in the even lanes with\n-the imaginary part in the odd lanes.\n+Do not define this pattern if all that must be done is the subtraction.\n+Some machines require other operations such as stack probes or\n+maintaining the back chain. Define this pattern to emit those\n+operations in addition to updating the stack pointer.\n \n-The operation is only supported for vector modes @var{m}.\n+@mdindex @code{check_stack}\n+@item @samp{check_stack}\n+If stack checking (@pxref{Stack Checking}) cannot be done on your system by\n+probing the stack, define this pattern to perform the needed check and signal\n+an error if the stack has overflowed. The single operand is the address in\n+the stack farthest from the current stack pointer that you need to validate.\n+Normally, on platforms where this pattern is needed, you would obtain the\n+stack limit from a global or thread-specific variable or register.\n \n-This pattern is not allowed to @code{FAIL}.\n+@mdindex @code{probe_stack_address}\n+@item @samp{probe_stack_address}\n+If stack checking (@pxref{Stack Checking}) can be done on your system by\n+probing the stack but without the need to actually access it, define this\n+pattern and signal an error if the stack has overflowed. The single operand\n+is the memory address in the stack that needs to be probed.\n \n-@cindex @code{cadd270@var{m}3} instruction pattern\n-@item @samp{cadd270@var{m}3}\n-Perform vector add and subtract on even/odd number pairs. The operation being\n-matched is semantically described as\n+@mdindex @code{probe_stack}\n+@item @samp{probe_stack}\n+If stack checking (@pxref{Stack Checking}) can be done on your system by\n+probing the stack but doing it with a ``store zero'' instruction is not valid\n+or optimal, define this pattern to do the probing differently and signal an\n+error if the stack has overflowed. The single operand is the memory reference\n+in the stack that needs to be probed.\n \n-@smallexample\n- for (int i = 0; i < N; i += 2)\n- @{\n- c[i] = a[i] + b[i+1];\n- c[i+1] = a[i+1] - b[i];\n- @}\n-@end smallexample\n+@mdindex @code{nonlocal_goto}\n+@item @samp{nonlocal_goto}\n+Emit code to generate a non-local goto, e.g., a jump from one function\n+to a label in an outer function. This pattern has four arguments,\n+each representing a value to be used in the jump. The first\n+argument is to be loaded into the frame pointer, the second is\n+the address to branch to (code to dispatch to the actual label),\n+the third is the address of a location where the stack is saved,\n+and the last is the address of the label, to be placed in the\n+location for the incoming static chain.\n \n-This operation is semantically equivalent to performing a vector addition of\n-complex numbers in operand 1 with operand 2 rotated by 270 degrees around\n-the argand plane and storing the result in operand 0.\n+On most machines you need not define this pattern, since GCC will\n+already generate the correct code, which is to load the frame pointer\n+and static chain, restore the stack (using the\n+@samp{restore_stack_nonlocal} pattern, if defined), and jump indirectly\n+to the dispatcher. You need only define this pattern if this code will\n+not work on your machine.\n \n-In GCC lane ordering the real part of the number must be in the even lanes with\n-the imaginary part in the odd lanes.\n+@mdindex @code{nonlocal_goto_receiver}\n+@item @samp{nonlocal_goto_receiver}\n+This pattern, if defined, contains code needed at the target of a\n+nonlocal goto after the code already generated by GCC@. You will not\n+normally need to define this pattern. A typical reason why you might\n+need this pattern is if some value, such as a pointer to a global table,\n+must be restored when the frame pointer is restored. Note that a nonlocal\n+goto only occurs within a unit-of-translation, so a global table pointer\n+that is shared by all functions of a given module need not be restored.\n+There are no arguments.\n \n-The operation is only supported for vector modes @var{m}.\n+@mdindex @code{exception_receiver}\n+@item @samp{exception_receiver}\n+This pattern, if defined, contains code needed at the site of an\n+exception handler that isn't needed at the site of a nonlocal goto. You\n+will not normally need to define this pattern. A typical reason why you\n+might need this pattern is if some value, such as a pointer to a global\n+table, must be restored after control flow is branched to the handler of\n+an exception. There are no arguments.\n \n-This pattern is not allowed to @code{FAIL}.\n+@mdindex @code{builtin_setjmp_setup}\n+@item @samp{builtin_setjmp_setup}\n+This pattern, if defined, contains additional code needed to initialize\n+the @code{jmp_buf}. You will not normally need to define this pattern.\n+A typical reason why you might need this pattern is if some value, such\n+as a pointer to a global table, must be restored. Though it is\n+preferred that the pointer value be recalculated if possible (given the\n+address of a label for instance). The single argument is a pointer to\n+the @code{jmp_buf}. Note that the buffer is five words long and that\n+the first three are normally used by the generic mechanism.\n \n-@cindex @code{cmla@var{m}4} instruction pattern\n-@item @samp{cmla@var{m}4}\n-Perform a vector multiply and accumulate that is semantically the same as\n-a multiply and accumulate of complex numbers.\n+@mdindex @code{builtin_setjmp_receiver}\n+@item @samp{builtin_setjmp_receiver}\n+This pattern, if defined, contains code needed at the site of a\n+built-in setjmp that isn't needed at the site of a nonlocal goto. You\n+will not normally need to define this pattern. A typical reason why you\n+might need this pattern is if some value, such as a pointer to a global\n+table, must be restored. It takes one argument, which is the label\n+to which builtin_longjmp transferred control; this pattern may be emitted\n+at a small offset from that label.\n \n-@smallexample\n- complex TYPE op0[N];\n- complex TYPE op1[N];\n- complex TYPE op2[N];\n- complex TYPE op3[N];\n- for (int i = 0; i < N; i += 1)\n- @{\n- op0[i] = op1[i] * op2[i] + op3[i];\n- @}\n-@end smallexample\n+@mdindex @code{builtin_longjmp}\n+@item @samp{builtin_longjmp}\n+This pattern, if defined, performs the entire action of the longjmp.\n+You will not normally need to define this pattern unless you also define\n+@code{builtin_setjmp_setup}. The single argument is a pointer to the\n+@code{jmp_buf}.\n \n-In GCC lane ordering the real part of the number must be in the even lanes with\n-the imaginary part in the odd lanes.\n+@mdindex @code{eh_return}\n+@item @samp{eh_return}\n+This pattern, if defined, affects the way @code{__builtin_eh_return},\n+and thence the call frame exception handling library routines, are\n+built. It is intended to handle non-trivial actions needed along\n+the abnormal return path.\n \n-The operation is only supported for vector modes @var{m}.\n+The address of the exception handler to which the function should return\n+is passed as operand to this pattern. It will normally need to copied by\n+the pattern to some special register or memory location.\n+If the pattern needs to determine the location of the target call\n+frame in order to do so, it may use @code{EH_RETURN_STACKADJ_RTX},\n+if defined; it will have already been assigned.\n \n-This pattern is not allowed to @code{FAIL}.\n+If this pattern is not defined, the default action will be to simply\n+copy the return address to @code{EH_RETURN_HANDLER_RTX}. Either\n+that macro or this pattern needs to be defined if call frame exception\n+handling is to be used.\n \n-@cindex @code{cmla_conj@var{m}4} instruction pattern\n-@item @samp{cmla_conj@var{m}4}\n-Perform a vector multiply by conjugate and accumulate that is semantically\n-the same as a multiply and accumulate of complex numbers where the second\n-multiply arguments is conjugated.\n+@mdindex @code{prologue}\n+@anchor{prologue instruction pattern}\n+@item @samp{prologue}\n+This pattern, if defined, emits RTL for entry to a function. The function\n+entry is responsible for setting up the stack frame, initializing the frame\n+pointer register, saving callee saved registers, etc.\n \n-@smallexample\n- complex TYPE op0[N];\n- complex TYPE op1[N];\n- complex TYPE op2[N];\n- complex TYPE op3[N];\n- for (int i = 0; i < N; i += 1)\n- @{\n- op0[i] = op1[i] * conj (op2[i]) + op3[i];\n- @}\n-@end smallexample\n+Using a prologue pattern is generally preferred over defining\n+@code{TARGET_ASM_FUNCTION_PROLOGUE} to emit assembly code for the prologue.\n \n-In GCC lane ordering the real part of the number must be in the even lanes with\n-the imaginary part in the odd lanes.\n+The @code{prologue} pattern is particularly useful for targets which perform\n+instruction scheduling.\n \n-The operation is only supported for vector modes @var{m}.\n+@mdindex @code{window_save}\n+@anchor{window_save instruction pattern}\n+@item @samp{window_save}\n+This pattern, if defined, emits RTL for a register window save. It should\n+be defined if the target machine has register windows but the window events\n+are decoupled from calls to subroutines. The canonical example is the SPARC\n+architecture.\n \n-This pattern is not allowed to @code{FAIL}.\n+@mdindex @code{epilogue}\n+@anchor{epilogue instruction pattern}\n+@item @samp{epilogue}\n+This pattern emits RTL for exit from a function. The function\n+exit is responsible for deallocating the stack frame, restoring callee saved\n+registers and emitting the return instruction.\n \n-@cindex @code{cmls@var{m}4} instruction pattern\n-@item @samp{cmls@var{m}4}\n-Perform a vector multiply and subtract that is semantically the same as\n-a multiply and subtract of complex numbers.\n+Using an epilogue pattern is generally preferred over defining\n+@code{TARGET_ASM_FUNCTION_EPILOGUE} to emit assembly code for the epilogue.\n \n-@smallexample\n- complex TYPE op0[N];\n- complex TYPE op1[N];\n- complex TYPE op2[N];\n- complex TYPE op3[N];\n- for (int i = 0; i < N; i += 1)\n- @{\n- op0[i] = op1[i] * op2[i] - op3[i];\n- @}\n-@end smallexample\n+The @code{epilogue} pattern is particularly useful for targets which perform\n+instruction scheduling or which have delay slots for their return instruction.\n \n-In GCC lane ordering the real part of the number must be in the even lanes with\n-the imaginary part in the odd lanes.\n+@mdindex @code{sibcall_epilogue}\n+@item @samp{sibcall_epilogue}\n+This pattern, if defined, emits RTL for exit from a function without the final\n+branch back to the calling function. This pattern will be emitted before any\n+sibling call (aka tail call) sites.\n \n-The operation is only supported for vector modes @var{m}.\n+The @code{sibcall_epilogue} pattern must not clobber any arguments used for\n+parameter passing or any stack slots for arguments passed to the current\n+function.\n \n-This pattern is not allowed to @code{FAIL}.\n+@mdindex @code{trap}\n+@item @samp{trap}\n+This pattern, if defined, signals an error, typically by causing some\n+kind of signal to be raised.\n \n-@cindex @code{cmls_conj@var{m}4} instruction pattern\n-@item @samp{cmls_conj@var{m}4}\n-Perform a vector multiply by conjugate and subtract that is semantically\n-the same as a multiply and subtract of complex numbers where the second\n-multiply arguments is conjugated.\n+@mdindex @code{ctrap@var{MM}4}\n+@item @samp{ctrap@var{MM}4}\n+Conditional trap instruction. Operand 0 is a piece of RTL which\n+performs a comparison, and operands 1 and 2 are the arms of the\n+comparison. Operand 3 is the trap code, an integer.\n+\n+A typical @code{ctrap} pattern looks like\n \n @smallexample\n- complex TYPE op0[N];\n- complex TYPE op1[N];\n- complex TYPE op2[N];\n- complex TYPE op3[N];\n- for (int i = 0; i < N; i += 1)\n- @{\n- op0[i] = op1[i] * conj (op2[i]) - op3[i];\n- @}\n+(define_insn \"ctrapsi4\"\n+ [(trap_if (match_operator 0 \"trap_operator\"\n+ [(match_operand 1 \"register_operand\")\n+ (match_operand 2 \"immediate_operand\")])\n+ (match_operand 3 \"const_int_operand\" \"i\"))]\n+ \"\"\n+ \"@dots{}\")\n @end smallexample\n \n-In GCC lane ordering the real part of the number must be in the even lanes with\n-the imaginary part in the odd lanes.\n-\n-The operation is only supported for vector modes @var{m}.\n+@mdindex @code{prefetch}\n+@item @samp{prefetch}\n+This pattern, if defined, emits code for a non-faulting data prefetch\n+instruction. Operand 0 is the address of the memory to prefetch. Operand 1\n+is a constant 1 if the prefetch is preparing for a write to the memory\n+address, or a constant 0 otherwise. Operand 2 is the expected degree of\n+temporal locality of the data and is a value between 0 and 3, inclusive; 0\n+means that the data has no temporal locality, so it need not be left in the\n+cache after the access; 3 means that the data has a high degree of temporal\n+locality and should be left in all levels of cache possible; 1 and 2 mean,\n+respectively, a low or moderate degree of temporal locality.\n \n-This pattern is not allowed to @code{FAIL}.\n+Targets that do not support write prefetches or locality hints can ignore\n+the values of operands 1 and 2.\n \n-@cindex @code{cmul@var{m}4} instruction pattern\n-@item @samp{cmul@var{m}4}\n-Perform a vector multiply that is semantically the same as multiply of\n-complex numbers.\n+@mdindex @code{blockage}\n+@item @samp{blockage}\n+This pattern defines a pseudo insn that prevents the instruction\n+scheduler and other passes from moving instructions and using register\n+equivalences across the boundary defined by the blockage insn.\n+This needs to be an UNSPEC_VOLATILE pattern or a volatile ASM.\n \n-@smallexample\n- complex TYPE op0[N];\n- complex TYPE op1[N];\n- complex TYPE op2[N];\n- for (int i = 0; i < N; i += 1)\n- @{\n- op0[i] = op1[i] * op2[i];\n- @}\n-@end smallexample\n+@mdindex @code{memory_blockage}\n+@item @samp{memory_blockage}\n+This pattern, if defined, represents a compiler memory barrier, and will be\n+placed at points across which RTL passes may not propagate memory accesses.\n+This instruction needs to read and write volatile BLKmode memory. It does\n+not need to generate any machine instruction. If this pattern is not defined,\n+the compiler falls back to emitting an instruction corresponding\n+to @code{asm volatile (\"\" ::: \"memory\")}.\n \n-In GCC lane ordering the real part of the number must be in the even lanes with\n-the imaginary part in the odd lanes.\n+@mdindex @code{memory_barrier}\n+@item @samp{memory_barrier}\n+If the target memory model is not fully synchronous, then this pattern\n+should be defined to an instruction that orders both loads and stores\n+before the instruction with respect to loads and stores after the instruction.\n+This pattern has no operands.\n \n-The operation is only supported for vector modes @var{m}.\n+@mdindex @code{speculation_barrier}\n+@item @samp{speculation_barrier}\n+If the target can support speculative execution, then this pattern should\n+be defined to an instruction that will block subsequent execution until\n+any prior speculation conditions has been resolved. The pattern must also\n+ensure that the compiler cannot move memory operations past the barrier,\n+so it needs to be an UNSPEC_VOLATILE pattern. The pattern has no\n+operands.\n \n-This pattern is not allowed to @code{FAIL}.\n+If this pattern is not defined then the default expansion of\n+@code{__builtin_speculation_safe_value} will emit a warning. You can\n+suppress this warning by defining this pattern with a final condition\n+of @code{0} (zero), which tells the compiler that a speculation\n+barrier is not needed for this target.\n \n-@cindex @code{cmul_conj@var{m}4} instruction pattern\n-@item @samp{cmul_conj@var{m}4}\n-Perform a vector multiply by conjugate that is semantically the same as a\n-multiply of complex numbers where the second multiply arguments is conjugated.\n+@mdindex @code{sync_compare_and_swap@var{mode}}\n+@item @samp{sync_compare_and_swap@var{mode}}\n+This pattern, if defined, emits code for an atomic compare-and-swap\n+operation. Operand 1 is the memory on which the atomic operation is\n+performed. Operand 2 is the ``old'' value to be compared against the\n+current contents of the memory location. Operand 3 is the ``new'' value\n+to store in the memory if the compare succeeds. Operand 0 is the result\n+of the operation; it should contain the contents of the memory\n+before the operation. If the compare succeeds, this should obviously be\n+a copy of operand 2.\n \n-@smallexample\n- complex TYPE op0[N];\n- complex TYPE op1[N];\n- complex TYPE op2[N];\n- for (int i = 0; i < N; i += 1)\n- @{\n- op0[i] = op1[i] * conj (op2[i]);\n- @}\n-@end smallexample\n+This pattern must show that both operand 0 and operand 1 are modified.\n \n-In GCC lane ordering the real part of the number must be in the even lanes with\n-the imaginary part in the odd lanes.\n+This pattern must issue any memory barrier instructions such that all\n+memory operations before the atomic operation occur before the atomic\n+operation and all memory operations after the atomic operation occur\n+after the atomic operation.\n \n-The operation is only supported for vector modes @var{m}.\n+For targets where the success or failure of the compare-and-swap\n+operation is available via the status flags, it is possible to\n+avoid a separate compare operation and issue the subsequent\n+branch or store-flag operation immediately after the compare-and-swap.\n+To this end, GCC will look for a @code{MODE_CC} set in the\n+output of @code{sync_compare_and_swap@var{mode}}; if the machine\n+description includes such a set, the target should also define special\n+@code{cbranchcc4} and/or @code{cstorecc4} instructions. GCC will then\n+be able to take the destination of the @code{MODE_CC} set and pass it\n+to the @code{cbranchcc4} or @code{cstorecc4} pattern as the first\n+operand of the comparison (the second will be @code{(const_int 0)}).\n \n-This pattern is not allowed to @code{FAIL}.\n+For targets where the operating system may provide support for this\n+operation via library calls, the @code{sync_compare_and_swap_optab}\n+may be initialized to a function with the same interface as the\n+@code{__sync_val_compare_and_swap_@var{n}} built-in. If the entire\n+set of @var{__sync} builtins are supported via library calls, the\n+target can initialize all of the optabs at once with\n+@code{init_sync_libfuncs}.\n+For the purposes of C++11 @code{std::atomic::is_lock_free}, it is\n+assumed that these library calls do @emph{not} use any kind of\n+interruptable locking.\n \n-@cindex @code{ffs@var{m}2} instruction pattern\n-@item @samp{ffs@var{m}2}\n-Store into operand 0 one plus the index of the least significant 1-bit\n-of operand 1. If operand 1 is zero, store zero.\n+@mdindex @code{sync_add@var{mode}}\n+@mdindex @code{sync_sub@var{mode}}\n+@mdindex @code{sync_ior@var{mode}}\n+@mdindex @code{sync_and@var{mode}}\n+@mdindex @code{sync_xor@var{mode}}\n+@mdindex @code{sync_nand@var{mode}}\n+@item @samp{sync_add@var{mode}}, @samp{sync_sub@var{mode}}\n+@itemx @samp{sync_ior@var{mode}}, @samp{sync_and@var{mode}}\n+@itemx @samp{sync_xor@var{mode}}, @samp{sync_nand@var{mode}}\n+These patterns emit code for an atomic operation on memory.\n+Operand 0 is the memory on which the atomic operation is performed.\n+Operand 1 is the second operand to the binary operator.\n \n-@var{m} is either a scalar or vector integer mode. When it is a scalar,\n-operand 1 has mode @var{m} but operand 0 can have whatever scalar\n-integer mode is suitable for the target. The compiler will insert\n-conversion instructions as necessary (typically to convert the result\n-to the same width as @code{int}). When @var{m} is a vector, both\n-operands must have mode @var{m}.\n+This pattern must issue any memory barrier instructions such that all\n+memory operations before the atomic operation occur before the atomic\n+operation and all memory operations after the atomic operation occur\n+after the atomic operation.\n \n-This pattern is not allowed to @code{FAIL}.\n+If these patterns are not defined, the operation will be constructed\n+from a compare-and-swap operation, if defined.\n \n-@cindex @code{clrsb@var{m}2} instruction pattern\n-@item @samp{clrsb@var{m}2}\n-Count leading redundant sign bits.\n-Store into operand 0 the number of redundant sign bits in operand 1, starting\n-at the most significant bit position.\n-A redundant sign bit is defined as any sign bit after the first. As such,\n-this count will be one less than the count of leading sign bits.\n+@mdindex @code{sync_old_add@var{mode}}\n+@mdindex @code{sync_old_sub@var{mode}}\n+@mdindex @code{sync_old_ior@var{mode}}\n+@mdindex @code{sync_old_and@var{mode}}\n+@mdindex @code{sync_old_xor@var{mode}}\n+@mdindex @code{sync_old_nand@var{mode}}\n+@item @samp{sync_old_add@var{mode}}, @samp{sync_old_sub@var{mode}}\n+@itemx @samp{sync_old_ior@var{mode}}, @samp{sync_old_and@var{mode}}\n+@itemx @samp{sync_old_xor@var{mode}}, @samp{sync_old_nand@var{mode}}\n+These patterns emit code for an atomic operation on memory,\n+and return the value that the memory contained before the operation.\n+Operand 0 is the result value, operand 1 is the memory on which the\n+atomic operation is performed, and operand 2 is the second operand\n+to the binary operator.\n \n-@var{m} is either a scalar or vector integer mode. When it is a scalar,\n-operand 1 has mode @var{m} but operand 0 can have whatever scalar\n-integer mode is suitable for the target. The compiler will insert\n-conversion instructions as necessary (typically to convert the result\n-to the same width as @code{int}). When @var{m} is a vector, both\n-operands must have mode @var{m}.\n+This pattern must issue any memory barrier instructions such that all\n+memory operations before the atomic operation occur before the atomic\n+operation and all memory operations after the atomic operation occur\n+after the atomic operation.\n \n-This pattern is not allowed to @code{FAIL}.\n+If these patterns are not defined, the operation will be constructed\n+from a compare-and-swap operation, if defined.\n \n-@cindex @code{clz@var{m}2} instruction pattern\n-@item @samp{clz@var{m}2}\n-Store into operand 0 the number of leading 0-bits in operand 1, starting\n-at the most significant bit position. If operand 1 is 0, the\n-@code{CLZ_DEFINED_VALUE_AT_ZERO} (@pxref{Misc}) macro defines if\n-the result is undefined or has a useful value.\n+@mdindex @code{sync_new_add@var{mode}}\n+@mdindex @code{sync_new_sub@var{mode}}\n+@mdindex @code{sync_new_ior@var{mode}}\n+@mdindex @code{sync_new_and@var{mode}}\n+@mdindex @code{sync_new_xor@var{mode}}\n+@mdindex @code{sync_new_nand@var{mode}}\n+@item @samp{sync_new_add@var{mode}}, @samp{sync_new_sub@var{mode}}\n+@itemx @samp{sync_new_ior@var{mode}}, @samp{sync_new_and@var{mode}}\n+@itemx @samp{sync_new_xor@var{mode}}, @samp{sync_new_nand@var{mode}}\n+These patterns are like their @code{sync_old_@var{op}} counterparts,\n+except that they return the value that exists in the memory location\n+after the operation, rather than before the operation.\n \n-@var{m} is either a scalar or vector integer mode. When it is a scalar,\n-operand 1 has mode @var{m} but operand 0 can have whatever scalar\n-integer mode is suitable for the target. The compiler will insert\n-conversion instructions as necessary (typically to convert the result\n-to the same width as @code{int}). When @var{m} is a vector, both\n-operands must have mode @var{m}.\n+@mdindex @code{sync_lock_test_and_set@var{mode}}\n+@item @samp{sync_lock_test_and_set@var{mode}}\n+This pattern takes two forms, based on the capabilities of the target.\n+In either case, operand 0 is the result of the operand, operand 1 is\n+the memory on which the atomic operation is performed, and operand 2\n+is the value to set in the lock.\n \n-This pattern is not allowed to @code{FAIL}.\n+In the ideal case, this operation is an atomic exchange operation, in\n+which the previous value in memory operand is copied into the result\n+operand, and the value operand is stored in the memory operand.\n \n-@cindex @code{ctz@var{m}2} instruction pattern\n-@item @samp{ctz@var{m}2}\n-Store into operand 0 the number of trailing 0-bits in operand 1, starting\n-at the least significant bit position. If operand 1 is 0, the\n-@code{CTZ_DEFINED_VALUE_AT_ZERO} (@pxref{Misc}) macro defines if\n-the result is undefined or has a useful value.\n+For less capable targets, any value operand that is not the constant 1\n+should be rejected with @code{FAIL}. In this case the target may use\n+an atomic test-and-set bit operation. The result operand should contain\n+1 if the bit was previously set and 0 if the bit was previously clear.\n+The true contents of the memory operand are implementation defined.\n \n-@var{m} is either a scalar or vector integer mode. When it is a scalar,\n-operand 1 has mode @var{m} but operand 0 can have whatever scalar\n-integer mode is suitable for the target. The compiler will insert\n-conversion instructions as necessary (typically to convert the result\n-to the same width as @code{int}). When @var{m} is a vector, both\n-operands must have mode @var{m}.\n+This pattern must issue any memory barrier instructions such that the\n+pattern as a whole acts as an acquire barrier, that is all memory\n+operations after the pattern do not occur until the lock is acquired.\n \n-This pattern is not allowed to @code{FAIL}.\n+If this pattern is not defined, the operation will be constructed from\n+a compare-and-swap operation, if defined.\n \n-@cindex @code{popcount@var{m}2} instruction pattern\n-@item @samp{popcount@var{m}2}\n-Store into operand 0 the number of 1-bits in operand 1.\n+@mdindex @code{sync_lock_release@var{mode}}\n+@item @samp{sync_lock_release@var{mode}}\n+This pattern, if defined, releases a lock set by\n+@code{sync_lock_test_and_set@var{mode}}. Operand 0 is the memory\n+that contains the lock; operand 1 is the value to store in the lock.\n \n-@var{m} is either a scalar or vector integer mode. When it is a scalar,\n-operand 1 has mode @var{m} but operand 0 can have whatever scalar\n-integer mode is suitable for the target. The compiler will insert\n-conversion instructions as necessary (typically to convert the result\n-to the same width as @code{int}). When @var{m} is a vector, both\n-operands must have mode @var{m}.\n+If the target doesn't implement full semantics for\n+@code{sync_lock_test_and_set@var{mode}}, any value operand which is not\n+the constant 0 should be rejected with @code{FAIL}, and the true contents\n+of the memory operand are implementation defined.\n \n-This pattern is not allowed to @code{FAIL}.\n+This pattern must issue any memory barrier instructions such that the\n+pattern as a whole acts as a release barrier, that is the lock is\n+released only after all previous memory operations have completed.\n \n-@cindex @code{parity@var{m}2} instruction pattern\n-@item @samp{parity@var{m}2}\n-Store into operand 0 the parity of operand 1, i.e.@: the number of 1-bits\n-in operand 1 modulo 2.\n+If this pattern is not defined, then a @code{memory_barrier} pattern\n+will be emitted, followed by a store of the value to the memory operand.\n \n-@var{m} is either a scalar or vector integer mode. When it is a scalar,\n-operand 1 has mode @var{m} but operand 0 can have whatever scalar\n-integer mode is suitable for the target. The compiler will insert\n-conversion instructions as necessary (typically to convert the result\n-to the same width as @code{int}). When @var{m} is a vector, both\n-operands must have mode @var{m}.\n+@mdindex @code{atomic_compare_and_swap@var{mode}}\n+@item @samp{atomic_compare_and_swap@var{mode}} \n+This pattern, if defined, emits code for an atomic compare-and-swap\n+operation with memory model semantics. Operand 2 is the memory on which\n+the atomic operation is performed. Operand 0 is an output operand which\n+is set to true or false based on whether the operation succeeded. Operand\n+1 is an output operand which is set to the contents of the memory before\n+the operation was attempted. Operand 3 is the value that is expected to\n+be in memory. Operand 4 is the value to put in memory if the expected\n+value is found there. Operand 5 is set to 1 if this compare and swap is to\n+be treated as a weak operation. Operand 6 is the memory model to be used\n+if the operation is a success. Operand 7 is the memory model to be used\n+if the operation fails.\n \n-This pattern is not allowed to @code{FAIL}.\n+If memory referred to in operand 2 contains the value in operand 3, then\n+operand 4 is stored in memory pointed to by operand 2 and fencing based on\n+the memory model in operand 6 is issued. \n \n-@cindex @code{one_cmpl@var{m}2} instruction pattern\n-@item @samp{one_cmpl@var{m}2}\n-Store the bitwise-complement of operand 1 into operand 0.\n+If memory referred to in operand 2 does not contain the value in operand 3,\n+then fencing based on the memory model in operand 7 is issued.\n \n-@cindex @code{cpymem@var{m}} instruction pattern\n-@item @samp{cpymem@var{m}}\n-Block copy instruction. The destination and source blocks of memory\n-are the first two operands, and both are @code{mem:BLK}s with an\n-address in mode @code{Pmode}.\n+If a target does not support weak compare-and-swap operations, or the port\n+elects not to implement weak operations, the argument in operand 5 can be\n+ignored. Note a strong implementation must be provided.\n \n-The number of bytes to copy is the third operand, in mode @var{m}.\n-Usually, you specify @code{Pmode} for @var{m}. However, if you can\n-generate better code knowing the range of valid lengths is smaller than\n-those representable in a full Pmode pointer, you should provide\n-a pattern with a\n-mode corresponding to the range of values you can handle efficiently\n-(e.g., @code{QImode} for values in the range 0--127; note we avoid numbers\n-that appear negative) and also a pattern with @code{Pmode}.\n+If this pattern is not provided, the @code{__atomic_compare_exchange}\n+built-in functions will utilize the legacy @code{sync_compare_and_swap}\n+pattern with an @code{__ATOMIC_SEQ_CST} memory model.\n \n-The fourth operand is the known shared alignment of the source and\n-destination, in the form of a @code{const_int} rtx. Thus, if the\n-compiler knows that both source and destination are word-aligned,\n-it may provide the value 4 for this operand.\n+@mdindex @code{atomic_load@var{mode}}\n+@item @samp{atomic_load@var{mode}}\n+This pattern implements an atomic load operation with memory model\n+semantics. Operand 1 is the memory address being loaded from. Operand 0\n+is the result of the load. Operand 2 is the memory model to be used for\n+the load operation.\n \n-Optional operands 5 and 6 specify expected alignment and size of block\n-respectively. The expected alignment differs from alignment in operand 4\n-in a way that the blocks are not required to be aligned according to it in\n-all cases. This expected alignment is also in bytes, just like operand 4.\n-Expected size, when unknown, is set to @code{(const_int -1)}.\n+If not present, the @code{__atomic_load} built-in function will either\n+resort to a normal load with memory barriers, or a compare-and-swap\n+operation if a normal load would not be atomic.\n \n-Descriptions of multiple @code{cpymem@var{m}} patterns can only be\n-beneficial if the patterns for smaller modes have fewer restrictions\n-on their first, second and fourth operands. Note that the mode @var{m}\n-in @code{cpymem@var{m}} does not impose any restriction on the mode of\n-individually copied data units in the block.\n+@mdindex @code{atomic_store@var{mode}}\n+@item @samp{atomic_store@var{mode}}\n+This pattern implements an atomic store operation with memory model\n+semantics. Operand 0 is the memory address being stored to. Operand 1\n+is the value to be written. Operand 2 is the memory model to be used for\n+the operation.\n \n-The @code{cpymem@var{m}} patterns need not give special consideration\n-to the possibility that the source and destination strings might\n-overlap. An exception is the case where source and destination are\n-equal, this case needs to be handled correctly.\n-These patterns are used to do inline expansion of @code{__builtin_memcpy}.\n+If not present, the @code{__atomic_store} built-in function will attempt to\n+perform a normal store and surround it with any required memory fences. If\n+the store would not be atomic, then an @code{__atomic_exchange} is\n+attempted with the result being ignored.\n \n-@cindex @code{movmem@var{m}} instruction pattern\n-@item @samp{movmem@var{m}}\n-Block move instruction. The destination and source blocks of memory\n-are the first two operands, and both are @code{mem:BLK}s with an\n-address in mode @code{Pmode}.\n+@mdindex @code{atomic_exchange@var{mode}}\n+@item @samp{atomic_exchange@var{mode}}\n+This pattern implements an atomic exchange operation with memory model\n+semantics. Operand 1 is the memory location the operation is performed on.\n+Operand 0 is an output operand which is set to the original value contained\n+in the memory pointed to by operand 1. Operand 2 is the value to be\n+stored. Operand 3 is the memory model to be used.\n \n-The number of bytes to copy is the third operand, in mode @var{m}.\n-Usually, you specify @code{Pmode} for @var{m}. However, if you can\n-generate better code knowing the range of valid lengths is smaller than\n-those representable in a full Pmode pointer, you should provide\n-a pattern with a\n-mode corresponding to the range of values you can handle efficiently\n-(e.g., @code{QImode} for values in the range 0--127; note we avoid numbers\n-that appear negative) and also a pattern with @code{Pmode}.\n+If this pattern is not present, the built-in function\n+@code{__atomic_exchange} will attempt to preform the operation with a\n+compare and swap loop.\n \n-The fourth operand is the known shared alignment of the source and\n-destination, in the form of a @code{const_int} rtx. Thus, if the\n-compiler knows that both source and destination are word-aligned,\n-it may provide the value 4 for this operand.\n+@mdindex @code{atomic_add@var{mode}}\n+@mdindex @code{atomic_sub@var{mode}}\n+@mdindex @code{atomic_or@var{mode}}\n+@mdindex @code{atomic_and@var{mode}}\n+@mdindex @code{atomic_xor@var{mode}}\n+@mdindex @code{atomic_nand@var{mode}}\n+@item @samp{atomic_add@var{mode}}, @samp{atomic_sub@var{mode}}\n+@itemx @samp{atomic_or@var{mode}}, @samp{atomic_and@var{mode}}\n+@itemx @samp{atomic_xor@var{mode}}, @samp{atomic_nand@var{mode}}\n+These patterns emit code for an atomic operation on memory with memory\n+model semantics. Operand 0 is the memory on which the atomic operation is\n+performed. Operand 1 is the second operand to the binary operator.\n+Operand 2 is the memory model to be used by the operation.\n \n-Optional operands 5 and 6 specify expected alignment and size of block\n-respectively. The expected alignment differs from alignment in operand 4\n-in a way that the blocks are not required to be aligned according to it in\n-all cases. This expected alignment is also in bytes, just like operand 4.\n-Expected size, when unknown, is set to @code{(const_int -1)}.\n+If these patterns are not defined, attempts will be made to use legacy\n+@code{sync} patterns, or equivalent patterns which return a result. If\n+none of these are available a compare-and-swap loop will be used.\n \n-Descriptions of multiple @code{movmem@var{m}} patterns can only be\n-beneficial if the patterns for smaller modes have fewer restrictions\n-on their first, second and fourth operands. Note that the mode @var{m}\n-in @code{movmem@var{m}} does not impose any restriction on the mode of\n-individually copied data units in the block.\n+@mdindex @code{atomic_fetch_add@var{mode}}\n+@mdindex @code{atomic_fetch_sub@var{mode}}\n+@mdindex @code{atomic_fetch_or@var{mode}}\n+@mdindex @code{atomic_fetch_and@var{mode}}\n+@mdindex @code{atomic_fetch_xor@var{mode}}\n+@mdindex @code{atomic_fetch_nand@var{mode}}\n+@item @samp{atomic_fetch_add@var{mode}}, @samp{atomic_fetch_sub@var{mode}}\n+@itemx @samp{atomic_fetch_or@var{mode}}, @samp{atomic_fetch_and@var{mode}}\n+@itemx @samp{atomic_fetch_xor@var{mode}}, @samp{atomic_fetch_nand@var{mode}}\n+These patterns emit code for an atomic operation on memory with memory\n+model semantics, and return the original value. Operand 0 is an output \n+operand which contains the value of the memory location before the \n+operation was performed. Operand 1 is the memory on which the atomic \n+operation is performed. Operand 2 is the second operand to the binary\n+operator. Operand 3 is the memory model to be used by the operation.\n \n-The @code{movmem@var{m}} patterns must correctly handle the case where\n-the source and destination strings overlap. These patterns are used to\n-do inline expansion of @code{__builtin_memmove}.\n+If these patterns are not defined, attempts will be made to use legacy\n+@code{sync} patterns. If none of these are available a compare-and-swap\n+loop will be used.\n \n-@cindex @code{movstr} instruction pattern\n-@item @samp{movstr}\n-String copy instruction, with @code{stpcpy} semantics. Operand 0 is\n-an output operand in mode @code{Pmode}. The addresses of the\n-destination and source strings are operands 1 and 2, and both are\n-@code{mem:BLK}s with addresses in mode @code{Pmode}. The execution of\n-the expansion of this pattern should store in operand 0 the address in\n-which the @code{NUL} terminator was stored in the destination string.\n+@mdindex @code{atomic_add_fetch@var{mode}}\n+@mdindex @code{atomic_sub_fetch@var{mode}}\n+@mdindex @code{atomic_or_fetch@var{mode}}\n+@mdindex @code{atomic_and_fetch@var{mode}}\n+@mdindex @code{atomic_xor_fetch@var{mode}}\n+@mdindex @code{atomic_nand_fetch@var{mode}}\n+@item @samp{atomic_add_fetch@var{mode}}, @samp{atomic_sub_fetch@var{mode}}\n+@itemx @samp{atomic_or_fetch@var{mode}}, @samp{atomic_and_fetch@var{mode}}\n+@itemx @samp{atomic_xor_fetch@var{mode}}, @samp{atomic_nand_fetch@var{mode}}\n+These patterns emit code for an atomic operation on memory with memory\n+model semantics and return the result after the operation is performed.\n+Operand 0 is an output operand which contains the value after the\n+operation. Operand 1 is the memory on which the atomic operation is\n+performed. Operand 2 is the second operand to the binary operator.\n+Operand 3 is the memory model to be used by the operation.\n \n-This pattern has also several optional operands that are same as in\n-@code{setmem}.\n+If these patterns are not defined, attempts will be made to use legacy\n+@code{sync} patterns, or equivalent patterns which return the result before\n+the operation followed by the arithmetic operation required to produce the\n+result. If none of these are available a compare-and-swap loop will be\n+used.\n \n-@cindex @code{setmem@var{m}} instruction pattern\n-@item @samp{setmem@var{m}}\n-Block set instruction. The destination string is the first operand,\n-given as a @code{mem:BLK} whose address is in mode @code{Pmode}. The\n-number of bytes to set is the second operand, in mode @var{m}. The value to\n-initialize the memory with is the third operand. Targets that only support the\n-clearing of memory should reject any value that is not the constant 0. See\n-@samp{cpymem@var{m}} for a discussion of the choice of mode.\n+@mdindex @code{atomic_test_and_set}\n+@item @samp{atomic_test_and_set}\n+This pattern emits code for @code{__builtin_atomic_test_and_set}.\n+Operand 0 is an output operand which is set to true if the previous\n+previous contents of the byte was \"set\", and false otherwise. Operand 1\n+is the @code{QImode} memory to be modified. Operand 2 is the memory\n+model to be used.\n \n-The fourth operand is the known alignment of the destination, in the form\n-of a @code{const_int} rtx. Thus, if the compiler knows that the\n-destination is word-aligned, it may provide the value 4 for this\n-operand.\n+The specific value that defines \"set\" is implementation defined, and\n+is normally based on what is performed by the native atomic test and set\n+instruction.\n \n-Optional operands 5 and 6 specify expected alignment and size of block\n-respectively. The expected alignment differs from alignment in operand 4\n-in a way that the blocks are not required to be aligned according to it in\n-all cases. This expected alignment is also in bytes, just like operand 4.\n-Expected size, when unknown, is set to @code{(const_int -1)}.\n-Operand 7 is the minimal size of the block and operand 8 is the\n-maximal size of the block (NULL if it cannot be represented as CONST_INT).\n-Operand 9 is the probable maximal size (i.e.@: we cannot rely on it for\n-correctness, but it can be used for choosing proper code sequence for a\n-given size).\n+@mdindex @code{atomic_bit_test_and_set@var{mode}}\n+@mdindex @code{atomic_bit_test_and_complement@var{mode}}\n+@mdindex @code{atomic_bit_test_and_reset@var{mode}}\n+@item @samp{atomic_bit_test_and_set@var{mode}}\n+@itemx @samp{atomic_bit_test_and_complement@var{mode}}\n+@itemx @samp{atomic_bit_test_and_reset@var{mode}}\n+These patterns emit code for an atomic bitwise operation on memory with memory\n+model semantics, and return the original value of the specified bit.\n+Operand 0 is an output operand which contains the value of the specified bit\n+from the memory location before the operation was performed. Operand 1 is the\n+memory on which the atomic operation is performed. Operand 2 is the bit within\n+the operand, starting with least significant bit. Operand 3 is the memory model\n+to be used by the operation. Operand 4 is a flag - it is @code{const1_rtx}\n+if operand 0 should contain the original value of the specified bit in the\n+least significant bit of the operand, and @code{const0_rtx} if the bit should\n+be in its original position in the operand.\n+@code{atomic_bit_test_and_set@var{mode}} atomically sets the specified bit after\n+remembering its original value, @code{atomic_bit_test_and_complement@var{mode}}\n+inverts the specified bit and @code{atomic_bit_test_and_reset@var{mode}} clears\n+the specified bit.\n \n-The use for multiple @code{setmem@var{m}} is as for @code{cpymem@var{m}}.\n+If these patterns are not defined, attempts will be made to use\n+@code{atomic_fetch_or@var{mode}}, @code{atomic_fetch_xor@var{mode}} or\n+@code{atomic_fetch_and@var{mode}} instruction patterns, or their @code{sync}\n+counterparts. If none of these are available a compare-and-swap\n+loop will be used.\n \n-@cindex @code{cmpstrn@var{m}} instruction pattern\n-@item @samp{cmpstrn@var{m}}\n-String compare instruction, with five operands. Operand 0 is the output;\n-it has mode @var{m}. The remaining four operands are like the operands\n-of @samp{cpymem@var{m}}. The two memory blocks specified are compared\n-byte by byte in lexicographic order starting at the beginning of each\n-string. The instruction is not allowed to prefetch more than one byte\n-at a time since either string may end in the first byte and reading past\n-that may access an invalid page or segment and cause a fault. The\n-comparison terminates early if the fetched bytes are different or if\n-they are equal to zero. The effect of the instruction is to store a\n-value in operand 0 whose sign indicates the result of the comparison.\n+@mdindex @code{atomic_add_fetch_cmp_0@var{mode}}\n+@mdindex @code{atomic_sub_fetch_cmp_0@var{mode}}\n+@mdindex @code{atomic_and_fetch_cmp_0@var{mode}}\n+@mdindex @code{atomic_or_fetch_cmp_0@var{mode}}\n+@mdindex @code{atomic_xor_fetch_cmp_0@var{mode}}\n+@item @samp{atomic_add_fetch_cmp_0@var{mode}}\n+@itemx @samp{atomic_sub_fetch_cmp_0@var{mode}}\n+@itemx @samp{atomic_and_fetch_cmp_0@var{mode}}\n+@itemx @samp{atomic_or_fetch_cmp_0@var{mode}}\n+@itemx @samp{atomic_xor_fetch_cmp_0@var{mode}}\n+These patterns emit code for an atomic operation on memory with memory\n+model semantics if the fetch result is used only in a comparison against\n+zero.\n+Operand 0 is an output operand which contains a boolean result of comparison\n+of the value after the operation against zero. Operand 1 is the memory on\n+which the atomic operation is performed. Operand 2 is the second operand\n+to the binary operator. Operand 3 is the memory model to be used by the\n+operation. Operand 4 is an integer holding the comparison code, one of\n+@code{EQ}, @code{NE}, @code{LT}, @code{GT}, @code{LE} or @code{GE}.\n \n-@cindex @code{cmpstr@var{m}} instruction pattern\n-@item @samp{cmpstr@var{m}}\n-String compare instruction, without known maximum length. Operand 0 is the\n-output; it has mode @var{m}. The second and third operand are the blocks of\n-memory to be compared; both are @code{mem:BLK} with an address in mode\n-@code{Pmode}.\n+If these patterns are not defined, attempts will be made to use separate\n+atomic operation and fetch pattern followed by comparison of the result\n+against zero.\n \n-The fourth operand is the known shared alignment of the source and\n-destination, in the form of a @code{const_int} rtx. Thus, if the\n-compiler knows that both source and destination are word-aligned,\n-it may provide the value 4 for this operand.\n+@mdindex @code{mem_thread_fence}\n+@item @samp{mem_thread_fence}\n+This pattern emits code required to implement a thread fence with\n+memory model semantics. Operand 0 is the memory model to be used.\n \n-The two memory blocks specified are compared byte by byte in lexicographic\n-order starting at the beginning of each string. The instruction is not allowed\n-to prefetch more than one byte at a time since either string may end in the\n-first byte and reading past that may access an invalid page or segment and\n-cause a fault. The comparison will terminate when the fetched bytes\n-are different or if they are equal to zero. The effect of the\n-instruction is to store a value in operand 0 whose sign indicates the\n-result of the comparison.\n+For the @code{__ATOMIC_RELAXED} model no instructions need to be issued\n+and this expansion is not invoked.\n \n-@cindex @code{cmpmem@var{m}} instruction pattern\n-@item @samp{cmpmem@var{m}}\n-Block compare instruction, with five operands like the operands\n-of @samp{cmpstr@var{m}}. The two memory blocks specified are compared\n-byte by byte in lexicographic order starting at the beginning of each\n-block. Unlike @samp{cmpstr@var{m}} the instruction can prefetch\n-any bytes in the two memory blocks. Also unlike @samp{cmpstr@var{m}}\n-the comparison will not stop if both bytes are zero. The effect of\n-the instruction is to store a value in operand 0 whose sign indicates\n-the result of the comparison.\n+The compiler always emits a compiler memory barrier regardless of what\n+expanding this pattern produced.\n \n-@cindex @code{strlen@var{m}} instruction pattern\n-@item @samp{strlen@var{m}}\n-Compute the length of a string, with three operands.\n-Operand 0 is the result (of mode @var{m}), operand 1 is\n-a @code{mem} referring to the first character of the string,\n-operand 2 is the character to search for (normally zero),\n-and operand 3 is a constant describing the known alignment\n-of the beginning of the string.\n+If this pattern is not defined, the compiler falls back to expanding the\n+@code{memory_barrier} pattern, then to emitting @code{__sync_synchronize}\n+library call, and finally to just placing a compiler memory barrier.\n \n-@cindex @code{rawmemchr@var{m}} instruction pattern\n-@item @samp{rawmemchr@var{m}}\n-Scan memory referred to by operand 1 for the first occurrence of operand 2.\n-Operand 1 is a @code{mem} and operand 2 a @code{const_int} of mode @var{m}.\n-Operand 0 is the result, i.e., a pointer to the first occurrence of operand 2\n-in the memory block given by operand 1.\n+@mdindex @code{get_thread_pointer@var{mode}}\n+@mdindex @code{set_thread_pointer@var{mode}}\n+@item @samp{get_thread_pointer@var{mode}}\n+@itemx @samp{set_thread_pointer@var{mode}}\n+These patterns emit code that reads/sets the TLS thread pointer. Currently,\n+these are only needed if the target needs to support the\n+@code{__builtin_thread_pointer} and @code{__builtin_set_thread_pointer}\n+builtins.\n \n-@cindex @code{float@var{m}@var{n}2} instruction pattern\n-@item @samp{float@var{m}@var{n}2}\n-Convert signed integer operand 1 (valid for fixed point mode @var{m}) to\n-floating point mode @var{n} and store in operand 0 (which has mode\n-@var{n}).\n+The get/set patterns have a single output/input operand respectively,\n+with @var{mode} intended to be @code{Pmode}.\n \n-@cindex @code{floatuns@var{m}@var{n}2} instruction pattern\n-@item @samp{floatuns@var{m}@var{n}2}\n-Convert unsigned integer operand 1 (valid for fixed point mode @var{m})\n-to floating point mode @var{n} and store in operand 0 (which has mode\n-@var{n}).\n+@mdindex @code{stack_protect_combined_set}\n+@item @samp{stack_protect_combined_set}\n+This pattern, if defined, moves a @code{ptr_mode} value from an address\n+whose declaration RTX is given in operand 1 to the memory in operand 0\n+without leaving the value in a register afterward. If several\n+instructions are needed by the target to perform the operation (eg. to\n+load the address from a GOT entry then load the @code{ptr_mode} value\n+and finally store it), it is the backend's responsibility to ensure no\n+intermediate result gets spilled. This is to avoid leaking the value\n+some place that an attacker might use to rewrite the stack guard slot\n+after having clobbered it.\n \n-@cindex @code{fix@var{m}@var{n}2} instruction pattern\n-@item @samp{fix@var{m}@var{n}2}\n-Convert operand 1 (valid for floating point mode @var{m}) to fixed\n-point mode @var{n} as a signed number and store in operand 0 (which\n-has mode @var{n}). This instruction's result is defined only when\n-the value of operand 1 is an integer.\n+If this pattern is not defined, then the address declaration is\n+expanded first in the standard way and a @code{stack_protect_set}\n+pattern is then generated to move the value from that address to the\n+address in operand 0.\n \n-If the machine description defines this pattern, it also needs to\n-define the @code{ftrunc} pattern.\n+@mdindex @code{stack_protect_set}\n+@item @samp{stack_protect_set}\n+This pattern, if defined, moves a @code{ptr_mode} value from the valid\n+memory location in operand 1 to the memory in operand 0 without leaving\n+the value in a register afterward. This is to avoid leaking the value\n+some place that an attacker might use to rewrite the stack guard slot\n+after having clobbered it.\n \n-@cindex @code{fixuns@var{m}@var{n}2} instruction pattern\n-@item @samp{fixuns@var{m}@var{n}2}\n-Convert operand 1 (valid for floating point mode @var{m}) to fixed\n-point mode @var{n} as an unsigned number and store in operand 0 (which\n-has mode @var{n}). This instruction's result is defined only when the\n-value of operand 1 is an integer.\n+Note: on targets where the addressing modes do not allow to load\n+directly from stack guard address, the address is expanded in a standard\n+way first which could cause some spills.\n \n-@cindex @code{ftrunc@var{m}2} instruction pattern\n-@item @samp{ftrunc@var{m}2}\n-Convert operand 1 (valid for floating point mode @var{m}) to an\n-integer value, still represented in floating point mode @var{m}, and\n-store it in operand 0 (valid for floating point mode @var{m}).\n+If this pattern is not defined, then a plain move pattern is generated.\n \n-@cindex @code{fix_trunc@var{m}@var{n}2} instruction pattern\n-@item @samp{fix_trunc@var{m}@var{n}2}\n-Like @samp{fix@var{m}@var{n}2} but works for any floating point value\n-of mode @var{m} by converting the value to an integer.\n+@mdindex @code{stack_protect_combined_test}\n+@item @samp{stack_protect_combined_test}\n+This pattern, if defined, compares a @code{ptr_mode} value from an\n+address whose declaration RTX is given in operand 1 with the memory in\n+operand 0 without leaving the value in a register afterward and\n+branches to operand 2 if the values were equal. If several\n+instructions are needed by the target to perform the operation (eg. to\n+load the address from a GOT entry then load the @code{ptr_mode} value\n+and finally store it), it is the backend's responsibility to ensure no\n+intermediate result gets spilled. This is to avoid leaking the value\n+some place that an attacker might use to rewrite the stack guard slot\n+after having clobbered it.\n \n-@cindex @code{fixuns_trunc@var{m}@var{n}2} instruction pattern\n-@item @samp{fixuns_trunc@var{m}@var{n}2}\n-Like @samp{fixuns@var{m}@var{n}2} but works for any floating point\n-value of mode @var{m} by converting the value to an integer.\n+If this pattern is not defined, then the address declaration is\n+expanded first in the standard way and a @code{stack_protect_test}\n+pattern is then generated to compare the value from that address to the\n+value at the memory in operand 0.\n \n-@cindex @code{trunc@var{m}@var{n}2} instruction pattern\n-@item @samp{trunc@var{m}@var{n}2}\n-Truncate operand 1 (valid for mode @var{m}) to mode @var{n} and\n-store in operand 0 (which has mode @var{n}). Both modes must be fixed\n-point or both floating point.\n+@mdindex @code{stack_protect_test}\n+@item @samp{stack_protect_test}\n+This pattern, if defined, compares a @code{ptr_mode} value from the\n+valid memory location in operand 1 with the memory in operand 0 without\n+leaving the value in a register afterward and branches to operand 2 if\n+the values were equal.\n \n-@cindex @code{extend@var{m}@var{n}2} instruction pattern\n-@item @samp{extend@var{m}@var{n}2}\n-Sign-extend operand 1 (valid for mode @var{m}) to mode @var{n} and\n-store in operand 0 (which has mode @var{n}). Both modes must be fixed\n-point or both floating point.\n+If this pattern is not defined, then a plain compare pattern and\n+conditional branch pattern is used.\n \n-@cindex @code{zero_extend@var{m}@var{n}2} instruction pattern\n-@item @samp{zero_extend@var{m}@var{n}2}\n-Zero-extend operand 1 (valid for mode @var{m}) to mode @var{n} and\n-store in operand 0 (which has mode @var{n}). Both modes must be fixed\n-point.\n+@mdindex @code{tag_memory}\n+This pattern tags an object that begins at the address specified by\n+operand 0, has the byte size indicated by the operand 2, and uses the\n+tag from operand 1.\n \n-@cindex @code{fract@var{m}@var{n}2} instruction pattern\n-@item @samp{fract@var{m}@var{n}2}\n-Convert operand 1 of mode @var{m} to mode @var{n} and store in\n-operand 0 (which has mode @var{n}). Mode @var{m} and mode @var{n}\n-could be fixed-point to fixed-point, signed integer to fixed-point,\n-fixed-point to signed integer, floating-point to fixed-point,\n-or fixed-point to floating-point.\n-When overflows or underflows happen, the results are undefined.\n+@mdindex @code{compose_tag}\n+This pattern composes a tagged address specified by operand 1 with\n+mode @code{ptr_mode}, with an integer operand 2 representing the tag\n+offset. It returns the result in operand 0 with mode @code{ptr_mode}.\n \n-@cindex @code{satfract@var{m}@var{n}2} instruction pattern\n-@item @samp{satfract@var{m}@var{n}2}\n-Convert operand 1 of mode @var{m} to mode @var{n} and store in\n-operand 0 (which has mode @var{n}). Mode @var{m} and mode @var{n}\n-could be fixed-point to fixed-point, signed integer to fixed-point,\n-or floating-point to fixed-point.\n-When overflows or underflows happen, the instruction saturates the\n-results to the maximum or the minimum.\n+@mdindex @code{clear_cache}\n+@item @samp{clear_cache}\n+This pattern, if defined, flushes the instruction cache for a region of\n+memory. The region is bounded to by the Pmode pointers in operand 0\n+inclusive and operand 1 exclusive.\n \n-@cindex @code{fractuns@var{m}@var{n}2} instruction pattern\n-@item @samp{fractuns@var{m}@var{n}2}\n-Convert operand 1 of mode @var{m} to mode @var{n} and store in\n-operand 0 (which has mode @var{n}). Mode @var{m} and mode @var{n}\n-could be unsigned integer to fixed-point, or\n-fixed-point to unsigned integer.\n-When overflows or underflows happen, the results are undefined.\n+If this pattern is not defined, a call to the library function\n+@code{__clear_cache} is used.\n \n-@cindex @code{satfractuns@var{m}@var{n}2} instruction pattern\n-@item @samp{satfractuns@var{m}@var{n}2}\n-Convert unsigned integer operand 1 of mode @var{m} to fixed-point mode\n-@var{n} and store in operand 0 (which has mode @var{n}).\n-When overflows or underflows happen, the instruction saturates the\n-results to the maximum or the minimum.\n+@mdindex @code{spaceship@var{m}4}\n+@item @samp{spaceship@var{m}4}\n+Initialize output operand 0 with mode of integer type to -1, 0, 1 or -128\n+if operand 1 with mode @var{m} compares less than operand 2, equal to\n+operand 2, greater than operand 2 or is unordered with operand 2.\n+Operand 3 should be @code{const0_rtx} if the result is used in comparisons,\n+@code{const1_rtx} if the result is used as integer value and the comparison\n+is integral unsigned, @code{constm1_rtx} if the result is used as integer\n+value and the comparison is integral signed and some other @code{CONST_INT}\n+if the result is used as integer value and the comparison is floating point.\n+In the last case, instead of setting output operand 0 to -128 for unordered,\n+set it to operand 3.\n+@var{m} should be a scalar floating point mode.\n \n-@cindex @code{extv@var{m}} instruction pattern\n-@item @samp{extv@var{m}}\n-Extract a bit-field from register operand 1, sign-extend it, and store\n-it in operand 0. Operand 2 specifies the width of the field in bits\n-and operand 3 the starting bit, which counts from the most significant\n-bit if @samp{BITS_BIG_ENDIAN} is true and from the least significant bit\n-otherwise.\n+This pattern is not allowed to @code{FAIL}.\n \n-Operands 0 and 1 both have mode @var{m}. Operands 2 and 3 have a\n-target-specific mode.\n+@mdindex @code{isfinite@var{m}2}\n+@item @samp{isfinite@var{m}2}\n+Return 1 if operand 1 is a finite floating point number and 0\n+otherwise. @var{m} is a scalar floating point mode. Operand 0\n+has mode @code{SImode}, and operand 1 has mode @var{m}.\n \n-@cindex @code{extvmisalign@var{m}} instruction pattern\n-@item @samp{extvmisalign@var{m}}\n-Extract a bit-field from memory operand 1, sign extend it, and store\n-it in operand 0. Operand 2 specifies the width in bits and operand 3\n-the starting bit. The starting bit is always somewhere in the first byte of\n-operand 1; it counts from the most significant bit if @samp{BITS_BIG_ENDIAN}\n-is true and from the least significant bit otherwise.\n+@mdindex @code{isnan@var{m}2}\n+@item @samp{isnan@var{m}2}\n+Return 1 if operand 1 is a @code{NaN} and 0 otherwise.\n+@var{m} is a scalar floating point mode. Operand 0\n+has mode @code{SImode}, and operand 1 has mode @var{m}.\n \n-Operand 0 has mode @var{m} while operand 1 has @code{BLK} mode.\n-Operands 2 and 3 have a target-specific mode.\n+@mdindex @code{isnormal@var{m}2}\n+@item @samp{isnormal@var{m}2}\n+Return 1 if operand 1 is a normal floating point number and 0\n+otherwise. @var{m} is a scalar floating point mode. Operand 0\n+has mode @code{SImode}, and operand 1 has mode @var{m}.\n \n-The instruction must not read beyond the last byte of the bit-field.\n+@mdindex @code{crc@var{m}@var{n}4}\n+@item @samp{crc@var{m}@var{n}4}\n+Calculate a bit-forward CRC using operands 1, 2 and 3,\n+then store the result in operand 0.\n+Operands 1 is the initial CRC, operands 2 is the data and operands 3 is the\n+polynomial without leading 1.\n+Operands 0, 1 and 3 have mode @var{n} and operand 2 has mode @var{m}, where\n+both modes are integers. The size of CRC to be calculated is determined by the\n+mode; for example, if @var{n} is @code{HImode}, a CRC16 is calculated.\n \n-@cindex @code{extzv@var{m}} instruction pattern\n-@item @samp{extzv@var{m}}\n-Like @samp{extv@var{m}} except that the bit-field value is zero-extended.\n+@mdindex @code{crc_rev@var{m}@var{n}4}\n+@item @samp{crc_rev@var{m}@var{n}4}\n+Similar to @samp{crc@var{m}@var{n}4}, but calculates a bit-reversed CRC.\n \n-@cindex @code{extzvmisalign@var{m}} instruction pattern\n-@item @samp{extzvmisalign@var{m}}\n-Like @samp{extvmisalign@var{m}} except that the bit-field value is\n-zero-extended.\n+@end table\n \n-@cindex @code{insv@var{m}} instruction pattern\n-@item @samp{insv@var{m}}\n-Insert operand 3 into a bit-field of register operand 0. Operand 1\n-specifies the width of the field in bits and operand 2 the starting bit,\n-which counts from the most significant bit if @samp{BITS_BIG_ENDIAN}\n-is true and from the least significant bit otherwise.\n+@node Standard Pattern Names for Vectorization\n+@subsection Standard Pattern Names for Vectorization\n+@mdindex vectorizer patterns\n+\n+@table @asis\n+@mdindex @code{vec_load_lanes@var{m}@var{n}}\n+@item @samp{vec_load_lanes@var{m}@var{n}}\n+Perform an interleaved load of several vectors from memory operand 1\n+into register operand 0. Both operands have mode @var{m}. The register\n+operand is viewed as holding consecutive vectors of mode @var{n},\n+while the memory operand is a flat array that contains the same number\n+of elements. The operation is equivalent to:\n+\n+@smallexample\n+int c = GET_MODE_SIZE (@var{m}) / GET_MODE_SIZE (@var{n});\n+for (j = 0; j < GET_MODE_NUNITS (@var{n}); j++)\n+ for (i = 0; i < c; i++)\n+ operand0[i][j] = operand1[j * c + i];\n+@end smallexample\n+\n+For example, @samp{vec_load_lanestiv4hi} loads 8 16-bit values\n+from memory into a register of mode @samp{TI}@. The register\n+contains two consecutive vectors of mode @samp{V4HI}@.\n+\n+This pattern can only be used if:\n+@smallexample\n+TARGET_ARRAY_MODE_SUPPORTED_P (@var{n}, @var{c})\n+@end smallexample\n+is true. GCC assumes that, if a target supports this kind of\n+instruction for some mode @var{n}, it also supports unaligned\n+loads for vectors of mode @var{n}.\n+\n+This pattern is not allowed to @code{FAIL}.\n+\n+@mdindex @code{vec_mask_load_lanes@var{m}@var{n}}\n+@item @samp{vec_mask_load_lanes@var{m}@var{n}}\n+Like @samp{vec_load_lanes@var{m}@var{n}}, but takes an additional\n+mask operand (operand 2) that specifies which elements of the destination\n+vectors should be loaded. Other elements of the destination vectors are\n+taken from operand 3, which is an else operand in the subvector mode\n+@var{n}, similar to the one in @code{maskload}.\n+The operation is equivalent to:\n+\n+@smallexample\n+int c = GET_MODE_SIZE (@var{m}) / GET_MODE_SIZE (@var{n});\n+for (j = 0; j < GET_MODE_NUNITS (@var{n}); j++)\n+ if (operand2[j])\n+ for (i = 0; i < c; i++)\n+ operand0[i][j] = operand1[j * c + i];\n+ else\n+ for (i = 0; i < c; i++)\n+ operand0[i][j] = operand3[j];\n+@end smallexample\n+\n+This pattern is not allowed to @code{FAIL}.\n+\n+@mdindex @code{vec_mask_len_load_lanes@var{m}@var{n}}\n+@item @samp{vec_mask_len_load_lanes@var{m}@var{n}}\n+Like @samp{vec_load_lanes@var{m}@var{n}}, but takes an additional\n+mask operand (operand 2), length operand (operand 4) as well as bias operand\n+(operand 5) that specifies which elements of the destination vectors should be\n+loaded. Other elements of the destination vectors are taken from operand 3,\n+which is an else operand similar to the one in @code{maskload}.\n+The operation is equivalent to:\n \n-Operands 0 and 3 both have mode @var{m}. Operands 1 and 2 have a\n-target-specific mode.\n+@smallexample\n+int c = GET_MODE_SIZE (@var{m}) / GET_MODE_SIZE (@var{n});\n+for (j = 0; j < operand4 + operand5; j++)\n+ for (i = 0; i < c; i++)\n+ if (operand2[j])\n+ operand0[i][j] = operand1[j * c + i];\n+ else\n+ operand0[i][j] = operand3[j];\n+@end smallexample\n \n-@cindex @code{insvmisalign@var{m}} instruction pattern\n-@item @samp{insvmisalign@var{m}}\n-Insert operand 3 into a bit-field of memory operand 0. Operand 1\n-specifies the width of the field in bits and operand 2 the starting bit.\n-The starting bit is always somewhere in the first byte of operand 0;\n-it counts from the most significant bit if @samp{BITS_BIG_ENDIAN}\n-is true and from the least significant bit otherwise.\n+This pattern is not allowed to @code{FAIL}.\n \n-Operand 3 has mode @var{m} while operand 0 has @code{BLK} mode.\n-Operands 1 and 2 have a target-specific mode.\n+@mdindex @code{vec_store_lanes@var{m}@var{n}}\n+@item @samp{vec_store_lanes@var{m}@var{n}}\n+Equivalent to @samp{vec_load_lanes@var{m}@var{n}}, with the memory\n+and register operands reversed. That is, the instruction is\n+equivalent to:\n \n-The instruction must not read or write beyond the last byte of the bit-field.\n+@smallexample\n+int c = GET_MODE_SIZE (@var{m}) / GET_MODE_SIZE (@var{n});\n+for (j = 0; j < GET_MODE_NUNITS (@var{n}); j++)\n+ for (i = 0; i < c; i++)\n+ operand0[j * c + i] = operand1[i][j];\n+@end smallexample\n \n-@cindex @code{extv} instruction pattern\n-@item @samp{extv}\n-Extract a bit-field from operand 1 (a register or memory operand), where\n-operand 2 specifies the width in bits and operand 3 the starting bit,\n-and store it in operand 0. Operand 0 must have mode @code{word_mode}.\n-Operand 1 may have mode @code{byte_mode} or @code{word_mode}; often\n-@code{word_mode} is allowed only for registers. Operands 2 and 3 must\n-be valid for @code{word_mode}.\n+for a memory operand 0 and register operand 1.\n \n-The RTL generation pass generates this instruction only with constants\n-for operands 2 and 3 and the constant is never zero for operand 2.\n+This pattern is not allowed to @code{FAIL}.\n \n-The bit-field value is sign-extended to a full word integer\n-before it is stored in operand 0.\n+@mdindex @code{vec_mask_store_lanes@var{m}@var{n}}\n+@item @samp{vec_mask_store_lanes@var{m}@var{n}}\n+Like @samp{vec_store_lanes@var{m}@var{n}}, but takes an additional\n+mask operand (operand 2) that specifies which elements of the source\n+vectors should be stored. The operation is equivalent to:\n \n-This pattern is deprecated; please use @samp{extv@var{m}} and\n-@code{extvmisalign@var{m}} instead.\n+@smallexample\n+int c = GET_MODE_SIZE (@var{m}) / GET_MODE_SIZE (@var{n});\n+for (j = 0; j < GET_MODE_NUNITS (@var{n}); j++)\n+ if (operand2[j])\n+ for (i = 0; i < c; i++)\n+ operand0[j * c + i] = operand1[i][j];\n+@end smallexample\n \n-@cindex @code{extzv} instruction pattern\n-@item @samp{extzv}\n-Like @samp{extv} except that the bit-field value is zero-extended.\n+This pattern is not allowed to @code{FAIL}.\n \n-This pattern is deprecated; please use @samp{extzv@var{m}} and\n-@code{extzvmisalign@var{m}} instead.\n+@mdindex @code{vec_mask_len_store_lanes@var{m}@var{n}}\n+@item @samp{vec_mask_len_store_lanes@var{m}@var{n}}\n+Like @samp{vec_store_lanes@var{m}@var{n}}, but takes an additional\n+mask operand (operand 2), length operand (operand 3) as well as bias operand (operand 4)\n+that specifies which elements of the source vectors should be stored.\n+The operation is equivalent to:\n \n-@cindex @code{insv} instruction pattern\n-@item @samp{insv}\n-Store operand 3 (which must be valid for @code{word_mode}) into a\n-bit-field in operand 0, where operand 1 specifies the width in bits and\n-operand 2 the starting bit. Operand 0 may have mode @code{byte_mode} or\n-@code{word_mode}; often @code{word_mode} is allowed only for registers.\n-Operands 1 and 2 must be valid for @code{word_mode}.\n+@smallexample\n+int c = GET_MODE_SIZE (@var{m}) / GET_MODE_SIZE (@var{n});\n+for (j = 0; j < operand3 + operand4; j++)\n+ if (operand2[j])\n+ for (i = 0; i < c; i++)\n+ operand0[j * c + i] = operand1[i][j];\n+@end smallexample\n \n-The RTL generation pass generates this instruction only with constants\n-for operands 1 and 2 and the constant is never zero for operand 1.\n+This pattern is not allowed to @code{FAIL}.\n \n-This pattern is deprecated; please use @samp{insv@var{m}} and\n-@code{insvmisalign@var{m}} instead.\n+@mdindex @code{gather_load@var{m}@var{n}}\n+@item @samp{gather_load@var{m}@var{n}}\n+Load several separate memory locations into a vector of mode @var{m}.\n+Operand 1 is a scalar base address and operand 2 is a vector of mode @var{n}\n+containing offsets from that base. Operand 0 is a destination vector with\n+the same number of elements as @var{n}. For each element index @var{i}:\n \n-@cindex @code{mov@var{mode}cc} instruction pattern\n-@item @samp{mov@var{mode}cc}\n-Conditionally move operand 2 or operand 3 into operand 0 according to the\n-comparison in operand 1. If the comparison is true, operand 2 is moved\n-into operand 0, otherwise operand 3 is moved.\n+@itemize @bullet\n+@item\n+extend the offset element @var{i} to address width, using zero\n+extension if operand 3 is 1 and sign extension if operand 3 is zero;\n+@item\n+multiply the extended offset by operand 4;\n+@item\n+add the result to the base; and\n+@item\n+load the value at that address into element @var{i} of operand 0.\n+@end itemize\n \n-The mode of the operands being compared need not be the same as the operands\n-being moved. Some machines, sparc64 for example, have instructions that\n-conditionally move an integer value based on the floating point condition\n-codes and vice versa.\n+The value of operand 3 does not matter if the offsets are already\n+address width.\n \n-If the machine does not have conditional move instructions, do not\n-define these patterns.\n+@mdindex @code{mask_gather_load@var{m}@var{n}}\n+@item @samp{mask_gather_load@var{m}@var{n}}\n+Like @samp{gather_load@var{m}@var{n}}, but takes an extra mask operand as\n+operand 5.\n+Other elements of the destination vectors are taken from operand 6,\n+which is an else operand similar to the one in @code{maskload}.\n+Bit @var{i} of the mask is set if element @var{i}\n+of the result should be loaded from memory and clear if element @var{i}\n+of the result should be set to operand 6.\n \n-@cindex @code{add@var{mode}cc} instruction pattern\n-@item @samp{add@var{mode}cc}\n-Similar to @samp{mov@var{mode}cc} but for conditional addition. Conditionally\n-move operand 2 or (operands 2 + operand 3) into operand 0 according to the\n-comparison in operand 1. If the comparison is false, operand 2 is moved into\n-operand 0, otherwise (operand 2 + operand 3) is moved.\n+@mdindex @code{mask_len_gather_load@var{m}@var{n}}\n+@item @samp{mask_len_gather_load@var{m}@var{n}}\n+Like @samp{gather_load@var{m}@var{n}}, but takes an extra mask operand\n+(operand 5) and an else operand (operand 6) as well as a len operand\n+(operand 7) and a bias operand (operand 8).\n \n-@cindex @code{cond_neg@var{mode}} instruction pattern\n-@cindex @code{cond_one_cmpl@var{mode}} instruction pattern\n-@cindex @code{cond_sqrt@var{mode}} instruction pattern\n-@cindex @code{cond_ceil@var{mode}} instruction pattern\n-@cindex @code{cond_floor@var{mode}} instruction pattern\n-@cindex @code{cond_round@var{mode}} instruction pattern\n-@cindex @code{cond_rint@var{mode}} instruction pattern\n-@item @samp{cond_neg@var{mode}}\n-@itemx @samp{cond_one_cmpl@var{mode}}\n-@itemx @samp{cond_sqrt@var{mode}}\n-@itemx @samp{cond_ceil@var{mode}}\n-@itemx @samp{cond_floor@var{mode}}\n-@itemx @samp{cond_round@var{mode}}\n-@itemx @samp{cond_rint@var{mode}}\n-When operand 1 is true, perform an operation on operands 2 and\n-store the result in operand 0, otherwise store operand 3 in operand 0.\n-The operation works elementwise if the operands are vectors.\n+Similar to mask_len_load the instruction loads at\n+most (operand 7 + operand 8) elements from memory.\n+Bit @var{i} of the mask is set if element @var{i} of the result should\n+be loaded from memory and clear if element @var{i} of the result should\n+be set to element @var{i} of operand 6.\n+Mask elements @var{i} with @var{i} > (operand 7 + operand 8) are ignored.\n \n-The scalar case is equivalent to:\n+@mdindex @code{mask_len_strided_load@var{m}}\n+@item @samp{mask_len_strided_load@var{m}}\n+Load several separate memory locations into a destination vector of mode @var{m}.\n+Operand 0 is a destination vector of mode @var{m}.\n+Operand 1 is a scalar base address and operand 2 is a scalar stride of Pmode.\n+operand 3 is mask operand, operand 4 is length operand and operand 5 is bias operand.\n+The instruction can be seen as a special case of @code{mask_len_gather_load@var{m}@var{n}}\n+with an offset vector that is a @code{vec_series} with zero as base and operand 2 as step.\n+For each element the load address is operand 1 + @var{i} * operand 2.\n+Similar to mask_len_load, the instruction loads at most (operand 4 + operand 5) elements from memory.\n+Element @var{i} of the mask (operand 3) is set if element @var{i} of the result should\n+be loaded from memory and clear if element @var{i} of the result should be zero.\n+Mask elements @var{i} with @var{i} > (operand 4 + operand 5) are ignored.\n \n-@smallexample\n-op0 = op1 ? @var{op} op2 : op3;\n-@end smallexample\n+@mdindex @code{scatter_store@var{m}@var{n}}\n+@item @samp{scatter_store@var{m}@var{n}}\n+Store a vector of mode @var{m} into several distinct memory locations.\n+Operand 0 is a scalar base address and operand 1 is a vector of mode\n+@var{n} containing offsets from that base. Operand 4 is the vector of\n+values that should be stored, which has the same number of elements as\n+@var{n}. For each element index @var{i}:\n \n-while the vector case is equivalent to:\n+@itemize @bullet\n+@item\n+extend the offset element @var{i} to address width, using zero\n+extension if operand 2 is 1 and sign extension if operand 2 is zero;\n+@item\n+multiply the extended offset by operand 3;\n+@item\n+add the result to the base; and\n+@item\n+store element @var{i} of operand 4 to that address.\n+@end itemize\n \n-@smallexample\n-for (i = 0; i < GET_MODE_NUNITS (@var{m}); i++)\n- op0[i] = op1[i] ? @var{op} op2[i] : op3[i];\n-@end smallexample\n+The value of operand 2 does not matter if the offsets are already\n+address width.\n \n-where, for example, @var{op} is @code{~} for @samp{cond_one_cmpl@var{mode}}.\n+@mdindex @code{mask_scatter_store@var{m}@var{n}}\n+@item @samp{mask_scatter_store@var{m}@var{n}}\n+Like @samp{scatter_store@var{m}@var{n}}, but takes an extra mask operand as\n+operand 5. Bit @var{i} of the mask is set if element @var{i}\n+of the result should be stored to memory.\n \n-When defined for floating-point modes, the contents of @samp{op2[i]}\n-are not interpreted if @samp{op1[i]} is false, just like they would not\n-be in a normal C @samp{?:} condition.\n+@mdindex @code{mask_len_scatter_store@var{m}@var{n}}\n+@item @samp{mask_len_scatter_store@var{m}@var{n}}\n+Like @samp{scatter_store@var{m}@var{n}}, but takes an extra mask operand (operand 5),\n+a len operand (operand 6) as well as a bias operand (operand 7). The instruction stores\n+at most (operand 6 + operand 7) elements of (operand 4) to memory.\n+Bit @var{i} of the mask is set if element @var{i} of (operand 4) should be stored.\n+Mask elements @var{i} with @var{i} > (operand 6 + operand 7) are ignored.\n \n-Operands 0, 2, and 3 all have mode @var{m}. Operand 1 is a scalar\n-integer if @var{m} is scalar, otherwise it has the mode returned by\n-@code{TARGET_VECTORIZE_GET_MASK_MODE}.\n+@mdindex @code{mask_len_strided_store@var{m}}\n+@item @samp{mask_len_strided_store@var{m}}\n+Store a vector of mode m into several distinct memory locations.\n+Operand 0 is a scalar base address and operand 1 is scalar stride of Pmode.\n+Operand 2 is the vector of values that should be stored, which is of mode @var{m}.\n+operand 3 is mask operand, operand 4 is length operand and operand 5 is bias operand.\n+The instruction can be seen as a special case of @code{mask_len_scatter_store@var{m}@var{n}}\n+with an offset vector that is a @code{vec_series} with zero as base and operand 1 as step.\n+For each element the store address is operand 0 + @var{i} * operand 1.\n+Similar to mask_len_store, the instruction stores at most (operand 4 + operand 5) elements of\n+mask (operand 3) to memory. Element @var{i} of the mask is set if element @var{i} of (operand 3)\n+should be stored. Mask elements @var{i} with @var{i} > (operand 4 + operand 5) are ignored.\n \n-@samp{cond_@var{op}@var{mode}} generally corresponds to a conditional\n-form of @samp{@var{op}@var{mode}2}.\n+@mdindex @code{while_ult@var{m}@var{n}}\n+@item @code{while_ult@var{m}@var{n}}\n+Set operand 0 to a mask that is true while incrementing operand 1\n+gives a value that is less than operand 2, for a vector length up to operand 3.\n+Operand 0 has mode @var{n} and operands 1 and 2 are scalar integers of mode\n+@var{m}. Operand 3 should be omitted when @var{n} is a vector mode, and\n+a @code{CONST_INT} otherwise. The operation for vector modes is equivalent to:\n \n-@cindex @code{cond_add@var{mode}} instruction pattern\n-@cindex @code{cond_sub@var{mode}} instruction pattern\n-@cindex @code{cond_mul@var{mode}} instruction pattern\n-@cindex @code{cond_div@var{mode}} instruction pattern\n-@cindex @code{cond_udiv@var{mode}} instruction pattern\n-@cindex @code{cond_mod@var{mode}} instruction pattern\n-@cindex @code{cond_umod@var{mode}} instruction pattern\n-@cindex @code{cond_and@var{mode}} instruction pattern\n-@cindex @code{cond_ior@var{mode}} instruction pattern\n-@cindex @code{cond_xor@var{mode}} instruction pattern\n-@cindex @code{cond_smin@var{mode}} instruction pattern\n-@cindex @code{cond_smax@var{mode}} instruction pattern\n-@cindex @code{cond_umin@var{mode}} instruction pattern\n-@cindex @code{cond_umax@var{mode}} instruction pattern\n-@cindex @code{cond_copysign@var{mode}} instruction pattern\n-@cindex @code{cond_fmin@var{mode}} instruction pattern\n-@cindex @code{cond_fmax@var{mode}} instruction pattern\n-@cindex @code{cond_ashl@var{mode}} instruction pattern\n-@cindex @code{cond_ashr@var{mode}} instruction pattern\n-@cindex @code{cond_lshr@var{mode}} instruction pattern\n-@item @samp{cond_add@var{mode}}\n-@itemx @samp{cond_sub@var{mode}}\n-@itemx @samp{cond_mul@var{mode}}\n-@itemx @samp{cond_div@var{mode}}\n-@itemx @samp{cond_udiv@var{mode}}\n-@itemx @samp{cond_mod@var{mode}}\n-@itemx @samp{cond_umod@var{mode}}\n-@itemx @samp{cond_and@var{mode}}\n-@itemx @samp{cond_ior@var{mode}}\n-@itemx @samp{cond_xor@var{mode}}\n-@itemx @samp{cond_smin@var{mode}}\n-@itemx @samp{cond_smax@var{mode}}\n-@itemx @samp{cond_umin@var{mode}}\n-@itemx @samp{cond_umax@var{mode}}\n-@itemx @samp{cond_copysign@var{mode}}\n-@itemx @samp{cond_fmin@var{mode}}\n-@itemx @samp{cond_fmax@var{mode}}\n-@itemx @samp{cond_ashl@var{mode}}\n-@itemx @samp{cond_ashr@var{mode}}\n-@itemx @samp{cond_lshr@var{mode}}\n-When operand 1 is true, perform an operation on operands 2 and 3 and\n-store the result in operand 0, otherwise store operand 4 in operand 0.\n-The operation works elementwise if the operands are vectors.\n+@smallexample\n+operand0[0] = operand1 < operand2;\n+for (i = 1; i < GET_MODE_NUNITS (@var{n}); i++)\n+ operand0[i] = operand0[i - 1] && (operand1 + i < operand2);\n+@end smallexample\n \n-The scalar case is equivalent to:\n+And for non-vector modes the operation is equivalent to:\n \n @smallexample\n-op0 = op1 ? op2 @var{op} op3 : op4;\n+operand0[0] = operand1 < operand2;\n+for (i = 1; i < operand3; i++)\n+ operand0[i] = operand0[i - 1] && (operand1 + i < operand2);\n @end smallexample\n \n-while the vector case is equivalent to:\n+@mdindex @code{select_vl@var{m}@var{n}}\n+@item @code{select_vl@var{m}@var{n}}\n+Set operand 0 (of mode @var{n}) to the number of scalar iterations that\n+should be handled by one iteration of a vector loop. Operand 1 is the\n+total number of scalar iterations that the loop needs to process and\n+operand 2 is a maximum bound on the result (also known as the\n+maximum ``vectorization factor''). Operand 3 (of mode @var{m}) is\n+a dummy parameter to pass the vector mode to be used.\n \n+The maximum value of operand 0 is given by:\n @smallexample\n-for (i = 0; i < GET_MODE_NUNITS (@var{m}); i++)\n- op0[i] = op1[i] ? op2[i] @var{op} op3[i] : op4[i];\n+operand0 = MIN (operand1, operand2)\n @end smallexample\n+However, targets might choose a lower value than this, based on\n+target-specific criteria. Each iteration of the vector loop might\n+therefore process a different number of scalar iterations, which in turn\n+means that induction variables will have a variable step. Because of\n+this, it is generally not useful to define this instruction if it will\n+always calculate the maximum value.\n \n-where, for example, @var{op} is @code{+} for @samp{cond_add@var{mode}}.\n+This optab is only useful on targets that implement @samp{len_load_@var{m}}\n+and/or @samp{len_store_@var{m}} or the associated @samp{_len} variants.\n \n-When defined for floating-point modes, the contents of @samp{op3[i]}\n-are not interpreted if @samp{op1[i]} is false, just like they would not\n-be in a normal C @samp{?:} condition.\n+@mdindex @code{vec_set@var{m}}\n+@item @samp{vec_set@var{m}}\n+Set given field in the vector value. Operand 0 is the vector to modify,\n+operand 1 is new value of field and operand 2 specify the field index.\n \n-Operands 0, 2, 3 and 4 all have mode @var{m}. Operand 1 is a scalar\n-integer if @var{m} is scalar, otherwise it has the mode returned by\n-@code{TARGET_VECTORIZE_GET_MASK_MODE}.\n+This pattern is not allowed to @code{FAIL}.\n \n-@samp{cond_@var{op}@var{mode}} generally corresponds to a conditional\n-form of @samp{@var{op}@var{mode}3}. As an exception, the vector forms\n-of shifts correspond to patterns like @code{vashl@var{mode}3} rather\n-than patterns like @code{ashl@var{mode}3}.\n+@mdindex @code{vec_extract@var{m}@var{n}}\n+@item @samp{vec_extract@var{m}@var{n}}\n+Extract given field from the vector value. Operand 1 is the vector, operand 2\n+specify field index and operand 0 place to store value into. The\n+@var{n} mode is the mode of the field or vector of fields that should be\n+extracted, should be either element mode of the vector mode @var{m}, or\n+a vector mode with the same element mode and smaller number of elements.\n+If @var{n} is a vector mode the index is counted in multiples of\n+mode @var{n}.\n \n-@samp{cond_copysign@var{mode}} is only defined for floating point modes.\n+This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{cond_fma@var{mode}} instruction pattern\n-@cindex @code{cond_fms@var{mode}} instruction pattern\n-@cindex @code{cond_fnma@var{mode}} instruction pattern\n-@cindex @code{cond_fnms@var{mode}} instruction pattern\n-@item @samp{cond_fma@var{mode}}\n-@itemx @samp{cond_fms@var{mode}}\n-@itemx @samp{cond_fnma@var{mode}}\n-@itemx @samp{cond_fnms@var{mode}}\n-Like @samp{cond_add@var{m}}, except that the conditional operation\n-takes 3 operands rather than two. For example, the vector form of\n-@samp{cond_fma@var{mode}} is equivalent to:\n+@mdindex @code{vec_init@var{m}@var{n}}\n+@item @samp{vec_init@var{m}@var{n}}\n+Initialize the vector to given values. Operand 0 is the vector to initialize\n+and operand 1 is parallel containing values for individual fields. The\n+@var{n} mode is the mode of the elements, should be either element mode of\n+the vector mode @var{m}, or a vector mode with the same element mode and\n+smaller number of elements.\n \n-@smallexample\n-for (i = 0; i < GET_MODE_NUNITS (@var{m}); i++)\n- op0[i] = op1[i] ? fma (op2[i], op3[i], op4[i]) : op5[i];\n-@end smallexample\n+@mdindex @code{vec_duplicate@var{m}}\n+@item @samp{vec_duplicate@var{m}}\n+Initialize vector output operand 0 so that each element has the value given\n+by scalar input operand 1. The vector has mode @var{m} and the scalar has\n+the mode appropriate for one element of @var{m}.\n \n-@cindex @code{cond_len_neg@var{mode}} instruction pattern\n-@cindex @code{cond_len_one_cmpl@var{mode}} instruction pattern\n-@cindex @code{cond_len_sqrt@var{mode}} instruction pattern\n-@cindex @code{cond_len_ceil@var{mode}} instruction pattern\n-@cindex @code{cond_len_floor@var{mode}} instruction pattern\n-@cindex @code{cond_len_round@var{mode}} instruction pattern\n-@cindex @code{cond_len_rint@var{mode}} instruction pattern\n-@item @samp{cond_len_neg@var{mode}}\n-@itemx @samp{cond_len_one_cmpl@var{mode}}\n-@itemx @samp{cond_len_sqrt@var{mode}}\n-@itemx @samp{cond_len_ceil@var{mode}}\n-@itemx @samp{cond_len_floor@var{mode}}\n-@itemx @samp{cond_len_round@var{mode}}\n-@itemx @samp{cond_len_rint@var{mode}}\n-When operand 1 is true and element index < operand 4 + operand 5, perform an operation on operands 1 and\n-store the result in operand 0, otherwise store operand 2 in operand 0.\n-The operation only works for the operands are vectors.\n+This pattern only handles duplicates of non-constant inputs. Constant\n+vectors go through the @code{mov@var{m}} pattern instead.\n \n-@smallexample\n-for (i = 0; i < GET_MODE_NUNITS (@var{m}); i++)\n- op0[i] = (i < ops[4] + ops[5] && op1[i]\n- ? @var{op} op2[i]\n- : op3[i]);\n-@end smallexample\n+This pattern is not allowed to @code{FAIL}.\n \n-where, for example, @var{op} is @code{~} for @samp{cond_len_one_cmpl@var{mode}}.\n+@mdindex @code{vec_series@var{m}}\n+@item @samp{vec_series@var{m}}\n+Initialize vector output operand 0 so that element @var{i} is equal to\n+operand 1 plus @var{i} times operand 2. In other words, create a linear\n+series whose base value is operand 1 and whose step is operand 2.\n \n-When defined for floating-point modes, the contents of @samp{op2[i]}\n-are not interpreted if @samp{op1[i]} is false, just like they would not\n-be in a normal C @samp{?:} condition.\n+The vector output has mode @var{m} and the scalar inputs have the mode\n+appropriate for one element of @var{m}. This pattern is not used for\n+floating-point vectors, in order to avoid having to specify the\n+rounding behavior for @var{i} > 1.\n \n-Operands 0, 2, and 3 all have mode @var{m}. Operand 1 is a scalar\n-integer if @var{m} is scalar, otherwise it has the mode returned by\n-@code{TARGET_VECTORIZE_GET_MASK_MODE}. Operand 4 has whichever\n-integer mode the target prefers.\n+This pattern is not allowed to @code{FAIL}.\n \n-@samp{cond_len_@var{op}@var{mode}} generally corresponds to a conditional\n-form of @samp{@var{op}@var{mode}2}.\n+@mdindex @code{check_raw_ptrs@var{m}}\n+@item @samp{check_raw_ptrs@var{m}}\n+Check whether, given two pointers @var{a} and @var{b} and a length @var{len},\n+a write of @var{len} bytes at @var{a} followed by a read of @var{len} bytes\n+at @var{b} can be split into interleaved byte accesses\n+@samp{@var{a}[0], @var{b}[0], @var{a}[1], @var{b}[1], @dots{}}\n+without affecting the dependencies between the bytes. Set operand 0\n+to true if the split is possible and false otherwise.\n \n+Operands 1, 2 and 3 provide the values of @var{a}, @var{b} and @var{len}\n+respectively. Operand 4 is a constant integer that provides the known\n+common alignment of @var{a} and @var{b}. All inputs have mode @var{m}.\n \n-@cindex @code{cond_len_add@var{mode}} instruction pattern\n-@cindex @code{cond_len_sub@var{mode}} instruction pattern\n-@cindex @code{cond_len_mul@var{mode}} instruction pattern\n-@cindex @code{cond_len_div@var{mode}} instruction pattern\n-@cindex @code{cond_len_udiv@var{mode}} instruction pattern\n-@cindex @code{cond_len_mod@var{mode}} instruction pattern\n-@cindex @code{cond_len_umod@var{mode}} instruction pattern\n-@cindex @code{cond_len_and@var{mode}} instruction pattern\n-@cindex @code{cond_len_ior@var{mode}} instruction pattern\n-@cindex @code{cond_len_xor@var{mode}} instruction pattern\n-@cindex @code{cond_len_smin@var{mode}} instruction pattern\n-@cindex @code{cond_len_smax@var{mode}} instruction pattern\n-@cindex @code{cond_len_umin@var{mode}} instruction pattern\n-@cindex @code{cond_len_umax@var{mode}} instruction pattern\n-@cindex @code{cond_len_copysign@var{mode}} instruction pattern\n-@cindex @code{cond_len_fmin@var{mode}} instruction pattern\n-@cindex @code{cond_len_fmax@var{mode}} instruction pattern\n-@cindex @code{cond_len_ashl@var{mode}} instruction pattern\n-@cindex @code{cond_len_ashr@var{mode}} instruction pattern\n-@cindex @code{cond_len_lshr@var{mode}} instruction pattern\n-@item @samp{cond_len_add@var{mode}}\n-@itemx @samp{cond_len_sub@var{mode}}\n-@itemx @samp{cond_len_mul@var{mode}}\n-@itemx @samp{cond_len_div@var{mode}}\n-@itemx @samp{cond_len_udiv@var{mode}}\n-@itemx @samp{cond_len_mod@var{mode}}\n-@itemx @samp{cond_len_umod@var{mode}}\n-@itemx @samp{cond_len_and@var{mode}}\n-@itemx @samp{cond_len_ior@var{mode}}\n-@itemx @samp{cond_len_xor@var{mode}}\n-@itemx @samp{cond_len_smin@var{mode}}\n-@itemx @samp{cond_len_smax@var{mode}}\n-@itemx @samp{cond_len_umin@var{mode}}\n-@itemx @samp{cond_len_umax@var{mode}}\n-@itemx @samp{cond_len_copysign@var{mode}}\n-@itemx @samp{cond_len_fmin@var{mode}}\n-@itemx @samp{cond_len_fmax@var{mode}}\n-@itemx @samp{cond_len_ashl@var{mode}}\n-@itemx @samp{cond_len_ashr@var{mode}}\n-@itemx @samp{cond_len_lshr@var{mode}}\n-When operand 1 is true and element index < operand 5 + operand 6, perform an operation on operands 2 and 3 and\n-store the result in operand 0, otherwise store operand 4 in operand 0.\n-The operation only works for the operands are vectors.\n+This split is possible if:\n \n @smallexample\n-for (i = 0; i < GET_MODE_NUNITS (@var{m}); i++)\n- op0[i] = (i < ops[5] + ops[6] && op1[i]\n- ? op2[i] @var{op} op3[i]\n- : op4[i]);\n+@var{a} == @var{b} || @var{a} + @var{len} <= @var{b} || @var{b} + @var{len} <= @var{a}\n @end smallexample\n \n-where, for example, @var{op} is @code{+} for @samp{cond_len_add@var{mode}}.\n+You should only define this pattern if the target has a way of accelerating\n+the test without having to do the individual comparisons.\n \n-When defined for floating-point modes, the contents of @samp{op3[i]}\n-are not interpreted if @samp{op1[i]} is false, just like they would not\n-be in a normal C @samp{?:} condition.\n+@mdindex @code{check_war_ptrs@var{m}}\n+@item @samp{check_war_ptrs@var{m}}\n+Like @samp{check_raw_ptrs@var{m}}, but with the read and write swapped round.\n+The split is possible in this case if:\n \n-Operands 0, 2, 3 and 4 all have mode @var{m}. Operand 1 is a scalar\n-integer if @var{m} is scalar, otherwise it has the mode returned by\n-@code{TARGET_VECTORIZE_GET_MASK_MODE}. Operand 5 has whichever\n-integer mode the target prefers.\n+@smallexample\n+@var{b} <= @var{a} || @var{a} + @var{len} <= @var{b}\n+@end smallexample\n \n-@samp{cond_len_@var{op}@var{mode}} generally corresponds to a conditional\n-form of @samp{@var{op}@var{mode}3}. As an exception, the vector forms\n-of shifts correspond to patterns like @code{vashl@var{mode}3} rather\n-than patterns like @code{ashl@var{mode}3}.\n+@mdindex @code{vec_cmp@var{m}@var{n}}\n+@item @samp{vec_cmp@var{m}@var{n}}\n+Output a vector comparison. Operand 0 of mode @var{n} is the destination for\n+predicate in operand 1 which is a signed vector comparison with operands of\n+mode @var{m} in operands 2 and 3. Predicate is computed by elementwise\n+evaluation of the vector comparison with a truth value of all-ones and a false\n+value of all-zeros.\n \n-@samp{cond_len_copysign@var{mode}} is only defined for floating point modes.\n+@mdindex @code{vec_cmpu@var{m}@var{n}}\n+@item @samp{vec_cmpu@var{m}@var{n}}\n+Similar to @code{vec_cmp@var{m}@var{n}} but perform unsigned vector comparison.\n \n-@cindex @code{cond_len_fma@var{mode}} instruction pattern\n-@cindex @code{cond_len_fms@var{mode}} instruction pattern\n-@cindex @code{cond_len_fnma@var{mode}} instruction pattern\n-@cindex @code{cond_len_fnms@var{mode}} instruction pattern\n-@item @samp{cond_len_fma@var{mode}}\n-@itemx @samp{cond_len_fms@var{mode}}\n-@itemx @samp{cond_len_fnma@var{mode}}\n-@itemx @samp{cond_len_fnms@var{mode}}\n-Like @samp{cond_len_add@var{m}}, except that the conditional operation\n-takes 3 operands rather than two. For example, the vector form of\n-@samp{cond_len_fma@var{mode}} is equivalent to:\n+@mdindex @code{vec_cmpeq@var{m}@var{n}}\n+@item @samp{vec_cmpeq@var{m}@var{n}}\n+Similar to @code{vec_cmp@var{m}@var{n}} but perform equality or non-equality\n+vector comparison only. If @code{vec_cmp@var{m}@var{n}}\n+or @code{vec_cmpu@var{m}@var{n}} instruction pattern is supported,\n+it will be preferred over @code{vec_cmpeq@var{m}@var{n}}, so there is\n+no need to define this instruction pattern if the others are supported.\n \n-@smallexample\n-for (i = 0; i < GET_MODE_NUNITS (@var{m}); i++)\n- op0[i] = (i < ops[6] + ops[7] && op1[i]\n- ? fma (op2[i], op3[i], op4[i])\n- : op5[i]);\n-@end smallexample\n+@mdindex @code{vcond_mask_@var{m}@var{n}}\n+@item @samp{vcond_mask_@var{m}@var{n}}\n+Output a conditional vector move. Operand 0 is the destination to\n+receive a combination of operand 1 and operand 2, depending on the\n+mask in operand 3. Operands 0, 1, and 2 have mode @var{m} while\n+operand 3 has mode @var{n}.\n \n-@cindex @code{neg@var{mode}cc} instruction pattern\n-@item @samp{neg@var{mode}cc}\n-Similar to @samp{mov@var{mode}cc} but for conditional negation. Conditionally\n-move the negation of operand 2 or the unchanged operand 3 into operand 0\n-according to the comparison in operand 1. If the comparison is true, the negation\n-of operand 2 is moved into operand 0, otherwise operand 3 is moved.\n+Suppose that @var{m} has @var{e} elements. There are then two\n+supported forms of @var{n}. The first form is an integer or\n+boolean vector that also has @var{e} elements. In this case, each\n+element is -1 or 0, with -1 selecting elements from operand 1 and\n+0 selecting elements from operand 2. The second supported form\n+of @var{n} is a scalar integer that has at least @var{e} bits.\n+A set bit then selects from operand 1 and a clear bit selects\n+from operand 2. Bits @var{e} and above have no effect.\n \n-@cindex @code{not@var{mode}cc} instruction pattern\n-@item @samp{not@var{mode}cc}\n-Similar to @samp{neg@var{mode}cc} but for conditional complement.\n-Conditionally move the bitwise complement of operand 2 or the unchanged\n-operand 3 into operand 0 according to the comparison in operand 1.\n-If the comparison is true, the complement of operand 2 is moved into\n-operand 0, otherwise operand 3 is moved.\n+Subject to those restrictions, the behavior is equivalent to:\n \n-@cindex @code{cstore@var{mode}4} instruction pattern\n-@item @samp{cstore@var{mode}4}\n-Store zero or nonzero in operand 0 according to whether a comparison\n-is true. Operand 1 is a comparison operator. Operand 2 and operand 3\n-are the first and second operand of the comparison, respectively.\n-You specify the mode that operand 0 must have when you write the\n-@code{match_operand} expression. The compiler automatically sees which\n-mode you have used and supplies an operand of that mode.\n+@smallexample\n+for (i = 0; i < @var{e}; i++)\n+ op0[i] = op3[i] ? op1[i] : op2[i];\n+@end smallexample\n \n-The value stored for a true condition must have 1 as its low bit, or\n-else must be negative. Otherwise the instruction is not suitable and\n-you should omit it from the machine description. You describe to the\n-compiler exactly which value is stored by defining the macro\n-@code{STORE_FLAG_VALUE} (@pxref{Misc}). If a description cannot be\n-found that can be used for all the possible comparison operators, you\n-should pick one and use a @code{define_expand} to map all results\n-onto the one you chose.\n+@mdindex @code{vcond_mask_len_@var{m}@var{n}}\n+@item @samp{vcond_mask_len_@var{m}@var{n}}\n+Set each element of operand 0 to the corresponding element of operand 2\n+or operand 3. Choose operand 2 if both the element index is less than\n+operand 4 plus operand 5 and the corresponding element of operand 1\n+is nonzero:\n \n-These operations may @code{FAIL}, but should do so only in relatively\n-uncommon cases; if they would @code{FAIL} for common cases involving\n-integer comparisons, it is best to restrict the predicates to not\n-allow these operands. Likewise if a given comparison operator will\n-always fail, independent of the operands (for floating-point modes, the\n-@code{ordered_comparison_operator} predicate is often useful in this case).\n+@smallexample\n+for (i = 0; i < GET_MODE_NUNITS (@var{m}); i++)\n+ op0[i] = i < op4 + op5 && op1[i] ? op2[i] : op3[i];\n+@end smallexample\n \n-If this pattern is omitted, the compiler will generate a conditional\n-branch---for example, it may copy a constant one to the target and branching\n-around an assignment of zero to the target---or a libcall. If the predicate\n-for operand 1 only rejects some operators, it will also try reordering the\n-operands and/or inverting the result value (e.g.@: by an exclusive OR).\n-These possibilities could be cheaper or equivalent to the instructions\n-used for the @samp{cstore@var{mode}4} pattern followed by those required\n-to convert a positive result from @code{STORE_FLAG_VALUE} to 1; in this\n-case, you can and should make operand 1's predicate reject some operators\n-in the @samp{cstore@var{mode}4} pattern, or remove the pattern altogether\n-from the machine description.\n+Operands 0, 2 and 3 have mode @var{m}. Operand 1 has mode @var{n}.\n+Operands 4 and 5 have a target-dependent scalar integer mode.\n \n-@cindex @code{tbranch_@var{op}@var{mode}3} instruction pattern\n-@item @samp{tbranch_@var{op}@var{mode}3}\n-Conditional branch instruction combined with a bit test-and-compare\n-instruction. Operand 0 is the operand of the comparison. Operand 1 is the bit\n-position of Operand 1 to test. Operand 3 is the @code{code_label} to jump to.\n-@var{op} is one of @var{eq} or @var{ne}.\n+@mdindex @code{maskload@var{m}@var{n}}\n+@item @samp{maskload@var{m}@var{n}}\n+Perform a masked load of vector from memory operand 1 of mode @var{m}\n+into register operand 0. The mask is provided in register operand 2 of\n+mode @var{n}. Operand 3 (the ``else value'') is of mode @var{m} and\n+specifies which value is loaded when the mask is unset.\n+The predicate of operand 3 must only accept the else values that the target\n+actually supports. Currently three values are attempted, zero, -1, and\n+undefined. GCC handles an else value of zero more efficiently than -1 or\n+undefined.\n \n-@cindex @code{cbranch@var{mode}4} instruction pattern\n-@item @samp{cbranch@var{mode}4}\n-Conditional branch instruction combined with a compare instruction.\n-Operand 0 is a comparison operator. Operand 1 and operand 2 are the\n-first and second operands of the comparison, respectively. Operand 3\n-is the @code{code_label} to jump to. For vectors this optab is only used for\n-comparisons of VECTOR_BOOLEAN_TYPE_P values and it never called for\n-data-registers. Data vector operands should use one of the patterns below\n-instead.\n+This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{vec_cbranch_any@var{mode}} instruction pattern\n-@item @samp{vec_cbranch_any@var{mode}}\n-Conditional branch instruction based on a vector compare that branches\n-when at least one of the elementwise comparisons of the two input\n-vectors is true.\n-Operand 0 is a comparison operator. Operand 1 and operand 2 are the\n-first and second operands of the comparison, respectively. Operand 3\n-is the @code{code_label} to jump to.\n+@mdindex @code{maskstore@var{m}@var{n}}\n+@item @samp{maskstore@var{m}@var{n}}\n+Perform a masked store of vector from register operand 1 of mode @var{m}\n+into memory operand 0. Mask is provided in register operand 2 of\n+mode @var{n}.\n \n-@cindex @code{vec_cbranch_all@var{mode}} instruction pattern\n-@item @samp{vec_cbranch_all@var{mode}}\n-Conditional branch instruction based on a vector compare that branches\n-when all of the elementwise comparisons of the two input vectors is true.\n-Operand 0 is a comparison operator. Operand 1 and operand 2 are the\n-first and second operands of the comparison, respectively. Operand 3\n-is the @code{code_label} to jump to.\n+This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{cond_vec_cbranch_any@var{mode}} instruction pattern\n-@item @samp{cond_vec_cbranch_any@var{mode}}\n-Masked conditional branch instruction based on a vector compare that branches\n-when at least one of the elementwise comparisons of the two input\n-vectors is true.\n-Operand 0 is a comparison operator. Operand 1 is the mask operand.\n-Operand 2 and operand 3 are the first and second operands of the comparison,\n-respectively. Operand 5 is the @code{code_label} to jump to. Inactive lanes in\n-the mask operand should not influence the decision to branch.\n+@mdindex @code{len_load_@var{m}}\n+@item @samp{len_load_@var{m}}\n+Load (operand 3 + operand 4) elements from memory operand 1\n+into vector register operand 0. Operands 0 and 1 have mode @var{m},\n+which must be a vector mode. Operand 3 has whichever integer mode the\n+target prefers. Operand 2 (the ``else value'') is of mode @var{m} and\n+specifies which value is loaded for the remaining elements. The predicate\n+of operand 2 must only accept the else values that the target actually\n+supports. Operand 4 conceptually has mode @code{QI}.\n \n-@cindex @code{cond_vec_cbranch_all@var{mode}} instruction pattern\n-@item @samp{cond_vec_cbranch_all@var{mode}}\n-Masked conditional branch instruction based on a vector compare that branches\n-when all of the elementwise comparisons of the two input vectors is true.\n-Operand 0 is a comparison operator. Operand 1 is the mask operand.\n-Operand 2 and operand 3 are the first and second operands of the comparison,\n-respectively. Operand 5 is the @code{code_label} to jump to. Inactive lanes in\n-the mask operand should not influence the decision to branch.\n+Operand 3 can be a variable or a constant amount. Operand 4 specifies a\n+constant bias: it is either a constant 0 or a constant -1. The predicate on\n+operand 4 must only accept the bias values that the target actually supports.\n+GCC handles a bias of 0 more efficiently than a bias of -1.\n \n-@cindex @code{cond_len_vec_cbranch_any@var{mode}} instruction pattern\n-@item @samp{cond_len_vec_cbranch_any@var{mode}}\n-Len based conditional branch instruction based on a vector compare that branches\n-when at least one of the elementwise comparisons of the two input\n-vectors is true.\n-Operand 0 is a comparison operator. Operand 1 is the mask operand. Operand 2\n-and operand 3 are the first and second operands of the comparison, respectively.\n-Operand 4 is the len operand and Operand 5 is the bias operand. Operand 6 is\n-the @code{code_label} to jump to. Inactive lanes in the mask operand should not\n-influence the decision to branch.\n+If (operand 3 + operand 4) exceeds the number of elements in mode\n+@var{m}, the behavior is undefined.\n \n-@cindex @code{cond_len_vec_cbranch_all@var{mode}} instruction pattern\n-@item @samp{cond_len_vec_cbranch_all@var{mode}}\n-Len based conditional branch instruction based on a vector compare that branches\n-when all of the elementwise comparisons of the two input vectors is true.\n-Operand 0 is a comparison operator. Operand 1 is the mask operand. Operand 2\n-and operand 3 are the first and second operands of the comparison, respectively.\n-Operand 4 is the len operand and Operand 5 is the bias operand. Operand 6 is\n-the @code{code_label} to jump to. Inactive lanes in the mask operand should not\n-influence the decision to branch.\n+If the target prefers the length to be measured in bytes rather than\n+elements, it should only implement this pattern for vectors of @code{QI}\n+elements.\n \n-@cindex @code{jump} instruction pattern\n-@item @samp{jump}\n-A jump inside a function; an unconditional branch. Operand 0 is the\n-@code{code_label} to jump to. This pattern name is mandatory on all\n-machines.\n+This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{call} instruction pattern\n-@item @samp{call}\n-Subroutine call instruction returning no value. Operand 0 is the\n-function to call; operand 1 is the number of bytes of arguments pushed\n-as a @code{const_int}. Operand 2 is the result of calling the target\n-hook @code{TARGET_FUNCTION_ARG} with the second argument @code{arg}\n-yielding true for @code{arg.end_marker_p ()}, in a call after all\n-parameters have been passed to that hook. By default this is the first\n-register beyond those used for arguments in the call, or @code{NULL} if\n-all the argument-registers are used in the call.\n+@mdindex @code{len_store_@var{m}}\n+@item @samp{len_store_@var{m}}\n+Store (operand 2 + operand 3) vector elements from vector register operand 1\n+into memory operand 0, leaving the other elements of\n+operand 0 unchanged. Operands 0 and 1 have mode @var{m}, which must be\n+a vector mode. Operand 2 has whichever integer mode the target prefers.\n+Operand 3 conceptually has mode @code{QI}.\n \n-On most machines, operand 2 is not actually stored into the RTL\n-pattern. It is supplied for the sake of some RISC machines which need\n-to put this information into the assembler code; they can put it in\n-the RTL instead of operand 1.\n+Operand 2 can be a variable or a constant amount. Operand 3 specifies a\n+constant bias: it is either a constant 0 or a constant -1. The predicate on\n+operand 3 must only accept the bias values that the target actually supports.\n+GCC handles a bias of 0 more efficiently than a bias of -1.\n \n-Operand 0 should be a @code{mem} RTX whose address is the address of the\n-function. Note, however, that this address can be a @code{symbol_ref}\n-expression even if it would not be a legitimate memory address on the\n-target machine. If it is also not a valid argument for a call\n-instruction, the pattern for this operation should be a\n-@code{define_expand} (@pxref{Expander Definitions}) that places the\n-address into a register and uses that register in the call instruction.\n+If (operand 2 + operand 3) exceeds the number of elements in mode\n+@var{m}, the behavior is undefined.\n \n-@cindex @code{call_value} instruction pattern\n-@item @samp{call_value}\n-Subroutine call instruction returning a value. Operand 0 is the hard\n-register in which the value is returned. There are three more\n-operands, the same as the three operands of the @samp{call}\n-instruction (but with numbers increased by one).\n+If the target prefers the length to be measured in bytes\n+rather than elements, it should only implement this pattern for vectors\n+of @code{QI} elements.\n \n-Subroutines that return @code{BLKmode} objects use the @samp{call}\n-insn.\n+This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{call_pop} instruction pattern\n-@cindex @code{call_value_pop} instruction pattern\n-@item @samp{call_pop}, @samp{call_value_pop}\n-Similar to @samp{call} and @samp{call_value}, except used if defined and\n-if @code{RETURN_POPS_ARGS} is nonzero. They should emit a @code{parallel}\n-that contains both the function call and a @code{set} to indicate the\n-adjustment made to the frame pointer.\n+@mdindex @code{mask_len_load@var{m}@var{n}}\n+@item @samp{mask_len_load@var{m}@var{n}}\n+Perform a masked load from the memory location pointed to by operand 1\n+into register operand 0. (operand 3 + operand 4) elements are loaded from\n+memory and other elements in operand 0 are set to undefined values.\n+This is a combination of len_load and maskload.\n+Operands 0 and 1 have mode @var{m}, which must be a vector mode. Operand 3\n+has whichever integer mode the target prefers. A mask is specified in\n+operand 2 which must be of type @var{n}. The mask has lower precedence than\n+the length and is itself subject to length masking,\n+i.e. only mask indices < (operand 4 + operand 5) are used.\n+Operand 3 is an else operand similar to the one in @code{maskload}.\n+Operand 4 conceptually has mode @code{QI}.\n \n-For machines where @code{RETURN_POPS_ARGS} can be nonzero, the use of these\n-patterns increases the number of functions for which the frame pointer\n-can be eliminated, if desired.\n+Operand 4 can be a variable or a constant amount. Operand 5 specifies a\n+constant bias: it is either a constant 0 or a constant -1. The predicate on\n+operand 5 must only accept the bias values that the target actually supports.\n+GCC handles a bias of 0 more efficiently than a bias of -1.\n \n-@cindex @code{untyped_call} instruction pattern\n-@item @samp{untyped_call}\n-Subroutine call instruction returning a value of any type. Operand 0 is\n-the function to call; operand 1 is a memory location where the result of\n-calling the function is to be stored; operand 2 is a @code{parallel}\n-expression where each element is a @code{set} expression that indicates\n-the saving of a function return value into the result block.\n+If (operand 4 + operand 5) exceeds the number of elements in mode\n+@var{m}, the behavior is undefined.\n \n-This instruction pattern should be defined to support\n-@code{__builtin_apply} on machines where special instructions are needed\n-to call a subroutine with arbitrary arguments or to save the value\n-returned. This instruction pattern is required on machines that have\n-multiple registers that can hold a return value\n-(i.e.@: @code{FUNCTION_VALUE_REGNO_P} is true for more than one register).\n+If the target prefers the length to be measured in bytes\n+rather than elements, it should only implement this pattern for vectors\n+of @code{QI} elements.\n \n-@cindex @code{return} instruction pattern\n-@item @samp{return}\n-Subroutine return instruction. This instruction pattern name should be\n-defined only if a single instruction can do all the work of returning\n-from a function.\n+This pattern is not allowed to @code{FAIL}.\n \n-Like the @samp{mov@var{m}} patterns, this pattern is also used after the\n-RTL generation phase. In this case it is to support machines where\n-multiple instructions are usually needed to return from a function, but\n-some class of functions only requires one instruction to implement a\n-return. Normally, the applicable functions are those which do not need\n-to save any registers or allocate stack space.\n+@mdindex @code{mask_len_store@var{m}@var{n}}\n+@item @samp{mask_len_store@var{m}@var{n}}\n+Perform a masked store from vector register operand 1 into memory operand 0.\n+(operand 3 + operand 4) elements are stored to memory\n+and leave the other elements of operand 0 unchanged.\n+This is a combination of len_store and maskstore.\n+Operands 0 and 1 have mode @var{m}, which must be a vector mode. Operand 3 has whichever\n+integer mode the target prefers. A mask is specified in operand 2 which must be\n+of type @var{n}. The mask has lower precedence than the length and is itself subject to\n+length masking, i.e. only mask indices < (operand 3 + operand 4) are used.\n+Operand 4 conceptually has mode @code{QI}.\n \n-It is valid for this pattern to expand to an instruction using\n-@code{simple_return} if no epilogue is required.\n+Operand 2 can be a variable or a constant amount. Operand 3 specifies a\n+constant bias: it is either a constant 0 or a constant -1. The predicate on\n+operand 4 must only accept the bias values that the target actually supports.\n+GCC handles a bias of 0 more efficiently than a bias of -1.\n \n-@cindex @code{simple_return} instruction pattern\n-@item @samp{simple_return}\n-Subroutine return instruction. This instruction pattern name should be\n-defined only if a single instruction can do all the work of returning\n-from a function on a path where no epilogue is required. This pattern\n-is very similar to the @code{return} instruction pattern, but it is emitted\n-only by the shrink-wrapping optimization on paths where the function\n-prologue has not been executed, and a function return should occur without\n-any of the effects of the epilogue. Additional uses may be introduced on\n-paths where both the prologue and the epilogue have executed.\n+If (operand 2 + operand 4) exceeds the number of elements in mode\n+@var{m}, the behavior is undefined.\n \n-@findex reload_completed\n-@findex leaf_function_p\n-For such machines, the condition specified in this pattern should only\n-be true when @code{reload_completed} is nonzero and the function's\n-epilogue would only be a single instruction. For machines with register\n-windows, the routine @code{leaf_function_p} may be used to determine if\n-a register window push is required.\n+If the target prefers the length to be measured in bytes\n+rather than elements, it should only implement this pattern for vectors\n+of @code{QI} elements.\n \n-Machines that have conditional return instructions should define patterns\n-such as\n+This pattern is not allowed to @code{FAIL}.\n \n-@smallexample\n-(define_insn \"\"\n- [(set (pc)\n- (if_then_else (match_operator\n- 0 \"comparison_operator\"\n- [(reg:CC CC_REG) (const_int 0)])\n- (return)\n- (pc)))]\n- \"@var{condition}\"\n- \"@dots{}\")\n-@end smallexample\n+@mdindex @code{vec_perm@var{m}}\n+@item @samp{vec_perm@var{m}}\n+Output a (variable) vector permutation. Operand 0 is the destination\n+to receive elements from operand 1 and operand 2, which are of mode\n+@var{m}. Operand 3 is the @dfn{selector}. It is an integral mode\n+vector of the same width and number of elements as mode @var{m}.\n \n-where @var{condition} would normally be the same condition specified on the\n-named @samp{return} pattern.\n+The input elements are numbered from 0 in operand 1 through\n+@math{2*@var{N}-1} in operand 2. The elements of the selector must\n+be computed modulo @math{2*@var{N}}. Note that if\n+@code{rtx_equal_p(operand1, operand2)}, this can be implemented\n+with just operand 1 and selector elements modulo @var{N}.\n \n-@cindex @code{untyped_return} instruction pattern\n-@item @samp{untyped_return}\n-Untyped subroutine return instruction. This instruction pattern should\n-be defined to support @code{__builtin_return} on machines where special\n-instructions are needed to return a value of any type.\n+In order to make things easy for a number of targets, if there is no\n+@samp{vec_perm} pattern for mode @var{m}, but there is for mode @var{q}\n+where @var{q} is a vector of @code{QImode} of the same width as @var{m},\n+the middle-end will lower the mode @var{m} @code{VEC_PERM_EXPR} to\n+mode @var{q}.\n \n-Operand 0 is a memory location where the result of calling a function\n-with @code{__builtin_apply} is stored; operand 1 is a @code{parallel}\n-expression where each element is a @code{set} expression that indicates\n-the restoring of a function return value from the result block.\n+See also @code{TARGET_VECTORIZER_VEC_PERM_CONST}, which performs\n+the analogous operation for constant selectors.\n \n-@cindex @code{nop} instruction pattern\n-@item @samp{nop}\n-No-op instruction. This instruction pattern name should always be defined\n-to output a no-op in assembler code. @code{(const_int 0)} will do as an\n-RTL pattern.\n+@mdindex @code{reduc_smin_scal_@var{m}}\n+@mdindex @code{reduc_smax_scal_@var{m}}\n+@item @samp{reduc_smin_scal_@var{m}}, @samp{reduc_smax_scal_@var{m}}\n+Find the signed minimum/maximum of the elements of a vector. The vector is\n+operand 1, and operand 0 is the scalar result, with mode equal to the mode of\n+the elements of the input vector.\n \n-@cindex @code{indirect_jump} instruction pattern\n-@item @samp{indirect_jump}\n-An instruction to jump to an address which is operand zero.\n-This pattern name is mandatory on all machines.\n+@mdindex @code{reduc_umin_scal_@var{m}}\n+@mdindex @code{reduc_umax_scal_@var{m}}\n+@item @samp{reduc_umin_scal_@var{m}}, @samp{reduc_umax_scal_@var{m}}\n+Find the unsigned minimum/maximum of the elements of a vector. The vector is\n+operand 1, and operand 0 is the scalar result, with mode equal to the mode of\n+the elements of the input vector.\n \n-@cindex @code{casesi} instruction pattern\n-@item @samp{casesi}\n-Instruction to jump through a dispatch table, including bounds checking.\n-This instruction takes five operands:\n+@mdindex @code{reduc_fmin_scal_@var{m}}\n+@mdindex @code{reduc_fmax_scal_@var{m}}\n+@item @samp{reduc_fmin_scal_@var{m}}, @samp{reduc_fmax_scal_@var{m}}\n+Find the floating-point minimum/maximum of the elements of a vector,\n+using the same rules as @code{fmin@var{m}3} and @code{fmax@var{m}3}.\n+Operand 1 is a vector of mode @var{m} and operand 0 is the scalar\n+result, which has mode @code{GET_MODE_INNER (@var{m})}.\n \n-@enumerate\n-@item\n-The index to dispatch on, which has mode @code{SImode}.\n+@mdindex @code{reduc_plus_scal_@var{m}}\n+@item @samp{reduc_plus_scal_@var{m}}\n+Compute the sum of the elements of a vector. The vector is operand 1, and\n+operand 0 is the scalar result, with mode equal to the mode of the elements of\n+the input vector.\n \n-@item\n-The lower bound for indices in the table, an integer constant.\n+@mdindex @code{reduc_and_scal_@var{m}}\n+@mdindex @code{reduc_ior_scal_@var{m}}\n+@mdindex @code{reduc_xor_scal_@var{m}}\n+@item @samp{reduc_and_scal_@var{m}}\n+@itemx @samp{reduc_ior_scal_@var{m}}\n+@itemx @samp{reduc_xor_scal_@var{m}}\n+Compute the bitwise @code{AND}/@code{IOR}/@code{XOR} reduction of the elements\n+of a vector of mode @var{m}. Operand 1 is the vector input and operand 0\n+is the scalar result. The mode of the scalar result is the same as one\n+element of @var{m}.\n \n-@item\n-The total range of indices in the table---the largest index\n-minus the smallest one (both inclusive).\n+@mdindex @code{reduc_sbool_and_scal_@var{m}}\n+@mdindex @code{reduc_sbool_ior_scal_@var{m}}\n+@mdindex @code{reduc_sbool_xor_scal_@var{m}}\n+@item @samp{reduc_sbool_and_scal_@var{m}}\n+@itemx @samp{reduc_sbool_ior_scal_@var{m}}\n+@itemx @samp{reduc_sbool_xor_scal_@var{m}}\n+Compute the bitwise @code{AND}/@code{IOR}/@code{XOR} reduction of the elements\n+of a vector boolean of mode @var{m}. Operand 1 is the vector input and\n+operand 0 is the scalar result. The mode of the scalar result is @var{QImode}\n+with its value either zero or one. If mode @var{m} is a scalar integer mode\n+then operand 2 is the number of elements in the input vector to provide\n+disambiguation for the case @var{m} is ambiguous.\n \n-@item\n-A label that precedes the table itself.\n+@mdindex @code{extract_last_@var{m}}\n+@item @code{extract_last_@var{m}}\n+Find the last set bit in mask operand 1 and extract the associated element\n+of vector operand 2. Store the result in scalar operand 0. Operand 2\n+has vector mode @var{m} while operand 0 has the mode appropriate for one\n+element of @var{m}. Operand 1 has the usual mask mode for vectors of mode\n+@var{m}; see @code{TARGET_VECTORIZE_GET_MASK_MODE}.\n \n-@item\n-A label to jump to if the index has a value outside the bounds.\n-@end enumerate\n+@mdindex @code{fold_extract_last_@var{m}}\n+@item @code{fold_extract_last_@var{m}}\n+If any bits of mask operand 2 are set, find the last set bit, extract\n+the associated element from vector operand 3, and store the result\n+in operand 0. Store operand 1 in operand 0 otherwise. Operand 3\n+has mode @var{m} and operands 0 and 1 have the mode appropriate for\n+one element of @var{m}. Operand 2 has the usual mask mode for vectors\n+of mode @var{m}; see @code{TARGET_VECTORIZE_GET_MASK_MODE}.\n \n-The table is an @code{addr_vec} or @code{addr_diff_vec} inside of a\n-@code{jump_table_data}. The number of elements in the table is one plus the\n-difference between the upper bound and the lower bound.\n+@mdindex @code{len_fold_extract_last_@var{m}}\n+@item @code{len_fold_extract_last_@var{m}}\n+Like @samp{fold_extract_last_@var{m}}, but takes an extra length operand as\n+operand 4 and an extra bias operand as operand 5. The last associated element\n+is extracted should have the index i < len (operand 4) + bias (operand 5).\n \n-@cindex @code{tablejump} instruction pattern\n-@item @samp{tablejump}\n-Instruction to jump to a variable address. This is a low-level\n-capability which can be used to implement a dispatch table when there\n-is no @samp{casesi} pattern.\n+@mdindex @code{fold_left_plus_@var{m}}\n+@item @code{fold_left_plus_@var{m}}\n+Take scalar operand 1 and successively add each element from vector\n+operand 2. Store the result in scalar operand 0. The vector has\n+mode @var{m} and the scalars have the mode appropriate for one\n+element of @var{m}. The operation is strictly in-order: there is\n+no reassociation.\n \n-This pattern requires two operands: the address or offset, and a label\n-which should immediately precede the jump table. If the macro\n-@code{CASE_VECTOR_PC_RELATIVE} evaluates to a nonzero value then the first\n-operand is an offset which counts from the address of the table; otherwise,\n-it is an absolute address to jump to. In either case, the first operand has\n-mode @code{Pmode}.\n+@mdindex @code{mask_fold_left_plus_@var{m}}\n+@item @code{mask_fold_left_plus_@var{m}}\n+Like @samp{fold_left_plus_@var{m}}, but takes an additional mask operand\n+(operand 3) that specifies which elements of the source vector should be added.\n \n-The @samp{tablejump} insn is always the last insn before the jump\n-table it uses. Its assembler code normally has no need to use the\n-second operand, but you should incorporate it in the RTL pattern so\n-that the jump optimizer will not delete the table as unreachable code.\n+@mdindex @code{mask_len_fold_left_plus_@var{m}}\n+@item @code{mask_len_fold_left_plus_@var{m}}\n+Like @samp{fold_left_plus_@var{m}}, but takes an additional mask operand\n+(operand 3), len operand (operand 4) and bias operand (operand 5) that\n+performs following operations strictly in-order (no reassociation):\n \n+@smallexample\n+operand0 = operand1;\n+for (i = 0; i < LEN + BIAS; i++)\n+ if (operand3[i])\n+ operand0 += operand2[i];\n+@end smallexample\n \n-@cindex @code{doloop_end} instruction pattern\n-@item @samp{doloop_end}\n-Conditional branch instruction that decrements a register and\n-jumps if the register is nonzero. Operand 0 is the register to\n-decrement and test; operand 1 is the label to jump to if the\n-register is nonzero.\n-@xref{Looping Patterns}.\n+@mdindex @code{sdot_prod@var{m}@var{n}}\n+@item @samp{sdot_prod@var{m}@var{n}}\n \n-This optional instruction pattern should be defined for machines with\n-low-overhead looping instructions as the loop optimizer will try to\n-modify suitable loops to utilize it. The target hook\n-@code{TARGET_CAN_USE_DOLOOP_P} controls the conditions under which\n-low-overhead loops can be used.\n+Multiply operand 1 by operand 2 without loss of precision, given that\n+both operands contain signed elements. Add each product to the overlapping\n+element of operand 3 and store the result in operand 0. Operands 0 and 3\n+have mode @var{m} and operands 1 and 2 have mode @var{n}, with @var{n}\n+having narrower elements than @var{m}.\n \n-@cindex @code{doloop_begin} instruction pattern\n-@item @samp{doloop_begin}\n-Companion instruction to @code{doloop_end} required for machines that\n-need to perform some initialization, such as loading a special counter\n-register. Operand 1 is the associated @code{doloop_end} pattern and\n-operand 0 is the register that it decrements.\n+Semantically the expressions perform the multiplication in the following signs\n \n-If initialization insns do not always need to be emitted, use a\n-@code{define_expand} (@pxref{Expander Definitions}) and make it fail.\n+@smallexample\n+sdot<signed op0, signed op1, signed op2, signed op3> ==\n+ op0 = sign-ext (op1) * sign-ext (op2) + op3\n+@dots{}\n+@end smallexample\n \n-@cindex @code{canonicalize_funcptr_for_compare} instruction pattern\n-@item @samp{canonicalize_funcptr_for_compare}\n-Canonicalize the function pointer in operand 1 and store the result\n-into operand 0.\n+@mdindex @code{udot_prod@var{m}@var{n}}\n+@item @samp{udot_prod@var{m}@var{n}}\n \n-Operand 0 is always a @code{reg} and has mode @code{Pmode}; operand 1\n-may be a @code{reg}, @code{mem}, @code{symbol_ref}, @code{const_int}, etc\n-and also has mode @code{Pmode}.\n+Multiply operand 1 by operand 2 without loss of precision, given that\n+both operands contain unsigned elements. Add each product to the overlapping\n+element of operand 3 and store the result in operand 0. Operands 0 and 3\n+have mode @var{m} and operands 1 and 2 have mode @var{n}, with @var{n}\n+having narrower elements than @var{m}.\n \n-Canonicalization of a function pointer usually involves computing\n-the address of the function which would be called if the function\n-pointer were used in an indirect call.\n+Semantically the expressions perform the multiplication in the following signs\n \n-Only define this pattern if function pointers on the target machine\n-can have different values but still call the same function when\n-used in an indirect call.\n+@smallexample\n+udot<unsigned op0, unsigned op1, unsigned op2, unsigned op3> ==\n+ op0 = zero-ext (op1) * zero-ext (op2) + op3\n+@dots{}\n+@end smallexample\n \n-@cindex @code{save_stack_block} instruction pattern\n-@cindex @code{save_stack_function} instruction pattern\n-@cindex @code{save_stack_nonlocal} instruction pattern\n-@cindex @code{restore_stack_block} instruction pattern\n-@cindex @code{restore_stack_function} instruction pattern\n-@cindex @code{restore_stack_nonlocal} instruction pattern\n-@item @samp{save_stack_block}\n-@itemx @samp{save_stack_function}\n-@itemx @samp{save_stack_nonlocal}\n-@itemx @samp{restore_stack_block}\n-@itemx @samp{restore_stack_function}\n-@itemx @samp{restore_stack_nonlocal}\n-Most machines save and restore the stack pointer by copying it to or\n-from an object of mode @code{Pmode}. Do not define these patterns on\n-such machines.\n+@mdindex @code{usdot_prod@var{m}@var{n}}\n+@item @samp{usdot_prod@var{m}@var{n}}\n+Compute the sum of the products of elements of different signs.\n+Multiply operand 1 by operand 2 without loss of precision, given that operand 1\n+is unsigned and operand 2 is signed. Add each product to the overlapping\n+element of operand 3 and store the result in operand 0. Operands 0 and 3 have\n+mode @var{m} and operands 1 and 2 have mode @var{n}, with @var{n} having\n+narrower elements than @var{m}.\n \n-Some machines require special handling for stack pointer saves and\n-restores. On those machines, define the patterns corresponding to the\n-non-standard cases by using a @code{define_expand} (@pxref{Expander\n-Definitions}) that produces the required insns. The three types of\n-saves and restores are:\n+Semantically the expressions perform the multiplication in the following signs\n \n-@enumerate\n-@item\n-@samp{save_stack_block} saves the stack pointer at the start of a block\n-that allocates a variable-sized object, and @samp{restore_stack_block}\n-restores the stack pointer when the block is exited.\n+@smallexample\n+usdot<signed op0, unsigned op1, signed op2, signed op3> ==\n+ op0 = ((signed-conv) zero-ext (op1)) * sign-ext (op2) + op3\n+@dots{}\n+@end smallexample\n \n-@item\n-@samp{save_stack_function} and @samp{restore_stack_function} do a\n-similar job for the outermost block of a function and are used when the\n-function allocates variable-sized objects or calls @code{alloca}. Only\n-the epilogue uses the restored stack pointer, allowing a simpler save or\n-restore sequence on some machines.\n+@mdindex @code{vec_shl_insert_@var{m}}\n+@item @samp{vec_shl_insert_@var{m}}\n+Shift the elements in vector input operand 1 left one element (i.e.@:\n+away from element 0) and fill the vacated element 0 with the scalar\n+in operand 2. Store the result in vector output operand 0. Operands\n+0 and 1 have mode @var{m} and operand 2 has the mode appropriate for\n+one element of @var{m}.\n \n-@item\n-@samp{save_stack_nonlocal} is used in functions that contain labels\n-branched to by nested functions. It saves the stack pointer in such a\n-way that the inner function can use @samp{restore_stack_nonlocal} to\n-restore the stack pointer. The compiler generates code to restore the\n-frame and argument pointer registers, but some machines require saving\n-and restoring additional data such as register window information or\n-stack backchains. Place insns in these patterns to save and restore any\n-such required data.\n-@end enumerate\n+@mdindex @code{vec_shl_@var{m}}\n+@item @samp{vec_shl_@var{m}}\n+Whole vector left shift in bits, i.e.@: away from element 0.\n+Operand 1 is a vector to be shifted.\n+Operand 2 is an integer shift amount in bits.\n+Operand 0 is where the resulting shifted vector is stored.\n+The output and input vectors should have the same modes.\n+\n+@mdindex @code{vec_shr_@var{m}}\n+@item @samp{vec_shr_@var{m}}\n+Whole vector right shift in bits, i.e.@: towards element 0.\n+Operand 1 is a vector to be shifted.\n+Operand 2 is an integer shift amount in bits.\n+Operand 0 is where the resulting shifted vector is stored.\n+The output and input vectors should have the same modes.\n \n-When saving the stack pointer, operand 0 is the save area and operand 1\n-is the stack pointer. The mode used to allocate the save area defaults\n-to @code{Pmode} but you can override that choice by defining the\n-@code{STACK_SAVEAREA_MODE} macro (@pxref{Storage Layout}). You must\n-specify an integral mode, or @code{VOIDmode} if no save area is needed\n-for a particular type of save (either because no save is needed or\n-because a machine-specific save area can be used). Operand 0 is the\n-stack pointer and operand 1 is the save area for restore operations. If\n-@samp{save_stack_block} is defined, operand 0 must not be\n-@code{VOIDmode} since these saves can be arbitrarily nested.\n+@mdindex @code{vec_pack_trunc_@var{m}}\n+@item @samp{vec_pack_trunc_@var{m}}\n+Narrow (demote) and merge the elements of two vectors. Operands 1 and 2\n+are vectors of the same mode having N integral or floating point elements\n+of size S@. Operand 0 is the resulting vector in which 2*N elements of\n+size S/2 are concatenated after narrowing them down using truncation.\n \n-A save area is a @code{mem} that is at a constant offset from\n-@code{virtual_stack_vars_rtx} when the stack pointer is saved for use by\n-nonlocal gotos and a @code{reg} in the other two cases.\n+@mdindex @code{vec_pack_sbool_trunc_@var{m}}\n+@item @samp{vec_pack_sbool_trunc_@var{m}}\n+Narrow and merge the elements of two vectors. Operands 1 and 2 are vectors\n+of the same type having N boolean elements. Operand 0 is the resulting\n+vector in which 2*N elements are concatenated. The last operand (operand 3)\n+is the number of elements in the output vector 2*N as a @code{CONST_INT}.\n+This instruction pattern is used when all the vector input and output\n+operands have the same scalar mode @var{m} and thus using\n+@code{vec_pack_trunc_@var{m}} would be ambiguous.\n \n-@cindex @code{allocate_stack} instruction pattern\n-@item @samp{allocate_stack}\n-Subtract (or add if @code{STACK_GROWS_DOWNWARD} is undefined) operand 1 from\n-the stack pointer to create space for dynamically allocated data.\n+@mdindex @code{vec_pack_ssat_@var{m}}\n+@mdindex @code{vec_pack_usat_@var{m}}\n+@item @samp{vec_pack_ssat_@var{m}}, @samp{vec_pack_usat_@var{m}}\n+Narrow (demote) and merge the elements of two vectors. Operands 1 and 2\n+are vectors of the same mode having N integral elements of size S.\n+Operand 0 is the resulting vector in which the elements of the two input\n+vectors are concatenated after narrowing them down using signed/unsigned\n+saturating arithmetic.\n \n-Store the resultant pointer to this space into operand 0. If you\n-are allocating space from the main stack, do this by emitting a\n-move insn to copy @code{virtual_stack_dynamic_rtx} to operand 0.\n-If you are allocating the space elsewhere, generate code to copy the\n-location of the space to operand 0. In the latter case, you must\n-ensure this space gets freed when the corresponding space on the main\n-stack is free.\n+@mdindex @code{vec_pack_sfix_trunc_@var{m}}\n+@mdindex @code{vec_pack_ufix_trunc_@var{m}}\n+@item @samp{vec_pack_sfix_trunc_@var{m}}, @samp{vec_pack_ufix_trunc_@var{m}}\n+Narrow, convert to signed/unsigned integral type and merge the elements\n+of two vectors. Operands 1 and 2 are vectors of the same mode having N\n+floating point elements of size S@. Operand 0 is the resulting vector\n+in which 2*N elements of size S/2 are concatenated.\n \n-Do not define this pattern if all that must be done is the subtraction.\n-Some machines require other operations such as stack probes or\n-maintaining the back chain. Define this pattern to emit those\n-operations in addition to updating the stack pointer.\n+@mdindex @code{vec_packs_float_@var{m}}\n+@mdindex @code{vec_packu_float_@var{m}}\n+@item @samp{vec_packs_float_@var{m}}, @samp{vec_packu_float_@var{m}}\n+Narrow, convert to floating point type and merge the elements\n+of two vectors. Operands 1 and 2 are vectors of the same mode having N\n+signed/unsigned integral elements of size S@. Operand 0 is the resulting vector\n+in which 2*N elements of size S/2 are concatenated.\n \n-@cindex @code{check_stack} instruction pattern\n-@item @samp{check_stack}\n-If stack checking (@pxref{Stack Checking}) cannot be done on your system by\n-probing the stack, define this pattern to perform the needed check and signal\n-an error if the stack has overflowed. The single operand is the address in\n-the stack farthest from the current stack pointer that you need to validate.\n-Normally, on platforms where this pattern is needed, you would obtain the\n-stack limit from a global or thread-specific variable or register.\n+@mdindex @code{vec_unpacks_hi_@var{m}}\n+@mdindex @code{vec_unpacks_lo_@var{m}}\n+@item @samp{vec_unpacks_hi_@var{m}}, @samp{vec_unpacks_lo_@var{m}}\n+Extract and widen (promote) the high/low part of a vector of signed\n+integral or floating point elements. The input vector (operand 1) has N\n+elements of size S@. Widen (promote) the high/low elements of the vector\n+using signed or floating point extension and place the resulting N/2\n+values of size 2*S in the output vector (operand 0).\n \n-@cindex @code{probe_stack_address} instruction pattern\n-@item @samp{probe_stack_address}\n-If stack checking (@pxref{Stack Checking}) can be done on your system by\n-probing the stack but without the need to actually access it, define this\n-pattern and signal an error if the stack has overflowed. The single operand\n-is the memory address in the stack that needs to be probed.\n+@mdindex @code{vec_unpacku_hi_@var{m}}\n+@mdindex @code{vec_unpacku_lo_@var{m}}\n+@item @samp{vec_unpacku_hi_@var{m}}, @samp{vec_unpacku_lo_@var{m}}\n+Extract and widen (promote) the high/low part of a vector of unsigned\n+integral elements. The input vector (operand 1) has N elements of size S.\n+Widen (promote) the high/low elements of the vector using zero extension and\n+place the resulting N/2 values of size 2*S in the output vector (operand 0).\n \n-@cindex @code{probe_stack} instruction pattern\n-@item @samp{probe_stack}\n-If stack checking (@pxref{Stack Checking}) can be done on your system by\n-probing the stack but doing it with a ``store zero'' instruction is not valid\n-or optimal, define this pattern to do the probing differently and signal an\n-error if the stack has overflowed. The single operand is the memory reference\n-in the stack that needs to be probed.\n+@mdindex @code{vec_unpacks_sbool_hi_@var{m}}\n+@mdindex @code{vec_unpacks_sbool_lo_@var{m}}\n+@item @samp{vec_unpacks_sbool_hi_@var{m}}, @samp{vec_unpacks_sbool_lo_@var{m}}\n+Extract the high/low part of a vector of boolean elements that have scalar\n+mode @var{m}. The input vector (operand 1) has N elements, the output\n+vector (operand 0) has N/2 elements. The last operand (operand 2) is the\n+number of elements of the input vector N as a @code{CONST_INT}. These\n+patterns are used if both the input and output vectors have the same scalar\n+mode @var{m} and thus using @code{vec_unpacks_hi_@var{m}} or\n+@code{vec_unpacks_lo_@var{m}} would be ambiguous.\n \n-@cindex @code{nonlocal_goto} instruction pattern\n-@item @samp{nonlocal_goto}\n-Emit code to generate a non-local goto, e.g., a jump from one function\n-to a label in an outer function. This pattern has four arguments,\n-each representing a value to be used in the jump. The first\n-argument is to be loaded into the frame pointer, the second is\n-the address to branch to (code to dispatch to the actual label),\n-the third is the address of a location where the stack is saved,\n-and the last is the address of the label, to be placed in the\n-location for the incoming static chain.\n+@mdindex @code{vec_unpacks_float_hi_@var{m}}\n+@mdindex @code{vec_unpacks_float_lo_@var{m}}\n+@mdindex @code{vec_unpacku_float_hi_@var{m}}\n+@mdindex @code{vec_unpacku_float_lo_@var{m}}\n+@item @samp{vec_unpacks_float_hi_@var{m}}, @samp{vec_unpacks_float_lo_@var{m}}\n+@itemx @samp{vec_unpacku_float_hi_@var{m}}, @samp{vec_unpacku_float_lo_@var{m}}\n+Extract, convert to floating point type and widen the high/low part of a\n+vector of signed/unsigned integral elements. The input vector (operand 1)\n+has N elements of size S@. Convert the high/low elements of the vector using\n+floating point conversion and place the resulting N/2 values of size 2*S in\n+the output vector (operand 0).\n \n-On most machines you need not define this pattern, since GCC will\n-already generate the correct code, which is to load the frame pointer\n-and static chain, restore the stack (using the\n-@samp{restore_stack_nonlocal} pattern, if defined), and jump indirectly\n-to the dispatcher. You need only define this pattern if this code will\n-not work on your machine.\n+@mdindex @code{vec_unpack_sfix_trunc_hi_@var{m}}\n+@mdindex @code{vec_unpack_sfix_trunc_lo_@var{m}}\n+@mdindex @code{vec_unpack_ufix_trunc_hi_@var{m}}\n+@mdindex @code{vec_unpack_ufix_trunc_lo_@var{m}}\n+@item @samp{vec_unpack_sfix_trunc_hi_@var{m}},\n+@itemx @samp{vec_unpack_sfix_trunc_lo_@var{m}}\n+@itemx @samp{vec_unpack_ufix_trunc_hi_@var{m}}\n+@itemx @samp{vec_unpack_ufix_trunc_lo_@var{m}}\n+Extract, convert to signed/unsigned integer type and widen the high/low part of a\n+vector of floating point elements. The input vector (operand 1)\n+has N elements of size S@. Convert the high/low elements of the vector\n+to integers and place the resulting N/2 values of size 2*S in\n+the output vector (operand 0).\n \n-@cindex @code{nonlocal_goto_receiver} instruction pattern\n-@item @samp{nonlocal_goto_receiver}\n-This pattern, if defined, contains code needed at the target of a\n-nonlocal goto after the code already generated by GCC@. You will not\n-normally need to define this pattern. A typical reason why you might\n-need this pattern is if some value, such as a pointer to a global table,\n-must be restored when the frame pointer is restored. Note that a nonlocal\n-goto only occurs within a unit-of-translation, so a global table pointer\n-that is shared by all functions of a given module need not be restored.\n-There are no arguments.\n+@mdindex @code{vec_widen_umult_hi_@var{m}}\n+@mdindex @code{vec_widen_umult_lo_@var{m}}\n+@mdindex @code{vec_widen_smult_hi_@var{m}}\n+@mdindex @code{vec_widen_smult_lo_@var{m}}\n+@mdindex @code{vec_widen_umult_even_@var{m}}\n+@mdindex @code{vec_widen_umult_odd_@var{m}}\n+@mdindex @code{vec_widen_smult_even_@var{m}}\n+@mdindex @code{vec_widen_smult_odd_@var{m}}\n+@item @samp{vec_widen_umult_hi_@var{m}}, @samp{vec_widen_umult_lo_@var{m}}\n+@itemx @samp{vec_widen_smult_hi_@var{m}}, @samp{vec_widen_smult_lo_@var{m}}\n+@itemx @samp{vec_widen_umult_even_@var{m}}, @samp{vec_widen_umult_odd_@var{m}}\n+@itemx @samp{vec_widen_smult_even_@var{m}}, @samp{vec_widen_smult_odd_@var{m}}\n+Signed/Unsigned widening multiplication. The two inputs (operands 1 and 2)\n+are vectors with N signed/unsigned elements of size S@. Multiply the high/low\n+or even/odd elements of the two vectors, and put the N/2 products of size 2*S\n+in the output vector (operand 0). A target shouldn't implement even/odd pattern\n+pair if it is less efficient than lo/hi one.\n \n-@cindex @code{exception_receiver} instruction pattern\n-@item @samp{exception_receiver}\n-This pattern, if defined, contains code needed at the site of an\n-exception handler that isn't needed at the site of a nonlocal goto. You\n-will not normally need to define this pattern. A typical reason why you\n-might need this pattern is if some value, such as a pointer to a global\n-table, must be restored after control flow is branched to the handler of\n-an exception. There are no arguments.\n+@mdindex @code{vec_widen_ushiftl_hi_@var{m}}\n+@mdindex @code{vec_widen_ushiftl_lo_@var{m}}\n+@mdindex @code{vec_widen_sshiftl_hi_@var{m}}\n+@mdindex @code{vec_widen_sshiftl_lo_@var{m}}\n+@item @samp{vec_widen_ushiftl_hi_@var{m}}, @samp{vec_widen_ushiftl_lo_@var{m}}\n+@itemx @samp{vec_widen_sshiftl_hi_@var{m}}, @samp{vec_widen_sshiftl_lo_@var{m}}\n+Signed/Unsigned widening shift left. The first input (operand 1) is a vector\n+with N signed/unsigned elements of size S@. Operand 2 is a constant. Shift\n+the high/low elements of operand 1, and put the N/2 results of size 2*S in the\n+output vector (operand 0).\n \n-@cindex @code{builtin_setjmp_setup} instruction pattern\n-@item @samp{builtin_setjmp_setup}\n-This pattern, if defined, contains additional code needed to initialize\n-the @code{jmp_buf}. You will not normally need to define this pattern.\n-A typical reason why you might need this pattern is if some value, such\n-as a pointer to a global table, must be restored. Though it is\n-preferred that the pointer value be recalculated if possible (given the\n-address of a label for instance). The single argument is a pointer to\n-the @code{jmp_buf}. Note that the buffer is five words long and that\n-the first three are normally used by the generic mechanism.\n+@mdindex @code{vec_widen_saddl_hi_@var{m}}\n+@mdindex @code{vec_widen_saddl_lo_@var{m}}\n+@mdindex @code{vec_widen_uaddl_hi_@var{m}}\n+@mdindex @code{vec_widen_uaddl_lo_@var{m}}\n+@item @samp{vec_widen_uaddl_hi_@var{m}}, @samp{vec_widen_uaddl_lo_@var{m}}\n+@itemx @samp{vec_widen_saddl_hi_@var{m}}, @samp{vec_widen_saddl_lo_@var{m}}\n+Signed/Unsigned widening add long. Operands 1 and 2 are vectors with N\n+signed/unsigned elements of size S@. Add the high/low elements of 1 and 2\n+together, widen the resulting elements and put the N/2 results of size 2*S in\n+the output vector (operand 0).\n \n-@cindex @code{builtin_setjmp_receiver} instruction pattern\n-@item @samp{builtin_setjmp_receiver}\n-This pattern, if defined, contains code needed at the site of a\n-built-in setjmp that isn't needed at the site of a nonlocal goto. You\n-will not normally need to define this pattern. A typical reason why you\n-might need this pattern is if some value, such as a pointer to a global\n-table, must be restored. It takes one argument, which is the label\n-to which builtin_longjmp transferred control; this pattern may be emitted\n-at a small offset from that label.\n+@mdindex @code{vec_widen_ssubl_hi_@var{m}}\n+@mdindex @code{vec_widen_ssubl_lo_@var{m}}\n+@mdindex @code{vec_widen_usubl_hi_@var{m}}\n+@mdindex @code{vec_widen_usubl_lo_@var{m}}\n+@item @samp{vec_widen_usubl_hi_@var{m}}, @samp{vec_widen_usubl_lo_@var{m}}\n+@itemx @samp{vec_widen_ssubl_hi_@var{m}}, @samp{vec_widen_ssubl_lo_@var{m}}\n+Signed/Unsigned widening subtract long. Operands 1 and 2 are vectors with N\n+signed/unsigned elements of size S@. Subtract the high/low elements of 2 from\n+1 and widen the resulting elements. Put the N/2 results of size 2*S in the\n+output vector (operand 0).\n \n-@cindex @code{builtin_longjmp} instruction pattern\n-@item @samp{builtin_longjmp}\n-This pattern, if defined, performs the entire action of the longjmp.\n-You will not normally need to define this pattern unless you also define\n-@code{builtin_setjmp_setup}. The single argument is a pointer to the\n-@code{jmp_buf}.\n+@mdindex @code{vec_widen_sabd_hi_@var{m}}\n+@mdindex @code{vec_widen_sabd_lo_@var{m}}\n+@mdindex @code{vec_widen_sabd_odd_@var{m}}\n+@mdindex @code{vec_widen_sabd_even_@var{m}}\n+@mdindex @code{vec_widen_uabd_hi_@var{m}}\n+@mdindex @code{vec_widen_uabd_lo_@var{m}}\n+@mdindex @code{vec_widen_uabd_odd_@var{m}}\n+@mdindex @code{vec_widen_uabd_even_@var{m}}\n+@item @samp{vec_widen_uabd_hi_@var{m}}, @samp{vec_widen_uabd_lo_@var{m}}\n+@itemx @samp{vec_widen_uabd_odd_@var{m}}, @samp{vec_widen_uabd_even_@var{m}}\n+@itemx @samp{vec_widen_sabd_hi_@var{m}}, @samp{vec_widen_sabd_lo_@var{m}}\n+@itemx @samp{vec_widen_sabd_odd_@var{m}}, @samp{vec_widen_sabd_even_@var{m}}\n+Signed/Unsigned widening absolute difference. Operands 1 and 2 are\n+vectors with N signed/unsigned elements of size S@. Find the absolute\n+difference between operands 1 and 2 and widen the resulting elements.\n+Put the N/2 results of size 2*S in the output vector (operand 0).\n \n-@cindex @code{eh_return} instruction pattern\n-@item @samp{eh_return}\n-This pattern, if defined, affects the way @code{__builtin_eh_return},\n-and thence the call frame exception handling library routines, are\n-built. It is intended to handle non-trivial actions needed along\n-the abnormal return path.\n+@mdindex @code{vec_trunc_add_high@var{m}}\n+@item @samp{vec_trunc_add_high@var{m}}\n+Signed or unsigned addition of two input integer vectors of mode @var{m}, then\n+extracts the most significant half of each result element and narrows it to\n+elements of half the original width.\n \n-The address of the exception handler to which the function should return\n-is passed as operand to this pattern. It will normally need to copied by\n-the pattern to some special register or memory location.\n-If the pattern needs to determine the location of the target call\n-frame in order to do so, it may use @code{EH_RETURN_STACKADJ_RTX},\n-if defined; it will have already been assigned.\n+Concretely, it computes:\n+@code{(bits(a)/2)((a + b) >> bits(a)/2)}\n \n-If this pattern is not defined, the default action will be to simply\n-copy the return address to @code{EH_RETURN_HANDLER_RTX}. Either\n-that macro or this pattern needs to be defined if call frame exception\n-handling is to be used.\n+where @code{bits(a)} is the width in bits of each input element.\n \n-@cindex @code{prologue} instruction pattern\n-@anchor{prologue instruction pattern}\n-@item @samp{prologue}\n-This pattern, if defined, emits RTL for entry to a function. The function\n-entry is responsible for setting up the stack frame, initializing the frame\n-pointer register, saving callee saved registers, etc.\n+Operand 1 and 2 are of integer vector mode @var{m} containing the same number\n+of signed or unsigned integral elements. The result (operand @code{0}) is of an\n+integer vector mode with the same number of elements but elements of half of the\n+width of those of mode @var{m}.\n \n-Using a prologue pattern is generally preferred over defining\n-@code{TARGET_ASM_FUNCTION_PROLOGUE} to emit assembly code for the prologue.\n+This operation currently only used for early break result compression when the\n+result of a vector boolean can be represented as 0 or -1.\n \n-The @code{prologue} pattern is particularly useful for targets which perform\n-instruction scheduling.\n+@mdindex @code{vec_addsub@var{m}3}\n+@item @samp{vec_addsub@var{m}3}\n+Alternating subtract, add with even lanes doing subtract and odd\n+lanes doing addition. Operands 1 and 2 and the outout operand are vectors\n+with mode @var{m}.\n \n-@cindex @code{window_save} instruction pattern\n-@anchor{window_save instruction pattern}\n-@item @samp{window_save}\n-This pattern, if defined, emits RTL for a register window save. It should\n-be defined if the target machine has register windows but the window events\n-are decoupled from calls to subroutines. The canonical example is the SPARC\n-architecture.\n+@mdindex @code{vec_fmaddsub@var{m}4}\n+@item @samp{vec_fmaddsub@var{m}4}\n+Alternating multiply subtract, add with even lanes doing subtract and odd\n+lanes doing addition of the third operand to the multiplication result\n+of the first two operands. Operands 1, 2 and 3 and the outout operand are vectors\n+with mode @var{m}.\n \n-@cindex @code{epilogue} instruction pattern\n-@anchor{epilogue instruction pattern}\n-@item @samp{epilogue}\n-This pattern emits RTL for exit from a function. The function\n-exit is responsible for deallocating the stack frame, restoring callee saved\n-registers and emitting the return instruction.\n+@mdindex @code{vec_fmsubadd@var{m}4}\n+@item @samp{vec_fmsubadd@var{m}4}\n+Alternating multiply add, subtract with even lanes doing addition and odd\n+lanes doing subtraction of the third operand to the multiplication result\n+of the first two operands. Operands 1, 2 and 3 and the outout operand are vectors\n+with mode @var{m}.\n \n-Using an epilogue pattern is generally preferred over defining\n-@code{TARGET_ASM_FUNCTION_EPILOGUE} to emit assembly code for the epilogue.\n+These instructions are not allowed to @code{FAIL}.\n \n-The @code{epilogue} pattern is particularly useful for targets which perform\n-instruction scheduling or which have delay slots for their return instruction.\n+@mdindex @code{cadd90@var{m}3}\n+@item @samp{cadd90@var{m}3}\n+Perform vector add and subtract on even/odd number pairs. The operation being\n+matched is semantically described as\n \n-@cindex @code{sibcall_epilogue} instruction pattern\n-@item @samp{sibcall_epilogue}\n-This pattern, if defined, emits RTL for exit from a function without the final\n-branch back to the calling function. This pattern will be emitted before any\n-sibling call (aka tail call) sites.\n+@smallexample\n+ for (int i = 0; i < N; i += 2)\n+ @{\n+ c[i] = a[i] - b[i+1];\n+ c[i+1] = a[i+1] + b[i];\n+ @}\n+@end smallexample\n \n-The @code{sibcall_epilogue} pattern must not clobber any arguments used for\n-parameter passing or any stack slots for arguments passed to the current\n-function.\n+This operation is semantically equivalent to performing a vector addition of\n+complex numbers in operand 1 with operand 2 rotated by 90 degrees around\n+the argand plane and storing the result in operand 0.\n \n-@cindex @code{trap} instruction pattern\n-@item @samp{trap}\n-This pattern, if defined, signals an error, typically by causing some\n-kind of signal to be raised.\n+In GCC lane ordering the real part of the number must be in the even lanes with\n+the imaginary part in the odd lanes.\n \n-@cindex @code{ctrap@var{MM}4} instruction pattern\n-@item @samp{ctrap@var{MM}4}\n-Conditional trap instruction. Operand 0 is a piece of RTL which\n-performs a comparison, and operands 1 and 2 are the arms of the\n-comparison. Operand 3 is the trap code, an integer.\n+The operation is only supported for vector modes @var{m}.\n \n-A typical @code{ctrap} pattern looks like\n+This pattern is not allowed to @code{FAIL}.\n+\n+@mdindex @code{cadd270@var{m}3}\n+@item @samp{cadd270@var{m}3}\n+Perform vector add and subtract on even/odd number pairs. The operation being\n+matched is semantically described as\n \n @smallexample\n-(define_insn \"ctrapsi4\"\n- [(trap_if (match_operator 0 \"trap_operator\"\n- [(match_operand 1 \"register_operand\")\n- (match_operand 2 \"immediate_operand\")])\n- (match_operand 3 \"const_int_operand\" \"i\"))]\n- \"\"\n- \"@dots{}\")\n+ for (int i = 0; i < N; i += 2)\n+ @{\n+ c[i] = a[i] + b[i+1];\n+ c[i+1] = a[i+1] - b[i];\n+ @}\n @end smallexample\n \n-@cindex @code{prefetch} instruction pattern\n-@item @samp{prefetch}\n-This pattern, if defined, emits code for a non-faulting data prefetch\n-instruction. Operand 0 is the address of the memory to prefetch. Operand 1\n-is a constant 1 if the prefetch is preparing for a write to the memory\n-address, or a constant 0 otherwise. Operand 2 is the expected degree of\n-temporal locality of the data and is a value between 0 and 3, inclusive; 0\n-means that the data has no temporal locality, so it need not be left in the\n-cache after the access; 3 means that the data has a high degree of temporal\n-locality and should be left in all levels of cache possible; 1 and 2 mean,\n-respectively, a low or moderate degree of temporal locality.\n-\n-Targets that do not support write prefetches or locality hints can ignore\n-the values of operands 1 and 2.\n+This operation is semantically equivalent to performing a vector addition of\n+complex numbers in operand 1 with operand 2 rotated by 270 degrees around\n+the argand plane and storing the result in operand 0.\n \n-@cindex @code{blockage} instruction pattern\n-@item @samp{blockage}\n-This pattern defines a pseudo insn that prevents the instruction\n-scheduler and other passes from moving instructions and using register\n-equivalences across the boundary defined by the blockage insn.\n-This needs to be an UNSPEC_VOLATILE pattern or a volatile ASM.\n+In GCC lane ordering the real part of the number must be in the even lanes with\n+the imaginary part in the odd lanes.\n \n-@cindex @code{memory_blockage} instruction pattern\n-@item @samp{memory_blockage}\n-This pattern, if defined, represents a compiler memory barrier, and will be\n-placed at points across which RTL passes may not propagate memory accesses.\n-This instruction needs to read and write volatile BLKmode memory. It does\n-not need to generate any machine instruction. If this pattern is not defined,\n-the compiler falls back to emitting an instruction corresponding\n-to @code{asm volatile (\"\" ::: \"memory\")}.\n+The operation is only supported for vector modes @var{m}.\n \n-@cindex @code{memory_barrier} instruction pattern\n-@item @samp{memory_barrier}\n-If the target memory model is not fully synchronous, then this pattern\n-should be defined to an instruction that orders both loads and stores\n-before the instruction with respect to loads and stores after the instruction.\n-This pattern has no operands.\n+This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{speculation_barrier} instruction pattern\n-@item @samp{speculation_barrier}\n-If the target can support speculative execution, then this pattern should\n-be defined to an instruction that will block subsequent execution until\n-any prior speculation conditions has been resolved. The pattern must also\n-ensure that the compiler cannot move memory operations past the barrier,\n-so it needs to be an UNSPEC_VOLATILE pattern. The pattern has no\n-operands.\n+@mdindex @code{cmla@var{m}4}\n+@item @samp{cmla@var{m}4}\n+Perform a vector multiply and accumulate that is semantically the same as\n+a multiply and accumulate of complex numbers.\n \n-If this pattern is not defined then the default expansion of\n-@code{__builtin_speculation_safe_value} will emit a warning. You can\n-suppress this warning by defining this pattern with a final condition\n-of @code{0} (zero), which tells the compiler that a speculation\n-barrier is not needed for this target.\n+@smallexample\n+ complex TYPE op0[N];\n+ complex TYPE op1[N];\n+ complex TYPE op2[N];\n+ complex TYPE op3[N];\n+ for (int i = 0; i < N; i += 1)\n+ @{\n+ op0[i] = op1[i] * op2[i] + op3[i];\n+ @}\n+@end smallexample\n \n-@cindex @code{sync_compare_and_swap@var{mode}} instruction pattern\n-@item @samp{sync_compare_and_swap@var{mode}}\n-This pattern, if defined, emits code for an atomic compare-and-swap\n-operation. Operand 1 is the memory on which the atomic operation is\n-performed. Operand 2 is the ``old'' value to be compared against the\n-current contents of the memory location. Operand 3 is the ``new'' value\n-to store in the memory if the compare succeeds. Operand 0 is the result\n-of the operation; it should contain the contents of the memory\n-before the operation. If the compare succeeds, this should obviously be\n-a copy of operand 2.\n+In GCC lane ordering the real part of the number must be in the even lanes with\n+the imaginary part in the odd lanes.\n \n-This pattern must show that both operand 0 and operand 1 are modified.\n+The operation is only supported for vector modes @var{m}.\n \n-This pattern must issue any memory barrier instructions such that all\n-memory operations before the atomic operation occur before the atomic\n-operation and all memory operations after the atomic operation occur\n-after the atomic operation.\n+This pattern is not allowed to @code{FAIL}.\n \n-For targets where the success or failure of the compare-and-swap\n-operation is available via the status flags, it is possible to\n-avoid a separate compare operation and issue the subsequent\n-branch or store-flag operation immediately after the compare-and-swap.\n-To this end, GCC will look for a @code{MODE_CC} set in the\n-output of @code{sync_compare_and_swap@var{mode}}; if the machine\n-description includes such a set, the target should also define special\n-@code{cbranchcc4} and/or @code{cstorecc4} instructions. GCC will then\n-be able to take the destination of the @code{MODE_CC} set and pass it\n-to the @code{cbranchcc4} or @code{cstorecc4} pattern as the first\n-operand of the comparison (the second will be @code{(const_int 0)}).\n+@mdindex @code{cmla_conj@var{m}4}\n+@item @samp{cmla_conj@var{m}4}\n+Perform a vector multiply by conjugate and accumulate that is semantically\n+the same as a multiply and accumulate of complex numbers where the second\n+multiply arguments is conjugated.\n \n-For targets where the operating system may provide support for this\n-operation via library calls, the @code{sync_compare_and_swap_optab}\n-may be initialized to a function with the same interface as the\n-@code{__sync_val_compare_and_swap_@var{n}} built-in. If the entire\n-set of @var{__sync} builtins are supported via library calls, the\n-target can initialize all of the optabs at once with\n-@code{init_sync_libfuncs}.\n-For the purposes of C++11 @code{std::atomic::is_lock_free}, it is\n-assumed that these library calls do @emph{not} use any kind of\n-interruptable locking.\n+@smallexample\n+ complex TYPE op0[N];\n+ complex TYPE op1[N];\n+ complex TYPE op2[N];\n+ complex TYPE op3[N];\n+ for (int i = 0; i < N; i += 1)\n+ @{\n+ op0[i] = op1[i] * conj (op2[i]) + op3[i];\n+ @}\n+@end smallexample\n \n-@cindex @code{sync_add@var{mode}} instruction pattern\n-@cindex @code{sync_sub@var{mode}} instruction pattern\n-@cindex @code{sync_ior@var{mode}} instruction pattern\n-@cindex @code{sync_and@var{mode}} instruction pattern\n-@cindex @code{sync_xor@var{mode}} instruction pattern\n-@cindex @code{sync_nand@var{mode}} instruction pattern\n-@item @samp{sync_add@var{mode}}, @samp{sync_sub@var{mode}}\n-@itemx @samp{sync_ior@var{mode}}, @samp{sync_and@var{mode}}\n-@itemx @samp{sync_xor@var{mode}}, @samp{sync_nand@var{mode}}\n-These patterns emit code for an atomic operation on memory.\n-Operand 0 is the memory on which the atomic operation is performed.\n-Operand 1 is the second operand to the binary operator.\n+In GCC lane ordering the real part of the number must be in the even lanes with\n+the imaginary part in the odd lanes.\n \n-This pattern must issue any memory barrier instructions such that all\n-memory operations before the atomic operation occur before the atomic\n-operation and all memory operations after the atomic operation occur\n-after the atomic operation.\n+The operation is only supported for vector modes @var{m}.\n \n-If these patterns are not defined, the operation will be constructed\n-from a compare-and-swap operation, if defined.\n+This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{sync_old_add@var{mode}} instruction pattern\n-@cindex @code{sync_old_sub@var{mode}} instruction pattern\n-@cindex @code{sync_old_ior@var{mode}} instruction pattern\n-@cindex @code{sync_old_and@var{mode}} instruction pattern\n-@cindex @code{sync_old_xor@var{mode}} instruction pattern\n-@cindex @code{sync_old_nand@var{mode}} instruction pattern\n-@item @samp{sync_old_add@var{mode}}, @samp{sync_old_sub@var{mode}}\n-@itemx @samp{sync_old_ior@var{mode}}, @samp{sync_old_and@var{mode}}\n-@itemx @samp{sync_old_xor@var{mode}}, @samp{sync_old_nand@var{mode}}\n-These patterns emit code for an atomic operation on memory,\n-and return the value that the memory contained before the operation.\n-Operand 0 is the result value, operand 1 is the memory on which the\n-atomic operation is performed, and operand 2 is the second operand\n-to the binary operator.\n+@mdindex @code{cmls@var{m}4}\n+@item @samp{cmls@var{m}4}\n+Perform a vector multiply and subtract that is semantically the same as\n+a multiply and subtract of complex numbers.\n \n-This pattern must issue any memory barrier instructions such that all\n-memory operations before the atomic operation occur before the atomic\n-operation and all memory operations after the atomic operation occur\n-after the atomic operation.\n+@smallexample\n+ complex TYPE op0[N];\n+ complex TYPE op1[N];\n+ complex TYPE op2[N];\n+ complex TYPE op3[N];\n+ for (int i = 0; i < N; i += 1)\n+ @{\n+ op0[i] = op1[i] * op2[i] - op3[i];\n+ @}\n+@end smallexample\n \n-If these patterns are not defined, the operation will be constructed\n-from a compare-and-swap operation, if defined.\n+In GCC lane ordering the real part of the number must be in the even lanes with\n+the imaginary part in the odd lanes.\n \n-@cindex @code{sync_new_add@var{mode}} instruction pattern\n-@cindex @code{sync_new_sub@var{mode}} instruction pattern\n-@cindex @code{sync_new_ior@var{mode}} instruction pattern\n-@cindex @code{sync_new_and@var{mode}} instruction pattern\n-@cindex @code{sync_new_xor@var{mode}} instruction pattern\n-@cindex @code{sync_new_nand@var{mode}} instruction pattern\n-@item @samp{sync_new_add@var{mode}}, @samp{sync_new_sub@var{mode}}\n-@itemx @samp{sync_new_ior@var{mode}}, @samp{sync_new_and@var{mode}}\n-@itemx @samp{sync_new_xor@var{mode}}, @samp{sync_new_nand@var{mode}}\n-These patterns are like their @code{sync_old_@var{op}} counterparts,\n-except that they return the value that exists in the memory location\n-after the operation, rather than before the operation.\n+The operation is only supported for vector modes @var{m}.\n \n-@cindex @code{sync_lock_test_and_set@var{mode}} instruction pattern\n-@item @samp{sync_lock_test_and_set@var{mode}}\n-This pattern takes two forms, based on the capabilities of the target.\n-In either case, operand 0 is the result of the operand, operand 1 is\n-the memory on which the atomic operation is performed, and operand 2\n-is the value to set in the lock.\n+This pattern is not allowed to @code{FAIL}.\n \n-In the ideal case, this operation is an atomic exchange operation, in\n-which the previous value in memory operand is copied into the result\n-operand, and the value operand is stored in the memory operand.\n+@mdindex @code{cmls_conj@var{m}4}\n+@item @samp{cmls_conj@var{m}4}\n+Perform a vector multiply by conjugate and subtract that is semantically\n+the same as a multiply and subtract of complex numbers where the second\n+multiply arguments is conjugated.\n \n-For less capable targets, any value operand that is not the constant 1\n-should be rejected with @code{FAIL}. In this case the target may use\n-an atomic test-and-set bit operation. The result operand should contain\n-1 if the bit was previously set and 0 if the bit was previously clear.\n-The true contents of the memory operand are implementation defined.\n+@smallexample\n+ complex TYPE op0[N];\n+ complex TYPE op1[N];\n+ complex TYPE op2[N];\n+ complex TYPE op3[N];\n+ for (int i = 0; i < N; i += 1)\n+ @{\n+ op0[i] = op1[i] * conj (op2[i]) - op3[i];\n+ @}\n+@end smallexample\n \n-This pattern must issue any memory barrier instructions such that the\n-pattern as a whole acts as an acquire barrier, that is all memory\n-operations after the pattern do not occur until the lock is acquired.\n+In GCC lane ordering the real part of the number must be in the even lanes with\n+the imaginary part in the odd lanes.\n \n-If this pattern is not defined, the operation will be constructed from\n-a compare-and-swap operation, if defined.\n+The operation is only supported for vector modes @var{m}.\n \n-@cindex @code{sync_lock_release@var{mode}} instruction pattern\n-@item @samp{sync_lock_release@var{mode}}\n-This pattern, if defined, releases a lock set by\n-@code{sync_lock_test_and_set@var{mode}}. Operand 0 is the memory\n-that contains the lock; operand 1 is the value to store in the lock.\n+This pattern is not allowed to @code{FAIL}.\n \n-If the target doesn't implement full semantics for\n-@code{sync_lock_test_and_set@var{mode}}, any value operand which is not\n-the constant 0 should be rejected with @code{FAIL}, and the true contents\n-of the memory operand are implementation defined.\n+@mdindex @code{cmul@var{m}4}\n+@item @samp{cmul@var{m}4}\n+Perform a vector multiply that is semantically the same as multiply of\n+complex numbers.\n \n-This pattern must issue any memory barrier instructions such that the\n-pattern as a whole acts as a release barrier, that is the lock is\n-released only after all previous memory operations have completed.\n+@smallexample\n+ complex TYPE op0[N];\n+ complex TYPE op1[N];\n+ complex TYPE op2[N];\n+ for (int i = 0; i < N; i += 1)\n+ @{\n+ op0[i] = op1[i] * op2[i];\n+ @}\n+@end smallexample\n \n-If this pattern is not defined, then a @code{memory_barrier} pattern\n-will be emitted, followed by a store of the value to the memory operand.\n+In GCC lane ordering the real part of the number must be in the even lanes with\n+the imaginary part in the odd lanes.\n \n-@cindex @code{atomic_compare_and_swap@var{mode}} instruction pattern\n-@item @samp{atomic_compare_and_swap@var{mode}} \n-This pattern, if defined, emits code for an atomic compare-and-swap\n-operation with memory model semantics. Operand 2 is the memory on which\n-the atomic operation is performed. Operand 0 is an output operand which\n-is set to true or false based on whether the operation succeeded. Operand\n-1 is an output operand which is set to the contents of the memory before\n-the operation was attempted. Operand 3 is the value that is expected to\n-be in memory. Operand 4 is the value to put in memory if the expected\n-value is found there. Operand 5 is set to 1 if this compare and swap is to\n-be treated as a weak operation. Operand 6 is the memory model to be used\n-if the operation is a success. Operand 7 is the memory model to be used\n-if the operation fails.\n+The operation is only supported for vector modes @var{m}.\n \n-If memory referred to in operand 2 contains the value in operand 3, then\n-operand 4 is stored in memory pointed to by operand 2 and fencing based on\n-the memory model in operand 6 is issued. \n+This pattern is not allowed to @code{FAIL}.\n \n-If memory referred to in operand 2 does not contain the value in operand 3,\n-then fencing based on the memory model in operand 7 is issued.\n+@mdindex @code{cmul_conj@var{m}4}\n+@item @samp{cmul_conj@var{m}4}\n+Perform a vector multiply by conjugate that is semantically the same as a\n+multiply of complex numbers where the second multiply arguments is conjugated.\n \n-If a target does not support weak compare-and-swap operations, or the port\n-elects not to implement weak operations, the argument in operand 5 can be\n-ignored. Note a strong implementation must be provided.\n+@smallexample\n+ complex TYPE op0[N];\n+ complex TYPE op1[N];\n+ complex TYPE op2[N];\n+ for (int i = 0; i < N; i += 1)\n+ @{\n+ op0[i] = op1[i] * conj (op2[i]);\n+ @}\n+@end smallexample\n \n-If this pattern is not provided, the @code{__atomic_compare_exchange}\n-built-in functions will utilize the legacy @code{sync_compare_and_swap}\n-pattern with an @code{__ATOMIC_SEQ_CST} memory model.\n+In GCC lane ordering the real part of the number must be in the even lanes with\n+the imaginary part in the odd lanes.\n \n-@cindex @code{atomic_load@var{mode}} instruction pattern\n-@item @samp{atomic_load@var{mode}}\n-This pattern implements an atomic load operation with memory model\n-semantics. Operand 1 is the memory address being loaded from. Operand 0\n-is the result of the load. Operand 2 is the memory model to be used for\n-the load operation.\n+The operation is only supported for vector modes @var{m}.\n \n-If not present, the @code{__atomic_load} built-in function will either\n-resort to a normal load with memory barriers, or a compare-and-swap\n-operation if a normal load would not be atomic.\n+This pattern is not allowed to @code{FAIL}.\n \n-@cindex @code{atomic_store@var{mode}} instruction pattern\n-@item @samp{atomic_store@var{mode}}\n-This pattern implements an atomic store operation with memory model\n-semantics. Operand 0 is the memory address being stored to. Operand 1\n-is the value to be written. Operand 2 is the memory model to be used for\n-the operation.\n+@mdindex @code{cond_neg@var{mode}}\n+@mdindex @code{cond_one_cmpl@var{mode}}\n+@mdindex @code{cond_sqrt@var{mode}}\n+@mdindex @code{cond_ceil@var{mode}}\n+@mdindex @code{cond_floor@var{mode}}\n+@mdindex @code{cond_round@var{mode}}\n+@mdindex @code{cond_rint@var{mode}}\n+@item @samp{cond_neg@var{mode}}\n+@itemx @samp{cond_one_cmpl@var{mode}}\n+@itemx @samp{cond_sqrt@var{mode}}\n+@itemx @samp{cond_ceil@var{mode}}\n+@itemx @samp{cond_floor@var{mode}}\n+@itemx @samp{cond_round@var{mode}}\n+@itemx @samp{cond_rint@var{mode}}\n+When operand 1 is true, perform an operation on operands 2 and\n+store the result in operand 0, otherwise store operand 3 in operand 0.\n+The operation works elementwise if the operands are vectors.\n \n-If not present, the @code{__atomic_store} built-in function will attempt to\n-perform a normal store and surround it with any required memory fences. If\n-the store would not be atomic, then an @code{__atomic_exchange} is\n-attempted with the result being ignored.\n+The scalar case is equivalent to:\n \n-@cindex @code{atomic_exchange@var{mode}} instruction pattern\n-@item @samp{atomic_exchange@var{mode}}\n-This pattern implements an atomic exchange operation with memory model\n-semantics. Operand 1 is the memory location the operation is performed on.\n-Operand 0 is an output operand which is set to the original value contained\n-in the memory pointed to by operand 1. Operand 2 is the value to be\n-stored. Operand 3 is the memory model to be used.\n+@smallexample\n+op0 = op1 ? @var{op} op2 : op3;\n+@end smallexample\n \n-If this pattern is not present, the built-in function\n-@code{__atomic_exchange} will attempt to preform the operation with a\n-compare and swap loop.\n+while the vector case is equivalent to:\n \n-@cindex @code{atomic_add@var{mode}} instruction pattern\n-@cindex @code{atomic_sub@var{mode}} instruction pattern\n-@cindex @code{atomic_or@var{mode}} instruction pattern\n-@cindex @code{atomic_and@var{mode}} instruction pattern\n-@cindex @code{atomic_xor@var{mode}} instruction pattern\n-@cindex @code{atomic_nand@var{mode}} instruction pattern\n-@item @samp{atomic_add@var{mode}}, @samp{atomic_sub@var{mode}}\n-@itemx @samp{atomic_or@var{mode}}, @samp{atomic_and@var{mode}}\n-@itemx @samp{atomic_xor@var{mode}}, @samp{atomic_nand@var{mode}}\n-These patterns emit code for an atomic operation on memory with memory\n-model semantics. Operand 0 is the memory on which the atomic operation is\n-performed. Operand 1 is the second operand to the binary operator.\n-Operand 2 is the memory model to be used by the operation.\n+@smallexample\n+for (i = 0; i < GET_MODE_NUNITS (@var{m}); i++)\n+ op0[i] = op1[i] ? @var{op} op2[i] : op3[i];\n+@end smallexample\n \n-If these patterns are not defined, attempts will be made to use legacy\n-@code{sync} patterns, or equivalent patterns which return a result. If\n-none of these are available a compare-and-swap loop will be used.\n+where, for example, @var{op} is @code{~} for @samp{cond_one_cmpl@var{mode}}.\n \n-@cindex @code{atomic_fetch_add@var{mode}} instruction pattern\n-@cindex @code{atomic_fetch_sub@var{mode}} instruction pattern\n-@cindex @code{atomic_fetch_or@var{mode}} instruction pattern\n-@cindex @code{atomic_fetch_and@var{mode}} instruction pattern\n-@cindex @code{atomic_fetch_xor@var{mode}} instruction pattern\n-@cindex @code{atomic_fetch_nand@var{mode}} instruction pattern\n-@item @samp{atomic_fetch_add@var{mode}}, @samp{atomic_fetch_sub@var{mode}}\n-@itemx @samp{atomic_fetch_or@var{mode}}, @samp{atomic_fetch_and@var{mode}}\n-@itemx @samp{atomic_fetch_xor@var{mode}}, @samp{atomic_fetch_nand@var{mode}}\n-These patterns emit code for an atomic operation on memory with memory\n-model semantics, and return the original value. Operand 0 is an output \n-operand which contains the value of the memory location before the \n-operation was performed. Operand 1 is the memory on which the atomic \n-operation is performed. Operand 2 is the second operand to the binary\n-operator. Operand 3 is the memory model to be used by the operation.\n+When defined for floating-point modes, the contents of @samp{op2[i]}\n+are not interpreted if @samp{op1[i]} is false, just like they would not\n+be in a normal C @samp{?:} condition.\n \n-If these patterns are not defined, attempts will be made to use legacy\n-@code{sync} patterns. If none of these are available a compare-and-swap\n-loop will be used.\n+Operands 0, 2, and 3 all have mode @var{m}. Operand 1 is a scalar\n+integer if @var{m} is scalar, otherwise it has the mode returned by\n+@code{TARGET_VECTORIZE_GET_MASK_MODE}.\n \n-@cindex @code{atomic_add_fetch@var{mode}} instruction pattern\n-@cindex @code{atomic_sub_fetch@var{mode}} instruction pattern\n-@cindex @code{atomic_or_fetch@var{mode}} instruction pattern\n-@cindex @code{atomic_and_fetch@var{mode}} instruction pattern\n-@cindex @code{atomic_xor_fetch@var{mode}} instruction pattern\n-@cindex @code{atomic_nand_fetch@var{mode}} instruction pattern\n-@item @samp{atomic_add_fetch@var{mode}}, @samp{atomic_sub_fetch@var{mode}}\n-@itemx @samp{atomic_or_fetch@var{mode}}, @samp{atomic_and_fetch@var{mode}}\n-@itemx @samp{atomic_xor_fetch@var{mode}}, @samp{atomic_nand_fetch@var{mode}}\n-These patterns emit code for an atomic operation on memory with memory\n-model semantics and return the result after the operation is performed.\n-Operand 0 is an output operand which contains the value after the\n-operation. Operand 1 is the memory on which the atomic operation is\n-performed. Operand 2 is the second operand to the binary operator.\n-Operand 3 is the memory model to be used by the operation.\n+@samp{cond_@var{op}@var{mode}} generally corresponds to a conditional\n+form of @samp{@var{op}@var{mode}2}.\n \n-If these patterns are not defined, attempts will be made to use legacy\n-@code{sync} patterns, or equivalent patterns which return the result before\n-the operation followed by the arithmetic operation required to produce the\n-result. If none of these are available a compare-and-swap loop will be\n-used.\n+@mdindex @code{cond_add@var{mode}}\n+@mdindex @code{cond_sub@var{mode}}\n+@mdindex @code{cond_mul@var{mode}}\n+@mdindex @code{cond_div@var{mode}}\n+@mdindex @code{cond_udiv@var{mode}}\n+@mdindex @code{cond_mod@var{mode}}\n+@mdindex @code{cond_umod@var{mode}}\n+@mdindex @code{cond_and@var{mode}}\n+@mdindex @code{cond_ior@var{mode}}\n+@mdindex @code{cond_xor@var{mode}}\n+@mdindex @code{cond_smin@var{mode}}\n+@mdindex @code{cond_smax@var{mode}}\n+@mdindex @code{cond_umin@var{mode}}\n+@mdindex @code{cond_umax@var{mode}}\n+@mdindex @code{cond_copysign@var{mode}}\n+@mdindex @code{cond_fmin@var{mode}}\n+@mdindex @code{cond_fmax@var{mode}}\n+@mdindex @code{cond_ashl@var{mode}}\n+@mdindex @code{cond_ashr@var{mode}}\n+@mdindex @code{cond_lshr@var{mode}}\n+@item @samp{cond_add@var{mode}}\n+@itemx @samp{cond_sub@var{mode}}\n+@itemx @samp{cond_mul@var{mode}}\n+@itemx @samp{cond_div@var{mode}}\n+@itemx @samp{cond_udiv@var{mode}}\n+@itemx @samp{cond_mod@var{mode}}\n+@itemx @samp{cond_umod@var{mode}}\n+@itemx @samp{cond_and@var{mode}}\n+@itemx @samp{cond_ior@var{mode}}\n+@itemx @samp{cond_xor@var{mode}}\n+@itemx @samp{cond_smin@var{mode}}\n+@itemx @samp{cond_smax@var{mode}}\n+@itemx @samp{cond_umin@var{mode}}\n+@itemx @samp{cond_umax@var{mode}}\n+@itemx @samp{cond_copysign@var{mode}}\n+@itemx @samp{cond_fmin@var{mode}}\n+@itemx @samp{cond_fmax@var{mode}}\n+@itemx @samp{cond_ashl@var{mode}}\n+@itemx @samp{cond_ashr@var{mode}}\n+@itemx @samp{cond_lshr@var{mode}}\n+When operand 1 is true, perform an operation on operands 2 and 3 and\n+store the result in operand 0, otherwise store operand 4 in operand 0.\n+The operation works elementwise if the operands are vectors.\n \n-@cindex @code{atomic_test_and_set} instruction pattern\n-@item @samp{atomic_test_and_set}\n-This pattern emits code for @code{__builtin_atomic_test_and_set}.\n-Operand 0 is an output operand which is set to true if the previous\n-previous contents of the byte was \"set\", and false otherwise. Operand 1\n-is the @code{QImode} memory to be modified. Operand 2 is the memory\n-model to be used.\n+The scalar case is equivalent to:\n \n-The specific value that defines \"set\" is implementation defined, and\n-is normally based on what is performed by the native atomic test and set\n-instruction.\n+@smallexample\n+op0 = op1 ? op2 @var{op} op3 : op4;\n+@end smallexample\n \n-@cindex @code{atomic_bit_test_and_set@var{mode}} instruction pattern\n-@cindex @code{atomic_bit_test_and_complement@var{mode}} instruction pattern\n-@cindex @code{atomic_bit_test_and_reset@var{mode}} instruction pattern\n-@item @samp{atomic_bit_test_and_set@var{mode}}\n-@itemx @samp{atomic_bit_test_and_complement@var{mode}}\n-@itemx @samp{atomic_bit_test_and_reset@var{mode}}\n-These patterns emit code for an atomic bitwise operation on memory with memory\n-model semantics, and return the original value of the specified bit.\n-Operand 0 is an output operand which contains the value of the specified bit\n-from the memory location before the operation was performed. Operand 1 is the\n-memory on which the atomic operation is performed. Operand 2 is the bit within\n-the operand, starting with least significant bit. Operand 3 is the memory model\n-to be used by the operation. Operand 4 is a flag - it is @code{const1_rtx}\n-if operand 0 should contain the original value of the specified bit in the\n-least significant bit of the operand, and @code{const0_rtx} if the bit should\n-be in its original position in the operand.\n-@code{atomic_bit_test_and_set@var{mode}} atomically sets the specified bit after\n-remembering its original value, @code{atomic_bit_test_and_complement@var{mode}}\n-inverts the specified bit and @code{atomic_bit_test_and_reset@var{mode}} clears\n-the specified bit.\n+while the vector case is equivalent to:\n \n-If these patterns are not defined, attempts will be made to use\n-@code{atomic_fetch_or@var{mode}}, @code{atomic_fetch_xor@var{mode}} or\n-@code{atomic_fetch_and@var{mode}} instruction patterns, or their @code{sync}\n-counterparts. If none of these are available a compare-and-swap\n-loop will be used.\n+@smallexample\n+for (i = 0; i < GET_MODE_NUNITS (@var{m}); i++)\n+ op0[i] = op1[i] ? op2[i] @var{op} op3[i] : op4[i];\n+@end smallexample\n \n-@cindex @code{atomic_add_fetch_cmp_0@var{mode}} instruction pattern\n-@cindex @code{atomic_sub_fetch_cmp_0@var{mode}} instruction pattern\n-@cindex @code{atomic_and_fetch_cmp_0@var{mode}} instruction pattern\n-@cindex @code{atomic_or_fetch_cmp_0@var{mode}} instruction pattern\n-@cindex @code{atomic_xor_fetch_cmp_0@var{mode}} instruction pattern\n-@item @samp{atomic_add_fetch_cmp_0@var{mode}}\n-@itemx @samp{atomic_sub_fetch_cmp_0@var{mode}}\n-@itemx @samp{atomic_and_fetch_cmp_0@var{mode}}\n-@itemx @samp{atomic_or_fetch_cmp_0@var{mode}}\n-@itemx @samp{atomic_xor_fetch_cmp_0@var{mode}}\n-These patterns emit code for an atomic operation on memory with memory\n-model semantics if the fetch result is used only in a comparison against\n-zero.\n-Operand 0 is an output operand which contains a boolean result of comparison\n-of the value after the operation against zero. Operand 1 is the memory on\n-which the atomic operation is performed. Operand 2 is the second operand\n-to the binary operator. Operand 3 is the memory model to be used by the\n-operation. Operand 4 is an integer holding the comparison code, one of\n-@code{EQ}, @code{NE}, @code{LT}, @code{GT}, @code{LE} or @code{GE}.\n+where, for example, @var{op} is @code{+} for @samp{cond_add@var{mode}}.\n \n-If these patterns are not defined, attempts will be made to use separate\n-atomic operation and fetch pattern followed by comparison of the result\n-against zero.\n+When defined for floating-point modes, the contents of @samp{op3[i]}\n+are not interpreted if @samp{op1[i]} is false, just like they would not\n+be in a normal C @samp{?:} condition.\n \n-@cindex @code{mem_thread_fence} instruction pattern\n-@item @samp{mem_thread_fence}\n-This pattern emits code required to implement a thread fence with\n-memory model semantics. Operand 0 is the memory model to be used.\n+Operands 0, 2, 3 and 4 all have mode @var{m}. Operand 1 is a scalar\n+integer if @var{m} is scalar, otherwise it has the mode returned by\n+@code{TARGET_VECTORIZE_GET_MASK_MODE}.\n+\n+@samp{cond_@var{op}@var{mode}} generally corresponds to a conditional\n+form of @samp{@var{op}@var{mode}3}. As an exception, the vector forms\n+of shifts correspond to patterns like @code{vashl@var{mode}3} rather\n+than patterns like @code{ashl@var{mode}3}.\n \n-For the @code{__ATOMIC_RELAXED} model no instructions need to be issued\n-and this expansion is not invoked.\n+@samp{cond_copysign@var{mode}} is only defined for floating point modes.\n \n-The compiler always emits a compiler memory barrier regardless of what\n-expanding this pattern produced.\n+@mdindex @code{cond_fma@var{mode}}\n+@mdindex @code{cond_fms@var{mode}}\n+@mdindex @code{cond_fnma@var{mode}}\n+@mdindex @code{cond_fnms@var{mode}}\n+@item @samp{cond_fma@var{mode}}\n+@itemx @samp{cond_fms@var{mode}}\n+@itemx @samp{cond_fnma@var{mode}}\n+@itemx @samp{cond_fnms@var{mode}}\n+Like @samp{cond_add@var{m}}, except that the conditional operation\n+takes 3 operands rather than two. For example, the vector form of\n+@samp{cond_fma@var{mode}} is equivalent to:\n \n-If this pattern is not defined, the compiler falls back to expanding the\n-@code{memory_barrier} pattern, then to emitting @code{__sync_synchronize}\n-library call, and finally to just placing a compiler memory barrier.\n+@smallexample\n+for (i = 0; i < GET_MODE_NUNITS (@var{m}); i++)\n+ op0[i] = op1[i] ? fma (op2[i], op3[i], op4[i]) : op5[i];\n+@end smallexample\n \n-@cindex @code{get_thread_pointer@var{mode}} instruction pattern\n-@cindex @code{set_thread_pointer@var{mode}} instruction pattern\n-@item @samp{get_thread_pointer@var{mode}}\n-@itemx @samp{set_thread_pointer@var{mode}}\n-These patterns emit code that reads/sets the TLS thread pointer. Currently,\n-these are only needed if the target needs to support the\n-@code{__builtin_thread_pointer} and @code{__builtin_set_thread_pointer}\n-builtins.\n+@mdindex @code{cond_len_neg@var{mode}}\n+@mdindex @code{cond_len_one_cmpl@var{mode}}\n+@mdindex @code{cond_len_sqrt@var{mode}}\n+@mdindex @code{cond_len_ceil@var{mode}}\n+@mdindex @code{cond_len_floor@var{mode}}\n+@mdindex @code{cond_len_round@var{mode}}\n+@mdindex @code{cond_len_rint@var{mode}}\n+@item @samp{cond_len_neg@var{mode}}\n+@itemx @samp{cond_len_one_cmpl@var{mode}}\n+@itemx @samp{cond_len_sqrt@var{mode}}\n+@itemx @samp{cond_len_ceil@var{mode}}\n+@itemx @samp{cond_len_floor@var{mode}}\n+@itemx @samp{cond_len_round@var{mode}}\n+@itemx @samp{cond_len_rint@var{mode}}\n+When operand 1 is true and element index < operand 4 + operand 5, perform an operation on operands 1 and\n+store the result in operand 0, otherwise store operand 2 in operand 0.\n+The operation only works for the operands are vectors.\n \n-The get/set patterns have a single output/input operand respectively,\n-with @var{mode} intended to be @code{Pmode}.\n+@smallexample\n+for (i = 0; i < GET_MODE_NUNITS (@var{m}); i++)\n+ op0[i] = (i < ops[4] + ops[5] && op1[i]\n+ ? @var{op} op2[i]\n+ : op3[i]);\n+@end smallexample\n \n-@cindex @code{stack_protect_combined_set} instruction pattern\n-@item @samp{stack_protect_combined_set}\n-This pattern, if defined, moves a @code{ptr_mode} value from an address\n-whose declaration RTX is given in operand 1 to the memory in operand 0\n-without leaving the value in a register afterward. If several\n-instructions are needed by the target to perform the operation (eg. to\n-load the address from a GOT entry then load the @code{ptr_mode} value\n-and finally store it), it is the backend's responsibility to ensure no\n-intermediate result gets spilled. This is to avoid leaking the value\n-some place that an attacker might use to rewrite the stack guard slot\n-after having clobbered it.\n+where, for example, @var{op} is @code{~} for @samp{cond_len_one_cmpl@var{mode}}.\n \n-If this pattern is not defined, then the address declaration is\n-expanded first in the standard way and a @code{stack_protect_set}\n-pattern is then generated to move the value from that address to the\n-address in operand 0.\n+When defined for floating-point modes, the contents of @samp{op2[i]}\n+are not interpreted if @samp{op1[i]} is false, just like they would not\n+be in a normal C @samp{?:} condition.\n \n-@cindex @code{stack_protect_set} instruction pattern\n-@item @samp{stack_protect_set}\n-This pattern, if defined, moves a @code{ptr_mode} value from the valid\n-memory location in operand 1 to the memory in operand 0 without leaving\n-the value in a register afterward. This is to avoid leaking the value\n-some place that an attacker might use to rewrite the stack guard slot\n-after having clobbered it.\n+Operands 0, 2, and 3 all have mode @var{m}. Operand 1 is a scalar\n+integer if @var{m} is scalar, otherwise it has the mode returned by\n+@code{TARGET_VECTORIZE_GET_MASK_MODE}. Operand 4 has whichever\n+integer mode the target prefers.\n \n-Note: on targets where the addressing modes do not allow to load\n-directly from stack guard address, the address is expanded in a standard\n-way first which could cause some spills.\n+@samp{cond_len_@var{op}@var{mode}} generally corresponds to a conditional\n+form of @samp{@var{op}@var{mode}2}.\n \n-If this pattern is not defined, then a plain move pattern is generated.\n \n-@cindex @code{stack_protect_combined_test} instruction pattern\n-@item @samp{stack_protect_combined_test}\n-This pattern, if defined, compares a @code{ptr_mode} value from an\n-address whose declaration RTX is given in operand 1 with the memory in\n-operand 0 without leaving the value in a register afterward and\n-branches to operand 2 if the values were equal. If several\n-instructions are needed by the target to perform the operation (eg. to\n-load the address from a GOT entry then load the @code{ptr_mode} value\n-and finally store it), it is the backend's responsibility to ensure no\n-intermediate result gets spilled. This is to avoid leaking the value\n-some place that an attacker might use to rewrite the stack guard slot\n-after having clobbered it.\n+@mdindex @code{cond_len_add@var{mode}}\n+@mdindex @code{cond_len_sub@var{mode}}\n+@mdindex @code{cond_len_mul@var{mode}}\n+@mdindex @code{cond_len_div@var{mode}}\n+@mdindex @code{cond_len_udiv@var{mode}}\n+@mdindex @code{cond_len_mod@var{mode}}\n+@mdindex @code{cond_len_umod@var{mode}}\n+@mdindex @code{cond_len_and@var{mode}}\n+@mdindex @code{cond_len_ior@var{mode}}\n+@mdindex @code{cond_len_xor@var{mode}}\n+@mdindex @code{cond_len_smin@var{mode}}\n+@mdindex @code{cond_len_smax@var{mode}}\n+@mdindex @code{cond_len_umin@var{mode}}\n+@mdindex @code{cond_len_umax@var{mode}}\n+@mdindex @code{cond_len_copysign@var{mode}}\n+@mdindex @code{cond_len_fmin@var{mode}}\n+@mdindex @code{cond_len_fmax@var{mode}}\n+@mdindex @code{cond_len_ashl@var{mode}}\n+@mdindex @code{cond_len_ashr@var{mode}}\n+@mdindex @code{cond_len_lshr@var{mode}}\n+@item @samp{cond_len_add@var{mode}}\n+@itemx @samp{cond_len_sub@var{mode}}\n+@itemx @samp{cond_len_mul@var{mode}}\n+@itemx @samp{cond_len_div@var{mode}}\n+@itemx @samp{cond_len_udiv@var{mode}}\n+@itemx @samp{cond_len_mod@var{mode}}\n+@itemx @samp{cond_len_umod@var{mode}}\n+@itemx @samp{cond_len_and@var{mode}}\n+@itemx @samp{cond_len_ior@var{mode}}\n+@itemx @samp{cond_len_xor@var{mode}}\n+@itemx @samp{cond_len_smin@var{mode}}\n+@itemx @samp{cond_len_smax@var{mode}}\n+@itemx @samp{cond_len_umin@var{mode}}\n+@itemx @samp{cond_len_umax@var{mode}}\n+@itemx @samp{cond_len_copysign@var{mode}}\n+@itemx @samp{cond_len_fmin@var{mode}}\n+@itemx @samp{cond_len_fmax@var{mode}}\n+@itemx @samp{cond_len_ashl@var{mode}}\n+@itemx @samp{cond_len_ashr@var{mode}}\n+@itemx @samp{cond_len_lshr@var{mode}}\n+When operand 1 is true and element index < operand 5 + operand 6, perform an operation on operands 2 and 3 and\n+store the result in operand 0, otherwise store operand 4 in operand 0.\n+The operation only works for the operands are vectors.\n \n-If this pattern is not defined, then the address declaration is\n-expanded first in the standard way and a @code{stack_protect_test}\n-pattern is then generated to compare the value from that address to the\n-value at the memory in operand 0.\n+@smallexample\n+for (i = 0; i < GET_MODE_NUNITS (@var{m}); i++)\n+ op0[i] = (i < ops[5] + ops[6] && op1[i]\n+ ? op2[i] @var{op} op3[i]\n+ : op4[i]);\n+@end smallexample\n \n-@cindex @code{stack_protect_test} instruction pattern\n-@item @samp{stack_protect_test}\n-This pattern, if defined, compares a @code{ptr_mode} value from the\n-valid memory location in operand 1 with the memory in operand 0 without\n-leaving the value in a register afterward and branches to operand 2 if\n-the values were equal.\n+where, for example, @var{op} is @code{+} for @samp{cond_len_add@var{mode}}.\n \n-If this pattern is not defined, then a plain compare pattern and\n-conditional branch pattern is used.\n+When defined for floating-point modes, the contents of @samp{op3[i]}\n+are not interpreted if @samp{op1[i]} is false, just like they would not\n+be in a normal C @samp{?:} condition.\n \n-@cindex @code{tag_memory} instruction pattern\n-This pattern tags an object that begins at the address specified by\n-operand 0, has the byte size indicated by the operand 2, and uses the\n-tag from operand 1.\n+Operands 0, 2, 3 and 4 all have mode @var{m}. Operand 1 is a scalar\n+integer if @var{m} is scalar, otherwise it has the mode returned by\n+@code{TARGET_VECTORIZE_GET_MASK_MODE}. Operand 5 has whichever\n+integer mode the target prefers.\n \n-@cindex @code{compose_tag} instruction pattern\n-This pattern composes a tagged address specified by operand 1 with\n-mode @code{ptr_mode}, with an integer operand 2 representing the tag\n-offset. It returns the result in operand 0 with mode @code{ptr_mode}.\n+@samp{cond_len_@var{op}@var{mode}} generally corresponds to a conditional\n+form of @samp{@var{op}@var{mode}3}. As an exception, the vector forms\n+of shifts correspond to patterns like @code{vashl@var{mode}3} rather\n+than patterns like @code{ashl@var{mode}3}.\n \n-@cindex @code{clear_cache} instruction pattern\n-@item @samp{clear_cache}\n-This pattern, if defined, flushes the instruction cache for a region of\n-memory. The region is bounded to by the Pmode pointers in operand 0\n-inclusive and operand 1 exclusive.\n+@samp{cond_len_copysign@var{mode}} is only defined for floating point modes.\n \n-If this pattern is not defined, a call to the library function\n-@code{__clear_cache} is used.\n+@mdindex @code{cond_len_fma@var{mode}}\n+@mdindex @code{cond_len_fms@var{mode}}\n+@mdindex @code{cond_len_fnma@var{mode}}\n+@mdindex @code{cond_len_fnms@var{mode}}\n+@item @samp{cond_len_fma@var{mode}}\n+@itemx @samp{cond_len_fms@var{mode}}\n+@itemx @samp{cond_len_fnma@var{mode}}\n+@itemx @samp{cond_len_fnms@var{mode}}\n+Like @samp{cond_len_add@var{m}}, except that the conditional operation\n+takes 3 operands rather than two. For example, the vector form of\n+@samp{cond_len_fma@var{mode}} is equivalent to:\n \n-@cindex @code{spaceship@var{m}4} instruction pattern\n-@item @samp{spaceship@var{m}4}\n-Initialize output operand 0 with mode of integer type to -1, 0, 1 or -128\n-if operand 1 with mode @var{m} compares less than operand 2, equal to\n-operand 2, greater than operand 2 or is unordered with operand 2.\n-Operand 3 should be @code{const0_rtx} if the result is used in comparisons,\n-@code{const1_rtx} if the result is used as integer value and the comparison\n-is integral unsigned, @code{constm1_rtx} if the result is used as integer\n-value and the comparison is integral signed and some other @code{CONST_INT}\n-if the result is used as integer value and the comparison is floating point.\n-In the last case, instead of setting output operand 0 to -128 for unordered,\n-set it to operand 3.\n-@var{m} should be a scalar floating point mode.\n+@smallexample\n+for (i = 0; i < GET_MODE_NUNITS (@var{m}); i++)\n+ op0[i] = (i < ops[6] + ops[7] && op1[i]\n+ ? fma (op2[i], op3[i], op4[i])\n+ : op5[i]);\n+@end smallexample\n \n-This pattern is not allowed to @code{FAIL}.\n+@mdindex @code{cbranch@var{mode}4}\n+@item @samp{cbranch@var{mode}4}\n+Conditional branch instruction combined with a compare instruction.\n+Operand 0 is a comparison operator. Operand 1 and operand 2 are the\n+first and second operands of the comparison, respectively. Operand 3\n+is the @code{code_label} to jump to. For vectors this optab is only used for\n+comparisons of VECTOR_BOOLEAN_TYPE_P values and it never called for\n+data-registers. Data vector operands should use one of the patterns below\n+instead.\n \n-@cindex @code{isfinite@var{m}2} instruction pattern\n-@item @samp{isfinite@var{m}2}\n-Return 1 if operand 1 is a finite floating point number and 0\n-otherwise. @var{m} is a scalar floating point mode. Operand 0\n-has mode @code{SImode}, and operand 1 has mode @var{m}.\n+@mdindex @code{vec_cbranch_any@var{mode}}\n+@item @samp{vec_cbranch_any@var{mode}}\n+Conditional branch instruction based on a vector compare that branches\n+when at least one of the elementwise comparisons of the two input\n+vectors is true.\n+Operand 0 is a comparison operator. Operand 1 and operand 2 are the\n+first and second operands of the comparison, respectively. Operand 3\n+is the @code{code_label} to jump to.\n \n-@cindex @code{isnan@var{m}2} instruction pattern\n-@item @samp{isnan@var{m}2}\n-Return 1 if operand 1 is a @code{NaN} and 0 otherwise.\n-@var{m} is a scalar floating point mode. Operand 0\n-has mode @code{SImode}, and operand 1 has mode @var{m}.\n+@mdindex @code{vec_cbranch_all@var{mode}}\n+@item @samp{vec_cbranch_all@var{mode}}\n+Conditional branch instruction based on a vector compare that branches\n+when all of the elementwise comparisons of the two input vectors is true.\n+Operand 0 is a comparison operator. Operand 1 and operand 2 are the\n+first and second operands of the comparison, respectively. Operand 3\n+is the @code{code_label} to jump to.\n \n-@cindex @code{isnormal@var{m}2} instruction pattern\n-@item @samp{isnormal@var{m}2}\n-Return 1 if operand 1 is a normal floating point number and 0\n-otherwise. @var{m} is a scalar floating point mode. Operand 0\n-has mode @code{SImode}, and operand 1 has mode @var{m}.\n+@mdindex @code{cond_vec_cbranch_any@var{mode}}\n+@item @samp{cond_vec_cbranch_any@var{mode}}\n+Masked conditional branch instruction based on a vector compare that branches\n+when at least one of the elementwise comparisons of the two input\n+vectors is true.\n+Operand 0 is a comparison operator. Operand 1 is the mask operand.\n+Operand 2 and operand 3 are the first and second operands of the comparison,\n+respectively. Operand 5 is the @code{code_label} to jump to. Inactive lanes in\n+the mask operand should not influence the decision to branch.\n \n-@cindex @code{crc@var{m}@var{n}4} instruction pattern\n-@item @samp{crc@var{m}@var{n}4}\n-Calculate a bit-forward CRC using operands 1, 2 and 3,\n-then store the result in operand 0.\n-Operands 1 is the initial CRC, operands 2 is the data and operands 3 is the\n-polynomial without leading 1.\n-Operands 0, 1 and 3 have mode @var{n} and operand 2 has mode @var{m}, where\n-both modes are integers. The size of CRC to be calculated is determined by the\n-mode; for example, if @var{n} is @code{HImode}, a CRC16 is calculated.\n+@mdindex @code{cond_vec_cbranch_all@var{mode}}\n+@item @samp{cond_vec_cbranch_all@var{mode}}\n+Masked conditional branch instruction based on a vector compare that branches\n+when all of the elementwise comparisons of the two input vectors is true.\n+Operand 0 is a comparison operator. Operand 1 is the mask operand.\n+Operand 2 and operand 3 are the first and second operands of the comparison,\n+respectively. Operand 5 is the @code{code_label} to jump to. Inactive lanes in\n+the mask operand should not influence the decision to branch.\n \n-@cindex @code{crc_rev@var{m}@var{n}4} instruction pattern\n-@item @samp{crc_rev@var{m}@var{n}4}\n-Similar to @samp{crc@var{m}@var{n}4}, but calculates a bit-reversed CRC.\n+@mdindex @code{cond_len_vec_cbranch_any@var{mode}}\n+@item @samp{cond_len_vec_cbranch_any@var{mode}}\n+Len based conditional branch instruction based on a vector compare that branches\n+when at least one of the elementwise comparisons of the two input\n+vectors is true.\n+Operand 0 is a comparison operator. Operand 1 is the mask operand. Operand 2\n+and operand 3 are the first and second operands of the comparison, respectively.\n+Operand 4 is the len operand and Operand 5 is the bias operand. Operand 6 is\n+the @code{code_label} to jump to. Inactive lanes in the mask operand should not\n+influence the decision to branch.\n+\n+@mdindex @code{cond_len_vec_cbranch_all@var{mode}}\n+@item @samp{cond_len_vec_cbranch_all@var{mode}}\n+Len based conditional branch instruction based on a vector compare that branches\n+when all of the elementwise comparisons of the two input vectors is true.\n+Operand 0 is a comparison operator. Operand 1 is the mask operand. Operand 2\n+and operand 3 are the first and second operands of the comparison, respectively.\n+Operand 4 is the len operand and Operand 5 is the bias operand. Operand 6 is\n+the @code{code_label} to jump to. Inactive lanes in the mask operand should not\n+influence the decision to branch.\n \n @end table\n \n", "prefixes": [] }