From patchwork Thu Apr  9 16:51:00 2020
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Patchwork-Submitter: Richard Sandiford <richard.sandiford@arm.com>
X-Patchwork-Id: 1268686
Return-Path: <gcc-patches-bounces@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized)
	smtp.mailfrom=gcc.gnu.org
	(client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org;
	envelope-from=gcc-patches-bounces@gcc.gnu.org;
	receiver=<UNKNOWN>)
Authentication-Results: ozlabs.org;
	dmarc=none (p=none dis=none) header.from=arm.com
Received: from sourceware.org (server2.sourceware.org
	[IPv6:2620:52:3:1:0:246e:9693:128c])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	key-exchange X25519 server-signature RSA-PSS (4096 bits)
	server-digest SHA256) (No client certificate requested)
	by ozlabs.org (Postfix) with ESMTPS id 48ynGR0ld1z9sSq
	for <incoming@patchwork.ozlabs.org>;
	Fri, 10 Apr 2020 02:51:09 +1000 (AEST)
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id E2B64385BF81;
	Thu,  9 Apr 2020 16:51:04 +0000 (GMT)
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from foss.arm.com (foss.arm.com [217.140.110.172])
	by sourceware.org (Postfix) with ESMTP id 3F614385B835
	for <gcc-patches@gcc.gnu.org>; Thu,  9 Apr 2020 16:51:02 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 3F614385B835
Authentication-Results: sourceware.org;
	dmarc=none (p=none dis=none) header.from=arm.com
Authentication-Results: sourceware.org;
	spf=pass smtp.mailfrom=richard.sandiford@arm.com
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14])
	by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E309031B
	for <gcc-patches@gcc.gnu.org>; Thu,  9 Apr 2020 09:51:01 -0700 (PDT)
Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.126])
	by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id
	89D433F73D
	for <gcc-patches@gcc.gnu.org>; Thu,  9 Apr 2020 09:51:01 -0700 (PDT)
From: Richard Sandiford <richard.sandiford@arm.com>
To: gcc-patches@gcc.gnu.org
Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com
Subject: [committed, wwwdocs] aarch64: Document SVE changes
Date: Thu, 09 Apr 2020 17:51:00 +0100
Message-ID: <mptd08gppqj.fsf@arm.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux)
MIME-Version: 1.0
X-Spam-Status: No, score=-26.3 required=5.0 tests=BAYES_00, GIT_PATCH_0,
	GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_DMARC_STATUS, SPF_HELO_NONE,
	SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on
	server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <http://gcc.gnu.org/mailman/options/gcc-patches>,
	<mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <http://gcc.gnu.org/mailman/listinfo/gcc-patches>,
	<mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
Errors-To: gcc-patches-bounces@gcc.gnu.org
Sender: "Gcc-patches" <gcc-patches-bounces@gcc.gnu.org>

As per $SUBJECT

This seemed to flow more naturally if we organised things as:

- improvements to existing features
- new options
- new extensions
- new CPUs

The patch also fixes up some missing tags flagged by xmllint.

Pushed.
---
 htdocs/gcc-10/changes.html | 100 +++++++++++++++++++++++++++++++------
 1 file changed, 85 insertions(+), 15 deletions(-)
diff --git a/htdocs/gcc-10/changes.html b/htdocs/gcc-10/changes.html
index 1c8d7a9f..61c767f4 100644
--- a/htdocs/gcc-10/changes.html
+++ b/htdocs/gcc-10/changes.html
@@ -237,6 +237,7 @@ a work-in-progress.</p>
 	with the new attribute <code>access</code>.
       </li>
     </ul>
+  </li>
 </ul>
 
 <h3 id="c">C</h3>
@@ -336,6 +337,7 @@ a work-in-progress.</p>
 	causing an syntactic ambiguity.
       </li>
     </ul>
+  </li>
   <li>
     G++ can now detect modifying constant objects in constexpr evaluation
     (which is undefined behavior).
@@ -452,6 +454,7 @@ a work-in-progress.</p>
         For formatted input/output, if the explicit widths after the data-edit
         descriptors <code>I</code>, <code>F</code> and <code>G</code> have been
         omitted, default widths are used.
+      </li>
       <li>
         A blank format item at the end of a format specification, i.e. nothing
         following the final comma, is allowed.  Use the option
@@ -478,6 +481,7 @@ a work-in-progress.</p>
         <code>CHARACTER</code> expressions. Use the option <code>-fdec</code>.
       </li>
     </ul>
+  </li>
   <li>
     Character type names in errors and warnings now include <code>len</code>
     in addition to <code>kind</code>; <code>*</code> is used for assumed
@@ -516,38 +520,104 @@ a work-in-progress.</p>
 
 <h3 id="aarch64">AArch64</h3>
 <ul>
-  <li> The <code>-mbranch-protection=pac-ret</code> option now accepts the
+  <li>There have been several improvements related to the Scalable
+  Vector Extension (SVE):
+    <ul>
+      <li>The SVE ACLE types and intrinsics are now supported.  They can
+      be accessed using the header file <code>arm_sve.h</code>.
+      </li>
+      <li>It is now possible to create fixed-length SVE types using
+      the <code>arm_sve_vector_bits</code> attribute.  For example:
+<pre>#if __ARM_FEATURE_SVE_BITS==512
+typedef svint32_t vec512 __attribute__((arm_sve_vector_bits(512)));
+typedef svbool_t pred512 __attribute__((arm_sve_vector_bits(512)));
+#endif</pre>
+      </li>
+      <li><code>-mlow-precision-div</code>, <code>-mlow-precision-sqrt</code>
+      and <code>-mlow-precision-recip-sqrt</code> now work for SVE.
+      </li>
+      <li><code>-msve-vector-bits=128</code> now generates
+      vector-length-specific code for little-endian targets.  It continues
+      to generate vector-length-agnostic code for big-endian targets,
+      just as previous releases did for all targets.
+      </li>
+      <li>The vectorizer is now able to use extending loads and truncating
+      stores, including gather loads and scatter stores.
+      </li>
+      <li>The vectorizer now compares the cost of vectorizing with SVE
+      and vectorizing with Advanced SIMD and tries to pick the best one.
+      Previously it would always use SVE if possible.
+      </li>
+      <li>If a vector loop uses Advanced SIMD rather than SVE, the vectorizer
+      now considers using SVE to vectorize the left-over elements (the
+      “scalar tail” or “epilog”).
+      </li>
+      <li>Besides these specific points, there have been many general
+      improvements to the way that the vectorizer uses SVE.
+      </li>
+    </ul>
+  </li>
+  <li>The <code>-mbranch-protection=pac-ret</code> option now accepts the
   optional argument <code>+b-key</code> extension to perform return address
   signing with the B-key instead of the A-key.
   </li>
+  <li>The option <code>-moutline-atomics</code> has been added to aid
+  deployment of the Large System Extensions (LSE) on GNU/Linux systems built
+  with a baseline architecture targeting Armv8-A.  When the option is
+  specified code is emitted to detect the presence of LSE instructions at
+  runtime and use them for standard atomic operations.
+  For more information please refer to the documentation.
+  </li>
   <li>The Transactional Memory Extension is now supported through ACLE
   intrinsics.  It can be enabled through the <code>+tme</code> option
   extension (for example, <code>-march=armv8.5-a+tme</code>).
   </li>
-  <li>Initial autovectorization support for SVE2 has been added and can be
-  enabled through the   <code>+sve2</code> option extension (for example,
-  <code>-march=armv8.5-a+sve2</code>).  Additional extensions can be enabled
-  through <code>+sve2-sm4</code>, <code>+sve2=aes</code>,
-  <code>+sve2-sha3</code>, <code>+sve2-bitperm</code>.
-  </li>
-  <li> A number of features from the Armv8.5-a are now supported through ACLE
+  <li>A number of features from Armv8.5-A are now supported through ACLE
   intrinsics.  These include:
     <ul>
 	<li>The random number instructions that can be enabled
 	through the (already present in GCC 9.1) <code>+rng</code> option
 	extension.</li>
 	<li>Floating-point intrinsics to round to integer instructions from
-	Armv8.5-a when targeting <code>-march=armv8.5-a</code> or later.</li>
+	Armv8.5-A when targeting <code>-march=armv8.5-a</code> or later.</li>
 	<li>Memory Tagging Extension intrinsics enabled through the
 	<code>+memtag</code> option extension.</li>
     </ul>
   </li>
-  <li> The option <code>-moutline-atomics</code> has been added to aid
-  deployment of the Large System Extensions (LSE) on GNU/Linux systems built
-  with a baseline architecture targeting Armv8-A.  When the option is
-  specified code is emitted to detect the presence of LSE instructions at
-  runtime and use them for standard atomic operations.
-  For more information please refer to the documentation.
+  <li>Similarly, the following Armv8.6-A features are now supported
+  through ACLE intrinsics:
+    <ul>
+      <li>The bfloat16 extension.  This extension is enabled automatically
+      when Armv8.6-A is selected (such as by <code>-march=armv8.6-a</code>).
+      It can also be enabled for Armv8.2-A and later using the
+      <code>+bf16</code> option extension.
+      </li>
+      <li>The Matrix Multiply extension.  This extension is split into
+      three parts, one for each supported data type:
+	<ul>
+	  <li>Support for 8-bit integer matrix multiply instructions.
+	  This extension is enabled automatically when Armv8.6-A is
+	  selected.  It can also be enabled for Armv8.2-A and later using
+	  the <code>+i8mm</code> option extension.
+	  </li>
+	  <li>Support for 32-bit floating-point matrix multiply instructions.
+	  This extension can be enabled using the <code>+f32mm</code>
+	  option extension, which also has the effect of enabling SVE.
+	  </li>
+	  <li>Support for 64-bit floating-point matrix multiply instructions.
+	  This extension can be enabled using the <code>+f64mm</code>
+	  option extension, which likewise has the effect of enabling SVE.
+	  </li>
+	</ul>
+      </li>
+    </ul>
+  </li>
+  <li>SVE2 is now supported through ACLE intrinsics and (to a limited extent)
+  through autovectorization.  It can be enabled through the <code>+sve2</code>
+  option extension (for example, <code>-march=armv8.5-a+sve2</code>).
+  Additional extensions can be enabled through <code>+sve2-sm4</code>,
+  <code>+sve2=aes</code>, <code>+sve2-sha3</code> and
+  <code>+sve2-bitperm</code>.
   </li>
   <li>
        Support has been added for the following processors