[ARM] Dot Product NEON intrinsics [Patch (3/8)]

Message ID	20171106165336.GA12409@arm.com
State	New
Headers	show Return-Path: <gcc-patches-return-466046-incoming=patchwork.ozlabs.org@gcc.gnu.org> DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:mime-version:content-type; q=dns; s=default; b=IhUaLfkkMtPiOEmJeP1Xo2wxxXRBiR5xeZOIyrWWSd737IRbB1 pR/qZREpKso6+l1kRX6FFUJQEpY+hTsvSJxTwxkd5/q8Abavpai503f5dpESdAYg tx3S/Ute23Dsw0zHuJG/snrZKMySNQGSGjxoOnEIzDXeAmpP/kT8tRgdo= Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk Sender: gcc-patches-owner@gcc.gnu.org Date: Mon, 6 Nov 2017 16:53:39 +0000 From: Tamar Christina <tamar.christina@arm.com> To: gcc-patches@gcc.gnu.org Cc: nd@arm.com, Ramana.Radhakrishnan@arm.com, Richard.Earnshaw@arm.com, nickc@redhat.com, Kyrylo.Tkachov@arm.com Subject: [PATCH][GCC][ARM] Dot Product NEON intrinsics [Patch (3/8)] Message-ID: <20171106165336.GA12409@arm.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="gKMricLos+KVdGMg" Content-Disposition: inline User-Agent: Mutt/1.5.24 (2015-08-30) NoDisclaimer: True X-Microsoft-Exchange-Diagnostics: 1; DB6PR0802MB2312; 20:SwpqzPgyxdrPih+sP8ogHFxotM8HZYLuPkaRTtXi2CF5W4L5yYo3l0g/eUavEktMaMDhSc7SvzoS4teD9qMac8POPlsJWKMsfwBrlS6/Xeamwg2ONeYURJ3/jXCZKaU7nb27s/Eo6N8Oq/DXz/2blaqtRRMlcHzETd8mJuTz7bw=; 4:XZsrDItm4PGHWVFZIXYaHPAfVky/VbNx1qf0yas7jYDkyD1KC4jp4d6pazNhMnq2oHwRBRXYuJlE0vllsuGO4GgjvUpemmIlatWYqzejfRI7vLq8dRqZOBpDZe5ou9ombUvipk9NdpqBAoQRP+ukyAczv1RWZMJT7m92En5oBEQYi/6oYrxWujzByLHbp3Vg/gcSLrj680ZgJfVqlUXVenbthqrAgPp5wrJwRPUBE4IGS6EA516kACCbNmr3Xz95hQDQrjYwwzc3Vgbqm6BT3jAsV67lbqVj0j8GrwYld84n6pszLDq5caIe1rb1FP2p Received-SPF: None (protection.outlook.com: arm.com does not designate permitted sender hosts) X-Microsoft-Exchange-Diagnostics: 1; DB6PR0802MB2312; 23:2ZyNkBS0LstYD/VYE1I+E6j8lJnozBXGoGJgEG/xl4BD9uS/8jm4Bm+wD8zsRX2qrdqv/7JazjFUUNtG/aVBP0cjK8DomAUs4f1bR8JmsEwVEip3P7Cjmih3uoYPnKAhR7qVNf2kavHCsMTZEXxO47T7fVeGtcIMlLqW8B6KJFNfF39WG8MTzbIHXyDeXQ8D8XyLzPuGNTNOU8sG1gcaFWZwC8c0KW9LWarDn0GGCN60BR4LeUdp7HURh1zsMUTe89BufMMHPNIG/vuP2W70ckw0vs2CFK7x3ku4hBrxBuPpg2LYY4+Z8pA1+14mRzTH4s27nLMkW/nfMOJH6VdL7VZWkPsz5loNwKFQmBeYFIQrHVqSvgRj22TMYwX88wz2oydnpYLcfU6bNx+y634OEzPBhnlEvCb/PciMjpVvpKb5CKBkxIsHVe8G+VPBoeX9UzpZVyOve3KW3JUQaMmhjd+iQeMUCqo7JHF5sAwOAwQj7724F19qnvSjw5ul2ON7Mp5vSSYDIVJ9tWy/pnsyvDx4fusZq3F15DES+kteDlCXXdKfWy2zL+QMShgaYRNrJ0rltshAePUaWXodLRort6olz5cv3hknWmMy5HwLX3OqGWP9B0o1l7uKGoz7OK3jzWKYxlcd597kzIU2/Nl817DSDEpJ/2luH3tJp3hdLnVPUPplGi1p0tKddt5tC+YOYJ0npNw/6kr8nPDX02jXLyauucTxYJxDEAyAPfB63C8nmfRe9epmoQMq5rP9CWauNPMhOCqywR2Lc2MqCJvcVFJByg1wKkd72YsU+CvkCcRfLAVVXsgJ2w9cIA1RwiGz6n+3MFRLequJs9IKRwAdK5kb4mJ1InexB8AsW4ZE9Jb2N0YDAi71KH6P+u70EEkwcF4BG3+rNa65VWPPKHpMcdXBe8VmRz4cMa9eflKwhiEj+R+Eq7q03URrPGzsNjATVEz82OAOAzQVC8/ivK08gmzkS9l6s4sX5uI4nwkasLzISNY9LH+J5EngBk+6bpCQkccXdtUnRKnSikIRTbeynfe8OEVA7vHFKUAfJUUbqavXZLOTRjAjFi97hyvv0FfKk3nB8YfUTFlngtnI4NkeoqXiNWV1DHlwJVmPga2KyVQGqKHL8RA0FrPC3UP3avcL26OBdwBomaxUxK0mOWvQLL+/FHmK5ddofvVCvIXeTLmpmKfz/ewgxXdlouwRWrwf1EOsI5glC8kUp1GBWbN82Yxyj70iRNY0IcFFjgnzV5S26FqYEUSn/UWQFnm/HHrCTfN9RYd/24mOEbuRRRdtdg== X-Microsoft-Exchange-Diagnostics: 1; DB6PR0802MB2312; 6:Iw3G3a+brTJtnWYhysVsLTiyN9w2YzPv/MinzhvC2RZ8BaNrJqNdRntzAlBGA79E3/NrDPEs+6GJ5jbjCSeGFWZsrGTtipB4GsLhmeJuV18lntRUBybgs5jIrTlnrFXPKlufzC5jQrmsTWHBcthhwr1klOf+UQGIBj7XkqV+ZfQu6GQ+syf3vruL2Byla9geiKSQd4mTa0oMDd9Y/7/Cvirhhs+LxbAqYm6gD5nW7vK2LQSXPcIM8egMgHB6HCsjL4+PUT6NwoBEbm51eOwToufVZ8mAb3JYC7YiN8Xh2EMXNCSZcjMbZgjBFi6YwDTDcIETzMONaH8N+KZi4fLFqJsqYwIMLIUUklqOHn4fhIE=; 5:uFiy+HTGqInhIYkQQt7l/Y0ZyPNPxp6FyT7fCSfyV2mp7TD3xA/GKDUcBLShefgji0DM7Rm2zCKIpEBFwv8XJaFOkVneCzNyyqJjx1L7ov2ubxCbpSY4cFQWluRhKxSoFI1bNfYplE18pbedyu/Xtgba9jc60VTc9ZPwg0VTf4Y=; 24:hRNBIJJbeS9KL2MoL1VMo8JyoHwn1XD9JtlmOHSr9szgZEbTsFXmaXcwHPJfJtvzwFy61NWXueCUEw+vIkWlse65Nx3r4dy3VW0mZLw7Sa0=; 7:0pWIGOZY1ZUsfd5q54qo6vNbckFGpfoZmpk5ok7vvcGfFoMHq/De+GwR8GH40HVNAbVchldZsoJs2Q/e3PSf34nDyTF3eRkdNTIOHYozCZfhFZq2WifWg15cP1Va+2P2BvlD6cG9gIwAUfvxltFkGdUnGw1LtyPIKw9ag4qn4eF7rNYlHJ/rg5gW27tsXFFRHbQGA7BUUtC40M82T69KeWrXczaDtFWY3WfOqrnlq68Zxlld1WBZBRbrz2Du+yoQ SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM
Series	[ARM] Dot Product NEON intrinsics [Patch (3/8)] \| expand [ARM] Dot Product NEON intrinsics [Patch (3/8)]

Tamar Christina Nov. 6, 2017, 4:53 p.m. UTC

Hi All,

This patch adds the NEON intrinsics for Dot product.

Dot product is available from ARMv8.2-a and onwards.

Regtested on arm-none-eabi, armeb-none-eabi,
aarch64-none-elf and aarch64_be-none-elf with no issues found.

Ok for trunk?

gcc/
2017-11-06  Tamar Christina  <tamar.christina@arm.com>

	* config/aarch64/arm_neon.h (vdot_u32, vdotq_u32)
	(vdot_s32, vdotq_s32): New.
	(vdot_lane_u32, vdotq_lane_u32): New.
	(vdot_lane_s32, vdotq_lane_s32): New.


gcc/testsuite/
2017-11-06  Tamar Christina  <tamar.christina@arm.com>

	* gcc.target/arm/simd/vdot-compile.c: New.
	* gcc.target/arm/simd/vect-dot-qi.h: New.
	* gcc.target/arm/simd/vect-dot-s8.c: New.
	* gcc.target/arm/simd/vect-dot-u8.c: New

--

Tamar Christina Nov. 21, 2017, 5:28 p.m. UTC | #1

Ping

> -----Original Message-----

> From: Tamar Christina [mailto:tamar.christina@arm.com]

> Sent: Monday, November 6, 2017 16:54

> To: gcc-patches@gcc.gnu.org

> Cc: nd <nd@arm.com>; Ramana Radhakrishnan

> <Ramana.Radhakrishnan@arm.com>; Richard Earnshaw

> <Richard.Earnshaw@arm.com>; nickc@redhat.com; Kyrylo Tkachov

> <Kyrylo.Tkachov@arm.com>

> Subject: [PATCH][GCC][ARM] Dot Product NEON intrinsics [Patch (3/8)]

> 

> Hi All,

> 

> This patch adds the NEON intrinsics for Dot product.

> 

> Dot product is available from ARMv8.2-a and onwards.

> 

> Regtested on arm-none-eabi, armeb-none-eabi, aarch64-none-elf and

> aarch64_be-none-elf with no issues found.

> 

> Ok for trunk?

> 

> gcc/

> 2017-11-06  Tamar Christina  <tamar.christina@arm.com>

> 

> 	* config/aarch64/arm_neon.h (vdot_u32, vdotq_u32)

> 	(vdot_s32, vdotq_s32): New.

> 	(vdot_lane_u32, vdotq_lane_u32): New.

> 	(vdot_lane_s32, vdotq_lane_s32): New.

> 

> 

> gcc/testsuite/

> 2017-11-06  Tamar Christina  <tamar.christina@arm.com>

> 

> 	* gcc.target/arm/simd/vdot-compile.c: New.

> 	* gcc.target/arm/simd/vect-dot-qi.h: New.

> 	* gcc.target/arm/simd/vect-dot-s8.c: New.

> 	* gcc.target/arm/simd/vect-dot-u8.c: New

> 

> --

Kyrill Tkachov Nov. 22, 2017, 11:26 a.m. UTC | #2

Hi Tamar,

On 06/11/17 16:53, Tamar Christina wrote:
> Hi All,
>
> This patch adds the NEON intrinsics for Dot product.
>
> Dot product is available from ARMv8.2-a and onwards.
>
> Regtested on arm-none-eabi, armeb-none-eabi,
> aarch64-none-elf and aarch64_be-none-elf with no issues found.
>
> Ok for trunk?
>
> gcc/
> 2017-11-06  Tamar Christina  <tamar.christina@arm.com>
>
>         * config/aarch64/arm_neon.h (vdot_u32, vdotq_u32)

This should be config/arm/arm_neon.h

>         (vdot_s32, vdotq_s32): New.
>         (vdot_lane_u32, vdotq_lane_u32): New.
>         (vdot_lane_s32, vdotq_lane_s32): New.
>
>
> gcc/testsuite/
> 2017-11-06  Tamar Christina  <tamar.christina@arm.com>
>
>         * gcc.target/arm/simd/vdot-compile.c: New.
>         * gcc.target/arm/simd/vect-dot-qi.h: New.
>         * gcc.target/arm/simd/vect-dot-s8.c: New.
>         * gcc.target/arm/simd/vect-dot-u8.c: New
>
> -- 

diff --git a/gcc/config/arm/arm_neon.h b/gcc/config/arm/arm_neon.h
index 0d436e83d0f01f0c86f8d6a25f84466c841c7e11..419080417901f343737741e334cbff818bb1e70a 100644
--- a/gcc/config/arm/arm_neon.h
+++ b/gcc/config/arm/arm_neon.h
@@ -18034,6 +18034,72 @@ vzipq_f16 (float16x8_t __a, float16x8_t __b)
  
  #endif
  
+/* Adv.SIMD Dot Product intrinsics.  */

Please no full stop: "AdvSIMD".

+
+#pragma GCC push_options
+#if __ARM_ARCH >= 8
+#pragma GCC target ("arch=armv8.2-a+dotprod")

<snip>

diff --git a/gcc/testsuite/gcc.target/arm/simd/vect-dot-qi.h b/gcc/testsuite/gcc.target/arm/simd/vect-dot-qi.h
new file mode 100644
index 0000000000000000000000000000000000000000..90b00aff95cfef96d1963be17673dc191cc71169
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/vect-dot-qi.h
@@ -0,0 +1,15 @@
+TYPE char X[N] __attribute__ ((__aligned__(__BIGGEST_ALIGNMENT__)));
+TYPE char Y[N] __attribute__ ((__aligned__(__BIGGEST_ALIGNMENT__)));
+
+__attribute__ ((noinline)) int
+foo1(int len) {
+  int i;
+  TYPE int result = 0;
+  TYPE short prod;
+
+  for (i=0; i<len; i++) {
+    prod = X[i] * Y[i];
+    result += prod;
+  }
+  return result;
+}
\ No newline at end of file

Please add new lines at the end of the new test files.
This applies to a few more new files in this patch.

Ok with these nits fixed.

Thanks,
Kyrill

Christophe Lyon Nov. 23, 2017, 11:26 p.m. UTC | #3

On 22 November 2017 at 12:26, Kyrill  Tkachov
<kyrylo.tkachov@foss.arm.com> wrote:
> Hi Tamar,
>
> On 06/11/17 16:53, Tamar Christina wrote:
>>
>> Hi All,
>>
>> This patch adds the NEON intrinsics for Dot product.
>>
>> Dot product is available from ARMv8.2-a and onwards.
>>
>> Regtested on arm-none-eabi, armeb-none-eabi,
>> aarch64-none-elf and aarch64_be-none-elf with no issues found.
>>
>> Ok for trunk?
>>
>> gcc/
>> 2017-11-06  Tamar Christina  <tamar.christina@arm.com>
>>
>>         * config/aarch64/arm_neon.h (vdot_u32, vdotq_u32)
>
>
> This should be config/arm/arm_neon.h
>
>>         (vdot_s32, vdotq_s32): New.
>>         (vdot_lane_u32, vdotq_lane_u32): New.
>>         (vdot_lane_s32, vdotq_lane_s32): New.
>>
>>
>> gcc/testsuite/
>> 2017-11-06  Tamar Christina  <tamar.christina@arm.com>
>>
>>         * gcc.target/arm/simd/vdot-compile.c: New.
>>         * gcc.target/arm/simd/vect-dot-qi.h: New.
>>         * gcc.target/arm/simd/vect-dot-s8.c: New.
>>         * gcc.target/arm/simd/vect-dot-u8.c: New
>>
>> --
>
>
> diff --git a/gcc/config/arm/arm_neon.h b/gcc/config/arm/arm_neon.h
> index
> 0d436e83d0f01f0c86f8d6a25f84466c841c7e11..419080417901f343737741e334cbff818bb1e70a
> 100644
> --- a/gcc/config/arm/arm_neon.h
> +++ b/gcc/config/arm/arm_neon.h
> @@ -18034,6 +18034,72 @@ vzipq_f16 (float16x8_t __a, float16x8_t __b)
>   #endif
>  +/* Adv.SIMD Dot Product intrinsics.  */
>
> Please no full stop: "AdvSIMD".
>
> +
> +#pragma GCC push_options
> +#if __ARM_ARCH >= 8
> +#pragma GCC target ("arch=armv8.2-a+dotprod")
>
> <snip>
>
Not sure if Kyrill actually meant to comment about the three lines
above, but they have a bug:
#if should be before #pragma GCC push_options.

Indeed, after this patch was committed (r255064), I've noticed many
regressions, for instance
p64_p128 is now unsupported. This is because the arm_crypto_ok
effective target now fails
with this message:
XXX/arm_neon.h:16911:1: error: inlining failed in call to
always_inline 'vaeseq_u8': target specific option mismatch

Not sure why this wasn't noticed in validations earlier?

Fixed as obvious (r255126).

Christophe

> diff --git a/gcc/testsuite/gcc.target/arm/simd/vect-dot-qi.h
> b/gcc/testsuite/gcc.target/arm/simd/vect-dot-qi.h
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..90b00aff95cfef96d1963be17673dc191cc71169
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/simd/vect-dot-qi.h
> @@ -0,0 +1,15 @@
> +TYPE char X[N] __attribute__ ((__aligned__(__BIGGEST_ALIGNMENT__)));
> +TYPE char Y[N] __attribute__ ((__aligned__(__BIGGEST_ALIGNMENT__)));
> +
> +__attribute__ ((noinline)) int
> +foo1(int len) {
> +  int i;
> +  TYPE int result = 0;
> +  TYPE short prod;
> +
> +  for (i=0; i<len; i++) {
> +    prod = X[i] * Y[i];
> +    result += prod;
> +  }
> +  return result;
> +}
> \ No newline at end of file
>
> Please add new lines at the end of the new test files.
> This applies to a few more new files in this patch.
>
> Ok with these nits fixed.
>
> Thanks,
> Kyrill
>
2017-11-24  Christophe Lyon  <christophe.lyon@linaro.org>

	* config/arm/arm_neon.h: Fix pragma GCC push_options before
	vdot_u32.
diff --git a/gcc/config/arm/arm_neon.h b/gcc/config/arm/arm_neon.h
index 3c9a8d9..d2e936c 100644
--- a/gcc/config/arm/arm_neon.h
+++ b/gcc/config/arm/arm_neon.h
@@ -18036,8 +18036,8 @@ vzipq_f16 (float16x8_t __a, float16x8_t __b)
 
 /* AdvSIMD Dot Product intrinsics.  */
 
-#pragma GCC push_options
 #if __ARM_ARCH >= 8
+#pragma GCC push_options
 #pragma GCC target ("arch=armv8.2-a+dotprod")
 
 __extension__ extern __inline uint32x2_t

Kyrill Tkachov Nov. 24, 2017, 9:33 a.m. UTC | #4

Hi Christophe,

On 23/11/17 23:26, Christophe Lyon wrote:
> On 22 November 2017 at 12:26, Kyrill  Tkachov
> <kyrylo.tkachov@foss.arm.com> wrote:
>> Hi Tamar,
>>
>> On 06/11/17 16:53, Tamar Christina wrote:
>>> Hi All,
>>>
>>> This patch adds the NEON intrinsics for Dot product.
>>>
>>> Dot product is available from ARMv8.2-a and onwards.
>>>
>>> Regtested on arm-none-eabi, armeb-none-eabi,
>>> aarch64-none-elf and aarch64_be-none-elf with no issues found.
>>>
>>> Ok for trunk?
>>>
>>> gcc/
>>> 2017-11-06  Tamar Christina  <tamar.christina@arm.com>
>>>
>>>          * config/aarch64/arm_neon.h (vdot_u32, vdotq_u32)
>>
>> This should be config/arm/arm_neon.h
>>
>>>          (vdot_s32, vdotq_s32): New.
>>>          (vdot_lane_u32, vdotq_lane_u32): New.
>>>          (vdot_lane_s32, vdotq_lane_s32): New.
>>>
>>>
>>> gcc/testsuite/
>>> 2017-11-06  Tamar Christina  <tamar.christina@arm.com>
>>>
>>>          * gcc.target/arm/simd/vdot-compile.c: New.
>>>          * gcc.target/arm/simd/vect-dot-qi.h: New.
>>>          * gcc.target/arm/simd/vect-dot-s8.c: New.
>>>          * gcc.target/arm/simd/vect-dot-u8.c: New
>>>
>>> --
>>
>> diff --git a/gcc/config/arm/arm_neon.h b/gcc/config/arm/arm_neon.h
>> index
>> 0d436e83d0f01f0c86f8d6a25f84466c841c7e11..419080417901f343737741e334cbff818bb1e70a
>> 100644
>> --- a/gcc/config/arm/arm_neon.h
>> +++ b/gcc/config/arm/arm_neon.h
>> @@ -18034,6 +18034,72 @@ vzipq_f16 (float16x8_t __a, float16x8_t __b)
>>    #endif
>>   +/* Adv.SIMD Dot Product intrinsics.  */
>>
>> Please no full stop: "AdvSIMD".
>>
>> +
>> +#pragma GCC push_options
>> +#if __ARM_ARCH >= 8
>> +#pragma GCC target ("arch=armv8.2-a+dotprod")
>>
>> <snip>
>>
> Not sure if Kyrill actually meant to comment about the three lines
> above, but they have a bug:
> #if should be before #pragma GCC push_options.

You're right, sorry for missing this :(


> Indeed, after this patch was committed (r255064), I've noticed many
> regressions, for instance
> p64_p128 is now unsupported. This is because the arm_crypto_ok
> effective target now fails
> with this message:
> XXX/arm_neon.h:16911:1: error: inlining failed in call to
> always_inline 'vaeseq_u8': target specific option mismatch
>
> Not sure why this wasn't noticed in validations earlier?
>
> Fixed as obvious (r255126).

Thank you for fixing this up.

Kyrill
> Christophe
>
>> diff --git a/gcc/testsuite/gcc.target/arm/simd/vect-dot-qi.h
>> b/gcc/testsuite/gcc.target/arm/simd/vect-dot-qi.h
>> new file mode 100644
>> index
>> 0000000000000000000000000000000000000000..90b00aff95cfef96d1963be17673dc191cc71169
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/arm/simd/vect-dot-qi.h
>> @@ -0,0 +1,15 @@
>> +TYPE char X[N] __attribute__ ((__aligned__(__BIGGEST_ALIGNMENT__)));
>> +TYPE char Y[N] __attribute__ ((__aligned__(__BIGGEST_ALIGNMENT__)));
>> +
>> +__attribute__ ((noinline)) int
>> +foo1(int len) {
>> +  int i;
>> +  TYPE int result = 0;
>> +  TYPE short prod;
>> +
>> +  for (i=0; i<len; i++) {
>> +    prod = X[i] * Y[i];
>> +    result += prod;
>> +  }
>> +  return result;
>> +}
>> \ No newline at end of file
>>
>> Please add new lines at the end of the new test files.
>> This applies to a few more new files in this patch.
>>
>> Ok with these nits fixed.
>>
>> Thanks,
>> Kyrill
>>

Tamar Christina Nov. 24, 2017, 10:31 a.m. UTC | #5

> >

> Not sure if Kyrill actually meant to comment about the three lines above, but

> they have a bug:

> #if should be before #pragma GCC push_options.

> 

> Indeed, after this patch was committed (r255064), I've noticed many

> regressions, for instance

> p64_p128 is now unsupported. This is because the arm_crypto_ok effective

> target now fails with this message:

> XXX/arm_neon.h:16911:1: error: inlining failed in call to always_inline

> 'vaeseq_u8': target specific option mismatch

> 

> Not sure why this wasn't noticed in validations earlier?


I still have the log files for these runs:

It seems that I was comparing the log files instead of the sum files, which do not show this difference.

/d/t/g/s/gcc (dot-product-arm ↩☡=) contrib/dg-cmp-results.sh -v -v "" ../../build-arm-none-eabi/results.clean/vanilla/gcc.log ../../build-arm-none-eabi/results.dotprod/vanilla/gcc.log  | grep p64_p128
/d/t/g/s/gcc (dot-product-arm ↩☡=) contrib/dg-cmp-results.sh -v -v "" ../../build-arm-none-eabi/results.clean/vanilla/gcc.sum ../../build-arm-none-eabi/results.dotprod/vanilla/gcc.sum  | grep p64_p128
NA->UNSUPPORTED: gcc.target/aarch64/advsimd-intrinsics/p64_p128.c   -O0 
PASS->NA: gcc.target/aarch64/advsimd-intrinsics/p64_p128.c   -O0  execution test
PASS->NA: gcc.target/aarch64/advsimd-intrinsics/p64_p128.c   -O0  (test for excess errors)
NA->UNSUPPORTED: gcc.target/aarch64/advsimd-intrinsics/p64_p128.c   -O1 
PASS->NA: gcc.target/aarch64/advsimd-intrinsics/p64_p128.c   -O1  execution test
PASS->NA: gcc.target/aarch64/advsimd-intrinsics/p64_p128.c   -O1  (test for excess errors)
NA->UNSUPPORTED: gcc.target/aarch64/advsimd-intrinsics/p64_p128.c   -O2

Sorry for missing this, I don't even know why these scripts accept the log files if they're always going to do the wrong thing.

Anyway thanks for fixing this and I'll make sure I'm using the sum files in the future.

Tamar

> 

> Fixed as obvious (r255126).

> 

> Christophe

> 

> > diff --git a/gcc/testsuite/gcc.target/arm/simd/vect-dot-qi.h

> > b/gcc/testsuite/gcc.target/arm/simd/vect-dot-qi.h

> > new file mode 100644

> > index

> >

> 0000000000000000000000000000000000000000..90b00aff95cfef96d1963be176

> 73

> > dc191cc71169

> > --- /dev/null

> > +++ b/gcc/testsuite/gcc.target/arm/simd/vect-dot-qi.h

> > @@ -0,0 +1,15 @@

> > +TYPE char X[N] __attribute__

> ((__aligned__(__BIGGEST_ALIGNMENT__)));

> > +TYPE char Y[N] __attribute__

> ((__aligned__(__BIGGEST_ALIGNMENT__)));

> > +

> > +__attribute__ ((noinline)) int

> > +foo1(int len) {

> > +  int i;

> > +  TYPE int result = 0;

> > +  TYPE short prod;

> > +

> > +  for (i=0; i<len; i++) {

> > +    prod = X[i] * Y[i];

> > +    result += prod;

> > +  }

> > +  return result;

> > +}

> > \ No newline at end of file

> >

> > Please add new lines at the end of the new test files.

> > This applies to a few more new files in this patch.

> >

> > Ok with these nits fixed.

> >

> > Thanks,

> > Kyrill

> >

Christophe Lyon Nov. 24, 2017, 4:15 p.m. UTC | #6

On 24 November 2017 at 11:31, Tamar Christina <Tamar.Christina@arm.com> wrote:
>> >
>> Not sure if Kyrill actually meant to comment about the three lines above, but
>> they have a bug:
>> #if should be before #pragma GCC push_options.
>>
>> Indeed, after this patch was committed (r255064), I've noticed many
>> regressions, for instance
>> p64_p128 is now unsupported. This is because the arm_crypto_ok effective
>> target now fails with this message:
>> XXX/arm_neon.h:16911:1: error: inlining failed in call to always_inline
>> 'vaeseq_u8': target specific option mismatch
>>
>> Not sure why this wasn't noticed in validations earlier?
>
> I still have the log files for these runs:
>
> It seems that I was comparing the log files instead of the sum files, which do not show this difference.
>
> /d/t/g/s/gcc (dot-product-arm ↩☡=) contrib/dg-cmp-results.sh -v -v "" ../../build-arm-none-eabi/results.clean/vanilla/gcc.log ../../build-arm-none-eabi/results.dotprod/vanilla/gcc.log  | grep p64_p128
> /d/t/g/s/gcc (dot-product-arm ↩☡=) contrib/dg-cmp-results.sh -v -v "" ../../build-arm-none-eabi/results.clean/vanilla/gcc.sum ../../build-arm-none-eabi/results.dotprod/vanilla/gcc.sum  | grep p64_p128
> NA->UNSUPPORTED: gcc.target/aarch64/advsimd-intrinsics/p64_p128.c   -O0
> PASS->NA: gcc.target/aarch64/advsimd-intrinsics/p64_p128.c   -O0  execution test
> PASS->NA: gcc.target/aarch64/advsimd-intrinsics/p64_p128.c   -O0  (test for excess errors)
> NA->UNSUPPORTED: gcc.target/aarch64/advsimd-intrinsics/p64_p128.c   -O1
> PASS->NA: gcc.target/aarch64/advsimd-intrinsics/p64_p128.c   -O1  execution test
> PASS->NA: gcc.target/aarch64/advsimd-intrinsics/p64_p128.c   -O1  (test for excess errors)
> NA->UNSUPPORTED: gcc.target/aarch64/advsimd-intrinsics/p64_p128.c   -O2
>
> Sorry for missing this, I don't even know why these scripts accept the log files if they're always going to do the wrong thing.
>
> Anyway thanks for fixing this and I'll make sure I'm using the sum files in the future.
>

Thanks for checking why you missed it.

That being said, I think there are a few more problems with your
patch, but there is a lot of "noise" in the reports.

After your commit, I have these reports:
http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/255064/report-build-info.html

After my commit, I have these reports:
http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/255126/report-build-info.html

I haven't fully checked that my patch fixes all the regressions
reported at r255064, but I don't see
why my patch would introduce regressions.... So I think your patch is
causing problems:
* on armeb --with-fpu=neon-fp16: (the 2 "REGRESSED" entries):
    gcc.target/arm/attr-neon3.c scan-assembler-times vld1 1 (found 2 times)
    gcc.target/arm/neon-vfma-1.c scan-assembler vfma\\.f32[\t]+[dDqQ]
    gcc.target/arm/neon-vfms-1.c scan-assembler vfms\\.f32[\t]+[dDqQ]

* on arm-none-linux-gnueabihf --with-cpu cortex-a5 --with-fpu vfpv3-d16-fp16
and armeb-none-linux-gnueabihf --with-cpu cortex-a9 --with-fpu vfpv3-d16-fp16
(the 2 "BIG-REGR" entries)

where a few tests fail:
(arm-none-linux-gnueabihf cortex-a5 vfpv3-d16-fp16):
    gcc.dg/vect/pr65947-14.c -flto -ffat-lto-objects execution test
    gcc.dg/vect/pr65947-14.c execution test

(armeb-none-linux-gnueabihf cortex-a9 vfpv3-d16-fp16):
  Executed from: gcc.dg/vect/vect.exp
    gcc.dg/vect/pr51074.c -flto -ffat-lto-objects execution test
    gcc.dg/vect/pr51074.c execution test
    gcc.dg/vect/pr64252.c -flto -ffat-lto-objects execution test
    gcc.dg/vect/pr65947-14.c -flto -ffat-lto-objects execution test
    gcc.dg/vect/pr65947-14.c execution test
    gcc.dg/vect/vect-cond-4.c -flto -ffat-lto-objects execution test
    gcc.dg/vect/vect-nb-iter-ub-2.c execution test
    gcc.dg/vect/vect-nb-iter-ub-3.c -flto -ffat-lto-objects execution test
    gcc.dg/vect/vect-nb-iter-ub-3.c execution test
    gcc.dg/vect/vect-strided-shift-1.c -flto -ffat-lto-objects execution test
    gcc.dg/vect/vect-strided-shift-1.c execution test
    gcc.dg/vect/vect-strided-u16-i3.c -flto -ffat-lto-objects execution test
    gcc.dg/vect/vect-strided-u16-i3.c execution test
  Executed from: gcc.target/arm/arm.exp
    gcc.target/arm/attr-neon3.c scan-assembler-times vld1 1 (found 2 times)
    gcc.target/arm/neon-vfma-1.c scan-assembler vfma\\.f32[\t]+[dDqQ]
    gcc.target/arm/neon-vfms-1.c scan-assembler vfms\\.f32[\t]+[dDqQ]
    gcc.target/arm/neon-vmla-1.c scan-assembler vmla\\.i32
    gcc.target/arm/neon-vmls-1.c scan-assembler vmls\\.i32
    gcc.target/arm/vect-copysignf.c scan-tree-dump-times vect
"vectorized 1 loops" 1 (found 0 times)

I haven't checked whether this tests were already failing before your
patch, and are just reported as new failures because they failed to
compile in the mean time.

Not sure I am clear :-)

Sorry for the delay and potentially hard to parse reports, I'm
struggling with infrastructure problems.

Thanks,

Christophe

> Tamar
>
>>
>> Fixed as obvious (r255126).
>>
>> Christophe
>>
>> > diff --git a/gcc/testsuite/gcc.target/arm/simd/vect-dot-qi.h
>> > b/gcc/testsuite/gcc.target/arm/simd/vect-dot-qi.h
>> > new file mode 100644
>> > index
>> >
>> 0000000000000000000000000000000000000000..90b00aff95cfef96d1963be176
>> 73
>> > dc191cc71169
>> > --- /dev/null
>> > +++ b/gcc/testsuite/gcc.target/arm/simd/vect-dot-qi.h
>> > @@ -0,0 +1,15 @@
>> > +TYPE char X[N] __attribute__
>> ((__aligned__(__BIGGEST_ALIGNMENT__)));
>> > +TYPE char Y[N] __attribute__
>> ((__aligned__(__BIGGEST_ALIGNMENT__)));
>> > +
>> > +__attribute__ ((noinline)) int
>> > +foo1(int len) {
>> > +  int i;
>> > +  TYPE int result = 0;
>> > +  TYPE short prod;
>> > +
>> > +  for (i=0; i<len; i++) {
>> > +    prod = X[i] * Y[i];
>> > +    result += prod;
>> > +  }
>> > +  return result;
>> > +}
>> > \ No newline at end of file
>> >
>> > Please add new lines at the end of the new test files.
>> > This applies to a few more new files in this patch.
>> >
>> > Ok with these nits fixed.
>> >
>> > Thanks,
>> > Kyrill
>> >

Tamar Christina Nov. 24, 2017, 6:05 p.m. UTC | #7

Hi Christophe,

> 

> After your commit, I have these reports:

> http://people.linaro.org/~christophe.lyon/cross-

> validation/gcc/trunk/255064/report-build-info.html

> 

> After my commit, I have these reports:

> http://people.linaro.org/~christophe.lyon/cross-

> validation/gcc/trunk/255126/report-build-info.html

> 

> I haven't fully checked that my patch fixes all the regressions reported at

> r255064, but I don't see why my patch would introduce regressions.... So I

> think your patch is causing problems:

> * on armeb --with-fpu=neon-fp16: (the 2 "REGRESSED" entries):

>     gcc.target/arm/attr-neon3.c scan-assembler-times vld1 1 (found 2 times)

>     gcc.target/arm/neon-vfma-1.c scan-assembler vfma\\.f32[\t]+[dDqQ]

>     gcc.target/arm/neon-vfms-1.c scan-assembler vfms\\.f32[\t]+[dDqQ]

> 

> * on arm-none-linux-gnueabihf --with-cpu cortex-a5 --with-fpu vfpv3-d16-

> fp16 and armeb-none-linux-gnueabihf --with-cpu cortex-a9 --with-fpu vfpv3-

> d16-fp16 (the 2 "BIG-REGR" entries)


This patch only introduced a few neon instrinsics in arm_neon.h, and most of these files don't use the header.

gcc.dg/vect/pr65947-14.c doesn't exist in my tree so it's a relatively new test.

I will run some regressions over the weekend on an updated tree, but I can't understand how a not included header it can cause execution failures 😊
However most of those are vectorizer tests. It seems much more likely to me that vectorization is broken rather.

Thanks,
Tamar

> where a few tests fail:

> (arm-none-linux-gnueabihf cortex-a5 vfpv3-d16-fp16):

>     gcc.dg/vect/pr65947-14.c -flto -ffat-lto-objects execution test

>     gcc.dg/vect/pr65947-14.c execution test

> 

> (armeb-none-linux-gnueabihf cortex-a9 vfpv3-d16-fp16):

>   Executed from: gcc.dg/vect/vect.exp

>     gcc.dg/vect/pr51074.c -flto -ffat-lto-objects execution test

>     gcc.dg/vect/pr51074.c execution test

>     gcc.dg/vect/pr64252.c -flto -ffat-lto-objects execution test

>     gcc.dg/vect/pr65947-14.c -flto -ffat-lto-objects execution test

>     gcc.dg/vect/pr65947-14.c execution test

>     gcc.dg/vect/vect-cond-4.c -flto -ffat-lto-objects execution test

>     gcc.dg/vect/vect-nb-iter-ub-2.c execution test

>     gcc.dg/vect/vect-nb-iter-ub-3.c -flto -ffat-lto-objects execution test

>     gcc.dg/vect/vect-nb-iter-ub-3.c execution test

>     gcc.dg/vect/vect-strided-shift-1.c -flto -ffat-lto-objects execution test

>     gcc.dg/vect/vect-strided-shift-1.c execution test

>     gcc.dg/vect/vect-strided-u16-i3.c -flto -ffat-lto-objects execution test

>     gcc.dg/vect/vect-strided-u16-i3.c execution test

>   Executed from: gcc.target/arm/arm.exp

>     gcc.target/arm/attr-neon3.c scan-assembler-times vld1 1 (found 2 times)

>     gcc.target/arm/neon-vfma-1.c scan-assembler vfma\\.f32[\t]+[dDqQ]

>     gcc.target/arm/neon-vfms-1.c scan-assembler vfms\\.f32[\t]+[dDqQ]

>     gcc.target/arm/neon-vmla-1.c scan-assembler vmla\\.i32

>     gcc.target/arm/neon-vmls-1.c scan-assembler vmls\\.i32

>     gcc.target/arm/vect-copysignf.c scan-tree-dump-times vect "vectorized 1

> loops" 1 (found 0 times)

> 

> I haven't checked whether this tests were already failing before your patch,

> and are just reported as new failures because they failed to compile in the

> mean time.

> 

> Not sure I am clear :-)

> 

> Sorry for the delay and potentially hard to parse reports, I'm struggling with

> infrastructure problems.

> 

> Thanks,

> 

> Christophe

> 

> > Tamar

> >

> >>

> >> Fixed as obvious (r255126).

> >>

> >> Christophe

> >>

> >> > diff --git a/gcc/testsuite/gcc.target/arm/simd/vect-dot-qi.h

> >> > b/gcc/testsuite/gcc.target/arm/simd/vect-dot-qi.h

> >> > new file mode 100644

> >> > index

> >> >

> >>

> 0000000000000000000000000000000000000000..90b00aff95cfef96d1963be176

> >> 73

> >> > dc191cc71169

> >> > --- /dev/null

> >> > +++ b/gcc/testsuite/gcc.target/arm/simd/vect-dot-qi.h

> >> > @@ -0,0 +1,15 @@

> >> > +TYPE char X[N] __attribute__

> >> ((__aligned__(__BIGGEST_ALIGNMENT__)));

> >> > +TYPE char Y[N] __attribute__

> >> ((__aligned__(__BIGGEST_ALIGNMENT__)));

> >> > +

> >> > +__attribute__ ((noinline)) int

> >> > +foo1(int len) {

> >> > +  int i;

> >> > +  TYPE int result = 0;

> >> > +  TYPE short prod;

> >> > +

> >> > +  for (i=0; i<len; i++) {

> >> > +    prod = X[i] * Y[i];

> >> > +    result += prod;

> >> > +  }

> >> > +  return result;

> >> > +}

> >> > \ No newline at end of file

> >> >

> >> > Please add new lines at the end of the new test files.

> >> > This applies to a few more new files in this patch.

> >> >

> >> > Ok with these nits fixed.

> >> >

> >> > Thanks,

> >> > Kyrill

> >> >

Christophe Lyon Nov. 24, 2017, 7:38 p.m. UTC | #8

On 24 November 2017 at 19:05, Tamar Christina <Tamar.Christina@arm.com> wrote:
> Hi Christophe,
>
>>
>> After your commit, I have these reports:
>> http://people.linaro.org/~christophe.lyon/cross-
>> validation/gcc/trunk/255064/report-build-info.html
>>
>> After my commit, I have these reports:
>> http://people.linaro.org/~christophe.lyon/cross-
>> validation/gcc/trunk/255126/report-build-info.html
>>
>> I haven't fully checked that my patch fixes all the regressions reported at
>> r255064, but I don't see why my patch would introduce regressions.... So I
>> think your patch is causing problems:
>> * on armeb --with-fpu=neon-fp16: (the 2 "REGRESSED" entries):
>>     gcc.target/arm/attr-neon3.c scan-assembler-times vld1 1 (found 2 times)
>>     gcc.target/arm/neon-vfma-1.c scan-assembler vfma\\.f32[\t]+[dDqQ]
>>     gcc.target/arm/neon-vfms-1.c scan-assembler vfms\\.f32[\t]+[dDqQ]
>>
>> * on arm-none-linux-gnueabihf --with-cpu cortex-a5 --with-fpu vfpv3-d16-
>> fp16 and armeb-none-linux-gnueabihf --with-cpu cortex-a9 --with-fpu vfpv3-
>> d16-fp16 (the 2 "BIG-REGR" entries)
>
> This patch only introduced a few neon instrinsics in arm_neon.h, and most of these files don't use the header.
>
> gcc.dg/vect/pr65947-14.c doesn't exist in my tree so it's a relatively new test.
>
> I will run some regressions over the weekend on an updated tree, but I can't understand how a not included header it can cause execution failures 😊
> However most of those are vectorizer tests. It seems much more likely to me that vectorization is broken rather.

Agreed. But note that many regressions are reported for the
configurations --with-fpu vfpv3-d16-fp16
at: http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/255064/report-build-info.html
Maybe that's just a matter of arm_neon.h being included by some
effective-target tests?


>
> Thanks,
> Tamar
>
>> where a few tests fail:
>> (arm-none-linux-gnueabihf cortex-a5 vfpv3-d16-fp16):
>>     gcc.dg/vect/pr65947-14.c -flto -ffat-lto-objects execution test
>>     gcc.dg/vect/pr65947-14.c execution test
>>
>> (armeb-none-linux-gnueabihf cortex-a9 vfpv3-d16-fp16):
>>   Executed from: gcc.dg/vect/vect.exp
>>     gcc.dg/vect/pr51074.c -flto -ffat-lto-objects execution test
>>     gcc.dg/vect/pr51074.c execution test
>>     gcc.dg/vect/pr64252.c -flto -ffat-lto-objects execution test
>>     gcc.dg/vect/pr65947-14.c -flto -ffat-lto-objects execution test
>>     gcc.dg/vect/pr65947-14.c execution test
>>     gcc.dg/vect/vect-cond-4.c -flto -ffat-lto-objects execution test
>>     gcc.dg/vect/vect-nb-iter-ub-2.c execution test
>>     gcc.dg/vect/vect-nb-iter-ub-3.c -flto -ffat-lto-objects execution test
>>     gcc.dg/vect/vect-nb-iter-ub-3.c execution test
>>     gcc.dg/vect/vect-strided-shift-1.c -flto -ffat-lto-objects execution test
>>     gcc.dg/vect/vect-strided-shift-1.c execution test
>>     gcc.dg/vect/vect-strided-u16-i3.c -flto -ffat-lto-objects execution test
>>     gcc.dg/vect/vect-strided-u16-i3.c execution test
>>   Executed from: gcc.target/arm/arm.exp
>>     gcc.target/arm/attr-neon3.c scan-assembler-times vld1 1 (found 2 times)
>>     gcc.target/arm/neon-vfma-1.c scan-assembler vfma\\.f32[\t]+[dDqQ]
>>     gcc.target/arm/neon-vfms-1.c scan-assembler vfms\\.f32[\t]+[dDqQ]
>>     gcc.target/arm/neon-vmla-1.c scan-assembler vmla\\.i32
>>     gcc.target/arm/neon-vmls-1.c scan-assembler vmls\\.i32
>>     gcc.target/arm/vect-copysignf.c scan-tree-dump-times vect "vectorized 1
>> loops" 1 (found 0 times)
>>
>> I haven't checked whether this tests were already failing before your patch,
>> and are just reported as new failures because they failed to compile in the
>> mean time.
>>
>> Not sure I am clear :-)
>>
>> Sorry for the delay and potentially hard to parse reports, I'm struggling with
>> infrastructure problems.
>>
>> Thanks,
>>
>> Christophe
>>
>> > Tamar
>> >
>> >>
>> >> Fixed as obvious (r255126).
>> >>
>> >> Christophe
>> >>
>> >> > diff --git a/gcc/testsuite/gcc.target/arm/simd/vect-dot-qi.h
>> >> > b/gcc/testsuite/gcc.target/arm/simd/vect-dot-qi.h
>> >> > new file mode 100644
>> >> > index
>> >> >
>> >>
>> 0000000000000000000000000000000000000000..90b00aff95cfef96d1963be176
>> >> 73
>> >> > dc191cc71169
>> >> > --- /dev/null
>> >> > +++ b/gcc/testsuite/gcc.target/arm/simd/vect-dot-qi.h
>> >> > @@ -0,0 +1,15 @@
>> >> > +TYPE char X[N] __attribute__
>> >> ((__aligned__(__BIGGEST_ALIGNMENT__)));
>> >> > +TYPE char Y[N] __attribute__
>> >> ((__aligned__(__BIGGEST_ALIGNMENT__)));
>> >> > +
>> >> > +__attribute__ ((noinline)) int
>> >> > +foo1(int len) {
>> >> > +  int i;
>> >> > +  TYPE int result = 0;
>> >> > +  TYPE short prod;
>> >> > +
>> >> > +  for (i=0; i<len; i++) {
>> >> > +    prod = X[i] * Y[i];
>> >> > +    result += prod;
>> >> > +  }
>> >> > +  return result;
>> >> > +}
>> >> > \ No newline at end of file
>> >> >
>> >> > Please add new lines at the end of the new test files.
>> >> > This applies to a few more new files in this patch.
>> >> >
>> >> > Ok with these nits fixed.
>> >> >
>> >> > Thanks,
>> >> > Kyrill
>> >> >

Christophe Lyon Nov. 26, 2017, 12:56 p.m. UTC | #9

On 24 November 2017 at 20:38, Christophe Lyon
<christophe.lyon@linaro.org> wrote:
> On 24 November 2017 at 19:05, Tamar Christina <Tamar.Christina@arm.com> wrote:
>> Hi Christophe,
>>
>>>
>>> After your commit, I have these reports:
>>> http://people.linaro.org/~christophe.lyon/cross-
>>> validation/gcc/trunk/255064/report-build-info.html
>>>
>>> After my commit, I have these reports:
>>> http://people.linaro.org/~christophe.lyon/cross-
>>> validation/gcc/trunk/255126/report-build-info.html
>>>
>>> I haven't fully checked that my patch fixes all the regressions reported at
>>> r255064, but I don't see why my patch would introduce regressions.... So I
>>> think your patch is causing problems:
>>> * on armeb --with-fpu=neon-fp16: (the 2 "REGRESSED" entries):
>>>     gcc.target/arm/attr-neon3.c scan-assembler-times vld1 1 (found 2 times)
>>>     gcc.target/arm/neon-vfma-1.c scan-assembler vfma\\.f32[\t]+[dDqQ]
>>>     gcc.target/arm/neon-vfms-1.c scan-assembler vfms\\.f32[\t]+[dDqQ]
>>>
>>> * on arm-none-linux-gnueabihf --with-cpu cortex-a5 --with-fpu vfpv3-d16-
>>> fp16 and armeb-none-linux-gnueabihf --with-cpu cortex-a9 --with-fpu vfpv3-
>>> d16-fp16 (the 2 "BIG-REGR" entries)
>>
>> This patch only introduced a few neon instrinsics in arm_neon.h, and most of these files don't use the header.
>>
>> gcc.dg/vect/pr65947-14.c doesn't exist in my tree so it's a relatively new test.
>>
>> I will run some regressions over the weekend on an updated tree, but I can't understand how a not included header it can cause execution failures 😊
>> However most of those are vectorizer tests. It seems much more likely to me that vectorization is broken rather.
>
> Agreed. But note that many regressions are reported for the
> configurations --with-fpu vfpv3-d16-fp16
> at: http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/255064/report-build-info.html
> Maybe that's just a matter of arm_neon.h being included by some
> effective-target tests?
>
>
Hi Tamar,

Good news, I have confirmed your obvious thoughts: I have run
validations of r255063+your patch fixed, and the results are clean:
http://people.linaro.org/~christophe.lyon/cross-validation/gcc-test-patches/255063-r255064-fixed.patch/report-build-info.html

I have also compared r255063 to r255216 (that is I applied all patches
between yours and mine):
http://people.linaro.org/~christophe.lyon/cross-validation/gcc-test-patches/255063-r255063-255126.patch/report-build-info.html
which confirms some regressions have been introduced in-between,
hidden by the problem in your patch.

Some may be obvious to bisect, some less.

Christophe

>>
>> Thanks,
>> Tamar
>>
>>> where a few tests fail:
>>> (arm-none-linux-gnueabihf cortex-a5 vfpv3-d16-fp16):
>>>     gcc.dg/vect/pr65947-14.c -flto -ffat-lto-objects execution test
>>>     gcc.dg/vect/pr65947-14.c execution test
>>>
>>> (armeb-none-linux-gnueabihf cortex-a9 vfpv3-d16-fp16):
>>>   Executed from: gcc.dg/vect/vect.exp
>>>     gcc.dg/vect/pr51074.c -flto -ffat-lto-objects execution test
>>>     gcc.dg/vect/pr51074.c execution test
>>>     gcc.dg/vect/pr64252.c -flto -ffat-lto-objects execution test
>>>     gcc.dg/vect/pr65947-14.c -flto -ffat-lto-objects execution test
>>>     gcc.dg/vect/pr65947-14.c execution test
>>>     gcc.dg/vect/vect-cond-4.c -flto -ffat-lto-objects execution test
>>>     gcc.dg/vect/vect-nb-iter-ub-2.c execution test
>>>     gcc.dg/vect/vect-nb-iter-ub-3.c -flto -ffat-lto-objects execution test
>>>     gcc.dg/vect/vect-nb-iter-ub-3.c execution test
>>>     gcc.dg/vect/vect-strided-shift-1.c -flto -ffat-lto-objects execution test
>>>     gcc.dg/vect/vect-strided-shift-1.c execution test
>>>     gcc.dg/vect/vect-strided-u16-i3.c -flto -ffat-lto-objects execution test
>>>     gcc.dg/vect/vect-strided-u16-i3.c execution test
>>>   Executed from: gcc.target/arm/arm.exp
>>>     gcc.target/arm/attr-neon3.c scan-assembler-times vld1 1 (found 2 times)
>>>     gcc.target/arm/neon-vfma-1.c scan-assembler vfma\\.f32[\t]+[dDqQ]
>>>     gcc.target/arm/neon-vfms-1.c scan-assembler vfms\\.f32[\t]+[dDqQ]
>>>     gcc.target/arm/neon-vmla-1.c scan-assembler vmla\\.i32
>>>     gcc.target/arm/neon-vmls-1.c scan-assembler vmls\\.i32
>>>     gcc.target/arm/vect-copysignf.c scan-tree-dump-times vect "vectorized 1
>>> loops" 1 (found 0 times)
>>>
>>> I haven't checked whether this tests were already failing before your patch,
>>> and are just reported as new failures because they failed to compile in the
>>> mean time.
>>>
>>> Not sure I am clear :-)
>>>
>>> Sorry for the delay and potentially hard to parse reports, I'm struggling with
>>> infrastructure problems.
>>>
>>> Thanks,
>>>
>>> Christophe
>>>
>>> > Tamar
>>> >
>>> >>
>>> >> Fixed as obvious (r255126).
>>> >>
>>> >> Christophe
>>> >>
>>> >> > diff --git a/gcc/testsuite/gcc.target/arm/simd/vect-dot-qi.h
>>> >> > b/gcc/testsuite/gcc.target/arm/simd/vect-dot-qi.h
>>> >> > new file mode 100644
>>> >> > index
>>> >> >
>>> >>
>>> 0000000000000000000000000000000000000000..90b00aff95cfef96d1963be176
>>> >> 73
>>> >> > dc191cc71169
>>> >> > --- /dev/null
>>> >> > +++ b/gcc/testsuite/gcc.target/arm/simd/vect-dot-qi.h
>>> >> > @@ -0,0 +1,15 @@
>>> >> > +TYPE char X[N] __attribute__
>>> >> ((__aligned__(__BIGGEST_ALIGNMENT__)));
>>> >> > +TYPE char Y[N] __attribute__
>>> >> ((__aligned__(__BIGGEST_ALIGNMENT__)));
>>> >> > +
>>> >> > +__attribute__ ((noinline)) int
>>> >> > +foo1(int len) {
>>> >> > +  int i;
>>> >> > +  TYPE int result = 0;
>>> >> > +  TYPE short prod;
>>> >> > +
>>> >> > +  for (i=0; i<len; i++) {
>>> >> > +    prod = X[i] * Y[i];
>>> >> > +    result += prod;
>>> >> > +  }
>>> >> > +  return result;
>>> >> > +}
>>> >> > \ No newline at end of file
>>> >> >
>>> >> > Please add new lines at the end of the new test files.
>>> >> > This applies to a few more new files in this patch.
>>> >> >
>>> >> > Ok with these nits fixed.
>>> >> >
>>> >> > Thanks,
>>> >> > Kyrill
>>> >> >

Christophe Lyon Nov. 26, 2017, 8:01 p.m. UTC | #10

On 26 November 2017 at 13:56, Christophe Lyon
<christophe.lyon@linaro.org> wrote:
> On 24 November 2017 at 20:38, Christophe Lyon
> <christophe.lyon@linaro.org> wrote:
>> On 24 November 2017 at 19:05, Tamar Christina <Tamar.Christina@arm.com> wrote:
>>> Hi Christophe,
>>>
>>>>
>>>> After your commit, I have these reports:
>>>> http://people.linaro.org/~christophe.lyon/cross-
>>>> validation/gcc/trunk/255064/report-build-info.html
>>>>
>>>> After my commit, I have these reports:
>>>> http://people.linaro.org/~christophe.lyon/cross-
>>>> validation/gcc/trunk/255126/report-build-info.html
>>>>
>>>> I haven't fully checked that my patch fixes all the regressions reported at
>>>> r255064, but I don't see why my patch would introduce regressions.... So I
>>>> think your patch is causing problems:
>>>> * on armeb --with-fpu=neon-fp16: (the 2 "REGRESSED" entries):
>>>>     gcc.target/arm/attr-neon3.c scan-assembler-times vld1 1 (found 2 times)
>>>>     gcc.target/arm/neon-vfma-1.c scan-assembler vfma\\.f32[\t]+[dDqQ]
>>>>     gcc.target/arm/neon-vfms-1.c scan-assembler vfms\\.f32[\t]+[dDqQ]
>>>>
>>>> * on arm-none-linux-gnueabihf --with-cpu cortex-a5 --with-fpu vfpv3-d16-
>>>> fp16 and armeb-none-linux-gnueabihf --with-cpu cortex-a9 --with-fpu vfpv3-
>>>> d16-fp16 (the 2 "BIG-REGR" entries)
>>>
>>> This patch only introduced a few neon instrinsics in arm_neon.h, and most of these files don't use the header.
>>>
>>> gcc.dg/vect/pr65947-14.c doesn't exist in my tree so it's a relatively new test.
>>>
>>> I will run some regressions over the weekend on an updated tree, but I can't understand how a not included header it can cause execution failures 😊
>>> However most of those are vectorizer tests. It seems much more likely to me that vectorization is broken rather.
>>
>> Agreed. But note that many regressions are reported for the
>> configurations --with-fpu vfpv3-d16-fp16
>> at: http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/255064/report-build-info.html
>> Maybe that's just a matter of arm_neon.h being included by some
>> effective-target tests?
>>
>>
> Hi Tamar,
>
> Good news, I have confirmed your obvious thoughts: I have run
> validations of r255063+your patch fixed, and the results are clean:
> http://people.linaro.org/~christophe.lyon/cross-validation/gcc-test-patches/255063-r255064-fixed.patch/report-build-info.html
>
> I have also compared r255063 to r255216 (that is I applied all patches
> between yours and mine):
> http://people.linaro.org/~christophe.lyon/cross-validation/gcc-test-patches/255063-r255063-255126.patch/report-build-info.html
> which confirms some regressions have been introduced in-between,
> hidden by the problem in your patch.
>
> Some may be obvious to bisect, some less.
>
OK, so for gcc:
FAIL: gcc.dg/ipa/inline-1.c scan-ipa-dump inline "op2 change 9.990000. of time"
after r255103, which updated the test

several failures for gcc.target/arm/addr-modes-float.c which was
introduced at r255111 (Charles is aware of that, probably just a
matter of adding the right effective-target)

I'm still trying to reproduce the regression:
FAIL: gcc.dg/vect/vect-nb-iter-ub-2.c execution test
on armeb

and for g++:
g++.dg/ipa/devirt-22.C  -std=gnu++11  scan-ipa-dump-times cp
"Discovered a virtual call to a known target" 1 (found 2 times)
g++.dg/ipa/devirt-22.C  -std=gnu++14  scan-ipa-dump-times cp
"Discovered a virtual call to a known target" 1 (found 2 times)
g++.dg/ipa/devirt-22.C  -std=gnu++98  scan-ipa-dump-times cp
"Discovered a virtual call to a known target" 1 (found 2 times)
g++.dg/pr79095-4.C  -std=gnu++98  scan-tree-dump-times vrp2
"__builtin_memset \\(_[0-9]+, 0, [0-9]+\\)" 1 (found 0 times)
g++.dg/pr79095-4.C  -std=gnu++98  (test for warnings, line )

Christophe


> Christophe
>
>>>
>>> Thanks,
>>> Tamar
>>>
>>>> where a few tests fail:
>>>> (arm-none-linux-gnueabihf cortex-a5 vfpv3-d16-fp16):
>>>>     gcc.dg/vect/pr65947-14.c -flto -ffat-lto-objects execution test
>>>>     gcc.dg/vect/pr65947-14.c execution test
>>>>
>>>> (armeb-none-linux-gnueabihf cortex-a9 vfpv3-d16-fp16):
>>>>   Executed from: gcc.dg/vect/vect.exp
>>>>     gcc.dg/vect/pr51074.c -flto -ffat-lto-objects execution test
>>>>     gcc.dg/vect/pr51074.c execution test
>>>>     gcc.dg/vect/pr64252.c -flto -ffat-lto-objects execution test
>>>>     gcc.dg/vect/pr65947-14.c -flto -ffat-lto-objects execution test
>>>>     gcc.dg/vect/pr65947-14.c execution test
>>>>     gcc.dg/vect/vect-cond-4.c -flto -ffat-lto-objects execution test
>>>>     gcc.dg/vect/vect-nb-iter-ub-2.c execution test
>>>>     gcc.dg/vect/vect-nb-iter-ub-3.c -flto -ffat-lto-objects execution test
>>>>     gcc.dg/vect/vect-nb-iter-ub-3.c execution test
>>>>     gcc.dg/vect/vect-strided-shift-1.c -flto -ffat-lto-objects execution test
>>>>     gcc.dg/vect/vect-strided-shift-1.c execution test
>>>>     gcc.dg/vect/vect-strided-u16-i3.c -flto -ffat-lto-objects execution test
>>>>     gcc.dg/vect/vect-strided-u16-i3.c execution test
>>>>   Executed from: gcc.target/arm/arm.exp
>>>>     gcc.target/arm/attr-neon3.c scan-assembler-times vld1 1 (found 2 times)
>>>>     gcc.target/arm/neon-vfma-1.c scan-assembler vfma\\.f32[\t]+[dDqQ]
>>>>     gcc.target/arm/neon-vfms-1.c scan-assembler vfms\\.f32[\t]+[dDqQ]
>>>>     gcc.target/arm/neon-vmla-1.c scan-assembler vmla\\.i32
>>>>     gcc.target/arm/neon-vmls-1.c scan-assembler vmls\\.i32
>>>>     gcc.target/arm/vect-copysignf.c scan-tree-dump-times vect "vectorized 1
>>>> loops" 1 (found 0 times)
>>>>
>>>> I haven't checked whether this tests were already failing before your patch,
>>>> and are just reported as new failures because they failed to compile in the
>>>> mean time.
>>>>
>>>> Not sure I am clear :-)
>>>>
>>>> Sorry for the delay and potentially hard to parse reports, I'm struggling with
>>>> infrastructure problems.
>>>>
>>>> Thanks,
>>>>
>>>> Christophe
>>>>
>>>> > Tamar
>>>> >
>>>> >>
>>>> >> Fixed as obvious (r255126).
>>>> >>
>>>> >> Christophe
>>>> >>
>>>> >> > diff --git a/gcc/testsuite/gcc.target/arm/simd/vect-dot-qi.h
>>>> >> > b/gcc/testsuite/gcc.target/arm/simd/vect-dot-qi.h
>>>> >> > new file mode 100644
>>>> >> > index
>>>> >> >
>>>> >>
>>>> 0000000000000000000000000000000000000000..90b00aff95cfef96d1963be176
>>>> >> 73
>>>> >> > dc191cc71169
>>>> >> > --- /dev/null
>>>> >> > +++ b/gcc/testsuite/gcc.target/arm/simd/vect-dot-qi.h
>>>> >> > @@ -0,0 +1,15 @@
>>>> >> > +TYPE char X[N] __attribute__
>>>> >> ((__aligned__(__BIGGEST_ALIGNMENT__)));
>>>> >> > +TYPE char Y[N] __attribute__
>>>> >> ((__aligned__(__BIGGEST_ALIGNMENT__)));
>>>> >> > +
>>>> >> > +__attribute__ ((noinline)) int
>>>> >> > +foo1(int len) {
>>>> >> > +  int i;
>>>> >> > +  TYPE int result = 0;
>>>> >> > +  TYPE short prod;
>>>> >> > +
>>>> >> > +  for (i=0; i<len; i++) {
>>>> >> > +    prod = X[i] * Y[i];
>>>> >> > +    result += prod;
>>>> >> > +  }
>>>> >> > +  return result;
>>>> >> > +}
>>>> >> > \ No newline at end of file
>>>> >> >
>>>> >> > Please add new lines at the end of the new test files.
>>>> >> > This applies to a few more new files in this patch.
>>>> >> >
>>>> >> > Ok with these nits fixed.
>>>> >> >
>>>> >> > Thanks,
>>>> >> > Kyrill
>>>> >> >

Kyrill Tkachov Nov. 27, 2017, 10:48 a.m. UTC | #11

Hi Christophe,

On 26/11/17 20:01, Christophe Lyon wrote:
> On 26 November 2017 at 13:56, Christophe Lyon
> <christophe.lyon@linaro.org> wrote:
>> On 24 November 2017 at 20:38, Christophe Lyon
>> <christophe.lyon@linaro.org> wrote:
>>> On 24 November 2017 at 19:05, Tamar Christina <Tamar.Christina@arm.com> wrote:
>>>> Hi Christophe,
>>>>
>>>>> After your commit, I have these reports:
>>>>> http://people.linaro.org/~christophe.lyon/cross-
>>>>> validation/gcc/trunk/255064/report-build-info.html
>>>>>
>>>>> After my commit, I have these reports:
>>>>> http://people.linaro.org/~christophe.lyon/cross-
>>>>> validation/gcc/trunk/255126/report-build-info.html
>>>>>
>>>>> I haven't fully checked that my patch fixes all the regressions reported at
>>>>> r255064, but I don't see why my patch would introduce regressions.... So I
>>>>> think your patch is causing problems:
>>>>> * on armeb --with-fpu=neon-fp16: (the 2 "REGRESSED" entries):
>>>>>      gcc.target/arm/attr-neon3.c scan-assembler-times vld1 1 (found 2 times)
>>>>>      gcc.target/arm/neon-vfma-1.c scan-assembler vfma\\.f32[\t]+[dDqQ]
>>>>>      gcc.target/arm/neon-vfms-1.c scan-assembler vfms\\.f32[\t]+[dDqQ]
>>>>>
>>>>> * on arm-none-linux-gnueabihf --with-cpu cortex-a5 --with-fpu vfpv3-d16-
>>>>> fp16 and armeb-none-linux-gnueabihf --with-cpu cortex-a9 --with-fpu vfpv3-
>>>>> d16-fp16 (the 2 "BIG-REGR" entries)
>>>> This patch only introduced a few neon instrinsics in arm_neon.h, and most of these files don't use the header.
>>>>
>>>> gcc.dg/vect/pr65947-14.c doesn't exist in my tree so it's a relatively new test.
>>>>
>>>> I will run some regressions over the weekend on an updated tree, but I can't understand how a not included header it can cause execution failures 😊
>>>> However most of those are vectorizer tests. It seems much more likely to me that vectorization is broken rather.
>>> Agreed. But note that many regressions are reported for the
>>> configurations --with-fpu vfpv3-d16-fp16
>>> at: http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/255064/report-build-info.html
>>> Maybe that's just a matter of arm_neon.h being included by some
>>> effective-target tests?
>>>
>>>
>> Hi Tamar,
>>
>> Good news, I have confirmed your obvious thoughts: I have run
>> validations of r255063+your patch fixed, and the results are clean:
>> http://people.linaro.org/~christophe.lyon/cross-validation/gcc-test-patches/255063-r255064-fixed.patch/report-build-info.html
>>
>> I have also compared r255063 to r255216 (that is I applied all patches
>> between yours and mine):
>> http://people.linaro.org/~christophe.lyon/cross-validation/gcc-test-patches/255063-r255063-255126.patch/report-build-info.html
>> which confirms some regressions have been introduced in-between,
>> hidden by the problem in your patch.
>>
>> Some may be obvious to bisect, some less.
>>

thank you very much for tracking these down.

> OK, so for gcc:
> FAIL: gcc.dg/ipa/inline-1.c scan-ipa-dump inline "op2 change 9.990000. of time"
> after r255103, which updated the test

Might be related to the various profile update cleanups that have been 
going on
over the last few weeks.

> several failures for gcc.target/arm/addr-modes-float.c which was
> introduced at r255111 (Charles is aware of that, probably just a
> matter of adding the right effective-target)

I agree.

> I'm still trying to reproduce the regression:
> FAIL: gcc.dg/vect/vect-nb-iter-ub-2.c execution test
> on armeb

Hmm, maybe something to do with the check_vect check that these tests do?
Or model flakiness...

>
> and for g++:
> g++.dg/ipa/devirt-22.C  -std=gnu++11  scan-ipa-dump-times cp
> "Discovered a virtual call to a known target" 1 (found 2 times)
> g++.dg/ipa/devirt-22.C  -std=gnu++14  scan-ipa-dump-times cp
> "Discovered a virtual call to a known target" 1 (found 2 times)
> g++.dg/ipa/devirt-22.C  -std=gnu++98  scan-ipa-dump-times cp
> "Discovered a virtual call to a known target" 1 (found 2 times)
> g++.dg/pr79095-4.C  -std=gnu++98  scan-tree-dump-times vrp2
> "__builtin_memset \\(_[0-9]+, 0, [0-9]+\\)" 1 (found 0 times)
> g++.dg/pr79095-4.C  -std=gnu++98  (test for warnings, line )

I'd guess these are related to the profile update improvements as well.

Kyrill

> Christophe
>
>
>> Christophe
>>
>>>> Thanks,
>>>> Tamar
>>>>
>>>>> where a few tests fail:
>>>>> (arm-none-linux-gnueabihf cortex-a5 vfpv3-d16-fp16):
>>>>>      gcc.dg/vect/pr65947-14.c -flto -ffat-lto-objects execution test
>>>>>      gcc.dg/vect/pr65947-14.c execution test
>>>>>
>>>>> (armeb-none-linux-gnueabihf cortex-a9 vfpv3-d16-fp16):
>>>>>    Executed from: gcc.dg/vect/vect.exp
>>>>>      gcc.dg/vect/pr51074.c -flto -ffat-lto-objects execution test
>>>>>      gcc.dg/vect/pr51074.c execution test
>>>>>      gcc.dg/vect/pr64252.c -flto -ffat-lto-objects execution test
>>>>>      gcc.dg/vect/pr65947-14.c -flto -ffat-lto-objects execution test
>>>>>      gcc.dg/vect/pr65947-14.c execution test
>>>>>      gcc.dg/vect/vect-cond-4.c -flto -ffat-lto-objects execution test
>>>>>      gcc.dg/vect/vect-nb-iter-ub-2.c execution test
>>>>>      gcc.dg/vect/vect-nb-iter-ub-3.c -flto -ffat-lto-objects execution test
>>>>>      gcc.dg/vect/vect-nb-iter-ub-3.c execution test
>>>>>      gcc.dg/vect/vect-strided-shift-1.c -flto -ffat-lto-objects execution test
>>>>>      gcc.dg/vect/vect-strided-shift-1.c execution test
>>>>>      gcc.dg/vect/vect-strided-u16-i3.c -flto -ffat-lto-objects execution test
>>>>>      gcc.dg/vect/vect-strided-u16-i3.c execution test
>>>>>    Executed from: gcc.target/arm/arm.exp
>>>>>      gcc.target/arm/attr-neon3.c scan-assembler-times vld1 1 (found 2 times)
>>>>>      gcc.target/arm/neon-vfma-1.c scan-assembler vfma\\.f32[\t]+[dDqQ]
>>>>>      gcc.target/arm/neon-vfms-1.c scan-assembler vfms\\.f32[\t]+[dDqQ]
>>>>>      gcc.target/arm/neon-vmla-1.c scan-assembler vmla\\.i32
>>>>>      gcc.target/arm/neon-vmls-1.c scan-assembler vmls\\.i32
>>>>>      gcc.target/arm/vect-copysignf.c scan-tree-dump-times vect "vectorized 1
>>>>> loops" 1 (found 0 times)
>>>>>
>>>>> I haven't checked whether this tests were already failing before your patch,
>>>>> and are just reported as new failures because they failed to compile in the
>>>>> mean time.
>>>>>
>>>>> Not sure I am clear :-)
>>>>>
>>>>> Sorry for the delay and potentially hard to parse reports, I'm struggling with
>>>>> infrastructure problems.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Christophe
>>>>>
>>>>>> Tamar
>>>>>>
>>>>>>> Fixed as obvious (r255126).
>>>>>>>
>>>>>>> Christophe
>>>>>>>
>>>>>>>> diff --git a/gcc/testsuite/gcc.target/arm/simd/vect-dot-qi.h
>>>>>>>> b/gcc/testsuite/gcc.target/arm/simd/vect-dot-qi.h
>>>>>>>> new file mode 100644
>>>>>>>> index
>>>>>>>>
>>>>> 0000000000000000000000000000000000000000..90b00aff95cfef96d1963be176
>>>>>>> 73
>>>>>>>> dc191cc71169
>>>>>>>> --- /dev/null
>>>>>>>> +++ b/gcc/testsuite/gcc.target/arm/simd/vect-dot-qi.h
>>>>>>>> @@ -0,0 +1,15 @@
>>>>>>>> +TYPE char X[N] __attribute__
>>>>>>> ((__aligned__(__BIGGEST_ALIGNMENT__)));
>>>>>>>> +TYPE char Y[N] __attribute__
>>>>>>> ((__aligned__(__BIGGEST_ALIGNMENT__)));
>>>>>>>> +
>>>>>>>> +__attribute__ ((noinline)) int
>>>>>>>> +foo1(int len) {
>>>>>>>> +  int i;
>>>>>>>> +  TYPE int result = 0;
>>>>>>>> +  TYPE short prod;
>>>>>>>> +
>>>>>>>> +  for (i=0; i<len; i++) {
>>>>>>>> +    prod = X[i] * Y[i];
>>>>>>>> +    result += prod;
>>>>>>>> +  }
>>>>>>>> +  return result;
>>>>>>>> +}
>>>>>>>> \ No newline at end of file
>>>>>>>>
>>>>>>>> Please add new lines at the end of the new test files.
>>>>>>>> This applies to a few more new files in this patch.
>>>>>>>>
>>>>>>>> Ok with these nits fixed.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Kyrill
>>>>>>>>

Tamar Christina Nov. 27, 2017, 10:49 a.m. UTC | #12

Hi Christoph,

> -----Original Message-----

> From: Christophe Lyon [mailto:christophe.lyon@linaro.org]

> Sent: Sunday, November 26, 2017 20:01

> To: Tamar Christina <Tamar.Christina@arm.com>

> Cc: Kyrill Tkachov <kyrylo.tkachov@foss.arm.com>; gcc-patches@gcc.gnu.org;

> nd <nd@arm.com>; Ramana Radhakrishnan

> <Ramana.Radhakrishnan@arm.com>; Richard Earnshaw

> <Richard.Earnshaw@arm.com>; nickc@redhat.com

> Subject: Re: [PATCH][GCC][ARM] Dot Product NEON intrinsics [Patch (3/8)]

> 

> On 26 November 2017 at 13:56, Christophe Lyon <christophe.lyon@linaro.org>

> wrote:

> > On 24 November 2017 at 20:38, Christophe Lyon

> > <christophe.lyon@linaro.org> wrote:

> >> On 24 November 2017 at 19:05, Tamar Christina

> <Tamar.Christina@arm.com> wrote:

> >>> Hi Christophe,

> >>>

> >>>>

> >>>> After your commit, I have these reports:

> >>>> http://people.linaro.org/~christophe.lyon/cross-

> >>>> validation/gcc/trunk/255064/report-build-info.html

> >>>>

> >>>> After my commit, I have these reports:

> >>>> http://people.linaro.org/~christophe.lyon/cross-

> >>>> validation/gcc/trunk/255126/report-build-info.html

> >>>>

> >>>> I haven't fully checked that my patch fixes all the regressions

> >>>> reported at r255064, but I don't see why my patch would introduce

> >>>> regressions.... So I think your patch is causing problems:

> >>>> * on armeb --with-fpu=neon-fp16: (the 2 "REGRESSED" entries):

> >>>>     gcc.target/arm/attr-neon3.c scan-assembler-times vld1 1 (found 2

> times)

> >>>>     gcc.target/arm/neon-vfma-1.c scan-assembler vfma\\.f32[\t]+[dDqQ]

> >>>>     gcc.target/arm/neon-vfms-1.c scan-assembler

> >>>> vfms\\.f32[\t]+[dDqQ]

> >>>>

> >>>> * on arm-none-linux-gnueabihf --with-cpu cortex-a5 --with-fpu

> >>>> vfpv3-d16-

> >>>> fp16 and armeb-none-linux-gnueabihf --with-cpu cortex-a9 --with-fpu

> >>>> vfpv3-

> >>>> d16-fp16 (the 2 "BIG-REGR" entries)

> >>>

> >>> This patch only introduced a few neon instrinsics in arm_neon.h, and

> most of these files don't use the header.

> >>>

> >>> gcc.dg/vect/pr65947-14.c doesn't exist in my tree so it's a relatively new

> test.

> >>>

> >>> I will run some regressions over the weekend on an updated tree, but

> >>> I can't understand how a not included header it can cause execution

> >>> failures 😊

> >>> However most of those are vectorizer tests. It seems much more likely

> to me that vectorization is broken rather.

> >>

> >> Agreed. But note that many regressions are reported for the

> >> configurations --with-fpu vfpv3-d16-fp16

> >> at:

> >> http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/

> >> 255064/report-build-info.html Maybe that's just a matter of

> >> arm_neon.h being included by some effective-target tests?

> >>

> >>

> > Hi Tamar,

> >

> > Good news, I have confirmed your obvious thoughts: I have run

> > validations of r255063+your patch fixed, and the results are clean:

> > http://people.linaro.org/~christophe.lyon/cross-validation/gcc-test-pa

> > tches/255063-r255064-fixed.patch/report-build-info.html


Thanks for confirming! My own finished as well. Sorry again for the breakage, I've
Updated my scripts to exclude log files so this shouldn't happen again!.


> >

> > I have also compared r255063 to r255216 (that is I applied all patches

> > between yours and mine):

> > http://people.linaro.org/~christophe.lyon/cross-validation/gcc-test-pa

> > tches/255063-r255063-255126.patch/report-build-info.html

> > which confirms some regressions have been introduced in-between,

> > hidden by the problem in your patch.

> >

> > Some may be obvious to bisect, some less.

> >

> OK, so for gcc:

> FAIL: gcc.dg/ipa/inline-1.c scan-ipa-dump inline "op2 change 9.990000. of

> time"

> after r255103, which updated the test

> 

> several failures for gcc.target/arm/addr-modes-float.c which was introduced

> at r255111 (Charles is aware of that, probably just a matter of adding the right

> effective-target)

> 

> I'm still trying to reproduce the regression:

> FAIL: gcc.dg/vect/vect-nb-iter-ub-2.c execution test on armeb

> 

> and for g++:

> g++.dg/ipa/devirt-22.C  -std=gnu++11  scan-ipa-dump-times cp

> "Discovered a virtual call to a known target" 1 (found 2 times)

> g++.dg/ipa/devirt-22.C  -std=gnu++14  scan-ipa-dump-times cp

> "Discovered a virtual call to a known target" 1 (found 2 times)

> g++.dg/ipa/devirt-22.C  -std=gnu++98  scan-ipa-dump-times cp

> "Discovered a virtual call to a known target" 1 (found 2 times)

> g++.dg/pr79095-4.C  -std=gnu++98  scan-tree-dump-times vrp2

> "__builtin_memset \\(_[0-9]+, 0, [0-9]+\\)" 1 (found 0 times)

> g++.dg/pr79095-4.C  -std=gnu++98  (test for warnings, line )

> 

> Christophe

> 

> 

> > Christophe

> >

> >>>

> >>> Thanks,

> >>> Tamar

> >>>

> >>>> where a few tests fail:

> >>>> (arm-none-linux-gnueabihf cortex-a5 vfpv3-d16-fp16):

> >>>>     gcc.dg/vect/pr65947-14.c -flto -ffat-lto-objects execution test

> >>>>     gcc.dg/vect/pr65947-14.c execution test

> >>>>

> >>>> (armeb-none-linux-gnueabihf cortex-a9 vfpv3-d16-fp16):

> >>>>   Executed from: gcc.dg/vect/vect.exp

> >>>>     gcc.dg/vect/pr51074.c -flto -ffat-lto-objects execution test

> >>>>     gcc.dg/vect/pr51074.c execution test

> >>>>     gcc.dg/vect/pr64252.c -flto -ffat-lto-objects execution test

> >>>>     gcc.dg/vect/pr65947-14.c -flto -ffat-lto-objects execution test

> >>>>     gcc.dg/vect/pr65947-14.c execution test

> >>>>     gcc.dg/vect/vect-cond-4.c -flto -ffat-lto-objects execution test

> >>>>     gcc.dg/vect/vect-nb-iter-ub-2.c execution test

> >>>>     gcc.dg/vect/vect-nb-iter-ub-3.c -flto -ffat-lto-objects execution test

> >>>>     gcc.dg/vect/vect-nb-iter-ub-3.c execution test

> >>>>     gcc.dg/vect/vect-strided-shift-1.c -flto -ffat-lto-objects execution

> test

> >>>>     gcc.dg/vect/vect-strided-shift-1.c execution test

> >>>>     gcc.dg/vect/vect-strided-u16-i3.c -flto -ffat-lto-objects execution test

> >>>>     gcc.dg/vect/vect-strided-u16-i3.c execution test

> >>>>   Executed from: gcc.target/arm/arm.exp

> >>>>     gcc.target/arm/attr-neon3.c scan-assembler-times vld1 1 (found 2

> times)

> >>>>     gcc.target/arm/neon-vfma-1.c scan-assembler vfma\\.f32[\t]+[dDqQ]

> >>>>     gcc.target/arm/neon-vfms-1.c scan-assembler vfms\\.f32[\t]+[dDqQ]

> >>>>     gcc.target/arm/neon-vmla-1.c scan-assembler vmla\\.i32

> >>>>     gcc.target/arm/neon-vmls-1.c scan-assembler vmls\\.i32

> >>>>     gcc.target/arm/vect-copysignf.c scan-tree-dump-times vect

> >>>> "vectorized 1 loops" 1 (found 0 times)

> >>>>

> >>>> I haven't checked whether this tests were already failing before

> >>>> your patch, and are just reported as new failures because they

> >>>> failed to compile in the mean time.

> >>>>

> >>>> Not sure I am clear :-)

> >>>>

> >>>> Sorry for the delay and potentially hard to parse reports, I'm

> >>>> struggling with infrastructure problems.

> >>>>

> >>>> Thanks,

> >>>>

> >>>> Christophe

> >>>>

> >>>> > Tamar

> >>>> >

> >>>> >>

> >>>> >> Fixed as obvious (r255126).

> >>>> >>

> >>>> >> Christophe

> >>>> >>

> >>>> >> > diff --git a/gcc/testsuite/gcc.target/arm/simd/vect-dot-qi.h

> >>>> >> > b/gcc/testsuite/gcc.target/arm/simd/vect-dot-qi.h

> >>>> >> > new file mode 100644

> >>>> >> > index

> >>>> >> >

> >>>> >>

> >>>>

> 0000000000000000000000000000000000000000..90b00aff95cfef96d1963be17

> >>>> 6

> >>>> >> 73

> >>>> >> > dc191cc71169

> >>>> >> > --- /dev/null

> >>>> >> > +++ b/gcc/testsuite/gcc.target/arm/simd/vect-dot-qi.h

> >>>> >> > @@ -0,0 +1,15 @@

> >>>> >> > +TYPE char X[N] __attribute__

> >>>> >> ((__aligned__(__BIGGEST_ALIGNMENT__)));

> >>>> >> > +TYPE char Y[N] __attribute__

> >>>> >> ((__aligned__(__BIGGEST_ALIGNMENT__)));

> >>>> >> > +

> >>>> >> > +__attribute__ ((noinline)) int foo1(int len) {

> >>>> >> > +  int i;

> >>>> >> > +  TYPE int result = 0;

> >>>> >> > +  TYPE short prod;

> >>>> >> > +

> >>>> >> > +  for (i=0; i<len; i++) {

> >>>> >> > +    prod = X[i] * Y[i];

> >>>> >> > +    result += prod;

> >>>> >> > +  }

> >>>> >> > +  return result;

> >>>> >> > +}

> >>>> >> > \ No newline at end of file

> >>>> >> >

> >>>> >> > Please add new lines at the end of the new test files.

> >>>> >> > This applies to a few more new files in this patch.

> >>>> >> >

> >>>> >> > Ok with these nits fixed.

> >>>> >> >

> >>>> >> > Thanks,

> >>>> >> > Kyrill

> >>>> >> >

[ARM] Dot Product NEON intrinsics [Patch (3/8)]

Commit Message

Comments

Patch