From patchwork Wed Aug 14 08:26:59 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1146835 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-506878-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="mR3FNVfJ"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 467jPG0LRvz9sDQ for ; Wed, 14 Aug 2019 18:27:12 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:mime-version:content-type; q=dns; s= default; b=P9V1zq3DR5Px/ooBPF7BTZ7BFS6zBV6MRJDmrJ5i6FFJxU6zVLOsp z+8SCpP8NEihbcbZWo/u4Cczw2hd5D8tjjLDnfhLRIHKGPCzjcV67RYwrKz5RWkr Tp11hbADvnCTNmDmj/uCBFdvLYe7pt+n+2XK0KzwDRlgWxsQNdsoGM= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:mime-version:content-type; s= default; bh=9gxNLdVXDjkZ2Ix2XJiXwWmup/I=; b=mR3FNVfJII4Vu3dESB5C gn5+ckzaZA9DRz4BrVTA2b4DpfjOcgdfCgR5DtZz/0eGUr2z2K+noGXibG7t1b2c vY0s4TjLtPiT/OE1b1HMJ3XaNJHQcCprzWks9i2FKA+gxQN+c+lWDIp29J61t0jf gOOVlBWHaLELJpximxnDthw= Received: (qmail 77979 invoked by alias); 14 Aug 2019 08:27:04 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 77969 invoked by uid 89); 14 Aug 2019 08:27:04 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-8.4 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, SPF_PASS autolearn=ham version=3.3.1 spammy= X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.110.172) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 14 Aug 2019 08:27:02 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C68BF337 for ; Wed, 14 Aug 2019 01:27:00 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.99.62]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 6E8373F694 for ; Wed, 14 Aug 2019 01:27:00 -0700 (PDT) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Subject: [committed][AArch64] Add support for SVE HF vconds Date: Wed, 14 Aug 2019 09:26:59 +0100 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 X-IsSubscribed: yes We were missing vcond patterns that had HF comparisons and HI or HF data. Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf. Applied as r274420. Richard 2019-08-14 Richard Sandiford gcc/ * config/aarch64/iterators.md (SVE_HSD): New mode iterator. (V_FP_EQUIV, v_fp_equiv): Handle VNx8HI and VNx8HF. * config/aarch64/aarch64-sve.md (vcond): Use SVE_HSD instead of SVE_SD. gcc/testsuite/ * gcc.target/aarch64/sve/vcond_17.c: New test. * gcc.target/aarch64/sve/vcond_17_run.c: Likewise. Index: gcc/config/aarch64/iterators.md =================================================================== --- gcc/config/aarch64/iterators.md 2019-08-14 09:20:57.547610677 +0100 +++ gcc/config/aarch64/iterators.md 2019-08-14 09:23:37.626427347 +0100 @@ -301,6 +301,9 @@ (define_mode_iterator SVE_HSDI [VNx16QI ;; All SVE floating-point vector modes that have 16-bit or 32-bit elements. (define_mode_iterator SVE_HSF [VNx8HF VNx4SF]) +;; All SVE vector modes that have 16-bit, 32-bit or 64-bit elements. +(define_mode_iterator SVE_HSD [VNx8HI VNx4SI VNx2DI VNx8HF VNx4SF VNx2DF]) + ;; All SVE vector modes that have 32-bit or 64-bit elements. (define_mode_iterator SVE_SD [VNx4SI VNx2DI VNx4SF VNx2DF]) @@ -928,9 +931,11 @@ (define_mode_attr v_int_equiv [(V8QI "v8 ]) ;; Floating-point equivalent of selected modes. -(define_mode_attr V_FP_EQUIV [(VNx4SI "VNx4SF") (VNx4SF "VNx4SF") +(define_mode_attr V_FP_EQUIV [(VNx8HI "VNx8HF") (VNx8HF "VNx8HF") + (VNx4SI "VNx4SF") (VNx4SF "VNx4SF") (VNx2DI "VNx2DF") (VNx2DF "VNx2DF")]) -(define_mode_attr v_fp_equiv [(VNx4SI "vnx4sf") (VNx4SF "vnx4sf") +(define_mode_attr v_fp_equiv [(VNx8HI "vnx8hf") (VNx8HF "vnx8hf") + (VNx4SI "vnx4sf") (VNx4SF "vnx4sf") (VNx2DI "vnx2df") (VNx2DF "vnx2df")]) ;; Mode for vector conditional operations where the comparison has Index: gcc/config/aarch64/aarch64-sve.md =================================================================== --- gcc/config/aarch64/aarch64-sve.md 2019-08-14 09:20:57.547610677 +0100 +++ gcc/config/aarch64/aarch64-sve.md 2019-08-14 09:23:37.626427347 +0100 @@ -2884,13 +2884,13 @@ (define_expand "vcondu" - [(set (match_operand:SVE_SD 0 "register_operand") - (if_then_else:SVE_SD + [(set (match_operand:SVE_HSD 0 "register_operand") + (if_then_else:SVE_HSD (match_operator 3 "comparison_operator" [(match_operand: 4 "register_operand") (match_operand: 5 "aarch64_simd_reg_or_zero")]) - (match_operand:SVE_SD 1 "register_operand") - (match_operand:SVE_SD 2 "register_operand")))] + (match_operand:SVE_HSD 1 "register_operand") + (match_operand:SVE_HSD 2 "register_operand")))] "TARGET_SVE" { aarch64_expand_sve_vcond (mode, mode, operands); Index: gcc/testsuite/gcc.target/aarch64/sve/vcond_17.c =================================================================== --- /dev/null 2019-07-30 08:53:31.317691683 +0100 +++ gcc/testsuite/gcc.target/aarch64/sve/vcond_17.c 2019-08-14 09:23:37.626427347 +0100 @@ -0,0 +1,94 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +#include + +#define eq(A, B) ((A) == (B)) +#define ne(A, B) ((A) != (B)) +#define olt(A, B) ((A) < (B)) +#define ole(A, B) ((A) <= (B)) +#define oge(A, B) ((A) >= (B)) +#define ogt(A, B) ((A) > (B)) +#define ordered(A, B) (!__builtin_isunordered (A, B)) +#define unordered(A, B) (__builtin_isunordered (A, B)) +#define ueq(A, B) (!__builtin_islessgreater (A, B)) +#define ult(A, B) (__builtin_isless (A, B)) +#define ule(A, B) (__builtin_islessequal (A, B)) +#define uge(A, B) (__builtin_isgreaterequal (A, B)) +#define ugt(A, B) (__builtin_isgreater (A, B)) +#define nueq(A, B) (__builtin_islessgreater (A, B)) +#define nult(A, B) (!__builtin_isless (A, B)) +#define nule(A, B) (!__builtin_islessequal (A, B)) +#define nuge(A, B) (!__builtin_isgreaterequal (A, B)) +#define nugt(A, B) (!__builtin_isgreater (A, B)) + +#define DEF_LOOP(CMP, EXPECT_INVALID) \ + void __attribute__ ((noinline, noclone)) \ + test_##CMP##_var (__fp16 *restrict dest, __fp16 *restrict src, \ + __fp16 fallback, __fp16 *restrict a, \ + __fp16 *restrict b, int count) \ + { \ + for (int i = 0; i < count; ++i) \ + dest[i] = CMP (a[i], b[i]) ? src[i] : fallback; \ + } \ + \ + void __attribute__ ((noinline, noclone)) \ + test_##CMP##_zero (__fp16 *restrict dest, __fp16 *restrict src, \ + __fp16 fallback, __fp16 *restrict a, \ + int count) \ + { \ + for (int i = 0; i < count; ++i) \ + dest[i] = CMP (a[i], (__fp16) 0) ? src[i] : fallback; \ + } \ + \ + void __attribute__ ((noinline, noclone)) \ + test_##CMP##_sel (__fp16 *restrict dest, __fp16 if_true, \ + __fp16 if_false, __fp16 *restrict a, \ + __fp16 b, int count) \ + { \ + for (int i = 0; i < count; ++i) \ + dest[i] = CMP (a[i], b) ? if_true : if_false; \ + } + +#define TEST_ALL(T) \ + T (eq, 0) \ + T (ne, 0) \ + T (olt, 1) \ + T (ole, 1) \ + T (oge, 1) \ + T (ogt, 1) \ + T (ordered, 0) \ + T (unordered, 0) \ + T (ueq, 0) \ + T (ult, 0) \ + T (ule, 0) \ + T (uge, 0) \ + T (ugt, 0) \ + T (nueq, 0) \ + T (nult, 0) \ + T (nule, 0) \ + T (nuge, 0) \ + T (nugt, 0) + +TEST_ALL (DEF_LOOP) + +/* { dg-final { scan-assembler {\tfcmeq\tp[0-9]+\.h, p[0-7]/z, z[0-9]+\.h, #0\.0\n} { xfail *-*-* } } } */ +/* { dg-final { scan-assembler {\tfcmeq\tp[0-9]+\.h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} } } */ + +/* { dg-final { scan-assembler {\tfcmne\tp[0-9]+\.h, p[0-7]/z, z[0-9]+\.h, #0\.0\n} } } */ +/* { dg-final { scan-assembler {\tfcmne\tp[0-9]+\.h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} } } */ + +/* { dg-final { scan-assembler {\tfcmlt\tp[0-9]+\.h, p[0-7]/z, z[0-9]+\.h, #0\.0\n} } } */ +/* { dg-final { scan-assembler {\tfcmlt\tp[0-9]+\.h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} } } */ + +/* { dg-final { scan-assembler {\tfcmle\tp[0-9]+\.h, p[0-7]/z, z[0-9]+\.h, #0\.0\n} } } */ +/* { dg-final { scan-assembler {\tfcmle\tp[0-9]+\.h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} } } */ + +/* { dg-final { scan-assembler {\tfcmgt\tp[0-9]+\.h, p[0-7]/z, z[0-9]+\.h, #0\.0\n} } } */ +/* { dg-final { scan-assembler {\tfcmgt\tp[0-9]+\.h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} } } */ + +/* { dg-final { scan-assembler {\tfcmge\tp[0-9]+\.h, p[0-7]/z, z[0-9]+\.h, #0\.0\n} } } */ +/* { dg-final { scan-assembler {\tfcmge\tp[0-9]+\.h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} } } */ + +/* { dg-final { scan-assembler-not {\tfcmuo\tp[0-9]+\.h, p[0-7]/z, z[0-9]+\.h, #0\.0\n} } } */ +/* { dg-final { scan-assembler {\tfcmuo\tp[0-9]+\.h, p[0-7]/z, z[0-9]+\.h, z[0-9]+\.h\n} } } */ Index: gcc/testsuite/gcc.target/aarch64/sve/vcond_17_run.c =================================================================== --- /dev/null 2019-07-30 08:53:31.317691683 +0100 +++ gcc/testsuite/gcc.target/aarch64/sve/vcond_17_run.c 2019-08-14 09:23:37.626427347 +0100 @@ -0,0 +1,54 @@ +/* { dg-do run { target aarch64_sve_hw } } */ +/* { dg-options "-O2 -ftree-vectorize" } */ +/* { dg-require-effective-target fenv_exceptions } */ + +#include + +#include "vcond_17.c" + +#define N 401 + +#define TEST_LOOP(CMP, EXPECT_INVALID) \ + { \ + __fp16 dest1[N], dest2[N], dest3[N], src[N]; \ + __fp16 a[N], b[N]; \ + for (int i = 0; i < N; ++i) \ + { \ + src[i] = i * i; \ + if (i % 5 == 0) \ + a[i] = 0; \ + else if (i % 3) \ + a[i] = i * 0.1; \ + else \ + a[i] = i; \ + if (i % 7 == 0) \ + b[i] = __builtin_nan (""); \ + else if (i % 6) \ + b[i] = i * 0.1; \ + else \ + b[i] = i; \ + asm volatile ("" ::: "memory"); \ + } \ + feclearexcept (FE_ALL_EXCEPT); \ + test_##CMP##_var (dest1, src, 11, a, b, N); \ + test_##CMP##_zero (dest2, src, 22, a, N); \ + test_##CMP##_sel (dest3, 33, 44, a, 9, N); \ + if (!fetestexcept (FE_INVALID) != !(EXPECT_INVALID)) \ + __builtin_abort (); \ + for (int i = 0; i < N; ++i) \ + { \ + if (dest1[i] != (CMP (a[i], b[i]) ? src[i] : 11)) \ + __builtin_abort (); \ + if (dest2[i] != (CMP (a[i], 0) ? src[i] : 22)) \ + __builtin_abort (); \ + if (dest3[i] != (CMP (a[i], 9) ? 33 : 44)) \ + __builtin_abort (); \ + } \ + } + +int __attribute__ ((optimize (1))) +main (void) +{ + TEST_ALL (TEST_LOOP) + return 0; +}