From patchwork Wed Nov 5 15:58:44 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ilya Tocar X-Patchwork-Id: 407066 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id AA8DD1400A0 for ; Thu, 6 Nov 2014 02:59:30 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:references:mime-version :content-type:in-reply-to; q=dns; s=default; b=QHyU6T/FCPseKjlpj uRZMpkRYm48+3f+aA1kvExizL5jkfrrJA5AnDBbQNFUH+ptw8LjQJ/zYTMVqmzho 2lKd78zhZCMvlqZub8jUrbJFRzH8Xb4DdBEd3Fw9kh7AQFUTreoElGmTDo5K3qoU CM3NVQSmnWSdNE0mlWjgB9NC3Q= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:references:mime-version :content-type:in-reply-to; s=default; bh=XTRckl84CyLtzjZHeru0aZT tF5Q=; b=pfMRz6+iY0IjV5debwkOx9DKPaSZlIYxyBEhrhhHm/KP3Y4ecQLIout Z5R1ejAiHkEr2+y5ey1OifRqBqKg88Fyt/phzNvPDZuo3EeomX74FPYlgaDIlpC0 CEc12ojtZ/9seWMvc5jUHOF6ZaYvt4KumUgMudHcLZd1WeQUK2Zg= Received: (qmail 32197 invoked by alias); 5 Nov 2014 15:59:13 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 32031 invoked by uid 89); 5 Nov 2014 15:59:11 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.4 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-wi0-f171.google.com Received: from mail-wi0-f171.google.com (HELO mail-wi0-f171.google.com) (209.85.212.171) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Wed, 05 Nov 2014 15:59:08 +0000 Received: by mail-wi0-f171.google.com with SMTP id r20so448747wiv.16 for ; Wed, 05 Nov 2014 07:59:05 -0800 (PST) X-Received: by 10.180.73.45 with SMTP id i13mr26577042wiv.32.1415203144055; Wed, 05 Nov 2014 07:59:04 -0800 (PST) Received: from msticlxl7.ims.intel.com (jfdmzpr05-ext.jf.intel.com. [134.134.139.74]) by mx.google.com with ESMTPSA id wl1sm4558303wjb.4.2014.11.05.07.58.59 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 05 Nov 2014 07:59:02 -0800 (PST) Date: Wed, 5 Nov 2014 18:58:44 +0300 From: Ilya Tocar To: Jakub Jelinek Cc: Uros Bizjak , GCC Patches Subject: Re: [PATCH AVX512] Fix dg.torture tests with avx512 Message-ID: <20141105155844.GA22764@msticlxl7.ims.intel.com> References: <20141030145532.GA56974@msticlxl7.ims.intel.com> <20141103102156.GQ5026@tucnak.redhat.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20141103102156.GQ5026@tucnak.redhat.com> User-Agent: Mutt/1.5.23 (2014-03-12) X-IsSubscribed: yes On 03 Nov 11:21, Jakub Jelinek wrote: > On Fri, Oct 31, 2014 at 11:17:07AM +0100, Uros Bizjak wrote: > > I'd like to ask Jakub for a review of the above two parts, other parts > > are OK with a rename (as mentioned above). > > Looks ok to me. Where the ICEs discovered just by normal make check or only > with GCC_TEST_RUN_EXPENSIVE ? If the latter, can you promote one of the > permutations that caused the ICEs to normal tests? If not and > GCC_TEST_RUN_EXPENSIVE has not been tested, can you try that? > This was discovered without GCC_TEST_RUN_EXPENSIVE, but I've tested it with it enabled, and didn't see any fails. I've committed version below. --- gcc/config/i386/i386.c | 59 ++++++++++++++++++++++++++++++++++++++++++++------ gcc/config/i386/sse.md | 54 ++++++++++++++++++++++++++++++++++++++------- 2 files changed, 98 insertions(+), 15 deletions(-) diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index c528599..aaffe9d 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -45943,6 +45943,42 @@ expand_vec_perm_pshufb (struct expand_vec_perm_d *d) { if (!TARGET_AVX512BW) return false; + + /* If vpermq didn't work, vpshufb won't work either. */ + if (d->vmode == V8DFmode || d->vmode == V8DImode) + return false; + + vmode = V64QImode; + if (d->vmode == V16SImode + || d->vmode == V32HImode + || d->vmode == V64QImode) + { + /* First see if vpermq can be used for + V16SImode/V32HImode/V64QImode. */ + if (valid_perm_using_mode_p (V8DImode, d)) + { + for (i = 0; i < 8; i++) + perm[i] = (d->perm[i * nelt / 8] * 8 / nelt) & 7; + if (d->testing_p) + return true; + target = gen_reg_rtx (V8DImode); + if (expand_vselect (target, gen_lowpart (V8DImode, d->op0), + perm, 8, false)) + { + emit_move_insn (d->target, + gen_lowpart (d->vmode, target)); + return true; + } + return false; + } + + /* Next see if vpermd can be used. */ + if (valid_perm_using_mode_p (V16SImode, d)) + vmode = V16SImode; + } + /* Or if vpermps can be used. */ + else if (d->vmode == V16SFmode) + vmode = V16SImode; if (vmode == V64QImode) { /* vpshufb only works intra lanes, it is not @@ -45962,6 +45998,9 @@ expand_vec_perm_pshufb (struct expand_vec_perm_d *d) if (vmode == V8SImode) for (i = 0; i < 8; ++i) rperm[i] = GEN_INT ((d->perm[i * nelt / 8] * 8 / nelt) & 7); + else if (vmode == V16SImode) + for (i = 0; i < 16; ++i) + rperm[i] = GEN_INT ((d->perm[i * nelt / 16] * 16 / nelt) & 15); else { eltsz = GET_MODE_SIZE (GET_MODE_INNER (d->vmode)); @@ -46000,8 +46039,14 @@ expand_vec_perm_pshufb (struct expand_vec_perm_d *d) emit_insn (gen_avx512bw_pshufbv64qi3 (target, op0, vperm)); else if (vmode == V8SFmode) emit_insn (gen_avx2_permvarv8sf (target, op0, vperm)); - else + else if (vmode == V8SImode) emit_insn (gen_avx2_permvarv8si (target, op0, vperm)); + else if (vmode == V16SFmode) + emit_insn (gen_avx512f_permvarv16sf (target, op0, vperm)); + else if (vmode == V16SImode) + emit_insn (gen_avx512f_permvarv16si (target, op0, vperm)); + else + gcc_unreachable (); } else { @@ -46055,21 +46100,21 @@ expand_vec_perm_1 (struct expand_vec_perm_d *d) { case V64QImode: if (TARGET_AVX512BW) - gen = gen_avx512bw_vec_dupv64qi; + gen = gen_avx512bw_vec_dupv64qi_1; break; case V32QImode: gen = gen_avx2_pbroadcastv32qi_1; break; case V32HImode: if (TARGET_AVX512BW) - gen = gen_avx512bw_vec_dupv32hi; + gen = gen_avx512bw_vec_dupv32hi_1; break; case V16HImode: gen = gen_avx2_pbroadcastv16hi_1; break; case V16SImode: if (TARGET_AVX512F) - gen = gen_avx512f_vec_dupv16si; + gen = gen_avx512f_vec_dupv16si_1; break; case V8SImode: gen = gen_avx2_pbroadcastv8si_1; @@ -46082,18 +46127,18 @@ expand_vec_perm_1 (struct expand_vec_perm_d *d) break; case V16SFmode: if (TARGET_AVX512F) - gen = gen_avx512f_vec_dupv16sf; + gen = gen_avx512f_vec_dupv16sf_1; break; case V8SFmode: gen = gen_avx2_vec_dupv8sf_1; break; case V8DFmode: if (TARGET_AVX512F) - gen = gen_avx512f_vec_dupv8df; + gen = gen_avx512f_vec_dupv8df_1; break; case V8DImode: if (TARGET_AVX512F) - gen = gen_avx512f_vec_dupv8di; + gen = gen_avx512f_vec_dupv8di_1; break; /* For other modes prefer other shuffles this function creates. */ default: break; diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 2757a1e..13ddd29 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -6297,6 +6297,18 @@ (set_attr "prefix" "vex") (set_attr "mode" "V8SF")]) +(define_insn "avx512f_vec_dup_1" + [(set (match_operand:VF_512 0 "register_operand" "=v") + (vec_duplicate:VF_512 + (vec_select: + (match_operand:VF_512 1 "register_operand" "v") + (parallel [(const_int 0)]))))] + "TARGET_AVX512F" + "vbroadcast\t{%x1, %0|%0, %x1}" + [(set_attr "type" "sselog1") + (set_attr "prefix" "evex") + (set_attr "mode" "")]) + (define_insn "vec_dupv4sf" [(set (match_operand:V4SF 0 "register_operand" "=x,x,x") (vec_duplicate:V4SF @@ -15922,22 +15934,35 @@ (match_operand:VI48_256 2 "nonimmediate_operand")))] "TARGET_AVX2") -(define_expand "vashr3" - [(set (match_operand:VI12_128 0 "register_operand") - (ashiftrt:VI12_128 - (match_operand:VI12_128 1 "register_operand") - (match_operand:VI12_128 2 "nonimmediate_operand")))] +(define_expand "vashrv8hi3" + [(set (match_operand:V8HI 0 "register_operand") + (ashiftrt:V8HI + (match_operand:V8HI 1 "register_operand") + (match_operand:V8HI 2 "nonimmediate_operand")))] "TARGET_XOP || (TARGET_AVX512BW && TARGET_AVX512VL)" { if (TARGET_XOP) { - rtx neg = gen_reg_rtx (mode); - emit_insn (gen_neg2 (neg, operands[2])); - emit_insn (gen_xop_sha3 (operands[0], operands[1], neg)); + rtx neg = gen_reg_rtx (V8HImode); + emit_insn (gen_negv8hi2 (neg, operands[2])); + emit_insn (gen_xop_shav8hi3 (operands[0], operands[1], neg)); DONE; } }) +(define_expand "vashrv16qi3" + [(set (match_operand:V16QI 0 "register_operand") + (ashiftrt:V16QI + (match_operand:V16QI 1 "register_operand") + (match_operand:V16QI 2 "nonimmediate_operand")))] + "TARGET_XOP" +{ + rtx neg = gen_reg_rtx (V16QImode); + emit_insn (gen_negv16qi2 (neg, operands[2])); + emit_insn (gen_xop_shav16qi3 (operands[0], operands[1], neg)); + DONE; +}) + (define_expand "vashrv2di3" [(set (match_operand:V2DI 0 "register_operand") (ashiftrt:V2DI @@ -16531,6 +16556,19 @@ (set_attr "prefix" "vex") (set_attr "mode" "V4DF")]) +(define_insn "_vec_dup_1" + [(set (match_operand:VI_AVX512BW 0 "register_operand" "=v,v") + (vec_duplicate:VI_AVX512BW + (vec_select:VI_AVX512BW + (match_operand:VI_AVX512BW 1 "nonimmediate_operand" "v,m") + (parallel [(const_int 0)]))))] + "TARGET_AVX512F" + "vpbroadcast\t{%1, %0|%0, %1} + vpbroadcast\t{%x1, %0|%0, %x1}" + [(set_attr "type" "ssemov") + (set_attr "prefix" "evex") + (set_attr "mode" "")]) + (define_insn "_vec_dup" [(set (match_operand:V48_AVX512VL 0 "register_operand" "=v") (vec_duplicate:V48_AVX512VL