From patchwork Wed Oct 9 10:30:14 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kirill Yukhin X-Patchwork-Id: 281806 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 5707C2C00C8 for ; Wed, 9 Oct 2013 21:36:14 +1100 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:references:mime-version :content-type:in-reply-to; q=dns; s=default; b=C9jPd0rvOuqryBM+f WyC6c/AsMkRLfI35BRfy+ZiYZsEtenckEBF9AEkkKtcMkc3vr0ktjnzumGhgyKfJ XuTe5TqiUlyIud8d7uzYXftcvmEg/9WIauQMb5F8Dg3c3EkqVi+BR43fUrZvnHDC FxMsg9cY2pa8Mu5qB82LhPxLC0= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:references:mime-version :content-type:in-reply-to; s=default; bh=c3FIScXXIu/tPSSDs/jfiYw nXls=; b=asrFK4/tsxVHTLvlSyTSqdkKKYTb65Mesz4b3+NWahRJiQmlGI43kXn cak31p99XDA/cHZeIRHF7yK2vH7EjA1hu65xnTQr55FRXGky6xPwNHWKub6ou4V5 kZqqMnl8/yO+axNd7XLPWcqMlQE5dhFJDqJSux0puBjXxCvAHx6g= Received: (qmail 30581 invoked by alias); 9 Oct 2013 10:36:07 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 30569 invoked by uid 89); 9 Oct 2013 10:36:06 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-2.3 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-pd0-f179.google.com Received: from mail-pd0-f179.google.com (HELO mail-pd0-f179.google.com) (209.85.192.179) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Wed, 09 Oct 2013 10:36:04 +0000 Received: by mail-pd0-f179.google.com with SMTP id v10so723646pde.10 for ; Wed, 09 Oct 2013 03:36:03 -0700 (PDT) X-Received: by 10.68.217.196 with SMTP id pa4mr6976745pbc.117.1381314650069; Wed, 09 Oct 2013 03:30:50 -0700 (PDT) Received: from msticlxl57.ims.intel.com ([192.55.54.40]) by mx.google.com with ESMTPSA id vz4sm53887982pab.11.1969.12.31.16.00.00 (version=TLSv1 cipher=RC4-SHA bits=128/128); Wed, 09 Oct 2013 03:30:49 -0700 (PDT) Date: Wed, 9 Oct 2013 14:30:14 +0400 From: Kirill Yukhin To: Richard Henderson Cc: Uros Bizjak , Vladimir Makarov , Jakub Jelinek , GCC Patches Subject: Re: [PATCH i386 3/8] [AVX512] [16/n] Add AVX-512 patterns: VI48_512 and VI4F_128 iterators. Message-ID: <20131009103014.GO52466@msticlxl57.ims.intel.com> References: <20130808112524.GA40277@msticlxl57.ims.intel.com> <20130814072638.GD52726@msticlxl57.ims.intel.com> <52129604.6040305@redhat.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <52129604.6040305@redhat.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-IsSubscribed: yes Hello, > This patch is still far too large. > > I think you should split it up based on every single mode iterator that > you need to add or change. Here's 1st subpatch. It extends VI4F_128 and introduces VI48_512 iterator. Is it Ok? Testing: 1. Bootstrap pass. 2. make check shows no regressions. 3. Spec 2000 & 2006 build show no regressions both with and without -mavx512f option. 4. Spec 2000 & 2006 run shows no stability regressions without -mavx512f option. --- Thanks, K PS. If it is Ok - I am going to strip out ChangeLog lines from big patch. --- gcc/config/i386/predicates.md | 5 + gcc/config/i386/sse.md | 344 +++++++++++++++++++++++++++++++++++++++++- 2 files changed, 348 insertions(+), 1 deletion(-) diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md index 18f425c..eff82eb 100644 --- a/gcc/config/i386/predicates.md +++ b/gcc/config/i386/predicates.md @@ -1332,3 +1332,8 @@ (define_predicate "general_vector_operand" (ior (match_operand 0 "nonimmediate_operand") (match_code "const_vector"))) + +;; Return true if OP is either -1 constant or stored in register. +(define_predicate "register_or_constm1_operand" + (ior (match_operand 0 "register_operand") + (match_test "op == constm1_rtx"))) diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 2364ccc..8221d61 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -87,7 +87,19 @@ ;; For AVX512F support UNSPEC_VPERMI2 UNSPEC_VPERMT2 + UNSPEC_UNSIGNED_PCMP + UNSPEC_TESTM + UNSPEC_TESTNM UNSPEC_SCATTER + UNSPEC_VTERNLOG + UNSPEC_ALIGN + UNSPEC_CONFLICT + UNSPEC_MASKED_EQ + UNSPEC_MASKED_GT + + ;; For AVX512PF support + UNSPEC_GATHER_PREFETCH + UNSPEC_SCATTER_PREFETCH ]) (define_c_enum "unspecv" [ @@ -364,6 +376,7 @@ (define_mode_iterator VI124_256_48_512 [V32QI V16HI V8SI (V8DI "TARGET_AVX512F") (V16SI "TARGET_AVX512F")]) (define_mode_iterator VI48_256 [V8SI V4DI]) +(define_mode_iterator VI48_512 [V16SI V8DI]) ;; Int-float size matches (define_mode_iterator VI4F_128 [V4SI V4SF]) @@ -1741,7 +1754,9 @@ [(V32QI "TARGET_AVX2") (V16HI "TARGET_AVX2") (V8SI "TARGET_AVX2") (V4DI "TARGET_AVX2") (V8SF "TARGET_AVX") (V4DF "TARGET_AVX") - (V4SF "TARGET_SSE")]) + (V4SF "TARGET_SSE") (V16SI "TARGET_AVX512F") + (V8DI "TARGET_AVX512F") (V16SF "TARGET_AVX512F") + (V8DF "TARGET_AVX512F")]) (define_expand "reduc__" [(smaxmin:REDUC_SMINMAX_MODE @@ -1754,6 +1769,16 @@ }) (define_expand "reduc__" + [(umaxmin:VI48_512 + (match_operand:VI48_512 0 "register_operand") + (match_operand:VI48_512 1 "register_operand"))] + "TARGET_AVX512F" +{ + ix86_expand_reduc (gen_3, operands[0], operands[1]); + DONE; +}) + +(define_expand "reduc__" [(umaxmin:VI_256 (match_operand:VI_256 0 "register_operand") (match_operand:VI_256 1 "register_operand"))] @@ -1877,6 +1902,20 @@ (set_attr "prefix" "evex") (set_attr "mode" "")]) +(define_insn "avx512f_ucmp3" + [(set (match_operand: 0 "register_operand" "=k") + (unspec: + [(match_operand:VI48_512 1 "register_operand" "v") + (match_operand:VI48_512 2 "nonimmediate_operand" "vm") + (match_operand:SI 3 "const_0_to_7_operand" "n")] + UNSPEC_UNSIGNED_PCMP))] + "TARGET_AVX512F" + "vpcmpu\t{%3, %2, %1, %0|%0, %1, %2, %3}" + [(set_attr "type" "ssecmp") + (set_attr "length_immediate" "1") + (set_attr "prefix" "evex") + (set_attr "mode" "")]) + (define_insn "_comi" [(set (reg:CCFP FLAGS_REG) (compare:CCFP @@ -5113,6 +5152,31 @@ operands[1] = adjust_address (operands[1], DFmode, INTVAL (operands[2]) * 8); }) +(define_insn "avx512f_vternlog" + [(set (match_operand:VI48_512 0 "register_operand" "=v") + (unspec:VI48_512 + [(match_operand:VI48_512 1 "register_operand" "0") + (match_operand:VI48_512 2 "register_operand" "v") + (match_operand:VI48_512 3 "nonimmediate_operand" "vm") + (match_operand:SI 4 "const_0_to_255_operand")] + UNSPEC_VTERNLOG))] + "TARGET_AVX512F" + "vpternlog\t{%4, %3, %2, %0|%0, %2, %3, %4}" + [(set_attr "type" "sselog") + (set_attr "prefix" "evex") + (set_attr "mode" "")]) + +(define_insn "avx512f_align" + [(set (match_operand:VI48_512 0 "register_operand" "=v") + (unspec:VI48_512 [(match_operand:VI48_512 1 "register_operand" "v") + (match_operand:VI48_512 2 "nonimmediate_operand" "vm") + (match_operand:SI 3 "const_0_to_255_operand")] + UNSPEC_ALIGN))] + "TARGET_AVX512F" + "valign\t{%3, %2, %1, %0|%0, %1, %2, %3}"; + [(set_attr "prefix" "evex") + (set_attr "mode" "")]) + (define_insn "avx512f_rndscale" [(set (match_operand:VF_512 0 "register_operand" "=v") (unspec:VF_512 @@ -6137,6 +6201,22 @@ (set_attr "prefix" "orig,vex") (set_attr "mode" "")]) +(define_insn "3" + [(set (match_operand:VI48_512 0 "register_operand" "=v,v") + (any_lshift:VI48_512 + (match_operand:VI48_512 1 "register_operand" "v,m") + (match_operand:SI 2 "nonmemory_operand" "vN,N")))] + "TARGET_AVX512F" + "vp\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "avx512f") + (set_attr "type" "sseishft") + (set (attr "length_immediate") + (if_then_else (match_operand 2 "const_int_operand") + (const_string "1") + (const_string "0"))) + (set_attr "prefix" "evex") + (set_attr "mode" "")]) + (define_expand "vec_shl_" [(set (match_operand:VI_128 0 "register_operand") (ashift:V1TI @@ -6212,6 +6292,25 @@ (set_attr "prefix" "orig,vex") (set_attr "mode" "")]) +(define_insn "avx512f_v" + [(set (match_operand:VI48_512 0 "register_operand" "=v") + (any_rotate:VI48_512 + (match_operand:VI48_512 1 "register_operand" "v") + (match_operand:VI48_512 2 "nonimmediate_operand" "vm")))] + "TARGET_AVX512F" + "vpv\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "prefix" "evex") + (set_attr "mode" "")]) + +(define_insn "avx512f_" + [(set (match_operand:VI48_512 0 "register_operand" "=v") + (any_rotate:VI48_512 + (match_operand:VI48_512 1 "nonimmediate_operand" "vm") + (match_operand:SI 2 "const_0_to_255_operand")))] + "TARGET_AVX512F" + "vp\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "prefix" "evex") + (set_attr "mode" "")]) (define_expand "3" [(set (match_operand:VI124_256_48_512 0 "register_operand") @@ -6445,6 +6544,28 @@ (set_attr "prefix" "vex") (set_attr "mode" "OI")]) +(define_expand "avx512f_eq3" + [(set (match_operand: 0 "register_operand") + (unspec: + [(match_operand:VI48_512 1 "register_operand") + (match_operand:VI48_512 2 "nonimmediate_operand")] + UNSPEC_MASKED_EQ))] + "TARGET_AVX512F" + "ix86_fixup_binary_operands_no_copy (EQ, mode, operands);") + +(define_insn "avx512f_eq3_1" + [(set (match_operand: 0 "register_operand" "=k") + (unspec: + [(match_operand:VI48_512 1 "register_operand" "%v") + (match_operand:VI48_512 2 "nonimmediate_operand" "vm")] + UNSPEC_MASKED_EQ))] + "TARGET_AVX512F && ix86_binary_operator_ok (EQ, mode, operands)" + "vpcmpeq\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "type" "ssecmp") + (set_attr "prefix_extra" "1") + (set_attr "prefix" "evex") + (set_attr "mode" "")]) + (define_insn "*sse4_1_eqv2di3" [(set (match_operand:V2DI 0 "register_operand" "=x,x") (eq:V2DI @@ -6519,6 +6640,18 @@ (set_attr "prefix" "vex") (set_attr "mode" "OI")]) +(define_insn "avx512f_gt3" + [(set (match_operand: 0 "register_operand" "=k") + (unspec: + [(match_operand:VI48_512 1 "register_operand" "v") + (match_operand:VI48_512 2 "nonimmediate_operand" "vm")] UNSPEC_MASKED_GT))] + "TARGET_AVX512F" + "vpcmpgt\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "type" "ssecmp") + (set_attr "prefix_extra" "1") + (set_attr "prefix" "evex") + (set_attr "mode" "")]) + (define_insn "sse2_gt3" [(set (match_operand:VI124_128 0 "register_operand" "=x,x") (gt:VI124_128 @@ -6903,6 +7036,28 @@ ] (const_string "")))]) +(define_insn "avx512f_testm3" + [(set (match_operand: 0 "register_operand" "=k") + (unspec: + [(match_operand:VI48_512 1 "register_operand" "v") + (match_operand:VI48_512 2 "nonimmediate_operand" "vm")] + UNSPEC_TESTM))] + "TARGET_AVX512F" + "vptestm\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "prefix" "evex") + (set_attr "mode" "")]) + +(define_insn "avx512f_testnm3" + [(set (match_operand: 0 "register_operand" "=k") + (unspec: + [(match_operand:VI48_512 1 "register_operand" "v") + (match_operand:VI48_512 2 "nonimmediate_operand" "vm")] + UNSPEC_TESTNM))] + "TARGET_AVX512CD" + "%vptestnm\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "prefix" "evex") + (set_attr "mode" "")]) + ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; ;; Parallel integral element swizzling @@ -9859,6 +10014,148 @@ (set_attr "btver2_decode" "vector,vector,vector,vector") (set_attr "mode" "TI")]) +(define_expand "avx512pf_gatherpf" + [(unspec + [(match_operand: 0 "register_or_constm1_operand") + (mem: + (match_par_dup 5 + [(match_operand 2 "vsib_address_operand") + (match_operand:VI48_512 1 "register_operand") + (match_operand:SI 3 "const1248_operand")])) + (match_operand:SI 4 "const_0_to_1_operand")] + UNSPEC_GATHER_PREFETCH)] + "TARGET_AVX512PF" +{ + operands[5] + = gen_rtx_UNSPEC (Pmode, gen_rtvec (3, operands[2], operands[1], + operands[3]), UNSPEC_VSIBADDR); +}) + +(define_insn "*avx512pf_gatherpf_mask" + [(unspec + [(match_operand: 0 "register_operand" "k") + (match_operator: 5 "vsib_mem_operator" + [(unspec:P + [(match_operand:P 2 "vsib_address_operand" "p") + (match_operand:VI48_512 1 "register_operand" "v") + (match_operand:SI 3 "const1248_operand" "n")] + UNSPEC_VSIBADDR)]) + (match_operand:SI 4 "const_0_to_1_operand" "n")] + UNSPEC_GATHER_PREFETCH)] + "TARGET_AVX512PF" +{ + switch (INTVAL (operands[4])) + { + case 0: + return "vgatherpf0ps\t{%5%{%0%}|%5%{%0%}}"; + case 1: + return "vgatherpf1ps\t{%5%{%0%}|%5%{%0%}}"; + default: + gcc_unreachable (); + } +} + [(set_attr "type" "sse") + (set_attr "prefix" "evex") + (set_attr "mode" "XI")]) + +(define_insn "*avx512pf_gatherpf" + [(unspec + [(const_int -1) + (match_operator: 4 "vsib_mem_operator" + [(unspec:P + [(match_operand:P 1 "vsib_address_operand" "p") + (match_operand:VI48_512 0 "register_operand" "v") + (match_operand:SI 2 "const1248_operand" "n")] + UNSPEC_VSIBADDR)]) + (match_operand:SI 3 "const_0_to_1_operand" "n")] + UNSPEC_GATHER_PREFETCH)] + "TARGET_AVX512PF" +{ + switch (INTVAL (operands[3])) + { + case 0: + return "vgatherpf0ps\t{%4|%4}"; + case 1: + return "vgatherpf1ps\t{%4|%4}"; + default: + gcc_unreachable (); + } +} + [(set_attr "type" "sse") + (set_attr "prefix" "evex") + (set_attr "mode" "XI")]) + +(define_expand "avx512pf_scatterpf" + [(unspec + [(match_operand: 0 "register_or_constm1_operand") + (mem: + (match_par_dup 5 + [(match_operand 2 "vsib_address_operand") + (match_operand:VI48_512 1 "register_operand") + (match_operand:SI 3 "const1248_operand")])) + (match_operand:SI 4 "const_0_to_1_operand")] + UNSPEC_SCATTER_PREFETCH)] + "TARGET_AVX512PF" +{ + operands[5] + = gen_rtx_UNSPEC (Pmode, gen_rtvec (3, operands[2], operands[1], + operands[3]), UNSPEC_VSIBADDR); +}) + +(define_insn "*avx512pf_scatterpf_mask" + [(unspec + [(match_operand: 0 "register_operand" "k") + (match_operator: 5 "vsib_mem_operator" + [(unspec:P + [(match_operand:P 2 "vsib_address_operand" "p") + (match_operand:VI48_512 1 "register_operand" "v") + (match_operand:SI 3 "const1248_operand" "n")] + UNSPEC_VSIBADDR)]) + (match_operand:SI 4 "const_0_to_1_operand" "n")] + UNSPEC_SCATTER_PREFETCH)] + "TARGET_AVX512PF" +{ + switch (INTVAL (operands[4])) + { + case 0: + return "vscatterpf0ps\t{%5%{%0%}|%5%{%0%}}"; + case 1: + return "vscatterpf1ps\t{%5%{%0%}|%5%{%0%}}"; + default: + gcc_unreachable (); + } +} + [(set_attr "type" "sse") + (set_attr "prefix" "evex") + (set_attr "mode" "XI")]) + +(define_insn "*avx512pf_scatterpf" + [(unspec + [(const_int -1) + (match_operator: 4 "vsib_mem_operator" + [(unspec:P + [(match_operand:P 1 "vsib_address_operand" "p") + (match_operand:VI48_512 0 "register_operand" "v") + (match_operand:SI 2 "const1248_operand" "n")] + UNSPEC_VSIBADDR)]) + (match_operand:SI 3 "const_0_to_1_operand" "n")] + UNSPEC_SCATTER_PREFETCH)] + "TARGET_AVX512PF" +{ + switch (INTVAL (operands[3])) + { + case 0: + return "vscatterpf0ps\t{%4|%4}"; + case 1: + return "vscatterpf1ps\t{%4|%4}"; + default: + gcc_unreachable (); + } +} + [(set_attr "type" "sse") + (set_attr "prefix" "evex") + (set_attr "mode" "XI")]) + ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; ;; XOP instructions @@ -10406,6 +10703,13 @@ }) (define_expand "vlshr3" + [(set (match_operand:VI48_512 0 "register_operand") + (lshiftrt:VI48_512 + (match_operand:VI48_512 1 "register_operand") + (match_operand:VI48_512 2 "nonimmediate_operand")))] + "TARGET_AVX512F") + +(define_expand "vlshr3" [(set (match_operand:VI48_256 0 "register_operand") (lshiftrt:VI48_256 (match_operand:VI48_256 1 "register_operand") @@ -10473,6 +10777,13 @@ }) (define_expand "vashl3" + [(set (match_operand:VI48_512 0 "register_operand") + (ashift:VI48_512 + (match_operand:VI48_512 1 "register_operand") + (match_operand:VI48_512 2 "nonimmediate_operand")))] + "TARGET_AVX512F") + +(define_expand "vashl3" [(set (match_operand:VI48_256 0 "register_operand") (ashift:VI48_256 (match_operand:VI48_256 1 "register_operand") @@ -10990,6 +11301,16 @@ (set_attr "prefix" "evex") (set_attr "mode" "")]) +(define_insn "avx512f_vec_dup_gpr" + [(set (match_operand:VI48_512 0 "register_operand" "=v") + (vec_duplicate:VI48_512 + (match_operand: 1 "register_operand" "r")))] + "TARGET_AVX512F && (mode != V8DImode || TARGET_64BIT)" + "vpbroadcast\t{%1, %0|%0, %1}" + [(set_attr "type" "ssemov") + (set_attr "prefix" "evex") + (set_attr "mode" "")]) + (define_insn "avx512f_vec_dup_mem" [(set (match_operand:VI48F_512 0 "register_operand" "=x") (vec_duplicate:VI48F_512 @@ -12136,3 +12457,24 @@ [(set_attr "type" "ssemov") (set_attr "prefix" "evex") (set_attr "mode" "")]) + +(define_insn "clz2" + [(set (match_operand:VI48_512 0 "register_operand" "=v") + (clz:VI48_512 + (match_operand:VI48_512 1 "nonimmediate_operand" "vm")))] + "TARGET_AVX512CD" + "vplzcnt\t{%1, %0|%0, %1}" + [(set_attr "type" "sse") + (set_attr "prefix" "evex") + (set_attr "mode" "")]) + +(define_insn "conflict" + [(set (match_operand:VI48_512 0 "register_operand" "=v") + (unspec:VI48_512 + [(match_operand:VI48_512 1 "nonimmediate_operand" "vm")] + UNSPEC_CONFLICT))] + "TARGET_AVX512CD" + "vpconflict\t{%1, %0|%0, %1}" + [(set_attr "type" "sse") + (set_attr "prefix" "evex") + (set_attr "mode" "")])