From patchwork Tue Sep 16 10:10:26 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Kirill Yukhin X-Patchwork-Id: 390027 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 2D45C14009A for ; Tue, 16 Sep 2014 20:11:01 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:references:mime-version :content-type:content-transfer-encoding:in-reply-to; q=dns; s= default; b=lARRmar1CV8C0iSoir36VclRxlDjk4M4LgNgTC1QE4CZuF9XoyM28 Yyv+W50/J92sOKq/EAx2U6+AA9/QYimcT1Qv/xnW0192jE4ooU6MxU5syGOSUZB0 x9gfMyAI6J6M46K7Ds66G090t7jJew0xscgusSLWF3etWO+ZUxYxB4= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:references:mime-version :content-type:content-transfer-encoding:in-reply-to; s=default; bh=QpN5d3PrppQlyqGVUuQ6keVNwHs=; b=V+q4VLvH+XxQLrSDo8AGxWBHMERN 999eVmzCf59aSajVYE3FNZE0MWhGsd5wwLKguHJg4SNOFssuWzNxZPgzqSzh+rTU Ylk64KFm2z1xMUVKmG4s0LXg5Zk3ISexzd6JTEvzh53xVJygnlrbE7/mNuabG1gT fFxYyfBD9qRFb3U= Received: (qmail 27870 invoked by alias); 16 Sep 2014 10:10:50 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 27847 invoked by uid 89); 16 Sep 2014 10:10:45 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-we0-f172.google.com Received: from mail-we0-f172.google.com (HELO mail-we0-f172.google.com) (74.125.82.172) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Tue, 16 Sep 2014 10:10:43 +0000 Received: by mail-we0-f172.google.com with SMTP id k48so5439695wev.3 for ; Tue, 16 Sep 2014 03:10:37 -0700 (PDT) X-Received: by 10.180.218.99 with SMTP id pf3mr31572981wic.19.1410862236663; Tue, 16 Sep 2014 03:10:36 -0700 (PDT) Received: from msticlxl57.ims.intel.com ([192.55.54.42]) by mx.google.com with ESMTPSA id n5sm17815983wja.38.2014.09.16.03.10.33 for (version=TLSv1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 16 Sep 2014 03:10:36 -0700 (PDT) Date: Tue, 16 Sep 2014 14:10:26 +0400 From: Kirill Yukhin To: Uros Bizjak Cc: Jakub Jelinek , Richard Henderson , GCC Patches Subject: Re: [PATCH i386 AVX512] [41/n] Extend extract insn patterns. Message-ID: <20140916101024.GC64926@msticlxl57.ims.intel.com> References: <20140915180501.GA27164@msticlxl57.ims.intel.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-IsSubscribed: yes Hello Uroš, On 16 Sep 09:47, Uros Bizjak wrote: > > + "TARGET_AVX512DQ && (INTVAL (operands[2]) = INTVAL (operands[3]) - 1)" > > Ouch, you have assignment instead of comparison here! Thanks, fixed! > > + (set (attr "memory") > > + (if_then_else (match_test "MEM_P (operands[0])") > > + (const_string "store") > > + (const_string "none"))) > > Set the type attribute to sselog1 to automatically calculate memory > attribute. Please see the definition of the attribute in i386.md. Fixed. > > + "TARGET_AVX512DQ" > > + "vextract32x8\t{$0x1, %1, %0%{%3%}|%0%{%3%}, %1, 0x1}" > > + [(set_attr "type" "sselog") > > + (set_attr "prefix_extra" "1") > > + (set_attr "length_immediate" "1") > > + (set (attr "memory") > > + (if_then_else (match_test "MEM_P (operands[0])") > > + (const_string "store") > > + (const_string "none"))) > > Set the type to sselog1 and remove memory attribute calculation (as above). Fixed. > > +} > > + [(set_attr "type" "sselog") > > + (set_attr "prefix_extra" "1") > > + (set_attr "length_immediate" "1") > > + (set_attr "memory" "none,store") > > Set the type to sselog1 and remove memory attribute calculation. Fixed. > > - "TARGET_AVX" > > - "vextract\t{$0x1, %1, %0|%0, %1, 0x1}" > > + "TARGET_AVX && (! || (TARGET_AVX512VL && TARGET_AVX512F))" > > Please split the pattern to avoid too complex insn constraints. Condition simplified. Updated ChangeLog entry: gcc/ * config/i386/i386.c (ix86_expand_vector_extract): Handle V32HI and V64QI modes. * config/i386/sse.md (define_mode_iterator VI48F_256): New. (define_mode_attr extract_type): Ditto. (define_mode_attr extract_suf): Ditto. (define_mode_iterator AVX512_VEC): Ditto. (define_expand "_vextract_mask"): Use AVX512_VEC. (define_insn "avx512dq_vextract64x2_1_maskm"): New. (define_insn "avx512dq_vextract64x2_1"): Ditto. (define_mode_attr extract_type_2): Ditto. (define_mode_attr extract_suf_2): Ditto. (define_mode_iterator AVX512_VEC_2): Ditto. (define_expand "_vextract_mask"): Use AVX512_VEC_2 mode iterator. (define_insn "vec_extract_hi__maskm"): Ditto. (define_expand "avx512vl_vextractf128"): Ditto. (define_insn_and_split "vec_extract_lo_"): Delete. (define_insn "vec_extract_lo_"): New. (define_split for V16FI mode): Ditto. (define_insn_and_split "vec_extract_lo_"): Delete. (define_insn "vec_extract_lo_"): New. (define_split for VI8F_256 mode): Ditto. (define_insn "vec_extract_hi_"): Add masking. (define_insn_and_split "vec_extract_lo_"): Delete. (define_insn "vec_extract_lo_"): New. (define_split for VI4F_256 mode): Ditto. (define_insn "vec_extract_lo__maskm"): Ditto. (define_insn "vec_extract_hi__maskm"): Ditto. (define_insn "vec_extract_hi_"): Add masking. (define_mode_iterator VEC_EXTRACT_MODE): Add V64QI and V32HI modes. (define_insn "vcvtph2ps"): Fix pattern condition. (define_insn "avx512f_vextract32x4_1_maskm"): Ditto. (define_insn "avx512f_vextract32x4_1"): Update `type' attribute, remove explicit `memory' attribute calculation. Is it ok for trunk? --- Thanks, K diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 30120a5..ccfd47d 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -40979,6 +40979,32 @@ ix86_expand_vector_extract (bool mmx_ok, rtx target, rtx vec, int elt) } break; + case V32HImode: + if (TARGET_AVX512BW) + { + tmp = gen_reg_rtx (V16HImode); + if (elt < 16) + emit_insn (gen_vec_extract_lo_v32hi (tmp, vec)); + else + emit_insn (gen_vec_extract_hi_v32hi (tmp, vec)); + ix86_expand_vector_extract (false, target, tmp, elt & 15); + return; + } + break; + + case V64QImode: + if (TARGET_AVX512BW) + { + tmp = gen_reg_rtx (V32QImode); + if (elt < 32) + emit_insn (gen_vec_extract_lo_v64qi (tmp, vec)); + else + emit_insn (gen_vec_extract_hi_v64qi (tmp, vec)); + ix86_expand_vector_extract (false, target, tmp, elt & 31); + return; + } + break; + case V16SFmode: tmp = gen_reg_rtx (V8SFmode); if (elt < 8) diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index bd321fc..561fdbb 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -534,6 +534,7 @@ (V4DI "TARGET_AVX512VL") (V4DF "TARGET_AVX512VL") (V4SI "TARGET_AVX512VL") (V4SF "TARGET_AVX512VL") (V2DI "TARGET_AVX512VL") (V2DF "TARGET_AVX512VL")]) +(define_mode_iterator VI48F_256 [V8SI V8SF V4DI V4DF]) ;; Mapping from float mode to required SSE level (define_mode_attr sse @@ -6319,44 +6320,64 @@ operands[1] = adjust_address (operands[1], SFmode, INTVAL (operands[2]) * 4); }) -(define_expand "avx512f_vextract32x4_mask" +(define_mode_attr extract_type + [(V16SF "avx512f") (V16SI "avx512f") (V8DF "avx512dq") (V8DI "avx512dq")]) + +(define_mode_attr extract_suf + [(V16SF "32x4") (V16SI "32x4") (V8DF "64x2") (V8DI "64x2")]) + +(define_mode_iterator AVX512_VEC + [(V8DF "TARGET_AVX512DQ") (V8DI "TARGET_AVX512DQ") V16SF V16SI]) + +(define_expand "_vextract_mask" [(match_operand: 0 "nonimmediate_operand") - (match_operand:V16FI 1 "register_operand") + (match_operand:AVX512_VEC 1 "register_operand") (match_operand:SI 2 "const_0_to_3_operand") (match_operand: 3 "nonimmediate_operand") (match_operand:QI 4 "register_operand")] "TARGET_AVX512F" { + int mask; + mask = INTVAL (operands[2]); + if (MEM_P (operands[0]) && GET_CODE (operands[3]) == CONST_VECTOR) operands[0] = force_reg (mode, operands[0]); - switch (INTVAL (operands[2])) - { - case 0: - emit_insn (gen_avx512f_vextract32x4_1_mask (operands[0], - operands[1], GEN_INT (0), GEN_INT (1), GEN_INT (2), - GEN_INT (3), operands[3], operands[4])); - break; - case 1: - emit_insn (gen_avx512f_vextract32x4_1_mask (operands[0], - operands[1], GEN_INT (4), GEN_INT (5), GEN_INT (6), - GEN_INT (7), operands[3], operands[4])); - break; - case 2: - emit_insn (gen_avx512f_vextract32x4_1_mask (operands[0], - operands[1], GEN_INT (8), GEN_INT (9), GEN_INT (10), - GEN_INT (11), operands[3], operands[4])); - break; - case 3: - emit_insn (gen_avx512f_vextract32x4_1_mask (operands[0], - operands[1], GEN_INT (12), GEN_INT (13), GEN_INT (14), - GEN_INT (15), operands[3], operands[4])); - break; - default: - gcc_unreachable (); - } + + if (mode == V16SImode || mode == V16SFmode) + emit_insn (gen_avx512f_vextract32x4_1_mask (operands[0], + operands[1], GEN_INT (mask * 4), GEN_INT (mask * 4 + 1), + GEN_INT (mask * 4 + 2), GEN_INT (mask * 4 + 3), operands[3], + operands[4])); + else + emit_insn (gen_avx512dq_vextract64x2_1_mask (operands[0], + operands[1], GEN_INT (mask * 2), GEN_INT (mask * 2 + 1), operands[3], + operands[4])); DONE; }) +(define_insn "avx512dq_vextract64x2_1_maskm" + [(set (match_operand: 0 "memory_operand" "=m") + (vec_merge: + (vec_select: + (match_operand:V8FI 1 "register_operand" "v") + (parallel [(match_operand 2 "const_0_to_7_operand") + (match_operand 3 "const_0_to_7_operand")])) + (match_operand: 4 "memory_operand" "0") + (match_operand:QI 5 "register_operand" "k")))] + "TARGET_AVX512DQ + && (INTVAL (operands[2]) % 2 == 0) + && (INTVAL (operands[2]) == INTVAL (operands[3]) - 1 )" +{ + operands[2] = GEN_INT ((INTVAL (operands[2])) >> 1); + return "vextract64x2\t{%2, %1, %0%{%5%}|%0%{%5%}, %1, %2}"; +} + [(set_attr "type" "sselog") + (set_attr "prefix_extra" "1") + (set_attr "length_immediate" "1") + (set_attr "memory" "store") + (set_attr "prefix" "evex") + (set_attr "mode" "")]) + (define_insn "avx512f_vextract32x4_1_maskm" [(set (match_operand: 0 "memory_operand" "=m") (vec_merge: @@ -6369,7 +6390,8 @@ (match_operand: 6 "memory_operand" "0") (match_operand:QI 7 "register_operand" "Yk")))] "TARGET_AVX512F - && (INTVAL (operands[2]) == (INTVAL (operands[3]) - 1) + && ((INTVAL (operands[2]) % 4 == 0) + && INTVAL (operands[2]) == (INTVAL (operands[3]) - 1) && INTVAL (operands[3]) == (INTVAL (operands[4]) - 1) && INTVAL (operands[4]) == (INTVAL (operands[5]) - 1))" { @@ -6383,6 +6405,23 @@ (set_attr "prefix" "evex") (set_attr "mode" "")]) +(define_insn "avx512dq_vextract64x2_1" + [(set (match_operand: 0 "" "=") + (vec_select: + (match_operand:V8FI 1 "register_operand" "v") + (parallel [(match_operand 2 "const_0_to_7_operand") + (match_operand 3 "const_0_to_7_operand")])))] + "TARGET_AVX512DQ && (INTVAL (operands[2]) == INTVAL (operands[3]) - 1)" +{ + operands[2] = GEN_INT ((INTVAL (operands[2])) >> 1); + return "vextract64x2\t{%2, %1, %0|%0, %1, %2}"; +} + [(set_attr "type" "sselog1") + (set_attr "prefix_extra" "1") + (set_attr "length_immediate" "1") + (set_attr "prefix" "evex") + (set_attr "mode" "")]) + (define_insn "avx512f_vextract32x4_1" [(set (match_operand: 0 "" "=") (vec_select: @@ -6399,19 +6438,24 @@ operands[2] = GEN_INT ((INTVAL (operands[2])) >> 2); return "vextract32x4\t{%2, %1, %0|%0, %1, %2}"; } - [(set_attr "type" "sselog") + [(set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") - (set (attr "memory") - (if_then_else (match_test "MEM_P (operands[0])") - (const_string "store") - (const_string "none"))) (set_attr "prefix" "evex") (set_attr "mode" "")]) -(define_expand "avx512f_vextract64x4_mask" +(define_mode_attr extract_type_2 + [(V16SF "avx512dq") (V16SI "avx512dq") (V8DF "avx512f") (V8DI "avx512f")]) + +(define_mode_attr extract_suf_2 + [(V16SF "32x8") (V16SI "32x8") (V8DF "64x4") (V8DI "64x4")]) + +(define_mode_iterator AVX512_VEC_2 + [(V16SF "TARGET_AVX512DQ") (V16SI "TARGET_AVX512DQ") V8DF V8DI]) + +(define_expand "_vextract_mask" [(match_operand: 0 "nonimmediate_operand") - (match_operand:V8FI 1 "register_operand") + (match_operand:AVX512_VEC_2 1 "register_operand") (match_operand:SI 2 "const_0_to_1_operand") (match_operand: 3 "nonimmediate_operand") (match_operand:QI 4 "register_operand")] @@ -6467,8 +6511,8 @@ (match_operand: 2 "memory_operand" "0") (match_operand:QI 3 "register_operand" "Yk")))] "TARGET_AVX512F" -"vextract64x4\t{$0x0, %1, %0%{%3%}|%0%{%3%}, %1, 0x0}" - [(set_attr "type" "sselog") + "vextract64x4\t{$0x0, %1, %0%{%3%}|%0%{%3%}, %1, 0x0}" + [(set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "evex") @@ -6487,13 +6531,9 @@ else return "#"; } - [(set_attr "type" "sselog") + [(set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") - (set (attr "memory") - (if_then_else (match_test "MEM_P (operands[0])") - (const_string "store") - (const_string "none"))) (set_attr "prefix" "evex") (set_attr "mode" "")]) @@ -6523,13 +6563,28 @@ (const_int 6) (const_int 7)])))] "TARGET_AVX512F" "vextract64x4\t{$0x1, %1, %0|%0, %1, 0x1}" - [(set_attr "type" "sselog") + [(set_attr "type" "sselog1") + (set_attr "prefix_extra" "1") + (set_attr "length_immediate" "1") + (set_attr "prefix" "evex") + (set_attr "mode" "")]) + +(define_insn "vec_extract_hi__maskm" + [(set (match_operand: 0 "memory_operand" "=m") + (vec_merge: + (vec_select: + (match_operand:V16FI 1 "register_operand" "v") + (parallel [(const_int 8) (const_int 9) + (const_int 10) (const_int 11) + (const_int 12) (const_int 13) + (const_int 14) (const_int 15)])) + (match_operand: 2 "memory_operand" "0") + (match_operand:QI 3 "register_operand" "k")))] + "TARGET_AVX512DQ" + "vextract32x8\t{$0x1, %1, %0%{%3%}|%0%{%3%}, %1, 0x1}" + [(set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") - (set (attr "memory") - (if_then_else (match_test "MEM_P (operands[0])") - (const_string "store") - (const_string "none"))) (set_attr "prefix" "evex") (set_attr "mode" "")]) @@ -6541,7 +6596,7 @@ (const_int 10) (const_int 11) (const_int 12) (const_int 13) (const_int 14) (const_int 15)])))] - "TARGET_AVX512F && (! || TARGET_AVX512DQ)" + "TARGET_AVX512F && " "@ vextract32x8\t{$0x1, %1, %0|%0, %1, 0x1} vextracti64x4\t{$0x1, %1, %0|%0, %1, 0x1}" @@ -6552,6 +6607,35 @@ (set_attr "prefix" "evex") (set_attr "mode" "")]) +(define_expand "avx512vl_vextractf128" + [(match_operand: 0 "nonimmediate_operand") + (match_operand:VI48F_256 1 "register_operand") + (match_operand:SI 2 "const_0_to_1_operand") + (match_operand: 3 "vector_move_operand") + (match_operand:QI 4 "register_operand")] + "TARGET_AVX512DQ && TARGET_AVX512VL" +{ + rtx (*insn)(rtx, rtx, rtx, rtx); + + if (MEM_P (operands[0]) && GET_CODE (operands[3]) == CONST_VECTOR) + operands[0] = force_reg (mode, operands[0]); + + switch (INTVAL (operands[2])) + { + case 0: + insn = gen_vec_extract_lo__mask; + break; + case 1: + insn = gen_vec_extract_hi__mask; + break; + default: + gcc_unreachable (); + } + + emit_insn (insn (operands[0], operands[1], operands[3], operands[4])); + DONE; +}) + (define_expand "avx_vextractf128" [(match_operand: 0 "nonimmediate_operand") (match_operand:V_256 1 "register_operand") @@ -6576,7 +6660,7 @@ DONE; }) -(define_insn_and_split "vec_extract_lo_" +(define_insn "vec_extract_lo_" [(set (match_operand: 0 "nonimmediate_operand" "=v,m") (vec_select: (match_operand:V16FI 1 "nonimmediate_operand" "vm,v") @@ -6584,11 +6668,28 @@ (const_int 2) (const_int 3) (const_int 4) (const_int 5) (const_int 6) (const_int 7)])))] - "TARGET_AVX512F && !(MEM_P (operands[0]) && MEM_P (operands[1]))" - "#" - "&& reload_completed" - [(const_int 0)] + "TARGET_AVX512F + && + && !(MEM_P (operands[0]) && MEM_P (operands[1]))" { + if () + return "vextract32x8\t{$0x0, %1, %0|%0, %1, 0x0}"; + else + return "#"; +}) + +(define_split + [(set (match_operand: 0 "nonimmediate_operand") + (vec_select: + (match_operand:V16FI 1 "nonimmediate_operand") + (parallel [(const_int 0) (const_int 1) + (const_int 2) (const_int 3) + (const_int 4) (const_int 5) + (const_int 6) (const_int 7)])))] + "TARGET_AVX512F && !(MEM_P (operands[0]) && MEM_P (operands[1])) + && reload_completed" + [(const_int 0)] + { rtx op1 = operands[1]; if (REG_P (op1)) op1 = gen_rtx_REG (mode, REGNO (op1)); @@ -6598,29 +6699,57 @@ DONE; }) -(define_insn_and_split "vec_extract_lo_" - [(set (match_operand: 0 "nonimmediate_operand" "=x,m") +(define_insn "vec_extract_lo_" + [(set (match_operand: 0 "" "=v,m") (vec_select: - (match_operand:VI8F_256 1 "nonimmediate_operand" "xm,x") + (match_operand:VI8F_256 1 "nonimmediate_operand" "vm,v") (parallel [(const_int 0) (const_int 1)])))] - "TARGET_AVX && !(MEM_P (operands[0]) && MEM_P (operands[1]))" - "#" - "&& reload_completed" - [(set (match_dup 0) (match_dup 1))] + "TARGET_AVX + && && + && !(MEM_P (operands[0]) && MEM_P (operands[1]))" { - if (REG_P (operands[1])) - operands[1] = gen_rtx_REG (mode, REGNO (operands[1])); + if () + return "vextract64x2\t{$0x0, %1, %0%{%3%}|%0%{%3%}, %1, 0x0}"; else - operands[1] = adjust_address (operands[1], mode, 0); + return "#"; +} + [(set_attr "type" "sselog") + (set_attr "prefix_extra" "1") + (set_attr "length_immediate" "1") + (set_attr "memory" "none,store") + (set_attr "prefix" "evex") + (set_attr "mode" "XI")]) + +(define_split + [(set (match_operand: 0 "nonimmediate_operand") + (vec_select: + (match_operand:VI8F_256 1 "nonimmediate_operand") + (parallel [(const_int 0) (const_int 1)])))] + "TARGET_AVX && !(MEM_P (operands[0]) && MEM_P (operands[1])) + && reload_completed" + [(const_int 0)] +{ + rtx op1 = operands[1]; + if (REG_P (op1)) + op1 = gen_rtx_REG (mode, REGNO (op1)); + else + op1 = gen_lowpart (mode, op1); + emit_move_insn (operands[0], op1); + DONE; }) -(define_insn "vec_extract_hi_" - [(set (match_operand: 0 "nonimmediate_operand" "=x,m") +(define_insn "vec_extract_hi_" + [(set (match_operand: 0 "" "=v,") (vec_select: - (match_operand:VI8F_256 1 "register_operand" "x,x") + (match_operand:VI8F_256 1 "register_operand" "v,v") (parallel [(const_int 2) (const_int 3)])))] "TARGET_AVX" - "vextract\t{$0x1, %1, %0|%0, %1, 0x1}" +{ + if (TARGET_AVX512DQ && TARGET_AVX512VL) + return "vextract64x2\t{$0x1, %1, %0|%0, %1, 0x1}"; + else + return "vextract\t{$0x1, %1, %0|%0, %1, 0x1}"; +} [(set_attr "type" "sselog") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") @@ -6628,36 +6757,101 @@ (set_attr "prefix" "vex") (set_attr "mode" "")]) -(define_insn_and_split "vec_extract_lo_" - [(set (match_operand: 0 "nonimmediate_operand" "=x,m") +(define_split + [(set (match_operand: 0 "nonimmediate_operand") (vec_select: - (match_operand:VI4F_256 1 "nonimmediate_operand" "xm,x") + (match_operand:VI4F_256 1 "nonimmediate_operand") (parallel [(const_int 0) (const_int 1) (const_int 2) (const_int 3)])))] - "TARGET_AVX && !(MEM_P (operands[0]) && MEM_P (operands[1]))" - "#" - "&& reload_completed" - [(set (match_dup 0) (match_dup 1))] + "TARGET_AVX && !(MEM_P (operands[0]) && MEM_P (operands[1])) && reload_completed" + [(const_int 0)] { - if (REG_P (operands[1])) - operands[1] = gen_rtx_REG (mode, REGNO (operands[1])); + rtx op1 = operands[1]; + if (REG_P (op1)) + op1 = gen_rtx_REG (mode, REGNO (op1)); else - operands[1] = adjust_address (operands[1], mode, 0); + op1 = gen_lowpart (mode, op1); + emit_move_insn (operands[0], op1); + DONE; }) -(define_insn "vec_extract_hi_" - [(set (match_operand: 0 "nonimmediate_operand" "=x,m") + +(define_insn "vec_extract_lo_" + [(set (match_operand: 0 "" "=") + (vec_select: + (match_operand:VI4F_256 1 "nonimmediate_operand" "v") + (parallel [(const_int 0) (const_int 1) + (const_int 2) (const_int 3)])))] + "TARGET_AVX && && " +{ + if () + return "vextract32x4\t{$0x0, %1, %0|%0, %1, 0x0}"; + else + return "#"; +} + [(set_attr "type" "sselog1") + (set_attr "prefix_extra" "1") + (set_attr "length_immediate" "1") + (set_attr "prefix" "evex") + (set_attr "mode" "")]) + +(define_insn "vec_extract_lo__maskm" + [(set (match_operand: 0 "memory_operand" "=m") + (vec_merge: + (vec_select: + (match_operand:VI4F_256 1 "register_operand" "v") + (parallel [(const_int 0) (const_int 1) + (const_int 2) (const_int 3)])) + (match_operand: 2 "memory_operand" "0") + (match_operand:QI 3 "register_operand" "k")))] + "TARGET_AVX512VL && TARGET_AVX512F" + "vextract32x4\t{$0x0, %1, %0%{3%}|%0%{%3%}, %1, 0x0}" + [(set_attr "type" "sselog") + (set_attr "length_immediate" "1") + (set_attr "prefix" "evex") + (set_attr "mode" "")]) + +(define_insn "vec_extract_hi__maskm" + [(set (match_operand: 0 "memory_operand" "=m") + (vec_merge: + (vec_select: + (match_operand:VI4F_256 1 "register_operand" "v") + (parallel [(const_int 4) (const_int 5) + (const_int 6) (const_int 7)])) + (match_operand: 2 "memory_operand" "0") + (match_operand: 3 "register_operand" "k")))] + "TARGET_AVX512F && TARGET_AVX512VL" +{ + return "vextract32x4\t{$0x1, %1, %0%{%3%}|%0%{%3%}, %1, 0x1}"; +} + [(set_attr "type" "sselog") + (set_attr "prefix_extra" "1") + (set_attr "length_immediate" "1") + (set_attr "memory" "store") + (set_attr "prefix" "evex") + (set_attr "mode" "")]) + +(define_insn "vec_extract_hi_" + [(set (match_operand: 0 "" "=") (vec_select: - (match_operand:VI4F_256 1 "register_operand" "x,x") + (match_operand:VI4F_256 1 "register_operand" "v") (parallel [(const_int 4) (const_int 5) (const_int 6) (const_int 7)])))] - "TARGET_AVX" - "vextract\t{$0x1, %1, %0|%0, %1, 0x1}" - [(set_attr "type" "sselog") + "TARGET_AVX && " +{ + if (TARGET_AVX512VL) + return "vextract32x4\t{$0x1, %1, %0|%0, %1, 0x1}"; + else + return "vextract\t{$0x1, %1, %0|%0, %1, 0x1}"; +} + [(set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") - (set_attr "memory" "none,store") - (set_attr "prefix" "vex") + (set (attr "prefix") + (if_then_else + (match_test "TARGET_AVX512VL") + (const_string "evex") + (const_string "vex"))) (set_attr "mode" "")]) (define_insn_and_split "vec_extract_lo_v32hi" @@ -6846,8 +7040,8 @@ ;; Modes handled by vec_extract patterns. (define_mode_iterator VEC_EXTRACT_MODE - [(V32QI "TARGET_AVX") V16QI - (V16HI "TARGET_AVX") V8HI + [(V64QI "TARGET_AVX512BW") (V32QI "TARGET_AVX") V16QI + (V32HI "TARGET_AVX512BW") (V16HI "TARGET_AVX") V8HI (V16SI "TARGET_AVX512F") (V8SI "TARGET_AVX") V4SI (V8DI "TARGET_AVX512F") (V4DI "TARGET_AVX") V2DI (V16SF "TARGET_AVX512F") (V8SF "TARGET_AVX") V4SF @@ -16498,7 +16692,7 @@ (match_operand:SI 2 "const_0_to_255_operand" "N")] UNSPEC_VCVTPS2PH) (match_operand:V4HI 3 "const0_operand")))] - "TARGET_F16C && " + "(TARGET_F16C || TARGET_AVX512VL) && " "vcvtps2ph\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "ssecvt") (set_attr "prefix" "maybe_evex") diff --git a/gcc/config/i386/subst.md b/gcc/config/i386/subst.md index b05cb17..91228c8 100644 --- a/gcc/config/i386/subst.md +++ b/gcc/config/i386/subst.md @@ -57,6 +57,7 @@ (define_subst_attr "mask_mode512bit_condition" "mask" "1" "( == 64 || TARGET_AVX512VL)") (define_subst_attr "mask_avx512vl_condition" "mask" "1" "TARGET_AVX512VL") (define_subst_attr "mask_avx512bw_condition" "mask" "1" "TARGET_AVX512BW") +(define_subst_attr "mask_avx512dq_condition" "mask" "1" "TARGET_AVX512DQ") (define_subst_attr "store_mask_constraint" "mask" "vm" "v") (define_subst_attr "store_mask_predicate" "mask" "nonimmediate_operand" "register_operand") (define_subst_attr "mask_prefix" "mask" "vex" "evex")