From patchwork Tue Sep 23 11:30:07 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kirill Yukhin X-Patchwork-Id: 392425 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 5DD5F1400AF for ; Tue, 23 Sep 2014 21:30:30 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:mime-version:content-type; q=dns; s=default; b=dyMBTaSFsktYsa0GGlH3ug48xOHGkz9RfUblZBn+luXH3DxZmm 2Zex5FmEzNRBjIwpQQ+IpR+LFQsIyMGojBuWSHPQU+ZMb/OJGzLafTao7SN4XJQ+ il3x9LvMtjFrg67AmT9uqF85mUoA8kBSlPB0GGoRrejZ39tj6laqkGIOU= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:mime-version:content-type; s= default; bh=OJWj4TPHdujngDJzzF61DHoepmk=; b=XVh2Ltb6XbGSsnhwXBYU uFZ007rlOV1Az9rHaOfIAVwWvlDRiJ4r7oMAgQjvxiTmIPRv/cIAXoqoFnifxydK tyufvK5Si/fJuBWmeb5WLBvsjkLtzEVf4f/vTy4UgA5XkCs0qH/MMPmdIYSxBVtf YzZeSPTvYZZYm8g9oTCrs8U= Received: (qmail 5565 invoked by alias); 23 Sep 2014 11:30:23 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 5554 invoked by uid 89); 23 Sep 2014 11:30:22 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-wg0-f49.google.com Received: from mail-wg0-f49.google.com (HELO mail-wg0-f49.google.com) (74.125.82.49) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Tue, 23 Sep 2014 11:30:21 +0000 Received: by mail-wg0-f49.google.com with SMTP id x12so4249931wgg.32 for ; Tue, 23 Sep 2014 04:30:18 -0700 (PDT) X-Received: by 10.180.187.198 with SMTP id fu6mr2560865wic.57.1411471818078; Tue, 23 Sep 2014 04:30:18 -0700 (PDT) Received: from msticlxl57.ims.intel.com (fmdmzpr02-ext.fm.intel.com. [192.55.55.37]) by mx.google.com with ESMTPSA id fv1sm15556530wjb.35.2014.09.23.04.30.14 for (version=TLSv1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 23 Sep 2014 04:30:17 -0700 (PDT) Date: Tue, 23 Sep 2014 15:30:07 +0400 From: Kirill Yukhin To: Uros Bizjak Cc: Jakub Jelinek , Richard Henderson , GCC Patches , kirill.yukhin@gmail.com Subject: [PATCH i386 AVX512] [49/n] Add vpshuf[lh]w insn patterns. Message-ID: <20140923113005.GD41516@msticlxl57.ims.intel.com> MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-IsSubscribed: yes Hello, Bootstrapped. AVX-512* tests on top of patch-set all pass under simulator. Is it ok for trunk? gcc/ * config/i386/sse.md (define_c_enum "unspec"): Add UNSPEC_PSHUFHW, UNSPEC_PSHUFLW. (define_insn "avx512bw_pshuflwv32hi"): New. (define_expand "avx512vl_pshuflwv3_mask"): Ditto. (define_insn "avx2_pshuflw_1"): Add masking. (define_expand "avx512vl_pshuflw_mask"): New. (define_insn "sse2_pshuflw_1"): Add masking. (define_insn "avx512bw_pshufhwv32hi"): New. (define_expand "avx512vl_pshufhwv3_mask"): Ditto. (define_insn "avx2_pshufhw_1"): Add masking. (define_expand "avx512vl_pshufhw_mask"): New. (define_insn "sse2_pshufhw_1"): Add masking. --- Thanks, K diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index f377a5e..133ba1e 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -129,6 +129,10 @@ UNSPEC_SHA256MSG2 UNSPEC_SHA256RNDS2 + ;; For AVX512BW support + UNSPEC_PSHUFHW + UNSPEC_PSHUFLW + ;; For AVX512DQ support UNSPEC_REDUCE UNSPEC_FPCLASS @@ -11789,6 +11793,40 @@ (set_attr "length_immediate" "1") (set_attr "mode" "TI")]) +(define_insn "avx512bw_pshuflwv32hi" + [(set (match_operand:V32HI 0 "register_operand" "=v") + (unspec:V32HI + [(match_operand:V32HI 1 "nonimmediate_operand" "vm") + (match_operand:SI 2 "const_0_to_255_operand" "n")] + UNSPEC_PSHUFLW))] + "TARGET_AVX512BW" + "vpshuflw\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "type" "sselog") + (set_attr "prefix" "evex") + (set_attr "mode" "XI")]) + +(define_expand "avx512vl_pshuflwv3_mask" + [(match_operand:V16HI 0 "register_operand") + (match_operand:V16HI 1 "nonimmediate_operand") + (match_operand:SI 2 "const_0_to_255_operand") + (match_operand:V16HI 3 "register_operand") + (match_operand:HI 4 "register_operand")] + "TARGET_AVX512VL && TARGET_AVX512BW" +{ + int mask = INTVAL (operands[2]); + emit_insn (gen_avx2_pshuflw_1_mask (operands[0], operands[1], + GEN_INT ((mask >> 0) & 3), + GEN_INT ((mask >> 2) & 3), + GEN_INT ((mask >> 4) & 3), + GEN_INT ((mask >> 6) & 3), + GEN_INT (((mask >> 0) & 3) + 8), + GEN_INT (((mask >> 2) & 3) + 8), + GEN_INT (((mask >> 4) & 3) + 8), + GEN_INT (((mask >> 6) & 3) + 8), + operands[3], operands[4])); + DONE; +}) + (define_expand "avx2_pshuflwv3" [(match_operand:V16HI 0 "register_operand") (match_operand:V16HI 1 "nonimmediate_operand") @@ -11808,10 +11846,10 @@ DONE; }) -(define_insn "avx2_pshuflw_1" - [(set (match_operand:V16HI 0 "register_operand" "=x") +(define_insn "avx2_pshuflw_1" + [(set (match_operand:V16HI 0 "register_operand" "=v") (vec_select:V16HI - (match_operand:V16HI 1 "nonimmediate_operand" "xm") + (match_operand:V16HI 1 "nonimmediate_operand" "vm") (parallel [(match_operand 2 "const_0_to_3_operand") (match_operand 3 "const_0_to_3_operand") (match_operand 4 "const_0_to_3_operand") @@ -11832,7 +11870,8 @@ && INTVAL (operands[2]) + 8 == INTVAL (operands[6]) && INTVAL (operands[3]) + 8 == INTVAL (operands[7]) && INTVAL (operands[4]) + 8 == INTVAL (operands[8]) - && INTVAL (operands[5]) + 8 == INTVAL (operands[9])" + && INTVAL (operands[5]) + 8 == INTVAL (operands[9]) + && && " { int mask = 0; mask |= INTVAL (operands[2]) << 0; @@ -11841,13 +11880,31 @@ mask |= INTVAL (operands[5]) << 6; operands[2] = GEN_INT (mask); - return "vpshuflw\t{%2, %1, %0|%0, %1, %2}"; + return "vpshuflw\t{%2, %1, %0|%0, %1, %2}"; } [(set_attr "type" "sselog") - (set_attr "prefix" "vex") + (set_attr "prefix" "maybe_evex") (set_attr "length_immediate" "1") (set_attr "mode" "OI")]) +(define_expand "avx512vl_pshuflw_mask" + [(match_operand:V8HI 0 "register_operand") + (match_operand:V8HI 1 "nonimmediate_operand") + (match_operand:SI 2 "const_0_to_255_operand") + (match_operand:V8HI 3 "register_operand") + (match_operand:QI 4 "register_operand")] + "TARGET_AVX512VL && TARGET_AVX512BW" +{ + int mask = INTVAL (operands[2]); + emit_insn (gen_sse2_pshuflw_1_mask (operands[0], operands[1], + GEN_INT ((mask >> 0) & 3), + GEN_INT ((mask >> 2) & 3), + GEN_INT ((mask >> 4) & 3), + GEN_INT ((mask >> 6) & 3), + operands[3], operands[4])); + DONE; +}) + (define_expand "sse2_pshuflw" [(match_operand:V8HI 0 "register_operand") (match_operand:V8HI 1 "nonimmediate_operand") @@ -11863,10 +11920,10 @@ DONE; }) -(define_insn "sse2_pshuflw_1" - [(set (match_operand:V8HI 0 "register_operand" "=x") +(define_insn "sse2_pshuflw_1" + [(set (match_operand:V8HI 0 "register_operand" "=v") (vec_select:V8HI - (match_operand:V8HI 1 "nonimmediate_operand" "xm") + (match_operand:V8HI 1 "nonimmediate_operand" "vm") (parallel [(match_operand 2 "const_0_to_3_operand") (match_operand 3 "const_0_to_3_operand") (match_operand 4 "const_0_to_3_operand") @@ -11875,7 +11932,7 @@ (const_int 5) (const_int 6) (const_int 7)])))] - "TARGET_SSE2" + "TARGET_SSE2 && && " { int mask = 0; mask |= INTVAL (operands[2]) << 0; @@ -11884,7 +11941,7 @@ mask |= INTVAL (operands[5]) << 6; operands[2] = GEN_INT (mask); - return "%vpshuflw\t{%2, %1, %0|%0, %1, %2}"; + return "%vpshuflw\t{%2, %1, %0|%0, %1, %2}"; } [(set_attr "type" "sselog") (set_attr "prefix_data16" "0") @@ -11912,10 +11969,44 @@ DONE; }) -(define_insn "avx2_pshufhw_1" - [(set (match_operand:V16HI 0 "register_operand" "=x") +(define_insn "avx512bw_pshufhwv32hi" + [(set (match_operand:V32HI 0 "register_operand" "=v") + (unspec:V32HI + [(match_operand:V32HI 1 "nonimmediate_operand" "vm") + (match_operand:SI 2 "const_0_to_255_operand" "n")] + UNSPEC_PSHUFHW))] + "TARGET_AVX512BW" + "vpshufhw\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "type" "sselog") + (set_attr "prefix" "evex") + (set_attr "mode" "XI")]) + +(define_expand "avx512vl_pshufhwv3_mask" + [(match_operand:V16HI 0 "register_operand") + (match_operand:V16HI 1 "nonimmediate_operand") + (match_operand:SI 2 "const_0_to_255_operand") + (match_operand:V16HI 3 "register_operand") + (match_operand:HI 4 "register_operand")] + "TARGET_AVX512VL && TARGET_AVX512BW" +{ + int mask = INTVAL (operands[2]); + emit_insn (gen_avx2_pshufhw_1_mask (operands[0], operands[1], + GEN_INT (((mask >> 0) & 3) + 4), + GEN_INT (((mask >> 2) & 3) + 4), + GEN_INT (((mask >> 4) & 3) + 4), + GEN_INT (((mask >> 6) & 3) + 4), + GEN_INT (((mask >> 0) & 3) + 12), + GEN_INT (((mask >> 2) & 3) + 12), + GEN_INT (((mask >> 4) & 3) + 12), + GEN_INT (((mask >> 6) & 3) + 12), + operands[3], operands[4])); + DONE; +}) + +(define_insn "avx2_pshufhw_1" + [(set (match_operand:V16HI 0 "register_operand" "=v") (vec_select:V16HI - (match_operand:V16HI 1 "nonimmediate_operand" "xm") + (match_operand:V16HI 1 "nonimmediate_operand" "vm") (parallel [(const_int 0) (const_int 1) (const_int 2) @@ -11936,7 +12027,8 @@ && INTVAL (operands[2]) + 8 == INTVAL (operands[6]) && INTVAL (operands[3]) + 8 == INTVAL (operands[7]) && INTVAL (operands[4]) + 8 == INTVAL (operands[8]) - && INTVAL (operands[5]) + 8 == INTVAL (operands[9])" + && INTVAL (operands[5]) + 8 == INTVAL (operands[9]) + && && " { int mask = 0; mask |= (INTVAL (operands[2]) - 4) << 0; @@ -11945,13 +12037,31 @@ mask |= (INTVAL (operands[5]) - 4) << 6; operands[2] = GEN_INT (mask); - return "vpshufhw\t{%2, %1, %0|%0, %1, %2}"; + return "vpshufhw\t{%2, %1, %0|%0, %1, %2}"; } [(set_attr "type" "sselog") - (set_attr "prefix" "vex") + (set_attr "prefix" "maybe_evex") (set_attr "length_immediate" "1") (set_attr "mode" "OI")]) +(define_expand "avx512vl_pshufhw_mask" + [(match_operand:V8HI 0 "register_operand") + (match_operand:V8HI 1 "nonimmediate_operand") + (match_operand:SI 2 "const_0_to_255_operand") + (match_operand:V8HI 3 "register_operand") + (match_operand:QI 4 "register_operand")] + "TARGET_AVX512VL && TARGET_AVX512BW" +{ + int mask = INTVAL (operands[2]); + emit_insn (gen_sse2_pshufhw_1_mask (operands[0], operands[1], + GEN_INT (((mask >> 0) & 3) + 4), + GEN_INT (((mask >> 2) & 3) + 4), + GEN_INT (((mask >> 4) & 3) + 4), + GEN_INT (((mask >> 6) & 3) + 4), + operands[3], operands[4])); + DONE; +}) + (define_expand "sse2_pshufhw" [(match_operand:V8HI 0 "register_operand") (match_operand:V8HI 1 "nonimmediate_operand") @@ -11967,10 +12077,10 @@ DONE; }) -(define_insn "sse2_pshufhw_1" - [(set (match_operand:V8HI 0 "register_operand" "=x") +(define_insn "sse2_pshufhw_1" + [(set (match_operand:V8HI 0 "register_operand" "=v") (vec_select:V8HI - (match_operand:V8HI 1 "nonimmediate_operand" "xm") + (match_operand:V8HI 1 "nonimmediate_operand" "vm") (parallel [(const_int 0) (const_int 1) (const_int 2) @@ -11979,7 +12089,7 @@ (match_operand 3 "const_4_to_7_operand") (match_operand 4 "const_4_to_7_operand") (match_operand 5 "const_4_to_7_operand")])))] - "TARGET_SSE2" + "TARGET_SSE2 && && " { int mask = 0; mask |= (INTVAL (operands[2]) - 4) << 0; @@ -11988,7 +12098,7 @@ mask |= (INTVAL (operands[5]) - 4) << 6; operands[2] = GEN_INT (mask); - return "%vpshufhw\t{%2, %1, %0|%0, %1, %2}"; + return "%vpshufhw\t{%2, %1, %0|%0, %1, %2}"; } [(set_attr "type" "sselog") (set_attr "prefix_rep" "1")