From patchwork Wed Sep 24 08:54:21 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kirill Yukhin X-Patchwork-Id: 392800 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 8E2B61400D6 for ; Wed, 24 Sep 2014 18:54:44 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:mime-version:content-type; q=dns; s=default; b=XcDu44pZti3X8InBkBbRHEXa6W/gm8T2+9R0kBi4lqh55yObAW u/Jf9B/ji4m3PGbq6wqPbPz2Nl6ZVXCQjRrsFZSIw/la9nMLPHt6HeCvaRd85EPG Tj0XV8h9geQRxGB70TNS0YtecK5A7iXXYVJTto0MLNqnde7FR56lyykgk= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:mime-version:content-type; s= default; bh=fHQeyuVNExglswKWltQeFFRlaaU=; b=olJxYtFTL5vuU5TpQf3a iEVw35qhbsRoq+iWbd4Bl0lUd9QZAc2GjtgVvuBY+N89J3ji10j4hGuG5EmY4e6s QhZKQ3bKmAyAkOCViJ0kJKSuWAW9sQlpM5llQt5gXSnqxBY2+YH6RnH+Wg0BZAyd kElo/Frearzr7nznbB8CA6c= Received: (qmail 9312 invoked by alias); 24 Sep 2014 08:54:37 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 9302 invoked by uid 89); 24 Sep 2014 08:54:36 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.1 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-wi0-f180.google.com Received: from mail-wi0-f180.google.com (HELO mail-wi0-f180.google.com) (209.85.212.180) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Wed, 24 Sep 2014 08:54:35 +0000 Received: by mail-wi0-f180.google.com with SMTP id q5so6758728wiv.7 for ; Wed, 24 Sep 2014 01:54:32 -0700 (PDT) X-Received: by 10.180.10.38 with SMTP id f6mr9844344wib.30.1411548872096; Wed, 24 Sep 2014 01:54:32 -0700 (PDT) Received: from msticlxl57.ims.intel.com (fmdmzpr02-ext.fm.intel.com. [192.55.55.37]) by mx.google.com with ESMTPSA id ia3sm10272006wjb.12.2014.09.24.01.54.28 for (version=TLSv1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Wed, 24 Sep 2014 01:54:31 -0700 (PDT) Date: Wed, 24 Sep 2014 12:54:21 +0400 From: Kirill Yukhin To: Uros Bizjak Cc: Jakub Jelinek , Richard Henderson , GCC Patches , kirill.yukhin@gmail.com Subject: [PATCH i386 AVX512] [52/n] Add convert ps2pd and ps2dq. Message-ID: <20140924085420.GB18703@msticlxl57.ims.intel.com> MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-IsSubscribed: yes Hello, Patch in the bottom adds support for ps2dq and ps2pd conversions. Bootstrapped. AVX-512* tests on top of patch-set all pass under simulator. Is it ok for trunk? gcc/ * config/i386/sse.md (define_c_enum "unspec"): Add UNSPEC_CVTINT2MASK. (define_insn "fix_trunc2"): New. (define_insn "fix_truncv2sfv2di2"): Ditto. (define_insn "ufix_trunc2"): Ditto. (define_insn "sse2_cvtss2sd"): Change "nonimmediate_operand" to "". (define_insn "avx_cvtpd2ps256"): Add masking. (define_expand "sse2_cvtpd2ps_mask): New. (define_insn "*sse2_cvtpd2ps"): Add masking. (define_insn "_cvt2mask"): New. (define_insn "_cvtmask2"): Ditto. (define_insn "sse2_cvtps2pd"): Add masking. --- Thanks, K diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index b2e1d4f..c9d6e00 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -132,6 +132,7 @@ ;; For AVX512BW support UNSPEC_PSHUFHW UNSPEC_PSHUFLW + UNSPEC_CVTINT2MASK ;; For AVX512DQ support UNSPEC_REDUCE @@ -4659,6 +4660,38 @@ (set_attr "prefix" "evex") (set_attr "mode" "")]) +(define_insn "fix_trunc2" + [(set (match_operand: 0 "register_operand" "=v") + (any_fix: + (match_operand:VF1_128_256VL 1 "" "")))] + "TARGET_AVX512DQ && " + "vcvttps2qq\t{%1, %0|%0, %1}" + [(set_attr "type" "ssecvt") + (set_attr "prefix" "evex") + (set_attr "mode" "")]) + +(define_insn "fix_truncv2sfv2di2" + [(set (match_operand:V2DI 0 "register_operand" "=v") + (any_fix:V2DI + (vec_select:V2SF + (match_operand:V4SF 1 "nonimmediate_operand" "vm") + (parallel [(const_int 0) (const_int 1)]))))] + "TARGET_AVX512DQ && TARGET_AVX512VL" + "vcvttps2qq\t{%1, %0|%0, %1}" + [(set_attr "type" "ssecvt") + (set_attr "prefix" "evex") + (set_attr "mode" "TI")]) + +(define_insn "ufix_trunc2" + [(set (match_operand: 0 "register_operand" "=v") + (unsigned_fix: + (match_operand:VF1_128_256VL 1 "nonimmediate_operand" "vm")))] + "TARGET_AVX512VL" + "vcvttps2udq\t{%1, %0|%0, %1}" + [(set_attr "type" "ssecvt") + (set_attr "prefix" "evex") + (set_attr "mode" "")]) + (define_expand "avx_cvttpd2dq256_2" [(set (match_operand:V8SI 0 "register_operand") (vec_concat:V8SI @@ -4713,7 +4746,7 @@ (vec_merge:V2DF (float_extend:V2DF (vec_select:V2SF - (match_operand:V4SF 2 "nonimmediate_operand" "x,m,") + (match_operand:V4SF 2 "" "x,m,") (parallel [(const_int 0) (const_int 1)]))) (match_operand:V2DF 1 "register_operand" "0,0,v") (const_int 1)))] @@ -4741,14 +4774,14 @@ (set_attr "prefix" "evex") (set_attr "mode" "V8SF")]) -(define_insn "avx_cvtpd2ps256" - [(set (match_operand:V4SF 0 "register_operand" "=x") +(define_insn "avx_cvtpd2ps256" + [(set (match_operand:V4SF 0 "register_operand" "=v") (float_truncate:V4SF - (match_operand:V4DF 1 "nonimmediate_operand" "xm")))] - "TARGET_AVX" - "vcvtpd2ps{y}\t{%1, %0|%0, %1}" + (match_operand:V4DF 1 "nonimmediate_operand" "vm")))] + "TARGET_AVX && " + "vcvtpd2ps{y}\t{%1, %0|%0, %1}" [(set_attr "type" "ssecvt") - (set_attr "prefix" "vex") + (set_attr "prefix" "maybe_evex") (set_attr "btver2_decode" "vector") (set_attr "mode" "V4SF")]) @@ -4761,16 +4794,28 @@ "TARGET_SSE2" "operands[2] = CONST0_RTX (V2SFmode);") -(define_insn "*sse2_cvtpd2ps" - [(set (match_operand:V4SF 0 "register_operand" "=x") +(define_expand "sse2_cvtpd2ps_mask" + [(set (match_operand:V4SF 0 "register_operand") + (vec_merge:V4SF + (vec_concat:V4SF + (float_truncate:V2SF + (match_operand:V2DF 1 "nonimmediate_operand")) + (match_dup 4)) + (match_operand:V4SF 2 "register_operand") + (match_operand:QI 3 "register_operand")))] + "TARGET_SSE2" + "operands[4] = CONST0_RTX (V2SFmode);") + +(define_insn "*sse2_cvtpd2ps" + [(set (match_operand:V4SF 0 "register_operand" "=v") (vec_concat:V4SF (float_truncate:V2SF - (match_operand:V2DF 1 "nonimmediate_operand" "xm")) + (match_operand:V2DF 1 "nonimmediate_operand" "vm")) (match_operand:V2SF 2 "const0_operand")))] - "TARGET_SSE2" + "TARGET_SSE2 && " { if (TARGET_AVX) - return "vcvtpd2ps{x}\t{%1, %0|%0, %1}"; + return "vcvtpd2ps{x}\t{%1, %0|%0, %1}"; else return "cvtpd2ps\t{%1, %0|%0, %1}"; } @@ -4824,14 +4869,54 @@ (set_attr "prefix" "evex") (set_attr "mode" "V8DF")]) -(define_insn "sse2_cvtps2pd" - [(set (match_operand:V2DF 0 "register_operand" "=x") +(define_insn "_cvt2mask" + [(set (match_operand: 0 "register_operand" "=Yk") + (unspec: + [(match_operand:VI12_AVX512VL 1 "register_operand" "v")] + UNSPEC_CVTINT2MASK))] + "TARGET_AVX512BW" + "vpmov2m\t{%1, %0|%0, %1}" + [(set_attr "prefix" "evex") + (set_attr "mode" "")]) + +(define_insn "_cvt2mask" + [(set (match_operand: 0 "register_operand" "=Yk") + (unspec: + [(match_operand:VI48_AVX512VL 1 "register_operand" "v")] + UNSPEC_CVTINT2MASK))] + "TARGET_AVX512DQ" + "vpmov2m\t{%1, %0|%0, %1}" + [(set_attr "prefix" "evex") + (set_attr "mode" "")]) + +(define_insn "_cvtmask2" + [(set (match_operand:VI12_AVX512VL 0 "register_operand" "=v") + (unspec:VI12_AVX512VL + [(match_operand: 1 "register_operand" "Yk")] + UNSPEC_CVTINT2MASK))] + "TARGET_AVX512BW" + "vpmovm2\t{%1, %0|%0, %1}" + [(set_attr "prefix" "evex") + (set_attr "mode" "")]) + +(define_insn "_cvtmask2" + [(set (match_operand:VI48_AVX512VL 0 "register_operand" "=v") + (unspec:VI48_AVX512VL + [(match_operand: 1 "register_operand" "Yk")] + UNSPEC_CVTINT2MASK))] + "TARGET_AVX512DQ" + "vpmovm2\t{%1, %0|%0, %1}" + [(set_attr "prefix" "evex") + (set_attr "mode" "")]) + +(define_insn "sse2_cvtps2pd" + [(set (match_operand:V2DF 0 "register_operand" "=v") (float_extend:V2DF (vec_select:V2SF - (match_operand:V4SF 1 "nonimmediate_operand" "xm") + (match_operand:V4SF 1 "nonimmediate_operand" "vm") (parallel [(const_int 0) (const_int 1)]))))] - "TARGET_SSE2" - "%vcvtps2pd\t{%1, %0|%0, %q1}" + "TARGET_SSE2 && " + "%vcvtps2pd\t{%1, %0|%0, %q1}" [(set_attr "type" "ssecvt") (set_attr "amdfam10_decode" "direct") (set_attr "athlon_decode" "double")