From patchwork Tue Oct 14 07:18:28 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Kirill Yukhin X-Patchwork-Id: 399386 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id DF77E1400A0 for ; Tue, 14 Oct 2014 18:20:02 +1100 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:references:mime-version :content-type:content-transfer-encoding:in-reply-to; q=dns; s= default; b=HEtctP7pqZrzjmS9n2IcixvEu8QdJO6at8ZqEkIrXRx1SE3piYVXC n5eXWB6+lW2NZdMD9RCkNpkpiYQaS3W7A9MSyPjFsdebojIcHfCoS1oGN6Rr8kP/ PtPQkHqtn1xk7T1xX/gTZXY9VGx2vo8z0ye8U2JAvpUT0xdiXldIsA= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:references:mime-version :content-type:content-transfer-encoding:in-reply-to; s=default; bh=PkzaunJsAva7JBjPc+LbEmWPzmc=; b=rsEjxunGEoJVfwoVMgs94kjnrjz9 KFqdO553bbmjY5pOxH+IBWx6ozD7Gzn+lrzKzq9ZdlrZoNRbnL9QbqFIe3Q1xfsS max++Ngq3q+pSHJ7WoMsjDw1HjnUYnr2dA4FWCHHat8l3PXq7cPPdvVL4Bx3nhr+ odZXsns4mbChYjc= Received: (qmail 17865 invoked by alias); 14 Oct 2014 07:19:55 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 17855 invoked by uid 89); 14 Oct 2014 07:19:55 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.1 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-wg0-f48.google.com Received: from mail-wg0-f48.google.com (HELO mail-wg0-f48.google.com) (74.125.82.48) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Tue, 14 Oct 2014 07:19:53 +0000 Received: by mail-wg0-f48.google.com with SMTP id k14so10136851wgh.31 for ; Tue, 14 Oct 2014 00:19:50 -0700 (PDT) X-Received: by 10.180.10.231 with SMTP id l7mr3618849wib.1.1413271190032; Tue, 14 Oct 2014 00:19:50 -0700 (PDT) Received: from msticlxl57.ims.intel.com (fmdmzpr01-ext.fm.intel.com. [192.55.54.36]) by mx.google.com with ESMTPSA id hc8sm17786915wib.1.2014.10.14.00.19.46 for (version=TLSv1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 14 Oct 2014 00:19:49 -0700 (PDT) Date: Tue, 14 Oct 2014 11:18:28 +0400 From: Kirill Yukhin To: Uros Bizjak Cc: Jakub Jelinek , Richard Henderson , GCC Patches Subject: Re: [PATCH i386 AVX512] [56/n] Add plus/minus/abs/neg/andnot insn patterns. Message-ID: <20141014071820.GA59591@msticlxl57.ims.intel.com> References: <20140925141206.GB27825@msticlxl57.ims.intel.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-IsSubscribed: yes Hello Uroš, It seems like I missed to post uppdated patch. On 25 Sep 20:11, Uros Bizjak wrote: > I'd rather go with the second approach, it is less confusing from the > maintainer POV. All other patterns with masking use some consistent > template, so I'd suggest using the same approach for everything. If it > is indeed too many patterns, then please split the patch to smaller > pieces. Goal was not to decrease size of the patch, I wanted to make pattern look simpler by hiding masking stuff beyond `subst'. Anyway, I've updated the patch. Here it is (bootstrapped and regtested). Is it ok for trunk? gcc/ * config/i386/sse.md (define_mode_iterator VI_AVX2): Extend to support AVX-512BW. (define_mode_iterator VI124_AVX2_48_AVX512F): Remove. (define_expand "3"): Remove masking support. (define_insn "*3"): Ditto. (define_expand "3_mask"): New. (define_expand "3_mask"): Ditto. (define_insn "*3_mask"): Ditto. (define_insn "*3_mask"): Ditto. (define_expand "_andnot3"): Remove masking support. (define_insn "*andnot3"): Ditto. (define_expand "_andnot3_mask"): New. (define_expand "_andnot3_mask"): Ditto. (define_insn "*andnot3"): Ditto. (define_insn "*andnot3"): Ditto. (define_insn "*abs2"): Remove masking support. (define_insn "abs2_mask"): New. (define_insn "abs2_mask"): Ditto. (define_expand "abs2"): Use VI_AVX2 mode iterator. --- Thanks, K diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index ffc831f..9edfebc 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -268,8 +268,8 @@ (V4DI "TARGET_AVX") V2DI]) (define_mode_iterator VI_AVX2 - [(V32QI "TARGET_AVX2") V16QI - (V16HI "TARGET_AVX2") V8HI + [(V64QI "TARGET_AVX512BW") (V32QI "TARGET_AVX2") V16QI + (V32HI "TARGET_AVX512BW") (V16HI "TARGET_AVX2") V8HI (V16SI "TARGET_AVX512F") (V8SI "TARGET_AVX2") V4SI (V8DI "TARGET_AVX512F") (V4DI "TARGET_AVX2") V2DI]) @@ -359,12 +359,6 @@ [(V16HI "TARGET_AVX2") V8HI (V8SI "TARGET_AVX2") V4SI]) -(define_mode_iterator VI124_AVX2_48_AVX512F - [(V32QI "TARGET_AVX2") V16QI - (V16HI "TARGET_AVX2") V8HI - (V16SI "TARGET_AVX512F") (V8SI "TARGET_AVX2") V4SI - (V8DI "TARGET_AVX512F")]) - (define_mode_iterator VI124_AVX512F [(V32QI "TARGET_AVX2") V16QI (V32HI "TARGET_AVX512F") (V16HI "TARGET_AVX2") V8HI @@ -9051,20 +9045,43 @@ "TARGET_SSE2" "operands[2] = force_reg (mode, CONST0_RTX (mode));") -(define_expand "3" +(define_expand "3" [(set (match_operand:VI_AVX2 0 "register_operand") (plusminus:VI_AVX2 (match_operand:VI_AVX2 1 "nonimmediate_operand") (match_operand:VI_AVX2 2 "nonimmediate_operand")))] - "TARGET_SSE2 && " + "TARGET_SSE2" + "ix86_fixup_binary_operands_no_copy (, mode, operands);") + +(define_expand "3_mask" + [(set (match_operand:VI48_AVX512VL 0 "register_operand") + (vec_merge:VI48_AVX512VL + (plusminus:VI48_AVX512VL + (match_operand:VI48_AVX512VL 1 "nonimmediate_operand") + (match_operand:VI48_AVX512VL 2 "nonimmediate_operand")) + (match_operand:VI48_AVX512VL 3 "vector_move_operand") + (match_operand: 4 "register_operand")))] + "TARGET_AVX512F" + "ix86_fixup_binary_operands_no_copy (, mode, operands);") + +(define_expand "3_mask" + [(set (match_operand:VI12_AVX512VL 0 "register_operand") + (vec_merge:VI12_AVX512VL + (plusminus:VI12_AVX512VL + (match_operand:VI12_AVX512VL 1 "nonimmediate_operand") + (match_operand:VI12_AVX512VL 2 "nonimmediate_operand")) + (match_operand:VI12_AVX512VL 3 "vector_move_operand") + (match_operand: 4 "register_operand")))] + "TARGET_AVX512BW" "ix86_fixup_binary_operands_no_copy (, mode, operands);") -(define_insn "*3" +(define_insn "*3" [(set (match_operand:VI_AVX2 0 "register_operand" "=x,v") (plusminus:VI_AVX2 (match_operand:VI_AVX2 1 "nonimmediate_operand" "0,v") (match_operand:VI_AVX2 2 "nonimmediate_operand" "xm,vm")))] - "TARGET_SSE2 && ix86_binary_operator_ok (, mode, operands) && " + "TARGET_SSE2 + && ix86_binary_operator_ok (, mode, operands)" "@ p\t{%2, %0|%0, %2} vp\t{%2, %1, %0|%0, %1, %2}" @@ -9074,6 +9091,35 @@ (set_attr "prefix" "") (set_attr "mode" "")]) +(define_insn "*3_mask" + [(set (match_operand:VI48_AVX512VL 0 "register_operand" "=v") + (vec_merge:VI48_AVX512VL + (plusminus:VI48_AVX512VL + (match_operand:VI48_AVX512VL 1 "nonimmediate_operand" "v") + (match_operand:VI48_AVX512VL 2 "nonimmediate_operand" "vm")) + (match_operand:VI48_AVX512VL 3 "vector_move_operand" "0C") + (match_operand: 4 "register_operand" "Yk")))] + "TARGET_AVX512F + && ix86_binary_operator_ok (, mode, operands)" + "vp\t{%2, %1, %0%{%4%}%N3|%0%{%4%}%N3, %1, %2}" + [(set_attr "type" "sseiadd") + (set_attr "prefix" "evex") + (set_attr "mode" "")]) + +(define_insn "*3_mask" + [(set (match_operand:VI12_AVX512VL 0 "register_operand" "=v") + (vec_merge:VI12_AVX512VL + (plusminus:VI12_AVX512VL + (match_operand:VI12_AVX512VL 1 "nonimmediate_operand" "v") + (match_operand:VI12_AVX512VL 2 "nonimmediate_operand" "vm")) + (match_operand:VI12_AVX512VL 3 "vector_move_operand" "0C") + (match_operand: 4 "register_operand" "Yk")))] + "TARGET_AVX512BW && ix86_binary_operator_ok (, mode, operands)" + "vp\t{%2, %1, %0%{%4%}%N3|%0%{%4%}%N3, %1, %2}" + [(set_attr "type" "sseiadd") + (set_attr "prefix" "evex") + (set_attr "mode" "")]) + (define_expand "_3" [(set (match_operand:VI12_AVX2 0 "register_operand") (sat_plusminus:VI12_AVX2 @@ -10489,19 +10535,41 @@ operands[2] = force_reg (mode, gen_rtx_CONST_VECTOR (mode, v)); }) -(define_expand "_andnot3" +(define_expand "_andnot3" [(set (match_operand:VI_AVX2 0 "register_operand") (and:VI_AVX2 (not:VI_AVX2 (match_operand:VI_AVX2 1 "register_operand")) (match_operand:VI_AVX2 2 "nonimmediate_operand")))] - "TARGET_SSE2 && ") + "TARGET_SSE2") -(define_insn "*andnot3" +(define_expand "_andnot3_mask" + [(set (match_operand:VI48_AVX512VL 0 "register_operand") + (vec_merge:VI48_AVX512VL + (and:VI48_AVX512VL + (not:VI48_AVX512VL + (match_operand:VI48_AVX512VL 1 "register_operand")) + (match_operand:VI48_AVX512VL 2 "nonimmediate_operand")) + (match_operand:VI48_AVX512VL 3 "vector_move_operand") + (match_operand: 4 "register_operand")))] + "TARGET_AVX512F") + +(define_expand "_andnot3_mask" + [(set (match_operand:VI12_AVX512VL 0 "register_operand") + (vec_merge:VI12_AVX512VL + (and:VI12_AVX512VL + (not:VI12_AVX512VL + (match_operand:VI12_AVX512VL 1 "register_operand")) + (match_operand:VI12_AVX512VL 2 "nonimmediate_operand")) + (match_operand:VI12_AVX512VL 3 "vector_move_operand") + (match_operand: 4 "register_operand")))] + "TARGET_AVX512BW") + +(define_insn "*andnot3" [(set (match_operand:VI 0 "register_operand" "=x,v") (and:VI (not:VI (match_operand:VI 1 "register_operand" "0,v")) (match_operand:VI 2 "nonimmediate_operand" "xm,vm")))] - "TARGET_SSE && " + "TARGET_SSE" { static char buf[64]; const char *ops; @@ -10560,7 +10628,7 @@ (eq_attr "mode" "TI")) (const_string "1") (const_string "*"))) - (set_attr "prefix" "") + (set_attr "prefix" "orig,vex") (set (attr "mode") (cond [(and (match_test " == 16") (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")) @@ -10578,6 +10646,36 @@ ] (const_string "")))]) +(define_insn "*andnot3_mask" + [(set (match_operand:VI48_AVX512VL 0 "register_operand" "=v") + (vec_merge:VI48_AVX512VL + (and:VI48_AVX512VL + (not:VI48_AVX512VL + (match_operand:VI48_AVX512VL 1 "register_operand" "v")) + (match_operand:VI48_AVX512VL 2 "nonimmediate_operand" "vm")) + (match_operand:VI48_AVX512VL 3 "vector_move_operand" "0C") + (match_operand: 4 "register_operand" "Yk")))] + "TARGET_AVX512F" + "vpandn\t{%2, %1, %0%{%4%}%N3|%0%{%4%}%N3, %1, %2}"; + [(set_attr "type" "sselog") + (set_attr "prefix" "evex") + (set_attr "mode" "")]) + +(define_insn "*andnot3_mask" + [(set (match_operand:VI12_AVX512VL 0 "register_operand" "=v") + (vec_merge:VI12_AVX512VL + (and:VI12_AVX512VL + (not:VI12_AVX512VL + (match_operand:VI12_AVX512VL 1 "register_operand" "v")) + (match_operand:VI12_AVX512VL 2 "nonimmediate_operand" "vm")) + (match_operand:VI12_AVX512VL 3 "vector_move_operand" "0C") + (match_operand: 4 "register_operand" "Yk")))] + "TARGET_AVX512BW" + "vpandn\t{%2, %1, %0%{%4%}%N3|%0%{%4%}%N3, %1, %2}"; + [(set_attr "type" "sselog") + (set_attr "prefix" "evex") + (set_attr "mode" "")]) + (define_expand "3" [(set (match_operand:VI 0 "register_operand") (any_logic:VI @@ -13361,22 +13459,48 @@ (set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p (insn)")) (set_attr "mode" "DI")]) -(define_insn "abs2" - [(set (match_operand:VI124_AVX2_48_AVX512F 0 "register_operand" "=v") - (abs:VI124_AVX2_48_AVX512F - (match_operand:VI124_AVX2_48_AVX512F 1 "nonimmediate_operand" "vm")))] - "TARGET_SSSE3 && " - "%vpabs\t{%1, %0|%0, %1}" +(define_insn "*abs2" + [(set (match_operand:VI_AVX2 0 "register_operand" "=v") + (abs:VI_AVX2 + (match_operand:VI_AVX2 1 "nonimmediate_operand" "vm")))] + "TARGET_SSSE3" + "%vpabs\t{%1, %0|%0, %1}" [(set_attr "type" "sselog1") (set_attr "prefix_data16" "1") (set_attr "prefix_extra" "1") (set_attr "prefix" "maybe_vex") (set_attr "mode" "")]) +(define_insn "abs2_mask" + [(set (match_operand:VI48_AVX512VL 0 "register_operand" "=v") + (vec_merge:VI48_AVX512VL + (abs:VI48_AVX512VL + (match_operand:VI48_AVX512VL 1 "nonimmediate_operand" "vm")) + (match_operand:VI48_AVX512VL 2 "vector_move_operand" "0C") + (match_operand: 3 "register_operand" "Yk")))] + "TARGET_AVX512F" + "vpabs\t{%1, %0%{%3%}%N2|%0%{%3%}%N2, %1}" + [(set_attr "type" "sselog1") + (set_attr "prefix" "evex") + (set_attr "mode" "")]) + +(define_insn "abs2_mask" + [(set (match_operand:VI12_AVX512VL 0 "register_operand" "=v") + (vec_merge:VI12_AVX512VL + (abs:VI12_AVX512VL + (match_operand:VI12_AVX512VL 1 "nonimmediate_operand" "vm")) + (match_operand:VI12_AVX512VL 2 "vector_move_operand" "0C") + (match_operand: 3 "register_operand" "Yk")))] + "TARGET_AVX512BW" + "vpabs\t{%1, %0%{%3%}%N2|%0%{%3%}%N2, %1}" + [(set_attr "type" "sselog1") + (set_attr "prefix" "evex") + (set_attr "mode" "")]) + (define_expand "abs2" - [(set (match_operand:VI124_AVX2_48_AVX512F 0 "register_operand") - (abs:VI124_AVX2_48_AVX512F - (match_operand:VI124_AVX2_48_AVX512F 1 "nonimmediate_operand")))] + [(set (match_operand:VI_AVX2 0 "register_operand") + (abs:VI_AVX2 + (match_operand:VI_AVX2 1 "nonimmediate_operand")))] "TARGET_SSE2" { if (!TARGET_SSSE3)