From patchwork Fri Dec 2 16:30:55 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 702033 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3tVfmQ0Pfkz9t1H for ; Sat, 3 Dec 2016 03:31:17 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="JursGRDD"; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:references:mime-version :content-type:in-reply-to; q=dns; s=default; b=jAdQPpBCGhXzfUx6x YIWf5V8BGeialcSbQgJOSzlcnHAMxU0HIcGypBJ4O0wjWWjogJFGMkNMNtedIOdg FR7XVQL8yTqlhsmONX3NJPh1aVUka/kwiPpZHAKkiWwuI77QggS1tUKHvGUeqYbV XK7diO3zMKNAwsdv8SkH6aQefQ= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:references:mime-version :content-type:in-reply-to; s=default; bh=iOKgc6Q9nwdui007LWZ/IWq u/e4=; b=JursGRDDCeB3+ckdQsG+dXBU4+wtrXtH94I/vyyI+IqqQnAmR2m6jeI e7jTEs8J1So7M1U6xoX6IOEVh413acGKihSaB1nPAMFgSxdU9c5QCpqkcoebsEpa 4iyrNsfYxi4sBuyitu72KxEczcAbY9EgGxFmTpFpn3XmcVfESOSY= Received: (qmail 70942 invoked by alias); 2 Dec 2016 16:31:10 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 70931 invoked by uid 89); 2 Dec 2016 16:31:10 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-4.8 required=5.0 tests=BAYES_00, RP_MATCHES_RCVD, SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=por X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 02 Dec 2016 16:30:59 +0000 Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id CF8CB19D395; Fri, 2 Dec 2016 16:30:58 +0000 (UTC) Received: from tucnak.zalov.cz (ovpn-204-100.brq.redhat.com [10.40.204.100]) by int-mx13.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id uB2GUvpZ019965 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Fri, 2 Dec 2016 11:30:58 -0500 Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.15.2/8.15.2) with ESMTP id uB2GUt0A029527; Fri, 2 Dec 2016 17:30:56 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.15.2/8.15.2/Submit) id uB2GUtPj029526; Fri, 2 Dec 2016 17:30:55 +0100 Date: Fri, 2 Dec 2016 17:30:55 +0100 From: Jakub Jelinek To: Uros Bizjak Cc: "gcc-patches@gcc.gnu.org" Subject: Re: [PATCH] Handle andn and ~ in 32-bit stv pass (PR target/70322) Message-ID: <20161202163055.GA3541@tucnak.redhat.com> Reply-To: Jakub Jelinek References: <20161202142126.GW3541@tucnak.redhat.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-IsSubscribed: yes On Fri, Dec 02, 2016 at 05:12:20PM +0100, Uros Bizjak wrote: > >> This patch: > >> 1) adds one_cmpldi2 pattern for stv purposes (which splits into two > >> one_cmplsi2 after reload) > >> 2) teaches the 32-bit stv pass to handle NOT (as xor all-ones) > >> 3) renames the old *andndi3_doubleword to *andndi3_doubleword_bmi, as it > >> is for -mbmi only, and adds another *andndi3_doubleword pattern that is > >> meant to live just from combine till the stv pass, or worse case till > >> following split1 pass when it is split back into not followed by and; > >> this change makes it possible to use pandn in stv pass, even without > >> -mbmi > > > > Please use attached (lightly tested) patch to implement point 3) > > above. The patch splits insn after reload, as is the case with all STV > > patterns. > > Now attached for real. Ok, I've checked in following patch (compared to your notes just added xfail to the pr70322-2.c test scan-assembler), feel free to test your patch and remove the xfail again. 2016-12-02 Jakub Jelinek PR target/70322 * config/i386/i386.c (dimode_scalar_to_vector_candidate_p): Handle NOT. (dimode_scalar_chain::compute_convert_gain): Likewise. (dimode_scalar_chain::convert_insn): Likewise. * config/i386/i386.md (*one_cmpldi2_doubleword): New define_insn_and_split. (one_cmpl2): Use SWIM1248x iterator instead of SWIM. * gcc.target/i386/pr70322-1.c: New test. * gcc.target/i386/pr70322-2.c: New test. * gcc.target/i386/pr70322-3.c: New test. Jakub --- gcc/config/i386/i386.c.jj 2016-12-02 11:17:40.702995111 +0100 +++ gcc/config/i386/i386.c 2016-12-02 12:01:31.656469089 +0100 @@ -2826,6 +2826,9 @@ dimode_scalar_to_vector_candidate_p (rtx return false; break; + case NOT: + break; + case REG: return true; @@ -2848,7 +2851,8 @@ dimode_scalar_to_vector_candidate_p (rtx if ((GET_MODE (XEXP (src, 0)) != DImode && !CONST_INT_P (XEXP (src, 0))) - || (GET_MODE (XEXP (src, 1)) != DImode + || (GET_CODE (src) != NOT + && GET_MODE (XEXP (src, 1)) != DImode && !CONST_INT_P (XEXP (src, 1)))) return false; @@ -3415,6 +3419,8 @@ dimode_scalar_chain::compute_convert_gai if (CONST_INT_P (XEXP (src, 1))) gain -= vector_const_cost (XEXP (src, 1)); } + else if (GET_CODE (src) == NOT) + gain += ix86_cost->add - COSTS_N_INSNS (1); else if (GET_CODE (src) == COMPARE) { /* Assume comparison cost is the same. */ @@ -3770,6 +3776,14 @@ dimode_scalar_chain::convert_insn (rtx_i PUT_MODE (src, V2DImode); break; + case NOT: + src = XEXP (src, 0); + convert_op (&src, insn); + subreg = gen_reg_rtx (V2DImode); + emit_insn_before (gen_move_insn (subreg, CONSTM1_RTX (V2DImode)), insn); + src = gen_rtx_XOR (V2DImode, src, subreg); + break; + case MEM: if (!REG_P (dst)) convert_op (&src, insn); --- gcc/config/i386/i386.md.jj 2016-12-01 23:24:51.663157486 +0100 +++ gcc/config/i386/i386.md 2016-12-02 12:50:27.616829191 +0100 @@ -9312,9 +9312,22 @@ ;; One complement instructions +(define_insn_and_split "*one_cmpldi2_doubleword" + [(set (match_operand:DI 0 "nonimmediate_operand" "=rm") + (not:DI (match_operand:DI 1 "nonimmediate_operand" "0")))] + "!TARGET_64BIT && TARGET_STV && TARGET_SSE2 + && ix86_unary_operator_ok (NOT, DImode, operands)" + "#" + "&& reload_completed" + [(set (match_dup 0) + (not:SI (match_dup 1))) + (set (match_dup 2) + (not:SI (match_dup 3)))] + "split_double_mode (DImode, &operands[0], 2, &operands[0], &operands[2]);") + (define_expand "one_cmpl2" - [(set (match_operand:SWIM 0 "nonimmediate_operand") - (not:SWIM (match_operand:SWIM 1 "nonimmediate_operand")))] + [(set (match_operand:SWIM1248x 0 "nonimmediate_operand") + (not:SWIM1248x (match_operand:SWIM1248x 1 "nonimmediate_operand")))] "" "ix86_expand_unary_operator (NOT, mode, operands); DONE;") --- gcc/testsuite/gcc.target/i386/pr70322-1.c.jj 2016-12-02 12:52:47.193051745 +0100 +++ gcc/testsuite/gcc.target/i386/pr70322-1.c 2016-12-02 12:52:24.708338078 +0100 @@ -0,0 +1,12 @@ +/* PR target/70322 */ +/* { dg-do compile { target ia32 } } */ +/* { dg-options "-O2 -msse2 -mstv -mbmi" } */ +/* { dg-final { scan-assembler "pandn" } } */ + +extern long long z; + +void +foo (long long x, long long y) +{ + z = ~x & y; +} --- gcc/testsuite/gcc.target/i386/pr70322-2.c.jj 2016-12-02 12:52:50.165013898 +0100 +++ gcc/testsuite/gcc.target/i386/pr70322-2.c 2016-12-02 12:52:39.302152232 +0100 @@ -0,0 +1,12 @@ +/* PR target/70322 */ +/* { dg-do compile { target ia32 } } */ +/* { dg-options "-O2 -msse2 -mstv -mno-bmi" } */ +/* { dg-final { scan-assembler "pandn" { xfail *-*-* } } } */ + +extern long long z; + +void +foo (long long x, long long y) +{ + z = ~x & y; +} --- gcc/testsuite/gcc.target/i386/pr70322-3.c.jj 2016-12-02 13:07:27.658796578 +0100 +++ gcc/testsuite/gcc.target/i386/pr70322-3.c 2016-12-02 13:08:11.899229225 +0100 @@ -0,0 +1,13 @@ +/* PR target/70322 */ +/* { dg-do compile { target ia32 } } */ +/* { dg-options "-O2 -msse2 -mstv" } */ +/* { dg-final { scan-assembler "pxor" } } */ +/* { dg-final { scan-assembler "por" } } */ + +extern long long z; + +void +foo (long long x, long long y) +{ + z = ~x | y; +}