From patchwork Wed Jan 9 19:20:06 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 1022584 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-493732-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="llcmJCxU"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="Z+I6jR8l"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 43Zf9H1Zf4z9sLt for ; Thu, 10 Jan 2019 06:20:33 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:content-type; q= dns; s=default; b=s+3ELO+djKoLozsSbo0yVYaCYIIp44bO14xekFJo0HMwlL h49eoJArHfSwC8s/iTGqzah50K+UVrTQtM+RUdqCm0kO3VTZWIGuFutvslt7xtSS o15U4atR0Bg6LnH1NlasuHuFW3m+v2AZjP3wLtIcvQvrXAebc3LqUfYr4yTxE= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:content-type; s= default; bh=6fFxhq8BmS5pD9bXXQx4YJl31XE=; b=llcmJCxUUvr3Seip4ivT 9ysLswErzVuHkp6do3KR2m3h7OyrxG/aK6AIItiHutwwPqfDBDT7jLPi/9Lq302v Je7U2Loy5/9KIA7qGhVMVzEhFJQq9oTBPHC0HBiwimpO/UrTtI7Qvk6Yxb2cwkQV w42ggTJacApQWWLKKtbsFvY= Received: (qmail 78023 invoked by alias); 9 Jan 2019 19:20:26 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 77985 invoked by uid 89); 9 Jan 2019 19:20:23 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-11.1 required=5.0 tests=BAYES_00, FREEMAIL_FROM, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=neg, 30f, complement, Committed X-HELO: mail-io1-f46.google.com Received: from mail-io1-f46.google.com (HELO mail-io1-f46.google.com) (209.85.166.46) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 09 Jan 2019 19:20:19 +0000 Received: by mail-io1-f46.google.com with SMTP id k2so6917152iog.7 for ; Wed, 09 Jan 2019 11:20:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=28HyoH1oRnUBfaULJu7FcJNGX6zZEg32hc9st2MmEM4=; b=Z+I6jR8lg5GwuaSYqX+5uSgsL8z6pIQzhSecm6T3wF1Kq+eO0RYRF7raMjdJ9lu1bA UsbviGZLpRyGcbvffiS/G5rOq3cd6HOv/giBcRg9lpqHToaqM/gp2m7ng0yU8JIoSlxk rG03Xu5EpHZfkiZUrrh9UaWdBr2XGdtH6urfgvmu9KAI4KLk8PwHGP1yg26aSXFvkWXE qAT3Q/QVWek0XTu6WSuElHSA/ksLWQyPTBUZYJzkYYtVTrQKraiqt77ddkhhjpSLmM19 Vm/dfC/nHZPhiUW0RZkWU4h20kHON+u/HRbh71528E2x0df2LRfztTKnYnCmkCmmtSv6 PNfA== MIME-Version: 1.0 From: Uros Bizjak Date: Wed, 9 Jan 2019 20:20:06 +0100 Message-ID: Subject: [PATCH, i386]: Add xorsign support To: "gcc-patches@gcc.gnu.org" Recent discussions on mailing list reminded me on this long forgotten patch... I hope it is still OK to commit it, so gcc-9 will support optimization that benefits SPEC on x86 targets. 2019-01-09 Uroš Bizjak * config/i386/i386-protos.h (ix86_expand_xorsign): New prototype. (ix86_split_xorsign): Ditto. * config/i386/i386.c (ix86_expand_xorsign): New function. (ix86_split_xorsign): Ditto. * config/i386/i386.md (UNSPEC_XORSIGN): New unspec. (xorsign3): New expander. (xorsign3_1): New insn_and_split pattern. * config/i386/sse.md (xorsign3): New expander. testsuite/ChangeLog: 2019-01-09 Uroš Bizjak * lib/target-supports.exp (check_effective_target_xorsign): Add i?86-*-* and x86_64-*-* targets. * gcc.target/i386/xorsign.c: New test. Bootstrapped and regression tested on x86_64. Committed to mainline SVN. Uros. Index: config/i386/i386-protos.h =================================================================== --- config/i386/i386-protos.h (revision 267776) +++ config/i386/i386-protos.h (working copy) @@ -124,6 +124,8 @@ extern void ix86_expand_fp_absneg_operator (enum r extern void ix86_expand_copysign (rtx []); extern void ix86_split_copysign_const (rtx []); extern void ix86_split_copysign_var (rtx []); +extern void ix86_expand_xorsign (rtx []); +extern void ix86_split_xorsign (rtx []); extern bool ix86_unary_operator_ok (enum rtx_code, machine_mode, rtx[]); extern bool ix86_match_ccmode (rtx, machine_mode); extern void ix86_expand_branch (enum rtx_code, rtx, rtx, rtx); Index: config/i386/i386.c =================================================================== --- config/i386/i386.c (revision 267776) +++ config/i386/i386.c (working copy) @@ -21860,6 +21860,63 @@ ix86_split_copysign_var (rtx operands[]) emit_insn (gen_rtx_SET (dest, x)); } +/* Expand an xorsign operation. */ + +void +ix86_expand_xorsign (rtx operands[]) +{ + rtx (*xorsign_insn)(rtx, rtx, rtx, rtx); + machine_mode mode, vmode; + rtx dest, op0, op1, mask; + + dest = operands[0]; + op0 = operands[1]; + op1 = operands[2]; + + mode = GET_MODE (dest); + + if (mode == SFmode) + { + xorsign_insn = gen_xorsignsf3_1; + vmode = V4SFmode; + } + else if (mode == DFmode) + { + xorsign_insn = gen_xorsigndf3_1; + vmode = V2DFmode; + } + else + gcc_unreachable (); + + mask = ix86_build_signbit_mask (vmode, 0, 0); + + emit_insn (xorsign_insn (dest, op0, op1, mask)); +} + +/* Deconstruct an xorsign operation into bit masks. */ + +void +ix86_split_xorsign (rtx operands[]) +{ + machine_mode mode, vmode; + rtx dest, op0, mask, x; + + dest = operands[0]; + op0 = operands[1]; + mask = operands[3]; + + mode = GET_MODE (dest); + vmode = GET_MODE (mask); + + dest = lowpart_subreg (vmode, dest, mode); + x = gen_rtx_AND (vmode, dest, mask); + emit_insn (gen_rtx_SET (dest, x)); + + op0 = lowpart_subreg (vmode, op0, mode); + x = gen_rtx_XOR (vmode, dest, op0); + emit_insn (gen_rtx_SET (dest, x)); +} + /* Return TRUE or FALSE depending on whether the first SET in INSN has source and destination with matching CC modes, and that the CC mode is at least as constrained as REQ_MODE. */ Index: config/i386/i386.md =================================================================== --- config/i386/i386.md (revision 267776) +++ config/i386/i386.md (working copy) @@ -124,6 +124,7 @@ ;; Generic math support UNSPEC_COPYSIGN + UNSPEC_XORSIGN UNSPEC_IEEE_MIN ; not commutative UNSPEC_IEEE_MAX ; not commutative @@ -9784,6 +9785,26 @@ && reload_completed" [(const_int 0)] "ix86_split_copysign_var (operands); DONE;") + +(define_expand "xorsign3" + [(match_operand:MODEF 0 "register_operand") + (match_operand:MODEF 1 "register_operand") + (match_operand:MODEF 2 "register_operand")] + "SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH" + "ix86_expand_xorsign (operands); DONE;") + +(define_insn_and_split "xorsign3_1" + [(set (match_operand:MODEF 0 "register_operand" "=Yv") + (unspec:MODEF + [(match_operand:MODEF 1 "register_operand" "Yv") + (match_operand:MODEF 2 "register_operand" "0") + (match_operand: 3 "nonimmediate_operand" "Yvm")] + UNSPEC_XORSIGN))] + "SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH" + "#" + "&& reload_completed" + [(const_int 0)] + "ix86_split_xorsign (operands); DONE;") ;; One complement instructions Index: config/i386/sse.md =================================================================== --- config/i386/sse.md (revision 267776) +++ config/i386/sse.md (working copy) @@ -3423,6 +3423,20 @@ operands[5] = gen_reg_rtx (mode); }) +(define_expand "xorsign3" + [(set (match_dup 4) + (and:VF (match_dup 3) + (match_operand:VF 2 "vector_operand"))) + (set (match_operand:VF 0 "register_operand") + (xor:VF (match_dup 4) + (match_operand:VF 1 "vector_operand")))] + "TARGET_SSE" +{ + operands[3] = ix86_build_signbit_mask (mode, 1, 0); + + operands[4] = gen_reg_rtx (mode); +}) + ;; Also define scalar versions. These are used for abs, neg, and ;; conditional move. Using subregs into vector modes causes register ;; allocation lossage. These patterns do not allow memory operands Index: testsuite/gcc.target/i386/xorsign.c =================================================================== --- testsuite/gcc.target/i386/xorsign.c (nonexistent) +++ testsuite/gcc.target/i386/xorsign.c (working copy) @@ -0,0 +1,57 @@ +/* { dg-do run { target sse2_runtime } } */ +/* { dg-options "-O2 -msse2 -mfpmath=sse -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ + +extern void abort (); + +#define N 16 +float a[N] = {-0.1f, -3.2f, -6.3f, -9.4f, + -12.5f, -15.6f, -18.7f, -21.8f, + 24.9f, 27.1f, 30.2f, 33.3f, + 36.4f, 39.5f, 42.6f, 45.7f}; +float b[N] = {-1.2f, 3.4f, -5.6f, 7.8f, + -9.0f, 1.0f, -2.0f, 3.0f, + -4.0f, -5.0f, 6.0f, 7.0f, + -8.0f, -9.0f, 10.0f, 11.0f}; +float r[N]; + +double ad[N] = {-0.1d, -3.2d, -6.3d, -9.4d, + -12.5d, -15.6d, -18.7d, -21.8d, + 24.9d, 27.1d, 30.2d, 33.3d, + 36.4d, 39.5d, 42.6d, 45.7d}; +double bd[N] = {-1.2d, 3.4d, -5.6d, 7.8d, + -9.0d, 1.0d, -2.0d, 3.0d, + -4.0d, -5.0d, 6.0d, 7.0d, + -8.0d, -9.0d, 10.0d, 11.0d}; +double rd[N]; + +int +main (void) +{ + int i; + + for (i = 0; i < N; i++) + r[i] = a[i] * __builtin_copysignf (1.0f, b[i]); + + /* check results: */ + for (i = 0; i < N; i++) + if (r[i] != a[i] * __builtin_copysignf (1.0f, b[i])) + abort (); + + for (i = 0; i < N; i++) + rd[i] = ad[i] * __builtin_copysign (1.0d, bd[i]); + + /* check results: */ + for (i = 0; i < N; i++) + if (rd[i] != ad[i] * __builtin_copysign (1.0d, bd[i])) + abort (); + + return 0; +} + +/* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" } } */ +/* { dg-final { scan-assembler "\[ \t\]xor" } } */ +/* { dg-final { scan-assembler "\[ \t\]and" } } */ +/* { dg-final { scan-assembler-not "copysign" } } */ +/* { dg-final { scan-assembler-not "\[ \t\]fxam" } } */ +/* { dg-final { scan-assembler-not "\[ \t\]or" } } */ +/* { dg-final { scan-assembler-not "\[ \t\]mul" } } */ Index: testsuite/lib/target-supports.exp =================================================================== --- testsuite/lib/target-supports.exp (revision 267776) +++ testsuite/lib/target-supports.exp (working copy) @@ -5730,7 +5730,8 @@ proc check_effective_target_vect_perm3_short { } { proc check_effective_target_xorsign { } { return [check_cached_effective_target_indexed xorsign { - expr { [istarget aarch64*-*-*] || [istarget arm*-*-*] }}] + expr { [istarget i?86-*-*] || [istarget x86_64-*-*] + || [istarget aarch64*-*-*] || [istarget arm*-*-*] }}] } # Return 1 if the target plus current options supports a vector