From patchwork Wed May 12 06:18:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 1477437 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=EO9lyAnk; dkim-atps=neutral Received: from sourceware.org (ip-8-43-85-97.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Fg4Nk1XCsz9sWC for ; Wed, 12 May 2021 16:18:28 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1E557384607B; Wed, 12 May 2021 06:18:25 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 1E557384607B DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1620800305; bh=+FUWOKmpPTNrktAg8ulFUNdEqIPZtpjKIOCWMa5J6hQ=; h=Date:Subject:To:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=EO9lyAnkF8iIUSix74mFaq2GEjyqeImZ2spXNt6atNkyqzNyTS7KTPpX8MLFb4JC3 9I6QzwX4lDFrfDfk8YmLZdY6bM2w2UyHABNctFEcwkYobBEChRxBWYwZPiYOMIYm0e UzAkdQYv+qhIgoksbEGMKnzUJyYIphSCDOvxwd4M= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-qk1-x72b.google.com (mail-qk1-x72b.google.com [IPv6:2607:f8b0:4864:20::72b]) by sourceware.org (Postfix) with ESMTPS id 586FD385780A for ; Wed, 12 May 2021 06:18:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 586FD385780A Received: by mail-qk1-x72b.google.com with SMTP id a2so21189223qkh.11 for ; Tue, 11 May 2021 23:18:22 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=+FUWOKmpPTNrktAg8ulFUNdEqIPZtpjKIOCWMa5J6hQ=; b=RIvylChZpu7gsqRqgqtslBJmGCG3yuURrmRD3b5q79/sF3DOGb+zUGm5ep34ID9rI+ ztTYj/Ao5EzNBRb8sHU771rtWqINjDSLxvKk3/IZFA5MoWMtbwQoBOamXUJhMznWUp6S 0bRb0MUj+kIGCRT19NT6qA2i/9gnWRzLZqBWfRcKz7jS1BBZVsmzlW0gCOiyCRY0HCbq 17hY0u2T5QMuRB+2FZb55Ay6j/yW0tzU6ePypz5tU6nyGsMihM7azVktuh4ST/4NbOIW IrjLGuYrMRC9l7lFa3jh1G6SalDJHL0e8Jf0bnpgvi+Bb+/tW6I9JsP0If/nova0TO57 eVzg== X-Gm-Message-State: AOAM5330fwVB2H/VCQL8FmQjUi0xnUcxG8qqLigOCPZsFuW7nnfSzCvj BLm0Jb0/OcAIEn/ghcDEBx+MopXdbuxUuqHPy3IhVLTWgk4MSQ== X-Google-Smtp-Source: ABdhPJyRYBZuSrBFrR+GDWdFO4Zq5smPV4v8o3Qm/apRr31X37ImBnRqEHh1N13X+DAvgBXrKG4YCeVIcwQG+sDVTYc= X-Received: by 2002:ae9:e706:: with SMTP id m6mr21784980qka.74.1620800301658; Tue, 11 May 2021 23:18:21 -0700 (PDT) MIME-Version: 1.0 Date: Wed, 12 May 2021 08:18:10 +0200 Message-ID: Subject: [PATCH] i386: Implement FP vector compares for V2SFmode [PR98218] To: "gcc-patches@gcc.gnu.org" X-Spam-Status: No, score=-9.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Uros Bizjak via Gcc-patches From: Uros Bizjak Reply-To: Uros Bizjak Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Implement FP vector compares for V2SFmode for TARGET_MMX_WITH_SSE. 2021-05-12 Uroš Bizjak gcc/ PR target/98218 * config/i386/i386-expand.c (ix86_expand_sse_movcc): Handle V2SF mode. * config/i386/mmx.md (MMXMODE124): New mode iterator. (V2FI): Ditto. (mmxintvecmode): New mode attribute. (mmxintvecmodelower): Ditto. (*mmx_maskcmpv2sf3_comm): New insn pattern. (*mmx_maskcmpv2sf3): Ditto. (vec_cmpv2sfv2si): New expander. (vcondv2si): Ditto. (mmx_vlendvps): New insn pattern. (vcond): Also handle V2SFmode. (vcondu): Ditto. (vcond_mask_): Ditto. gcc/testsuite/ PR target/98218 * g++.target/i386/pr98218-1.C: Ditto. * gcc.target/i386/pr98218-4.c: New test. * gcc.target/i386/pr98218-1.c: Correct PR number. * gcc.target/i386/pr98218-1a.c: Ditto. * gcc.target/i386/pr98218-2.c: Ditto. * gcc.target/i386/pr98218-2a.c: Ditto. * gcc.target/i386/pr98218-3.c: Ditto. * gcc.target/i386/pr98218-3a.c: Ditto. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Pushed to master. Uros. diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c index 5cfde5b3d30..dd230081b16 100644 --- a/gcc/config/i386/i386-expand.c +++ b/gcc/config/i386/i386-expand.c @@ -3680,6 +3680,13 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, rtx op_false) switch (mode) { + case E_V2SFmode: + if (TARGET_SSE4_1) + { + gen = gen_mmx_blendvps; + op_true = force_reg (mode, op_true); + } + break; case E_V4SFmode: if (TARGET_SSE4_1) gen = gen_sse4_1_blendvps; diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index f08570856f9..d433c524652 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -49,6 +49,7 @@ (define_mode_iterator MMXMODEI8 [V8QI V4HI V2SI (V1DI "TARGET_SSE2")]) ;; All 8-byte vector modes handled by MMX (define_mode_iterator MMXMODE [V8QI V4HI V2SI V1DI V2SF]) +(define_mode_iterator MMXMODE124 [V8QI V4HI V2SI V2SF]) ;; Mix-n-match (define_mode_iterator MMXMODE12 [V8QI V4HI]) @@ -56,12 +57,22 @@ (define_mode_iterator MMXMODE14 [V8QI V2SI]) (define_mode_iterator MMXMODE24 [V4HI V2SI]) (define_mode_iterator MMXMODE248 [V4HI V2SI V1DI]) +;; All V2S* modes +(define_mode_iterator V2FI [V2SF V2SI]) + ;; Mapping from integer vector mode to mnemonic suffix (define_mode_attr mmxvecsize [(V8QI "b") (V4HI "w") (V2SI "d") (V1DI "q")]) (define_mode_attr mmxdoublemode [(V8QI "V8HI") (V4HI "V4SI")]) +;; Mapping of vector float modes to an integer mode of the same size +(define_mode_attr mmxintvecmode + [(V2SF "V2SI") (V2SI "V2SI") (V4HI "V4HI") (V8QI "V8QI")]) + +(define_mode_attr mmxintvecmodelower + [(V2SF "v2si") (V2SI "v2si") (V4HI "v4hi") (V8QI "v8qi")]) + (define_mode_attr Yv_Yw [(V8QI "Yw") (V4HI "Yw") (V2SI "Yv") (V1DI "Yv") (V2SF "Yv")]) @@ -714,6 +725,85 @@ (define_insn "mmx_gev2sf3" (set_attr "prefix_extra" "1") (set_attr "mode" "V2SF")]) +(define_insn "*mmx_maskcmpv2sf3_comm" + [(set (match_operand:V2SF 0 "register_operand" "=x,x") + (match_operator:V2SF 3 "sse_comparison_operator" + [(match_operand:V2SF 1 "register_operand" "%0,x") + (match_operand:V2SF 2 "register_operand" "x,x")]))] + "TARGET_MMX_WITH_SSE + && GET_RTX_CLASS (GET_CODE (operands[3])) == RTX_COMM_COMPARE" + "@ + cmp%D3ps\t{%2, %0|%0, %2} + vcmp%D3ps\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssecmp") + (set_attr "length_immediate" "1") + (set_attr "prefix" "orig,vex") + (set_attr "mode" "V4SF")]) + +(define_insn "*mmx_maskcmpv2sf3" + [(set (match_operand:V2SF 0 "register_operand" "=x,x") + (match_operator:V2SF 3 "sse_comparison_operator" + [(match_operand:V2SF 1 "register_operand" "0,x") + (match_operand:V2SF 2 "register_operand" "x,x")]))] + "TARGET_MMX_WITH_SSE" + "@ + cmp%D3ps\t{%2, %0|%0, %2} + vcmp%D3ps\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssecmp") + (set_attr "length_immediate" "1") + (set_attr "prefix" "orig,vex") + (set_attr "mode" "V4SF")]) + +(define_expand "vec_cmpv2sfv2si" + [(set (match_operand:V2SI 0 "register_operand") + (match_operator:V2SI 1 "" + [(match_operand:V2SF 2 "register_operand") + (match_operand:V2SF 3 "register_operand")]))] + "TARGET_MMX_WITH_SSE" +{ + bool ok = ix86_expand_fp_vec_cmp (operands); + gcc_assert (ok); + DONE; +}) + +(define_expand "vcondv2sf" + [(set (match_operand:V2FI 0 "register_operand") + (if_then_else:V2FI + (match_operator 3 "" + [(match_operand:V2SF 4 "register_operand") + (match_operand:V2SF 5 "register_operand")]) + (match_operand:V2FI 1) + (match_operand:V2FI 2)))] + "TARGET_MMX_WITH_SSE" +{ + bool ok = ix86_expand_fp_vcond (operands); + gcc_assert (ok); + DONE; +}) + +(define_insn "mmx_blendvps" + [(set (match_operand:V2SF 0 "register_operand" "=Yr,*x,x") + (unspec:V2SF + [(match_operand:V2SF 1 "register_operand" "0,0,x") + (match_operand:V2SF 2 "register_operand" "Yr,*x,x") + (match_operand:V2SF 3 "register_operand" "Yz,Yz,x")] + UNSPEC_BLENDV))] + "TARGET_SSE4_1 && TARGET_MMX_WITH_SSE" + "@ + blendvps\t{%3, %2, %0|%0, %2, %3} + blendvps\t{%3, %2, %0|%0, %2, %3} + vblendvps\t{%3, %2, %1, %0|%0, %1, %2, %3}" + [(set_attr "isa" "noavx,noavx,avx") + (set_attr "type" "ssemov") + (set_attr "length_immediate" "1") + (set_attr "prefix_data16" "1,1,*") + (set_attr "prefix_extra" "1") + (set_attr "prefix" "orig,orig,vex") + (set_attr "btver2_decode" "vector") + (set_attr "mode" "V4SF")]) + ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; ;; Parallel single-precision floating point logical operations @@ -1657,42 +1747,46 @@ (define_expand "vec_cmpu" DONE; }) -(define_expand "vcond" - [(set (match_operand:MMXMODEI 0 "register_operand") - (if_then_else:MMXMODEI +(define_expand "vcond" + [(set (match_operand:MMXMODE124 0 "register_operand") + (if_then_else:MMXMODE124 (match_operator 3 "" [(match_operand:MMXMODEI 4 "register_operand") (match_operand:MMXMODEI 5 "register_operand")]) - (match_operand:MMXMODEI 1) - (match_operand:MMXMODEI 2)))] - "TARGET_MMX_WITH_SSE" + (match_operand:MMXMODE124 1) + (match_operand:MMXMODE124 2)))] + "TARGET_MMX_WITH_SSE + && (GET_MODE_NUNITS (mode) + == GET_MODE_NUNITS (mode))" { bool ok = ix86_expand_int_vcond (operands); gcc_assert (ok); DONE; }) -(define_expand "vcondu" - [(set (match_operand:MMXMODEI 0 "register_operand") - (if_then_else:MMXMODEI +(define_expand "vcondu" + [(set (match_operand:MMXMODE124 0 "register_operand") + (if_then_else:MMXMODE124 (match_operator 3 "" [(match_operand:MMXMODEI 4 "register_operand") (match_operand:MMXMODEI 5 "register_operand")]) - (match_operand:MMXMODEI 1) - (match_operand:MMXMODEI 2)))] - "TARGET_MMX_WITH_SSE" + (match_operand:MMXMODE124 1) + (match_operand:MMXMODE124 2)))] + "TARGET_MMX_WITH_SSE + && (GET_MODE_NUNITS (mode) + == GET_MODE_NUNITS (mode))" { bool ok = ix86_expand_int_vcond (operands); gcc_assert (ok); DONE; }) -(define_expand "vcond_mask_" - [(set (match_operand:MMXMODEI 0 "register_operand") - (vec_merge:MMXMODEI - (match_operand:MMXMODEI 1 "register_operand") - (match_operand:MMXMODEI 2 "register_operand") - (match_operand:MMXMODEI 3 "register_operand")))] +(define_expand "vcond_mask_" + [(set (match_operand:MMXMODE124 0 "register_operand") + (vec_merge:MMXMODE124 + (match_operand:MMXMODE124 1 "register_operand") + (match_operand:MMXMODE124 2 "register_operand") + (match_operand: 3 "register_operand")))] "TARGET_MMX_WITH_SSE" { ix86_expand_sse_movcc (operands[0], operands[3], diff --git a/gcc/testsuite/g++.target/i386/pr98218-1.C b/gcc/testsuite/g++.target/i386/pr98218-1.C new file mode 100644 index 00000000000..61ea4bf9008 --- /dev/null +++ b/gcc/testsuite/g++.target/i386/pr98218-1.C @@ -0,0 +1,20 @@ +/* PR target/98218 */ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2 -msse2" } */ + +typedef unsigned int __attribute__((__vector_size__ (8))) v64u32; +typedef int __attribute__((__vector_size__ (8))) v64s32; +typedef float __attribute__((__vector_size__ (8))) v64f32; + +v64u32 au, bu; +v64s32 as, bs; +v64f32 af, bf; + +v64u32 tu (v64f32 a, v64f32 b) { return (a > b) ? au : bu; } +v64s32 ts (v64f32 a, v64f32 b) { return (a > b) ? as : bs; } +v64f32 fu (v64u32 a, v64u32 b) { return (a > b) ? af : bf; } +v64f32 fs (v64s32 a, v64s32 b) { return (a > b) ? af : bf; } +v64f32 ff (v64f32 a, v64f32 b) { return (a > b) ? af : bf; } + +/* { dg-final { scan-assembler-times "cmpltps" 3 } } */ +/* { dg-final { scan-assembler-times "pcmpgtd" 2 } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr98218-1.c b/gcc/testsuite/gcc.target/i386/pr98218-1.c index 48407dabc2a..9d6602c08a2 100644 --- a/gcc/testsuite/gcc.target/i386/pr98218-1.c +++ b/gcc/testsuite/gcc.target/i386/pr98218-1.c @@ -1,4 +1,4 @@ -/* PR target/98522 */ +/* PR target/98218 */ /* { dg-do compile { target { ! ia32 } } } */ /* { dg-options "-O2 -msse2" } */ diff --git a/gcc/testsuite/gcc.target/i386/pr98218-1a.c b/gcc/testsuite/gcc.target/i386/pr98218-1a.c index 3470c87cdc3..2610438b24a 100644 --- a/gcc/testsuite/gcc.target/i386/pr98218-1a.c +++ b/gcc/testsuite/gcc.target/i386/pr98218-1a.c @@ -1,4 +1,4 @@ -/* PR target/98522 */ +/* PR target/98218 */ /* { dg-do compile { target { ! ia32 } } } */ /* { dg-options "-O2 -ftree-vectorize -msse2" } */ diff --git a/gcc/testsuite/gcc.target/i386/pr98218-2.c b/gcc/testsuite/gcc.target/i386/pr98218-2.c index 0b716126413..948bf4f5978 100644 --- a/gcc/testsuite/gcc.target/i386/pr98218-2.c +++ b/gcc/testsuite/gcc.target/i386/pr98218-2.c @@ -1,4 +1,4 @@ -/* PR target/98522 */ +/* PR target/98218 */ /* { dg-do compile { target { ! ia32 } } } */ /* { dg-options "-O2 -msse2" } */ diff --git a/gcc/testsuite/gcc.target/i386/pr98218-2a.c b/gcc/testsuite/gcc.target/i386/pr98218-2a.c index 6afd0a412d7..73c7226044f 100644 --- a/gcc/testsuite/gcc.target/i386/pr98218-2a.c +++ b/gcc/testsuite/gcc.target/i386/pr98218-2a.c @@ -1,4 +1,4 @@ -/* PR target/98522 */ +/* PR target/98218 */ /* { dg-do compile { target { ! ia32 } } } */ /* { dg-options "-O2 -ftree-vectorize -msse2" } */ diff --git a/gcc/testsuite/gcc.target/i386/pr98218-3.c b/gcc/testsuite/gcc.target/i386/pr98218-3.c index 83a8c298640..1b40d0cee36 100644 --- a/gcc/testsuite/gcc.target/i386/pr98218-3.c +++ b/gcc/testsuite/gcc.target/i386/pr98218-3.c @@ -1,4 +1,4 @@ -/* PR target/98522 */ +/* PR target/98218 */ /* { dg-do compile { target { ! ia32 } } } */ /* { dg-options "-O2 -msse2" } */ diff --git a/gcc/testsuite/gcc.target/i386/pr98218-3a.c b/gcc/testsuite/gcc.target/i386/pr98218-3a.c index 272d54e5b34..cf1d4972807 100644 --- a/gcc/testsuite/gcc.target/i386/pr98218-3a.c +++ b/gcc/testsuite/gcc.target/i386/pr98218-3a.c @@ -1,4 +1,4 @@ -/* PR target/98522 */ +/* PR target/98218 */ /* { dg-do compile { target { ! ia32 } } } */ /* { dg-options "-O2 -ftree-vectorize -msse2" } */ diff --git a/gcc/testsuite/gcc.target/i386/pr98218-4.c b/gcc/testsuite/gcc.target/i386/pr98218-4.c new file mode 100644 index 00000000000..647bdb1171b --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr98218-4.c @@ -0,0 +1,16 @@ +/* PR target/98218 */ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2 -msse2" } */ + +typedef unsigned int __attribute__((__vector_size__ (8))) v64u32; +typedef int __attribute__((__vector_size__ (8))) v64s32; +typedef float __attribute__((__vector_size__ (8))) v64f32; + +v64u32 tu (v64f32 a, v64f32 b) { return a > b; } +v64s32 ts (v64f32 a, v64f32 b) { return a > b; } +v64f32 fu (v64u32 a, v64u32 b) { return a > b; } +v64f32 fs (v64s32 a, v64s32 b) { return a > b; } +v64f32 ff (v64f32 a, v64f32 b) { return a > b; } + +/* { dg-final { scan-assembler-times "cmpltps" 3 } } */ +/* { dg-final { scan-assembler-times "pcmpgtd" 2 } } */