From patchwork Wed May 26 18:49:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 1484241 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=W+I2/WaQ; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Fr0P56WyWz9s5R for ; Thu, 27 May 2021 04:49:44 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 02C4F3949097; Wed, 26 May 2021 18:49:42 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 02C4F3949097 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1622054982; bh=l26miurJEV8cwM14b/gxzp5C632qEdG2Y5kr4PjirGg=; h=Date:Subject:To:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=W+I2/WaQHTxFd0Cn1Ajm81EpCEYEeJx2/8pql/1IiJ9neEW7oG0ZRCFqUsRowTCX7 XBioKym4iw3/NdUKOkVzF9ZL0HshL+hrWD35+CwmH7NaBJmTzvBorqKSxHvNPH4pkl itZuSWHosNh7iQpW1WLp1A52ifuyZo1jdT5/gd3g= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-qt1-x82f.google.com (mail-qt1-x82f.google.com [IPv6:2607:f8b0:4864:20::82f]) by sourceware.org (Postfix) with ESMTPS id 3EC733948D95 for ; Wed, 26 May 2021 18:49:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 3EC733948D95 Received: by mail-qt1-x82f.google.com with SMTP id a15so1578223qta.0 for ; Wed, 26 May 2021 11:49:39 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=l26miurJEV8cwM14b/gxzp5C632qEdG2Y5kr4PjirGg=; b=m3//wOr+lUkK3gwXlcFhlBhEpvMgs0NbML5vN4KNzek6+yQx/z91iwk/lLP18XVX9s zsWz+W2+/EHZBTX1BTsx84XuAyMQUezDqMcx+ZzvfgPPXhWlRJPEfmLf3D9VwLpg2kPy 5b2CmPrX4uPVpaYRv7teLqfi3Hn1w6fgd2+IkRu5ybvd1kjSvTtGi9wnvQKNhRP24bWf Xa4xTwuwcRCr2C9Vj+WmNuXbGUBPhqH96m1EqraVWpPS1j12Og3y1gpY7DTYYnd+oL7K Bd+cFomCTM+YR0i42hmTwoXgVaF5NCR+A9p1Xl3Dg3eovIXY/g8ECCadOGWEBOo+uVPt EfLQ== X-Gm-Message-State: AOAM530X1eqntvjabAEoWxKuAi66E0CEUFOOqyd5lickGykWMJAlXOf0 pL/zHyqZZo9QFlcqQDV9DCmCsckqS3vroPFLKprcwJ0Con0syQ== X-Google-Smtp-Source: ABdhPJz5oF8eyDj9tU5TduxOd3X989V7ORPW/U/Ihc7U8jQE+yDpRscUNUnQD5vikCBpvuoO4MrVatswQhnLhFgnBnk= X-Received: by 2002:ac8:5d01:: with SMTP id f1mr38408531qtx.105.1622054978640; Wed, 26 May 2021 11:49:38 -0700 (PDT) MIME-Version: 1.0 Date: Wed, 26 May 2021 20:49:27 +0200 Message-ID: Subject: [PATCH] i386: Autovectorize 4-byte vectors To: "gcc-patches@gcc.gnu.org" X-Spam-Status: No, score=-8.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Uros Bizjak via Gcc-patches From: Uros Bizjak Reply-To: Uros Bizjak Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" 2021-05-26 Uroš Bizjak gcc/ * config/i386/i386.c (ix86_autovectorize_vector_modes): Add V4QImode and V16QImode for TARGET_SSE2. * doc/sourcebuild.texi (Vector-specific attributes): Add vect64 and vect32 description. gcc/testsuite/ * lib/target-supports.exp (check_effective_target_vect32): New. (available_vector_sizes): Append 32 for x86 targets. * gcc.dg/vect/pr71264.c (dg-final): Xfail scan dump for vect32 targets. * gcc.dg/vect/slp-28.c (dg-final): Adjust dumps for vect32 targets. * gcc.dg/vect/slp-3.c (dg-final): Ditto. * gcc.target/i386/pr100637-3b.c: New test. * gcc.target/i386/pr100637-3w.c: Ditto. * gcc.target/i386/pr100637-4b.c: Ditto. * gcc.target/i386/pr100637-4w.c: Ditto. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Pushed to master. Uros. diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 28e6113a609..04649b42122 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -22190,12 +22190,15 @@ ix86_autovectorize_vector_modes (vector_modes *modes, bool all) modes->safe_push (V16QImode); modes->safe_push (V32QImode); } - else if (TARGET_MMX_WITH_SSE) + else if (TARGET_SSE2) modes->safe_push (V16QImode); if (TARGET_MMX_WITH_SSE) modes->safe_push (V8QImode); + if (TARGET_SSE2) + modes->safe_push (V4QImode); + return 0; } diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi index cf3098749c0..16c6a3b8e99 100644 --- a/gcc/doc/sourcebuild.texi +++ b/gcc/doc/sourcebuild.texi @@ -1740,6 +1740,12 @@ circumstances. @item vect_variable_length Target has variable-length vectors. +@item vect64 +Target supports vectors of 64 bits. + +@item vect32 +Target supports vectors of 32 bits. + @item vect_widen_sum_hi_to_si Target supports a vector widening summation of @code{short} operands into @code{int} results, or can promote (unpack) from @code{short} diff --git a/gcc/testsuite/gcc.dg/vect/pr71264.c b/gcc/testsuite/gcc.dg/vect/pr71264.c index dc849bf2797..1381e0ed132 100644 --- a/gcc/testsuite/gcc.dg/vect/pr71264.c +++ b/gcc/testsuite/gcc.dg/vect/pr71264.c @@ -19,5 +19,4 @@ void test(uint8_t *ptr, uint8_t *mask) } } -/* { dg-final { scan-tree-dump "vectorized 1 loops in function" "vect" { xfail s390*-*-* sparc*-*-* } } } */ - +/* { dg-final { scan-tree-dump "vectorized 1 loops in function" "vect" { xfail { { s390*-*-* sparc*-*-* } || vect32 } } } } */ diff --git a/gcc/testsuite/gcc.dg/vect/slp-28.c b/gcc/testsuite/gcc.dg/vect/slp-28.c index 7778bad4465..0bb5f0eb0e4 100644 --- a/gcc/testsuite/gcc.dg/vect/slp-28.c +++ b/gcc/testsuite/gcc.dg/vect/slp-28.c @@ -88,6 +88,7 @@ int main (void) return 0; } -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */ -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { ! vect32 } } } } */ +/* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" { target vect32 } } } */ +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target { ! vect32 } } } } */ diff --git a/gcc/testsuite/gcc.dg/vect/slp-3.c b/gcc/testsuite/gcc.dg/vect/slp-3.c index 46ab584419a..80ded1840ad 100644 --- a/gcc/testsuite/gcc.dg/vect/slp-3.c +++ b/gcc/testsuite/gcc.dg/vect/slp-3.c @@ -141,8 +141,8 @@ int main (void) return 0; } -/* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" { target { ! vect_partial_vectors } } } } */ -/* { dg-final { scan-tree-dump-times "vectorized 4 loops" 1 "vect" { target vect_partial_vectors } } } */ -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 3 "vect" { target { ! vect_partial_vectors } } } }*/ -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 "vect" { target vect_partial_vectors } } } */ +/* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" { target { ! { vect_partial_vectors || vect32 } } } } } */ +/* { dg-final { scan-tree-dump-times "vectorized 4 loops" 1 "vect" { target { vect_partial_vectors || vect32 } } } } */ +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 3 "vect" { target { ! { vect_partial_vectors || vect32 } } } } }*/ +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 "vect" { target { vect_partial_vectors || vect32 } } } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr100637-3b.c b/gcc/testsuite/gcc.target/i386/pr100637-3b.c new file mode 100644 index 00000000000..16df70059a9 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr100637-3b.c @@ -0,0 +1,56 @@ +/* PR target/100637 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize -msse4" } */ + +char r[4], a[4], b[4]; +unsigned char ur[4], ua[4], ub[4]; + +void maxs (void) +{ + int i; + + for (i = 0; i < 4; i++) + r[i] = a[i] > b[i] ? a[i] : b[i]; +} + +/* { dg-final { scan-assembler "pmaxsb" } } */ + +void maxu (void) +{ + int i; + + for (i = 0; i < 4; i++) + ur[i] = ua[i] > ub[i] ? ua[i] : ub[i]; +} + +/* { dg-final { scan-assembler "pmaxub" } } */ + +void mins (void) +{ + int i; + + for (i = 0; i < 4; i++) + r[i] = a[i] < b[i] ? a[i] : b[i]; +} + +/* { dg-final { scan-assembler "pminsb" } } */ + +void minu (void) +{ + int i; + + for (i = 0; i < 4; i++) + ur[i] = ua[i] < ub[i] ? ua[i] : ub[i]; +} + +/* { dg-final { scan-assembler "pminub" } } */ + +void _abs (void) +{ + int i; + + for (i = 0; i < 4; i++) + r[i] = a[i] < 0 ? -a[i] : a[i]; +} + +/* { dg-final { scan-assembler "pabsb" } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr100637-3w.c b/gcc/testsuite/gcc.target/i386/pr100637-3w.c new file mode 100644 index 00000000000..7f1882e7a56 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr100637-3w.c @@ -0,0 +1,86 @@ +/* PR target/100637 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize -msse4" } */ + +short r[2], a[2], b[2]; +unsigned short ur[2], ua[2], ub[2]; + +void mulh (void) +{ + int i; + + for (i = 0; i < 2; i++) + r[i] = ((int) a[i] * b[i]) >> 16; +} + +/* { dg-final { scan-assembler "pmulhw" { xfail *-*-* } } } */ + +void mulhu (void) +{ + int i; + + for (i = 0; i < 2; i++) + ur[i] = ((unsigned int) ua[i] * ub[i]) >> 16; +} + +/* { dg-final { scan-assembler "pmulhuw" { xfail *-*-* } } } */ + +void mulhrs (void) +{ + int i; + + for (i = 0; i < 2; i++) + r[i] = ((((int) a[i] * b[i]) >> 14) + 1) >> 1; +} + +/* { dg-final { scan-assembler "pmulhrsw" } } */ + +void maxs (void) +{ + int i; + + for (i = 0; i < 2; i++) + r[i] = a[i] > b[i] ? a[i] : b[i]; +} + +/* { dg-final { scan-assembler "pmaxsw" } } */ + +void maxu (void) +{ + int i; + + for (i = 0; i < 2; i++) + ur[i] = ua[i] > ub[i] ? ua[i] : ub[i]; +} + +/* { dg-final { scan-assembler "pmaxuw" } } */ + +void mins (void) +{ + int i; + + for (i = 0; i < 2; i++) + r[i] = a[i] < b[i] ? a[i] : b[i]; +} + +/* { dg-final { scan-assembler "pminsw" } } */ + +void minu (void) +{ + int i; + + for (i = 0; i < 2; i++) + ur[i] = ua[i] < ub[i] ? ua[i] : ub[i]; +} + +/* { dg-final { scan-assembler "pminuw" } } */ + +void _abs (void) +{ + int i; + + for (i = 0; i < 2; i++) + r[i] = a[i] < 0 ? -a[i] : a[i]; +} + +/* { dg-final { scan-assembler "pabsw" } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr100637-4b.c b/gcc/testsuite/gcc.target/i386/pr100637-4b.c new file mode 100644 index 00000000000..198e3dd3352 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr100637-4b.c @@ -0,0 +1,19 @@ +/* PR target/100637 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize -msse2" } */ + +typedef char T; + +#define M 4 + +extern T a[M], b[M], s1[M], s2[M], r[M]; + +void foo (void) +{ + int j; + + for (j = 0; j < M; j++) + r[j] = (a[j] < b[j]) ? s1[j] : s2[j]; +} + +/* { dg-final { scan-assembler "pcmpgtb" } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr100637-4w.c b/gcc/testsuite/gcc.target/i386/pr100637-4w.c new file mode 100644 index 00000000000..0f5dacce906 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr100637-4w.c @@ -0,0 +1,19 @@ +/* PR target/100637 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize -msse2" } */ + +typedef short T; + +#define M 2 + +extern T a[M], b[M], s1[M], s2[M], r[M]; + +void foo (void) +{ + int j; + + for (j = 0; j < M; j++) + r[j] = (a[j] < b[j]) ? s1[j] : s2[j]; +} + +/* { dg-final { scan-assembler "pcmpgtw" { xfail *-*-* } } } */ diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index 849f1bbeda5..7f78c5593ac 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -7626,6 +7626,7 @@ proc available_vector_sizes { } { if { ![is-effective-target ia32] } { lappend result 64 } + lappend result 32 } elseif { [istarget sparc*-*-*] } { lappend result 64 } elseif { [istarget amdgcn*-*-*] } { @@ -7655,6 +7656,12 @@ proc check_effective_target_vect64 { } { return [expr { [lsearch -exact [available_vector_sizes] 64] >= 0 }] } +# Return 1 if the target supports vectors of 32 bits. + +proc check_effective_target_vect32 { } { + return [expr { [lsearch -exact [available_vector_sizes] 32] >= 0 }] +} + # Return 1 if the target supports vector copysignf calls. proc check_effective_target_vect_call_copysignf { } {