From patchwork Tue May 26 13:21:32 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 1298076 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=gcc.gnu.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=v6XFqPCb; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49WZPD1Xqhz9sSt for ; Tue, 26 May 2020 23:21:51 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D0FC6383F86F; Tue, 26 May 2020 13:21:48 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D0FC6383F86F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1590499308; bh=8TMiEA5pUouiVwp/wyIJCUWd7+LUMG4YEN7ppIbruvY=; h=Date:Subject:To:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=v6XFqPCbRbsmDtM5U5G9t9Xr2cDpRxC0rMtPeOU6ysoYtYIkq7SVBdnJOuu0Tr0vh 2F1vCKTtWtkeL9et+iH+4GPuDCq5O1TWrfwvnlFCgcyj6004GQwZIRkBh2Td/jXqV3 +d9H47rPC5nawp7YnfLvJZNmH8ghyiL9+1v/8X48= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-qk1-x72d.google.com (mail-qk1-x72d.google.com [IPv6:2607:f8b0:4864:20::72d]) by sourceware.org (Postfix) with ESMTPS id 543453851C23 for ; Tue, 26 May 2020 13:21:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 543453851C23 Received: by mail-qk1-x72d.google.com with SMTP id z80so20517055qka.0 for ; Tue, 26 May 2020 06:21:44 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=8TMiEA5pUouiVwp/wyIJCUWd7+LUMG4YEN7ppIbruvY=; b=ClfyFvncP+E8c/Ru1+Y5uChG/gjT1I35BSY8BWTECb+LkdUnRaIJYaFm04QAxnvwP6 1UYryibdrczpHVaOIKpnJGziqRBYXaqzcKvz0JvsMMQSmjKk8byxD7J7YOf/1TItCj9I rcx0J6TZUk2TzMbizOVxRBjj4BM8ITYTUDgvHzRzU745rULO9gzKgSqYX+zCfhrHEeZF 17Z8vvrAftUWPFzkGBUM4PSFUs4CBQ9gyaYmsUUn9OXXIofr9L93lN/Jniq39qVWExXs P18rf20YxJMPwDwkQnFoH51M/CB8IXBgDcEFbpUbNLxFctaY9pNd6kfZW/s/33ZOXavA RG9A== X-Gm-Message-State: AOAM531rgOOnmtomkwvBUA/bZvE+gGicfkfcuFFrq6yaPIm8GK1xv5yl C9YB5nCmFszz5UkB+wnj0IRuWUMSYyjLwFvTqO/vg18Zwcs= X-Google-Smtp-Source: ABdhPJz5Afnhyzf2jAjbreYzzsgiL+YDykizrSH0r3szYl3XCkn26DUyLBy3QAz5u8nJqz7Alqhxr9H0RBG5yThD32g= X-Received: by 2002:a05:620a:13b0:: with SMTP id m16mr881603qki.292.1590499303660; Tue, 26 May 2020 06:21:43 -0700 (PDT) MIME-Version: 1.0 Date: Tue, 26 May 2020 15:21:32 +0200 Message-ID: Subject: [committed] i386: Implement V2SI and V4HI shuffles To: "gcc-patches@gcc.gnu.org" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Uros Bizjak via Gcc-patches From: Uros Bizjak Reply-To: Uros Bizjak Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" 2020-05-26 Uroš Bizjak gcc/ChangeLog: * config/i386/mmx.md (*mmx_pshufd_1): New insn pattern. * config/i386/i386-expand.c (ix86_vectorize_vec_perm_const): Handle E_V2SImode and E_V4HImode. (expand_vec_perm_even_odd_1): Handle E_V4HImode. Assert that E_V2SImode is already handled. (expand_vec_perm_broadcast_1): Assert that E_V2SImode is already handled by standard shuffle patterns. gcc/testsuite/ChangeLog: * gcc.target/i386/vperm-v2si.c: New test. * gcc.target/i386/vperm-v4hi.c: Ditto. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Uros. diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c index 79f827fd653..338b4f7cf4f 100644 --- a/gcc/config/i386/i386-expand.c +++ b/gcc/config/i386/i386-expand.c @@ -18634,10 +18634,26 @@ expand_vec_perm_even_odd_1 (struct expand_vec_perm_d *d, unsigned odd) case E_V2DFmode: case E_V4SFmode: case E_V2DImode: + case E_V2SImode: case E_V4SImode: /* These are always directly implementable by expand_vec_perm_1. */ gcc_unreachable (); + case E_V4HImode: + if (d->testing_p) + break; + /* We need 2*log2(N)-1 operations to achieve odd/even + with interleave. */ + t1 = gen_reg_rtx (V4HImode); + emit_insn (gen_mmx_punpckhwd (t1, d->op0, d->op1)); + emit_insn (gen_mmx_punpcklwd (d->target, d->op0, d->op1)); + if (odd) + t2 = gen_mmx_punpckhwd (d->target, d->target, t1); + else + t2 = gen_mmx_punpcklwd (d->target, d->target, t1); + emit_insn (t2); + break; + case E_V8HImode: if (TARGET_SSE4_1) return expand_vec_perm_even_odd_pack (d); @@ -18820,6 +18836,7 @@ expand_vec_perm_broadcast_1 (struct expand_vec_perm_d *d) case E_V2DFmode: case E_V2DImode: case E_V4SFmode: + case E_V2SImode: case E_V4SImode: /* These are always implementable using standard shuffle patterns. */ gcc_unreachable (); @@ -19312,6 +19329,11 @@ ix86_vectorize_vec_perm_const (machine_mode vmode, rtx target, rtx op0, if (d.testing_p && TARGET_SSSE3) return true; break; + case E_V2SImode: + case E_V4HImode: + if (!TARGET_MMX_WITH_SSE) + return false; + break; case E_V2DImode: case E_V2DFmode: if (!TARGET_SSE) @@ -19344,7 +19366,9 @@ ix86_vectorize_vec_perm_const (machine_mode vmode, rtx target, rtx op0, d.one_operand_p = (which != 3); /* Implementable with shufps or pshufd. */ - if (d.one_operand_p && (d.vmode == V4SFmode || d.vmode == V4SImode)) + if (d.one_operand_p + && (d.vmode == V4SFmode + || d.vmode == V4SImode || d.vmode == V2SImode)) return true; /* Otherwise we have to go through the motions and see if we can diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index b5564711aa4..c31b4f81079 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -1988,6 +1988,28 @@ (set_attr "length_immediate" "1") (set_attr "mode" "DI,TI")]) +(define_insn "*mmx_pshufd_1" + [(set (match_operand:V2SI 0 "register_operand" "=Yv") + (vec_select:V2SI + (match_operand:V2SI 1 "register_operand" "Yv") + (parallel [(match_operand 2 "const_0_to_1_operand") + (match_operand 3 "const_0_to_1_operand")])))] + "TARGET_MMX_WITH_SSE" +{ + int mask = 0; + mask |= INTVAL (operands[2]) << 0; + mask |= INTVAL (operands[3]) << 2; + mask |= 2 << 4; + mask |= 3 << 6; + operands[2] = GEN_INT (mask); + + return "%vpshufd\t{%2, %1, %0|%0, %1, %2}"; +} + [(set_attr "type" "sselog1") + (set_attr "prefix_data16" "1") + (set_attr "length_immediate" "1") + (set_attr "mode" "TI")]) + (define_insn "mmx_pswapdv2si2" [(set (match_operand:V2SI 0 "register_operand" "=y") (vec_select:V2SI diff --git a/gcc/testsuite/gcc.target/i386/vperm-v2si.c b/gcc/testsuite/gcc.target/i386/vperm-v2si.c new file mode 100644 index 00000000000..5b38b316e3b --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/vperm-v2si.c @@ -0,0 +1,41 @@ +/* { dg-do run { target { ! ia32 } } } */ +/* { dg-options "-O -msse2" } */ +/* { dg-require-effective-target sse2 } */ + +#include "isa-check.h" +#include "sse-os-support.h" + +typedef int S; +typedef int V __attribute__((vector_size(8))); +typedef int IV __attribute__((vector_size(8))); +typedef union { S s[2]; V v; } U; + +static U i[2], b, c; + +extern int memcmp (const void *, const void *, __SIZE_TYPE__); +#define assert(T) ((T) || (__builtin_trap (), 0)) + +#define TEST(E0, E1) \ + b.v = __builtin_shuffle (i[0].v, i[1].v, (IV){E0, E1}); \ + c.s[0] = i[0].s[E0]; \ + c.s[1] = i[0].s[E1]; \ + __asm__("" : : : "memory"); \ + assert (memcmp (&b, &c, sizeof(c)) == 0); + +#include "vperm-2-2.inc" + +int main() +{ + check_isa (); + + if (!sse_os_support ()) + exit (0); + + i[0].s[0] = 0; + i[0].s[1] = 1; + i[0].s[2] = 2; + i[0].s[3] = 3; + + check(); + return 0; +} diff --git a/gcc/testsuite/gcc.target/i386/vperm-v4hi.c b/gcc/testsuite/gcc.target/i386/vperm-v4hi.c new file mode 100644 index 00000000000..bff6512672d --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/vperm-v4hi.c @@ -0,0 +1,47 @@ +/* { dg-do run { target { ! ia32 } } } */ +/* { dg-options "-O -msse2" } */ +/* { dg-require-effective-target sse2 } */ + +#include "isa-check.h" +#include "sse-os-support.h" + +typedef short S; +typedef short V __attribute__((vector_size(8))); +typedef short IV __attribute__((vector_size(8))); +typedef union { S s[4]; V v; } U; + +static U i[2], b, c; + +extern int memcmp (const void *, const void *, __SIZE_TYPE__); +#define assert(T) ((T) || (__builtin_trap (), 0)) + +#define TEST(E0, E1, E2, E3) \ + b.v = __builtin_shuffle (i[0].v, i[1].v, (IV){E0, E1, E2, E3}); \ + c.s[0] = i[0].s[E0]; \ + c.s[1] = i[0].s[E1]; \ + c.s[2] = i[0].s[E2]; \ + c.s[3] = i[0].s[E3]; \ + __asm__("" : : : "memory"); \ + assert (memcmp (&b, &c, sizeof(c)) == 0); + +#include "vperm-4-2.inc" + +int main() +{ + check_isa (); + + if (!sse_os_support ()) + exit (0); + + i[0].s[0] = 0; + i[0].s[1] = 1; + i[0].s[2] = 2; + i[0].s[3] = 3; + i[0].s[4] = 4; + i[0].s[5] = 5; + i[0].s[6] = 6; + i[0].s[7] = 7; + + check(); + return 0; +}