From patchwork Thu Jun 17 13:22:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 1493514 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=I3WmcRl4; dkim-atps=neutral Received: from sourceware.org (ip-8-43-85-97.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G5N6H6kYZz9sSn for ; Thu, 17 Jun 2021 23:23:18 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5F055385383F for ; Thu, 17 Jun 2021 13:23:15 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 5F055385383F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1623936195; bh=C4NAAYJOtk94bfljxHHJBjFVOhVo/mC+TlEHLPJKIn0=; h=Date:Subject:To:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=I3WmcRl4C0O5UTmWjFnXQBVZXSbvBEtTnT4HpXVYm7LXMt8gwoMvLI28N6+T0X5WB OIKDMvU6vVUcTqMoL+G8M+60IoVP/rwf57wibQyNsLI0Ps1mAJ0NmUwPq9zZKNJ6xg HOeCS0tA5F5eBS5Es4Fqn22y+teefwcD5rqsulnk= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-qk1-x729.google.com (mail-qk1-x729.google.com [IPv6:2607:f8b0:4864:20::729]) by sourceware.org (Postfix) with ESMTPS id 99D773857431 for ; Thu, 17 Jun 2021 13:22:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 99D773857431 Received: by mail-qk1-x729.google.com with SMTP id j184so3065616qkd.6 for ; Thu, 17 Jun 2021 06:22:30 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=C4NAAYJOtk94bfljxHHJBjFVOhVo/mC+TlEHLPJKIn0=; b=TOgJzQKY9XXCWfPvDsOippmr4eMaF1JlD1OQvRacBGXpGVPtJUZeI1JcPvGk1s10q5 VCRDDBnMnNOUN82uT9NDiddAHZ3BofMU4CLPD5OWRRDnilHQSsOEdw7H0Amu+5cUFzot ZVTh8F0gm0QfWRqano9JHoGYNz8UTcwYwLj2yV6DHHA0EmjN8GJgbw7TFl5Blsm3/RGL UIUSGPnEpYBTJtZW8AhNCVg14ywxfNkRH2IykJIxtKJejBii/h51ZQ6DgcZC3oVlXGWS JHZuMfVOyrCzQzBXoPO8KmLYQbxk7qGy6V5HRjP5a10CYWJEixvMN6+20+AM8sRzv9bG 6K/Q== X-Gm-Message-State: AOAM530dnPzssYaukocnJGZbhADBDoByGOlWwgbouSRXMgiKmCJwytJT REvXUn3Kru4G2hbVhHOj4TrFHno7O1Fm3J8+dC0zpKGosxdvzw== X-Google-Smtp-Source: ABdhPJywuwBNfZIrvXF4U2SXzepzT1G35dXC+eUUsr3G/Msrgcy9x0+UOJduMV74CoHUAIstRkMXt6LaxhQIYzvN318= X-Received: by 2002:ae9:dd06:: with SMTP id r6mr3867221qkf.74.1623936149886; Thu, 17 Jun 2021 06:22:29 -0700 (PDT) MIME-Version: 1.0 Date: Thu, 17 Jun 2021 15:22:17 +0200 Message-ID: Subject: [PATCH] i386: Add variable vec_set for 64bit vectors [PR97194] To: "gcc-patches@gcc.gnu.org" X-Spam-Status: No, score=-9.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Uros Bizjak via Gcc-patches From: Uros Bizjak Reply-To: Uros Bizjak Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" To generate sane code a SSE4.1 variable PBLENDV instruction is needed. 2021-06-17 Uroš Bizjak gcc/ PR target/97194 * config/i386/i386-expand.c (expand_vector_set_var): Handle V2FS mode remapping. Pass TARGET_MMX_WITH_SSE to ix86_expand_vector_init_duplicate. (ix86_expand_vector_init_duplicate): Emit insv_1 for QImode for !TARGET_PARTIAL_REG_STALL. * config/i386/predicates.md (vec_setm_mmx_operand): New predicate. * config/i386/mmx.md (vec_setv2sf): Use vec_setm_mmx_operand as operand 2 predicate. Call ix86_expand_vector_set_var for non-constant index operand. (vec_setv2si): Ditto. (vec_setv4hi): Ditto. (vec_setv8qi): ditto. gcc/testsuite/ PR target/97194 * gcc.target/i386/sse4_1-vec-set-1.c: New test. * gcc.target/i386/sse4_1-vec-set-2.c: ditto. diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c index eb6f9b0684e..8f4e4e4d884 100644 --- a/gcc/config/i386/i386-expand.c +++ b/gcc/config/i386/i386-expand.c @@ -13811,10 +13811,17 @@ ix86_expand_vector_init_duplicate (bool mmx_ok, machine_mode mode, wsmode = GET_MODE_INNER (wvmode); val = convert_modes (wsmode, smode, val, true); - x = expand_simple_binop (wsmode, ASHIFT, val, - GEN_INT (GET_MODE_BITSIZE (smode)), - NULL_RTX, 1, OPTAB_LIB_WIDEN); - val = expand_simple_binop (wsmode, IOR, val, x, x, 1, OPTAB_LIB_WIDEN); + + if (smode == QImode && !TARGET_PARTIAL_REG_STALL) + emit_insn (gen_insv_1 (wsmode, val, val)); + else + { + x = expand_simple_binop (wsmode, ASHIFT, val, + GEN_INT (GET_MODE_BITSIZE (smode)), + NULL_RTX, 1, OPTAB_LIB_WIDEN); + val = expand_simple_binop (wsmode, IOR, val, x, x, 1, + OPTAB_LIB_WIDEN); + } x = gen_reg_rtx (wvmode); ok = ix86_expand_vector_init_duplicate (mmx_ok, wvmode, x, val); @@ -14788,6 +14795,9 @@ ix86_expand_vector_set_var (rtx target, rtx val, rtx idx) case E_V8DFmode: cmp_mode = V8DImode; break; + case E_V2SFmode: + cmp_mode = V2SImode; + break; case E_V4SFmode: cmp_mode = V4SImode; break; @@ -14809,9 +14819,11 @@ ix86_expand_vector_set_var (rtx target, rtx val, rtx idx) idxv = gen_reg_rtx (cmp_mode); idx_tmp = convert_to_mode (GET_MODE_INNER (cmp_mode), idx, 1); - ok = ix86_expand_vector_init_duplicate (false, mode, valv, val); + ok = ix86_expand_vector_init_duplicate (TARGET_MMX_WITH_SSE, + mode, valv, val); gcc_assert (ok); - ok = ix86_expand_vector_init_duplicate (false, cmp_mode, idxv, idx_tmp); + ok = ix86_expand_vector_init_duplicate (TARGET_MMX_WITH_SSE, + cmp_mode, idxv, idx_tmp); gcc_assert (ok); vec[0] = target; vec[1] = valv; diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index 59a16f4cd50..a107ac5ccb4 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -1279,11 +1279,14 @@ (define_insn "*mmx_concatv2sf" (define_expand "vec_setv2sf" [(match_operand:V2SF 0 "register_operand") (match_operand:SF 1 "register_operand") - (match_operand 2 "const_int_operand")] + (match_operand 2 "vec_setm_mmx_operand")] "TARGET_MMX || TARGET_MMX_WITH_SSE" { - ix86_expand_vector_set (TARGET_MMX_WITH_SSE, operands[0], operands[1], - INTVAL (operands[2])); + if (CONST_INT_P (operands[2])) + ix86_expand_vector_set (TARGET_MMX_WITH_SSE, operands[0], operands[1], + INTVAL (operands[2])); + else + ix86_expand_vector_set_var (operands[0], operands[1], operands[2]); DONE; }) @@ -2989,11 +2992,14 @@ (define_insn "*mmx_concatv2si" (define_expand "vec_setv2si" [(match_operand:V2SI 0 "register_operand") (match_operand:SI 1 "register_operand") - (match_operand 2 "const_int_operand")] + (match_operand 2 "vec_setm_mmx_operand")] "TARGET_MMX || TARGET_MMX_WITH_SSE" { - ix86_expand_vector_set (TARGET_MMX_WITH_SSE, operands[0], operands[1], - INTVAL (operands[2])); + if (CONST_INT_P (operands[2])) + ix86_expand_vector_set (TARGET_MMX_WITH_SSE, operands[0], operands[1], + INTVAL (operands[2])); + else + ix86_expand_vector_set_var (operands[0], operands[1], operands[2]); DONE; }) @@ -3145,11 +3151,14 @@ (define_expand "vec_initv2sisi" (define_expand "vec_setv4hi" [(match_operand:V4HI 0 "register_operand") (match_operand:HI 1 "register_operand") - (match_operand 2 "const_int_operand")] + (match_operand 2 "vec_setm_mmx_operand")] "TARGET_MMX || TARGET_MMX_WITH_SSE" { - ix86_expand_vector_set (TARGET_MMX_WITH_SSE, operands[0], operands[1], - INTVAL (operands[2])); + if (CONST_INT_P (operands[2])) + ix86_expand_vector_set (TARGET_MMX_WITH_SSE, operands[0], operands[1], + INTVAL (operands[2])); + else + ix86_expand_vector_set_var (operands[0], operands[1], operands[2]); DONE; }) @@ -3177,11 +3186,14 @@ (define_expand "vec_initv4hihi" (define_expand "vec_setv8qi" [(match_operand:V8QI 0 "register_operand") (match_operand:QI 1 "register_operand") - (match_operand 2 "const_int_operand")] + (match_operand 2 "vec_setm_mmx_operand")] "TARGET_SSE4_1 && TARGET_MMX_WITH_SSE" { - ix86_expand_vector_set (TARGET_MMX_WITH_SSE, operands[0], operands[1], - INTVAL (operands[2])); + if (CONST_INT_P (operands[2])) + ix86_expand_vector_set (TARGET_MMX_WITH_SSE, operands[0], operands[1], + INTVAL (operands[2])); + else + ix86_expand_vector_set_var (operands[0], operands[1], operands[2]); DONE; }) diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md index 3dd134e7f22..e7a896874d6 100644 --- a/gcc/config/i386/predicates.md +++ b/gcc/config/i386/predicates.md @@ -1026,6 +1026,12 @@ (define_predicate "vec_setm_operand" (match_test "TARGET_AVX2")) (match_code "const_int"))) +(define_predicate "vec_setm_mmx_operand" + (ior (and (match_operand 0 "register_operand") + (match_test "TARGET_SSE4_1") + (match_test "TARGET_MMX_WITH_SSE")) + (match_code "const_int"))) + ;; True for registers, or 1 or -1. Used to optimize double-word shifts. (define_predicate "reg_or_pm1_operand" (ior (match_operand 0 "register_operand") diff --git a/gcc/testsuite/gcc.target/i386/sse4_1-vec-set-1.c b/gcc/testsuite/gcc.target/i386/sse4_1-vec-set-1.c new file mode 100644 index 00000000000..7c7fd34bbc1 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/sse4_1-vec-set-1.c @@ -0,0 +1,26 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-msse4.1 -O2" } */ +/* { dg-final { scan-assembler-times {(?n)v?pcmpeq[bwd]} 4 } } */ +/* { dg-final { scan-assembler-times {(?n)v?p?blendv} 4 } } */ + +typedef char v8qi __attribute__ ((vector_size (8))); +typedef short v4hi __attribute__ ((vector_size (8))); +typedef int v2si __attribute__ ((vector_size (8))); +typedef float v2sf __attribute__ ((vector_size (8))); + +#define FOO(VTYPE, TYPE) \ + VTYPE \ + __attribute__ ((noipa)) \ + foo_##VTYPE (VTYPE a, TYPE b, unsigned int c) \ + { \ + a[c] = b; \ + return a; \ + } \ + +FOO (v8qi, char); + +FOO (v4hi, short); + +FOO (v2si, int); + +FOO (v2sf, float); diff --git a/gcc/testsuite/gcc.target/i386/sse4_1-vec-set-2.c b/gcc/testsuite/gcc.target/i386/sse4_1-vec-set-2.c new file mode 100644 index 00000000000..24f80414761 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/sse4_1-vec-set-2.c @@ -0,0 +1,45 @@ +/* { dg-do run { target { ! ia32 } } } */ +/* { dg-require-effective-target sse4 } */ +/* { dg-options "-O2 -msse4.1" } */ + + +#ifndef CHECK +#define CHECK "sse4_1-check.h" +#endif + +#ifndef TEST +#define TEST sse4_1_test +#endif + +#include CHECK + +#include "sse4_1-vec-set-1.c" + +#define CALC_TEST(vtype, type, N, idx) \ +do \ + { \ + int i,val = idx * idx - idx * 3 + 16; \ + type res[N],exp[N]; \ + vtype resv; \ + for (i = 0; i < N; i++) \ + { \ + res[i] = i * i - i * 3 + 15; \ + exp[i] = res[i]; \ + } \ + exp[idx] = val; \ + resv = foo_##vtype (*(vtype *)&res[0], val, idx); \ + for (i = 0; i < N; i++) \ + { \ + if (resv[i] != exp[i]) \ + abort (); \ + } \ + } \ +while (0) + +static void +TEST (void) +{ + CALC_TEST (v8qi, char, 8, 5); + CALC_TEST (v4hi, short, 4, 2); + CALC_TEST (v2si, int, 2, 1); +}