From patchwork Mon Apr 6 22:51:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Pan2 via Gcc-patches" X-Patchwork-Id: 1267102 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=gcc.gnu.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=nWsOXcUJ; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48x5Pb3Y00z9sQt for ; Tue, 7 Apr 2020 08:51:31 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1DB9A385DC36; Mon, 6 Apr 2020 22:51:29 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 1DB9A385DC36 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1586213489; bh=akGjScspFAeOyidokmAZJjdzqlmLZxBWVTTyGkOKlAE=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=nWsOXcUJndhTdosx7IOgkKer/uRN5AO9Bp3zxn9WcKG1JG6VF4s0Ib+w1QpaZpcIP ZvUYbC9lNDEd7dazszeCORf+fIls7krafKFCZj1KcKe6GHTHTagbEYXabpVnQVgAnF ny890wZ3SCgeBE064zt9fbxxttovx7hETerWYf58= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-1.mimecast.com (us-smtp-1.mimecast.com [207.211.31.81]) by sourceware.org (Postfix) with ESMTP id E0B7F385DC36 for ; Mon, 6 Apr 2020 22:51:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org E0B7F385DC36 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-71-yclZvKA4Ovu60CVIW661rA-1; Mon, 06 Apr 2020 18:51:22 -0400 X-MC-Unique: yclZvKA4Ovu60CVIW661rA-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 97D748017F3; Mon, 6 Apr 2020 22:51:21 +0000 (UTC) Received: from tucnak.zalov.cz (ovpn-112-28.ams2.redhat.com [10.36.112.28]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 243219D372; Mon, 6 Apr 2020 22:51:20 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.15.2/8.15.2) with ESMTP id 036MpIID023644; Tue, 7 Apr 2020 00:51:19 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.15.2/8.15.2/Submit) id 036MpI2T023643; Tue, 7 Apr 2020 00:51:18 +0200 Date: Tue, 7 Apr 2020 00:51:18 +0200 To: Uros Bizjak Subject: [PATCH] i386: Fix emit_reduc_half on V{64Q,32H}Imode [PR94500] Message-ID: <20200406225118.GX2212@tucnak> MIME-Version: 1.0 User-Agent: Mutt/1.11.3 (2019-02-01) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Disposition: inline X-Spam-Status: No, score=-18.9 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jakub Jelinek via Gcc-patches From: "Li, Pan2 via Gcc-patches" Reply-To: Jakub Jelinek Cc: gcc-patches@gcc.gnu.org Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Hi! The following testcase is miscompiled in 8.x, because emit_reduc_half is prepared to handle for 512-bit modes only i equal to 512, 256, 128 and 64. V32HImode also needs i equal to 32 and V64QImode i equal to 32 and 16, but emit_reduc_half in that case performs a redundant permutation exactly like i == 32. In 9+ the testcase works because Richard in r9-3393 changed the reduc_* expanders so that they actually don't call ix86_expand_reduc on 512-bit modes, but only 128-bit ones. The patch fixes emit_reduc_half to handle also i of 32 and 16 similarly to how V32QImode/V16HImode are handled for AVX2. I think it shouldn't hurt to fix the function even on the trunk and 9 branch even when nothing uses it ATM. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/9 and primarily for 8.5 (obviously in that case s/i386-expand/i386/)? 2020-04-06 Jakub Jelinek PR target/94500 * config/i386/i386-expand.c (emit_reduc_half): For V{64QI,32HI}mode handle i < 64 using avx512bw_lshrv4ti3. Formatting fixes. * gcc.target/i386/avx512bw-pr94500.c: New test. Jakub --- gcc/config/i386/i386-expand.c.jj 2020-03-29 19:26:31.748561262 +0200 +++ gcc/config/i386/i386-expand.c 2020-04-06 17:18:44.906242980 +0200 @@ -14891,43 +14891,51 @@ emit_reduc_half (rtx dest, rtx src, int break; case E_V64QImode: case E_V32HImode: + if (i < 64) + { + d = gen_reg_rtx (V4TImode); + tem = gen_avx512bw_lshrv4ti3 (d, gen_lowpart (V4TImode, src), + GEN_INT (i / 2)); + break; + } + /* FALLTHRU */ case E_V16SImode: case E_V16SFmode: case E_V8DImode: case E_V8DFmode: if (i > 128) tem = gen_avx512f_shuf_i32x4_1 (gen_lowpart (V16SImode, dest), - gen_lowpart (V16SImode, src), - gen_lowpart (V16SImode, src), - GEN_INT (0x4 + (i == 512 ? 4 : 0)), - GEN_INT (0x5 + (i == 512 ? 4 : 0)), - GEN_INT (0x6 + (i == 512 ? 4 : 0)), - GEN_INT (0x7 + (i == 512 ? 4 : 0)), - GEN_INT (0xC), GEN_INT (0xD), - GEN_INT (0xE), GEN_INT (0xF), - GEN_INT (0x10), GEN_INT (0x11), - GEN_INT (0x12), GEN_INT (0x13), - GEN_INT (0x14), GEN_INT (0x15), - GEN_INT (0x16), GEN_INT (0x17)); + gen_lowpart (V16SImode, src), + gen_lowpart (V16SImode, src), + GEN_INT (0x4 + (i == 512 ? 4 : 0)), + GEN_INT (0x5 + (i == 512 ? 4 : 0)), + GEN_INT (0x6 + (i == 512 ? 4 : 0)), + GEN_INT (0x7 + (i == 512 ? 4 : 0)), + GEN_INT (0xC), GEN_INT (0xD), + GEN_INT (0xE), GEN_INT (0xF), + GEN_INT (0x10), GEN_INT (0x11), + GEN_INT (0x12), GEN_INT (0x13), + GEN_INT (0x14), GEN_INT (0x15), + GEN_INT (0x16), GEN_INT (0x17)); else tem = gen_avx512f_pshufd_1 (gen_lowpart (V16SImode, dest), - gen_lowpart (V16SImode, src), - GEN_INT (i == 128 ? 0x2 : 0x1), - GEN_INT (0x3), - GEN_INT (0x3), - GEN_INT (0x3), - GEN_INT (i == 128 ? 0x6 : 0x5), - GEN_INT (0x7), - GEN_INT (0x7), - GEN_INT (0x7), - GEN_INT (i == 128 ? 0xA : 0x9), - GEN_INT (0xB), - GEN_INT (0xB), - GEN_INT (0xB), - GEN_INT (i == 128 ? 0xE : 0xD), - GEN_INT (0xF), - GEN_INT (0xF), - GEN_INT (0xF)); + gen_lowpart (V16SImode, src), + GEN_INT (i == 128 ? 0x2 : 0x1), + GEN_INT (0x3), + GEN_INT (0x3), + GEN_INT (0x3), + GEN_INT (i == 128 ? 0x6 : 0x5), + GEN_INT (0x7), + GEN_INT (0x7), + GEN_INT (0x7), + GEN_INT (i == 128 ? 0xA : 0x9), + GEN_INT (0xB), + GEN_INT (0xB), + GEN_INT (0xB), + GEN_INT (i == 128 ? 0xE : 0xD), + GEN_INT (0xF), + GEN_INT (0xF), + GEN_INT (0xF)); break; default: gcc_unreachable (); --- gcc/testsuite/gcc.target/i386/avx512bw-pr94500.c.jj 2020-04-06 17:24:42.246904934 +0200 +++ gcc/testsuite/gcc.target/i386/avx512bw-pr94500.c 2020-04-06 17:26:03.721687840 +0200 @@ -0,0 +1,28 @@ +/* PR target/94500 */ +/* { dg-do run { target avx512bw } } */ +/* { dg-options "-O3 -mavx512bw -mprefer-vector-width=512" } */ + +#define AVX512BW +#include "avx512f-helper.h" + +__attribute__((noipa)) signed char +foo (signed char *p) +{ + signed char r = 0; + int i; + for (i = 0; i < 256; i++) + if (p[i] > r) r = p[i]; + return r; +} + +signed char buf[256]; + +static void +TEST (void) +{ + int i; + for (i = 0; i < 256; i++) + buf[i] = i - 128; + if (foo (buf) != 127) + abort (); +}