From patchwork Thu Oct 2 18:34:40 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 396055 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id C40C9140180 for ; Fri, 3 Oct 2014 04:34:58 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:date:message-id:subject:from:to:cc:content-type; q=dns; s=default; b=e+IhSlroDGtp5YjARhwZUegoSNn+HtQMod0QKasDiPf e/SIicxe3GTsen/FjxPXc0mzn3GSUIFrt4f2OLQKj2tCQ/hURrXCiAUmbLQ7FJ5F w8dgesZ3ZhUsApsOKKtnFeXdspSX6mYKFqWAP2ywJicBHKJZtoV/m/K1Yp5/D3lE = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:date:message-id:subject:from:to:cc:content-type; s=default; bh=q/g1BPpieed0W+j+rYK3oy56ANw=; b=mycxnCWgMLhXLKwl7 GjviJRNTvXh5qwrZv2UwJjBSZ1jfMswwXDlGmUXL6Bnd59u9mXEJ+92dUBn0GSNS AAPUl58PVytWu2eFKBRmPKQfPL0o4RhkwWWYaTM234BszKrHRzP+iobtUyx/pfjS NrqnGT//losinDU5HFAqbAafk8= Received: (qmail 20110 invoked by alias); 2 Oct 2014 18:34:46 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 19989 invoked by uid 89); 2 Oct 2014 18:34:45 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.4 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-lb0-f171.google.com Received: from mail-lb0-f171.google.com (HELO mail-lb0-f171.google.com) (209.85.217.171) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Thu, 02 Oct 2014 18:34:43 +0000 Received: by mail-lb0-f171.google.com with SMTP id z12so2843053lbi.16 for ; Thu, 02 Oct 2014 11:34:40 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.152.22.200 with SMTP id g8mr635029laf.1.1412274880119; Thu, 02 Oct 2014 11:34:40 -0700 (PDT) Received: by 10.152.1.193 with HTTP; Thu, 2 Oct 2014 11:34:40 -0700 (PDT) Date: Thu, 2 Oct 2014 20:34:40 +0200 Message-ID: Subject: [RFC, RFH PATCH, i386] Fix gcc.target/i386/pr61403.c FAIL with -mavx2 From: Uros Bizjak To: Jakub Jelinek Cc: Evgeny Stupachenko , "H.J. Lu" , GCC Patches On Wed, Oct 1, 2014 at 9:03 PM, Uros Bizjak wrote: >> And now the expand_vec_perm_palignr improvement, tested >> with GCC_TEST_RUN_EXPENSIVE=1 make check-gcc \ >> RUNTESTFLAGS='--target_board=unix/-mavx2 dg-torture.exp=vshuf*.c' >> E.g. >> typedef unsigned long long V __attribute__ ((vector_size (32))); >> extern void abort (void); >> V a, b, c, d; >> void test_14 (void) >> { >> V mask = { 6, 1, 3, 4 }; >> int i; >> c = __builtin_shuffle (a, mask); >> d = __builtin_shuffle (a, b, mask); >> } >> (distilled from test 15 in vshuf-v4di.c) results in: >> - vmovdqa a(%rip), %ymm0 >> - vpermq $54, %ymm0, %ymm1 >> - vpshufb .LC1(%rip), %ymm0, %ymm0 >> - vmovdqa %ymm1, c(%rip) >> - vmovdqa b(%rip), %ymm1 >> - vpshufb .LC0(%rip), %ymm1, %ymm1 >> - vpermq $78, %ymm1, %ymm1 >> - vpor %ymm1, %ymm0, %ymm0 >> + vmovdqa a(%rip), %ymm1 >> + vpermq $54, %ymm1, %ymm0 >> + vmovdqa %ymm0, c(%rip) >> + vmovdqa b(%rip), %ymm0 >> + vpalignr $8, %ymm1, %ymm0, %ymm0 >> + vpermq $99, %ymm0, %ymm0 >> vmovdqa %ymm0, d(%rip) >> vzeroupper >> ret >> change (and two fewer .rodata constants). > > On a related note, I would like to point out that > gcc.target/i386/pr61403.c also fails to generate blend insn with > -mavx2. The new insn sequence includes lots of new vpshufb insns with > memory access. Following patch fixes the failure: --cut here-- --cut here-- The comment above expand_vec_perm_pblendv claims that: /* Use the same checks as in expand_vec_perm_blend, but skipping AVX and AVX2 as they require more than 2 instructions. */ However, I see a significant reduction in vpshufb and vpor instructions (33->16 and 22->11), and 6 new vblendps insns. BTW: I have no access to avx2 target, so I can't test the patch with a runtime tests. OTOH, it doesn't ICE for "GCC_TEST_RUN_EXPENSIVE=1 make check-gcc RUNTESTFLAGS='--target_board=unix/-mavx2 dg-torture.exp=vshuf*.c'". Jakub, what do you think? Uros. Index: i386.c =================================================================== --- i386.c (revision 215802) +++ i386.c (working copy) @@ -43407,8 +43407,10 @@ expand_vec_perm_pblendv (struct expand_vec_perm_d AVX and AVX2 as they require more than 2 instructions. */ if (d->one_operand_p) return false; - if (TARGET_SSE4_1 && GET_MODE_SIZE (vmode) == 16) + if (TARGET_AVX2 && GET_MODE_SIZE (vmode) == 32) ; + else if (TARGET_SSE4_1 && GET_MODE_SIZE (vmode) == 16) + ; else return false;