From patchwork Mon May 9 16:47:59 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 619966 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3r3Sxs5h05z9t3f for ; Tue, 10 May 2016 02:48:33 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=eQFeUAhy; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:mime-version :content-type; q=dns; s=default; b=tWyE67gtJk34btzOrDRwobc9IYVS7 B53q64XoMAWMfCMtefVJ2yy+oAvyn/q44Dg6EdF2lmoQnu+tgXGJ5qC36op4Ulb0 SNtDHPiW3MOLU8zZbgDqV0MZVfqudvAjhHmaSI3r4dPq0bVGQnO9saNAskUx2/rF 2OgX+By/KBkaic= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:mime-version :content-type; s=default; bh=qbgqANvfQnOIIG42jWrSHEZEaZQ=; b=eQF eUAhyFyIyeV/IVp1FLP2F9MabTwBRDk1FdShD2gb6o/HRRFk5ZQ1ZXQ2zSx5eeDO a0+6SKzibmm1M9BSpo/7T/Ip8dEBBh9yXemWWdvudlKh6Okc2lk6rc4Nc3sck5bs /AOIzILiQP1hpVGB2ASakHUmtzCwfo+FruX2c9Yk= Received: (qmail 31689 invoked by alias); 9 May 2016 16:48:15 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 31668 invoked by uid 89); 9 May 2016 16:48:15 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=1.8 required=5.0 tests=BAYES_00, KAM_LAZY_DOMAIN_SECURITY, LIKELY_SPAM_BODY, RP_MATCHES_RCVD, SPF_HELO_PASS, UNWANTED_LANGUAGE_BODY autolearn=no version=3.3.2 spammy=origorig, orig, orig, TARGET_SSE2, target_sse2 X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Mon, 09 May 2016 16:48:05 +0000 Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id CF7833B72A; Mon, 9 May 2016 16:48:03 +0000 (UTC) Received: from tucnak.zalov.cz (ovpn-116-17.ams2.redhat.com [10.36.116.17]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u49Gm2UT012833 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Mon, 9 May 2016 12:48:03 -0400 Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.15.2/8.15.2) with ESMTP id u49Gm02X020477; Mon, 9 May 2016 18:48:00 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.15.2/8.15.2/Submit) id u49GlxNB020476; Mon, 9 May 2016 18:47:59 +0200 Date: Mon, 9 May 2016 18:47:59 +0200 From: Jakub Jelinek To: Uros Bizjak , Kirill Yukhin Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] vinsertps XMM16-XMM31 fixes Message-ID: <20160509164759.GF28550@tucnak.redhat.com> Reply-To: Jakub Jelinek MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.24 (2015-08-30) X-IsSubscribed: yes Hi! The testcases show that we emit AVX512BW instructions even when AVX512BW is disabled. Additionally, two of the 4 patterns were using weirdo constraint for the output (x instead of v, while they used v for input). Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2016-05-09 Jakub Jelinek PR target/71019 * config/i386/sse.md (_packssdw, _packusdw): Make sure EVEX encoded insn is not emitted unless TARGET_AVX512BW. (_packuswb, _packsswb): Likewise. For TARGET_AVX512BW, use "=v" constraint instead of "=x" for the result operand. * gcc.target/i386/avx512vl-pack-1.c: New test. * gcc.target/i386/avx512vl-pack-2.c: New test. * gcc.target/i386/avx512bw-pack-2.c: New test. Jakub --- gcc/config/i386/sse.md.jj 2016-05-09 11:38:36.000000000 +0200 +++ gcc/config/i386/sse.md 2016-05-09 12:34:58.839865460 +0200 @@ -11500,54 +11500,57 @@ (define_expand "vec_pack_trunc_" }) (define_insn "_packsswb" - [(set (match_operand:VI1_AVX512 0 "register_operand" "=x,x") + [(set (match_operand:VI1_AVX512 0 "register_operand" "=x,x,v") (vec_concat:VI1_AVX512 (ss_truncate: - (match_operand: 1 "register_operand" "0,v")) + (match_operand: 1 "register_operand" "0,x,v")) (ss_truncate: - (match_operand: 2 "vector_operand" "xBm,vm"))))] + (match_operand: 2 "vector_operand" "xBm,xm,vm"))))] "TARGET_SSE2 && && " "@ packsswb\t{%2, %0|%0, %2} + vpacksswb\t{%2, %1, %0|%0, %1, %2} vpacksswb\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "isa" "noavx,avx") + [(set_attr "isa" "noavx,avx,avx512bw") (set_attr "type" "sselog") - (set_attr "prefix_data16" "1,*") - (set_attr "prefix" "orig,maybe_evex") + (set_attr "prefix_data16" "1,*,*") + (set_attr "prefix" "orig,,evex") (set_attr "mode" "")]) (define_insn "_packssdw" - [(set (match_operand:VI2_AVX2 0 "register_operand" "=x,v") + [(set (match_operand:VI2_AVX2 0 "register_operand" "=x,x,v") (vec_concat:VI2_AVX2 (ss_truncate: - (match_operand: 1 "register_operand" "0,v")) + (match_operand: 1 "register_operand" "0,x,v")) (ss_truncate: - (match_operand: 2 "vector_operand" "xBm,vm"))))] + (match_operand: 2 "vector_operand" "xBm,xm,vm"))))] "TARGET_SSE2 && && " "@ packssdw\t{%2, %0|%0, %2} + vpackssdw\t{%2, %1, %0|%0, %1, %2} vpackssdw\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "isa" "noavx,avx") + [(set_attr "isa" "noavx,avx,avx512bw") (set_attr "type" "sselog") - (set_attr "prefix_data16" "1,*") - (set_attr "prefix" "orig,vex") + (set_attr "prefix_data16" "1,*,*") + (set_attr "prefix" "orig,,evex") (set_attr "mode" "")]) (define_insn "_packuswb" - [(set (match_operand:VI1_AVX512 0 "register_operand" "=x,x") + [(set (match_operand:VI1_AVX512 0 "register_operand" "=x,x,v") (vec_concat:VI1_AVX512 (us_truncate: - (match_operand: 1 "register_operand" "0,v")) + (match_operand: 1 "register_operand" "0,x,v")) (us_truncate: - (match_operand: 2 "vector_operand" "xBm,vm"))))] + (match_operand: 2 "vector_operand" "xBm,xm,vm"))))] "TARGET_SSE2 && && " "@ packuswb\t{%2, %0|%0, %2} + vpackuswb\t{%2, %1, %0|%0, %1, %2} vpackuswb\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "isa" "noavx,avx") + [(set_attr "isa" "noavx,avx,avx512bw") (set_attr "type" "sselog") - (set_attr "prefix_data16" "1,*") - (set_attr "prefix" "orig,vex") + (set_attr "prefix_data16" "1,*,*") + (set_attr "prefix" "orig,,evex") (set_attr "mode" "")]) (define_insn "avx512bw_interleave_highv64qi" @@ -14572,21 +14575,22 @@ (define_insn "_mpsadbw" (set_attr "mode" "")]) (define_insn "_packusdw" - [(set (match_operand:VI2_AVX2 0 "register_operand" "=Yr,*x,v") + [(set (match_operand:VI2_AVX2 0 "register_operand" "=Yr,*x,x,v") (vec_concat:VI2_AVX2 (us_truncate: - (match_operand: 1 "register_operand" "0,0,v")) + (match_operand: 1 "register_operand" "0,0,x,v")) (us_truncate: - (match_operand: 2 "vector_operand" "YrBm,*xBm,vm"))))] + (match_operand: 2 "vector_operand" "YrBm,*xBm,xm,vm"))))] "TARGET_SSE4_1 && && " "@ packusdw\t{%2, %0|%0, %2} packusdw\t{%2, %0|%0, %2} + vpackusdw\t{%2, %1, %0|%0, %1, %2} vpackusdw\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "isa" "noavx,noavx,avx") + [(set_attr "isa" "noavx,noavx,avx,avx512bw") (set_attr "type" "sselog") (set_attr "prefix_extra" "1") - (set_attr "prefix" "orig,orig,maybe_evex") + (set_attr "prefix" "orig,orig,,evex") (set_attr "mode" "")]) (define_insn "_pblendvb" --- gcc/testsuite/gcc.target/i386/avx512vl-pack-1.c.jj 2016-05-09 12:16:52.062562903 +0200 +++ gcc/testsuite/gcc.target/i386/avx512vl-pack-1.c 2016-05-09 12:21:42.786628535 +0200 @@ -0,0 +1,68 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mavx512vl -mno-avx512bw" } */ + +#include + +__m128i +f1 (__m128i a, __m128i b) +{ + return _mm_packs_epi16 (a, b); +} + +/* { dg-final { scan-assembler-times "vpacksswb\[^\n\r\]*xmm\[0-9\]" 1 } } */ + +__m128i +f2 (__m128i a, __m128i b) +{ + return _mm_packs_epi32 (a, b); +} + +/* { dg-final { scan-assembler-times "vpackssdw\[^\n\r\]*xmm\[0-9\]" 1 } } */ + +__m128i +f3 (__m128i a, __m128i b) +{ + return _mm_packus_epi16 (a, b); +} + +/* { dg-final { scan-assembler-times "vpackuswb\[^\n\r\]*xmm\[0-9\]" 1 } } */ + +__m128i +f4 (__m128i a, __m128i b) +{ + return _mm_packus_epi32 (a, b); +} + +/* { dg-final { scan-assembler-times "vpackusdw\[^\n\r\]*xmm\[0-9\]" 1 } } */ + +__m256i +f5 (__m256i a, __m256i b) +{ + return _mm256_packs_epi16 (a, b); +} + +/* { dg-final { scan-assembler-times "vpacksswb\[^\n\r\]*ymm\[0-9\]" 1 } } */ + +__m256i +f6 (__m256i a, __m256i b) +{ + return _mm256_packs_epi32 (a, b); +} + +/* { dg-final { scan-assembler-times "vpackssdw\[^\n\r\]*ymm\[0-9\]" 1 } } */ + +__m256i +f7 (__m256i a, __m256i b) +{ + return _mm256_packus_epi16 (a, b); +} + +/* { dg-final { scan-assembler-times "vpackuswb\[^\n\r\]*ymm\[0-9\]" 1 } } */ + +__m256i +f8 (__m256i a, __m256i b) +{ + return _mm256_packus_epi32 (a, b); +} + +/* { dg-final { scan-assembler-times "vpackusdw\[^\n\r\]*ymm\[0-9\]" 1 } } */ --- gcc/testsuite/gcc.target/i386/avx512vl-pack-2.c.jj 2016-05-09 12:16:54.961523671 +0200 +++ gcc/testsuite/gcc.target/i386/avx512vl-pack-2.c 2016-05-09 12:24:13.532588490 +0200 @@ -0,0 +1,108 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2 -mavx512vl -mno-avx512bw" } */ + +#include + +__m128i +f1 (__m128i a, __m128i b) +{ + register __m128i c __asm ("xmm16") = a; + asm volatile ("" : "+v" (c)); + c = _mm_packs_epi16 (c, b); + asm volatile ("" : "+v" (c)); + return c; +} + +/* { dg-final { scan-assembler-times "vpacksswb\[^\n\r\]*xmm\[0-9\]" 1 } } */ +/* { dg-final { scan-assembler-not "vpacksswb\[^\n\r\]*xmm16" } } */ + +__m128i +f2 (__m128i a, __m128i b) +{ + register __m128i c __asm ("xmm16") = a; + asm volatile ("" : "+v" (c)); + c = _mm_packs_epi32 (c, b); + asm volatile ("" : "+v" (c)); + return c; +} + +/* { dg-final { scan-assembler-times "vpackssdw\[^\n\r\]*xmm\[0-9\]" 1 } } */ +/* { dg-final { scan-assembler-not "vpackssdw\[^\n\r\]*xmm16" } } */ + +__m128i +f3 (__m128i a, __m128i b) +{ + register __m128i c __asm ("xmm16") = a; + asm volatile ("" : "+v" (c)); + c = _mm_packus_epi16 (c, b); + asm volatile ("" : "+v" (c)); + return c; +} + +/* { dg-final { scan-assembler-times "vpackuswb\[^\n\r\]*xmm\[0-9\]" 1 } } */ +/* { dg-final { scan-assembler-not "vpackuswb\[^\n\r\]*xmm16" } } */ + +__m128i +f4 (__m128i a, __m128i b) +{ + register __m128i c __asm ("xmm16") = a; + asm volatile ("" : "+v" (c)); + c = _mm_packus_epi32 (c, b); + asm volatile ("" : "+v" (c)); + return c; +} + +/* { dg-final { scan-assembler-times "vpackusdw\[^\n\r\]*xmm\[0-9\]" 1 } } */ +/* { dg-final { scan-assembler-not "vpackusdw\[^\n\r\]*xmm16" } } */ + +__m256i +f5 (__m256i a, __m256i b) +{ + register __m256i c __asm ("xmm16") = a; + asm volatile ("" : "+v" (c)); + c = _mm256_packs_epi16 (c, b); + asm volatile ("" : "+v" (c)); + return c; +} + +/* { dg-final { scan-assembler-times "vpacksswb\[^\n\r\]*ymm\[0-9\]" 1 } } */ +/* { dg-final { scan-assembler-not "vpacksswb\[^\n\r\]*ymm16" } } */ + +__m256i +f6 (__m256i a, __m256i b) +{ + register __m256i c __asm ("xmm16") = a; + asm volatile ("" : "+v" (c)); + c = _mm256_packs_epi32 (c, b); + asm volatile ("" : "+v" (c)); + return c; +} + +/* { dg-final { scan-assembler-times "vpackssdw\[^\n\r\]*ymm\[0-9\]" 1 } } */ +/* { dg-final { scan-assembler-not "vpackssdw\[^\n\r\]*ymm16" } } */ + +__m256i +f7 (__m256i a, __m256i b) +{ + register __m256i c __asm ("xmm16") = a; + asm volatile ("" : "+v" (c)); + c = _mm256_packus_epi16 (c, b); + asm volatile ("" : "+v" (c)); + return c; +} + +/* { dg-final { scan-assembler-times "vpackuswb\[^\n\r\]*ymm\[0-9\]" 1 } } */ +/* { dg-final { scan-assembler-not "vpackuswb\[^\n\r\]*ymm16" } } */ + +__m256i +f8 (__m256i a, __m256i b) +{ + register __m256i c __asm ("xmm16") = a; + asm volatile ("" : "+v" (c)); + c = _mm256_packus_epi32 (c, b); + asm volatile ("" : "+v" (c)); + return c; +} + +/* { dg-final { scan-assembler-times "vpackusdw\[^\n\r\]*ymm\[0-9\]" 1 } } */ +/* { dg-final { scan-assembler-not "vpackusdw\[^\n\r\]*ymm16" } } */ --- gcc/testsuite/gcc.target/i386/avx512bw-pack-2.c.jj 2016-05-09 12:28:02.869486414 +0200 +++ gcc/testsuite/gcc.target/i386/avx512bw-pack-2.c 2016-05-09 12:29:06.941620616 +0200 @@ -0,0 +1,100 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2 -mavx512vl -mavx512bw" } */ + +#include + +__m128i +f1 (__m128i a, __m128i b) +{ + register __m128i c __asm ("xmm16") = a; + asm volatile ("" : "+v" (c)); + c = _mm_packs_epi16 (c, b); + asm volatile ("" : "+v" (c)); + return c; +} + +/* { dg-final { scan-assembler-times "vpacksswb\[^\n\r\]*xmm16" 1 } } */ + +__m128i +f2 (__m128i a, __m128i b) +{ + register __m128i c __asm ("xmm16") = a; + asm volatile ("" : "+v" (c)); + c = _mm_packs_epi32 (c, b); + asm volatile ("" : "+v" (c)); + return c; +} + +/* { dg-final { scan-assembler-times "vpackssdw\[^\n\r\]*xmm16" 1 } } */ + +__m128i +f3 (__m128i a, __m128i b) +{ + register __m128i c __asm ("xmm16") = a; + asm volatile ("" : "+v" (c)); + c = _mm_packus_epi16 (c, b); + asm volatile ("" : "+v" (c)); + return c; +} + +/* { dg-final { scan-assembler-times "vpackuswb\[^\n\r\]*xmm16" 1 } } */ + +__m128i +f4 (__m128i a, __m128i b) +{ + register __m128i c __asm ("xmm16") = a; + asm volatile ("" : "+v" (c)); + c = _mm_packus_epi32 (c, b); + asm volatile ("" : "+v" (c)); + return c; +} + +/* { dg-final { scan-assembler-times "vpackusdw\[^\n\r\]*xmm16" 1 } } */ + +__m256i +f5 (__m256i a, __m256i b) +{ + register __m256i c __asm ("xmm16") = a; + asm volatile ("" : "+v" (c)); + c = _mm256_packs_epi16 (c, b); + asm volatile ("" : "+v" (c)); + return c; +} + +/* { dg-final { scan-assembler-times "vpacksswb\[^\n\r\]*ymm16" 1 } } */ + +__m256i +f6 (__m256i a, __m256i b) +{ + register __m256i c __asm ("xmm16") = a; + asm volatile ("" : "+v" (c)); + c = _mm256_packs_epi32 (c, b); + asm volatile ("" : "+v" (c)); + return c; +} + +/* { dg-final { scan-assembler-times "vpackssdw\[^\n\r\]*ymm16" 1 } } */ + +__m256i +f7 (__m256i a, __m256i b) +{ + register __m256i c __asm ("xmm16") = a; + asm volatile ("" : "+v" (c)); + c = _mm256_packus_epi16 (c, b); + asm volatile ("" : "+v" (c)); + return c; +} + +/* { dg-final { scan-assembler-times "vpackuswb\[^\n\r\]*ymm16" 1 } } */ + +__m256i +f8 (__m256i a, __m256i b) +{ + register __m256i c __asm ("xmm16") = a; + asm volatile ("" : "+v" (c)); + c = _mm256_packus_epi32 (c, b); + asm volatile ("" : "+v" (c)); + return c; +} + +/* { dg-final { scan-assembler-times "vpackusdw\[^\n\r\]*ymm16" 1 } } */