From patchwork Wed May 18 21:01:39 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 623729 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3r968K4FP5z9snm for ; Thu, 19 May 2016 07:02:09 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=XlEx2PHb; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:mime-version :content-type; q=dns; s=default; b=FlatwZb2KSeqvfUkMvgJ1svf9PXaj ewsHffTuIkE5RGE0G5aEyJV51WugZHom0TdNQr8RfjwKDgjq+2PJS7zWM1KfTilK imrqb3JsKOEeabBxylqdxDlK7FwsivWbGGYFyx2q0avXawjdO5adGkQsz7AlXCqN PP6Nt2LjItifi4= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:mime-version :content-type; s=default; bh=2AP3CLzqOF/R5OBJ7rUgT3jkUh8=; b=XlE x2PHbIQIJ1tb9CepNx4cpFPC4FbauoPKjxuNasNEvBzogb1AnFQ8Qz4KNK6+JWSz pNA5fT2drXqQy2oSDXjXKQg4/2bPwvm/kMS4gCt58n8mEph0n0n3AOKXcNVuf7Tz 5cswpzdDsNzkEBnJfk33H6+FR9Z3oxNdZrK0czQo= Received: (qmail 28653 invoked by alias); 18 May 2016 21:02:01 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 28642 invoked by uid 89); 18 May 2016 21:02:00 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-3.3 required=5.0 tests=BAYES_00, RP_MATCHES_RCVD, SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=xv X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Wed, 18 May 2016 21:01:44 +0000 Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 1C3586F670; Wed, 18 May 2016 21:01:43 +0000 (UTC) Received: from tucnak.zalov.cz (ovpn-116-88.ams2.redhat.com [10.36.116.88]) by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u4IL1fo5029802 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Wed, 18 May 2016 17:01:42 -0400 Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.15.2/8.15.2) with ESMTP id u4IL1eVG001133; Wed, 18 May 2016 23:01:40 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.15.2/8.15.2/Submit) id u4IL1dHr001132; Wed, 18 May 2016 23:01:39 +0200 Date: Wed, 18 May 2016 23:01:39 +0200 From: Jakub Jelinek To: Uros Bizjak , Kirill Yukhin Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] Improve XMM16+ handling in vec_set* Message-ID: <20160518210139.GW28550@tucnak.redhat.com> Reply-To: Jakub Jelinek MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.24 (2015-08-30) X-IsSubscribed: yes Hi! vinserti32x4 is in AVX512VL. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2016-05-18 Jakub Jelinek * config/i386/sse.md (vec_set_lo_v16hi, vec_set_hi_v16hi, vec_set_lo_v32qi, vec_set_hi_v32qi): Add alternative with v constraint instead of x and vinserti32x4 insn. * gcc.target/i386/avx512vl-vinserti32x4-3.c: New test. Jakub --- gcc/config/i386/sse.md.jj 2016-05-18 13:21:35.000000000 +0200 +++ gcc/config/i386/sse.md 2016-05-18 15:02:54.574685438 +0200 @@ -17899,47 +17899,50 @@ (define_insn "vec_set_hi_")]) (define_insn "vec_set_lo_v16hi" - [(set (match_operand:V16HI 0 "register_operand" "=x") + [(set (match_operand:V16HI 0 "register_operand" "=x,v") (vec_concat:V16HI - (match_operand:V8HI 2 "nonimmediate_operand" "xm") + (match_operand:V8HI 2 "nonimmediate_operand" "xm,vm") (vec_select:V8HI - (match_operand:V16HI 1 "register_operand" "x") + (match_operand:V16HI 1 "register_operand" "x,v") (parallel [(const_int 8) (const_int 9) (const_int 10) (const_int 11) (const_int 12) (const_int 13) (const_int 14) (const_int 15)]))))] "TARGET_AVX" - "vinsert%~128\t{$0x0, %2, %1, %0|%0, %1, %2, 0x0}" + "@vinsert%~128\t{$0x0, %2, %1, %0|%0, %1, %2, 0x0} + vinserti32x4\t{$0x0, %2, %1, %0|%0, %1, %2, 0x0}" [(set_attr "type" "sselog") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") - (set_attr "prefix" "vex") + (set_attr "prefix" "vex,evex") (set_attr "mode" "OI")]) (define_insn "vec_set_hi_v16hi" - [(set (match_operand:V16HI 0 "register_operand" "=x") + [(set (match_operand:V16HI 0 "register_operand" "=x,v") (vec_concat:V16HI (vec_select:V8HI - (match_operand:V16HI 1 "register_operand" "x") + (match_operand:V16HI 1 "register_operand" "x,v") (parallel [(const_int 0) (const_int 1) (const_int 2) (const_int 3) (const_int 4) (const_int 5) (const_int 6) (const_int 7)])) - (match_operand:V8HI 2 "nonimmediate_operand" "xm")))] + (match_operand:V8HI 2 "nonimmediate_operand" "xm,vm")))] "TARGET_AVX" - "vinsert%~128\t{$0x1, %2, %1, %0|%0, %1, %2, 0x1}" + "@ + vinsert%~128\t{$0x1, %2, %1, %0|%0, %1, %2, 0x1} + vinserti32x4\t{$0x1, %2, %1, %0|%0, %1, %2, 0x1}" [(set_attr "type" "sselog") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") - (set_attr "prefix" "vex") + (set_attr "prefix" "vex,evex") (set_attr "mode" "OI")]) (define_insn "vec_set_lo_v32qi" - [(set (match_operand:V32QI 0 "register_operand" "=x") + [(set (match_operand:V32QI 0 "register_operand" "=x,v") (vec_concat:V32QI - (match_operand:V16QI 2 "nonimmediate_operand" "xm") + (match_operand:V16QI 2 "nonimmediate_operand" "xm,v") (vec_select:V16QI - (match_operand:V32QI 1 "register_operand" "x") + (match_operand:V32QI 1 "register_operand" "x,v") (parallel [(const_int 16) (const_int 17) (const_int 18) (const_int 19) (const_int 20) (const_int 21) @@ -17949,18 +17952,20 @@ (define_insn "vec_set_lo_v32qi" (const_int 28) (const_int 29) (const_int 30) (const_int 31)]))))] "TARGET_AVX" - "vinsert%~128\t{$0x0, %2, %1, %0|%0, %1, %2, 0x0}" + "@ + vinsert%~128\t{$0x0, %2, %1, %0|%0, %1, %2, 0x0} + vinserti32x4\t{$0x0, %2, %1, %0|%0, %1, %2, 0x0}" [(set_attr "type" "sselog") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") - (set_attr "prefix" "vex") + (set_attr "prefix" "vex,evex") (set_attr "mode" "OI")]) (define_insn "vec_set_hi_v32qi" - [(set (match_operand:V32QI 0 "register_operand" "=x") + [(set (match_operand:V32QI 0 "register_operand" "=x,v") (vec_concat:V32QI (vec_select:V16QI - (match_operand:V32QI 1 "register_operand" "x") + (match_operand:V32QI 1 "register_operand" "x,v") (parallel [(const_int 0) (const_int 1) (const_int 2) (const_int 3) (const_int 4) (const_int 5) @@ -17969,13 +17974,15 @@ (define_insn "vec_set_hi_v32qi" (const_int 10) (const_int 11) (const_int 12) (const_int 13) (const_int 14) (const_int 15)])) - (match_operand:V16QI 2 "nonimmediate_operand" "xm")))] + (match_operand:V16QI 2 "nonimmediate_operand" "xm,vm")))] "TARGET_AVX" - "vinsert%~128\t{$0x1, %2, %1, %0|%0, %1, %2, 0x1}" + "@ + vinsert%~128\t{$0x1, %2, %1, %0|%0, %1, %2, 0x1} + vinserti32x4\t{$0x1, %2, %1, %0|%0, %1, %2, 0x1}" [(set_attr "type" "sselog") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") - (set_attr "prefix" "vex") + (set_attr "prefix" "vex,evex") (set_attr "mode" "OI")]) (define_insn "_maskload" --- gcc/testsuite/gcc.target/i386/avx512vl-vinserti32x4-3.c.jj 2016-05-18 15:06:44.517541398 +0200 +++ gcc/testsuite/gcc.target/i386/avx512vl-vinserti32x4-3.c 2016-05-18 15:31:00.918492975 +0200 @@ -0,0 +1,49 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2 -mavx512vl -masm=att" } */ + +typedef char V1 __attribute__((vector_size (32))); +typedef short V2 __attribute__((vector_size (32))); + +void +f1 (V1 x, char y) +{ + register V1 a __asm ("xmm16"); + a = x; + asm volatile ("" : "+v" (a)); + a[7] = y; + asm volatile ("" : "+v" (a)); +} + +void +f2 (V1 x, char y) +{ + register V1 a __asm ("xmm16"); + a = x; + asm volatile ("" : "+v" (a)); + a[28] = y; + asm volatile ("" : "+v" (a)); +} + +void +f3 (V2 x, short y) +{ + register V2 a __asm ("xmm16"); + a = x; + asm volatile ("" : "+v" (a)); + a[3] = y; + asm volatile ("" : "+v" (a)); +} + +void +f4 (V2 x, short y) +{ + register V2 a __asm ("xmm16"); + a = x; + asm volatile ("" : "+v" (a)); + a[14] = y; + asm volatile ("" : "+v" (a)); +} + +/* { dg-final { scan-assembler-times "vinserti32x4\[^\n\r]*0x0\[^\n\r]*%ymm16" 2 } } */ +/* { dg-final { scan-assembler-times "vinserti32x4\[^\n\r]*0x1\[^\n\r]*%ymm16" 2 } } */ +/* { dg-final { scan-assembler-times "vextracti32x4\[^\n\r]*0x1\[^\n\r]*%\[yz]mm16" 2 } } */