From patchwork Sat Oct 8 15:43:22 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 118549 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 2F1B6B70C2 for ; Sun, 9 Oct 2011 02:43:51 +1100 (EST) Received: (qmail 26521 invoked by alias); 8 Oct 2011 15:43:48 -0000 Received: (qmail 26513 invoked by uid 22791); 8 Oct 2011 15:43:46 -0000 X-SWARE-Spam-Status: No, hits=-6.8 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, SPF_HELO_PASS, TW_DQ, TW_VD X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Sat, 08 Oct 2011 15:43:25 +0000 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id p98FhORp006418 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Sat, 8 Oct 2011 11:43:24 -0400 Received: from tyan-ft48-01.lab.bos.redhat.com (tyan-ft48-01.lab.bos.redhat.com [10.16.42.4]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id p98FhNIF023246 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 8 Oct 2011 11:43:24 -0400 Received: from tyan-ft48-01.lab.bos.redhat.com (localhost.localdomain [127.0.0.1]) by tyan-ft48-01.lab.bos.redhat.com (8.14.4/8.14.4) with ESMTP id p98FhNJM028743; Sat, 8 Oct 2011 17:43:23 +0200 Received: (from jakub@localhost) by tyan-ft48-01.lab.bos.redhat.com (8.14.4/8.14.4/Submit) id p98FhM90028739; Sat, 8 Oct 2011 17:43:22 +0200 Date: Sat, 8 Oct 2011 17:43:22 +0200 From: Jakub Jelinek To: Richard Henderson , Uros Bizjak Cc: gcc-patches@gcc.gnu.org Subject: [RFC] Slightly fix up vgather* patterns Message-ID: <20111008154322.GQ19412@tyan-ft48-01.lab.bos.redhat.com> Reply-To: Jakub Jelinek MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hi! The AVX2 docs say that the insns will #UD if any of the mask, src and index registers are the same, but e.g. on #include __m256 m; float f[1024]; __m256 foo (void) { __m256i mi = (__m256i) m; return _mm256_mask_i32gather_ps (m, f, mi, m, 4); } which is IMHO valid and should for m being zero vector just return a zero vector and clear mask (in this case it was already cleared) we compile it as vmovdqa m(%rip), %ymm1 vmovaps %ymm1, %ymm0 vgatherdps %ymm1, (%rax, %ymm1, 4), %ymm0 and thus IMHO it will #UD. Also, the insns should make it clear that the mask register is modified too (the patch clobbers it, perhaps we could instead say that it zeros the register (which is true if it doesn't segfault), but then what if a segfault handler chooses to continue with the next insn and doesn't clear the mask register?). Still, the insn description is imprecise, saying that it loads from mem at the address register is wrong and perhaps some DCE might delete what shouldn't be deleted. So, either it should (use (mem (scratch))) or something similar, or in the unspec list all the memory locations that are being read (mem: (plus:SI (reg:SI) (vec_select:SI (match_operand:V4SI) (parallel [(const_int N)])))) for N 0 through something (but it is complicated by Pmode size vs. the need to do nothing/truncate/sign_extend the vec_select to the right mode). What do you think? 2011-10-08 Jakub Jelinek * config/i386/sse.md (avx2_gathersi, avx2_gatherdi, avx2_gatherdi256): Add clobber of operand 4. (*avx2_gathersi, *avx2_gatherdi, *avx2_gatherdi256): Add clobber of the mask register, add earlyclobber to both output operands. Jakub --- gcc/config/i386/sse.md.jj 2011-10-07 10:03:27.000000000 +0200 +++ gcc/config/i386/sse.md 2011-10-08 17:14:50.000000000 +0200 @@ -12521,55 +12521,59 @@ (define_mode_attr VEC_GATHER_MODE (V8SI "V8SI") (V8SF "V8SI")]) (define_expand "avx2_gathersi" - [(set (match_operand:VEC_GATHER_MODE 0 "register_operand" "") - (unspec:VEC_GATHER_MODE - [(match_operand:VEC_GATHER_MODE 1 "register_operand" "") - (match_operand: 2 "memory_operand" "") - (match_operand: 3 "register_operand" "") - (match_operand:VEC_GATHER_MODE 4 "register_operand" "") - (match_operand:SI 5 "const1248_operand " "")] - UNSPEC_GATHER))] + [(parallel [(set (match_operand:VEC_GATHER_MODE 0 "register_operand" "") + (unspec:VEC_GATHER_MODE + [(match_operand:VEC_GATHER_MODE 1 "register_operand" "") + (match_operand: 2 "memory_operand" "") + (match_operand: 3 "register_operand" "") + (match_operand:VEC_GATHER_MODE 4 "register_operand" "") + (match_operand:SI 5 "const1248_operand " "")] + UNSPEC_GATHER)) + (clobber (match_dup 4))])] "TARGET_AVX2") (define_insn "*avx2_gathersi" - [(set (match_operand:VEC_GATHER_MODE 0 "register_operand" "=x") + [(set (match_operand:VEC_GATHER_MODE 0 "register_operand" "=&x") (unspec:VEC_GATHER_MODE - [(match_operand:VEC_GATHER_MODE 1 "register_operand" "0") + [(match_operand:VEC_GATHER_MODE 2 "register_operand" "0") (mem: - (match_operand:P 2 "register_operand" "r")) - (match_operand: 3 "register_operand" "x") - (match_operand:VEC_GATHER_MODE 4 "register_operand" "x") - (match_operand:SI 5 "const1248_operand" "n")] - UNSPEC_GATHER))] + (match_operand:P 3 "register_operand" "r")) + (match_operand: 4 "register_operand" "x") + (match_operand:VEC_GATHER_MODE 5 "register_operand" "1") + (match_operand:SI 6 "const1248_operand" "n")] + UNSPEC_GATHER)) + (clobber (match_operand:VEC_GATHER_MODE 1 "register_operand" "=&x"))] "TARGET_AVX2" - "vgatherd\t{%4, (%2, %3, %c5), %0|%0, (%2, %3, %c5), %4}" + "vgatherd\t{%1, (%3, %4, %c6), %0|%0, (%3, %4, %c6), %1}" [(set_attr "type" "ssemov") (set_attr "prefix" "vex") (set_attr "mode" "")]) (define_expand "avx2_gatherdi" - [(set (match_operand:VEC_GATHER_MODE 0 "register_operand" "") - (unspec:VEC_GATHER_MODE - [(match_operand:VEC_GATHER_MODE 1 "register_operand" "") - (match_operand: 2 "memory_operand" "") - (match_operand: 3 "register_operand" "") - (match_operand:VEC_GATHER_MODE 4 "register_operand" "") - (match_operand:SI 5 "const1248_operand " "")] - UNSPEC_GATHER))] + [(parallel [(set (match_operand:VEC_GATHER_MODE 0 "register_operand" "") + (unspec:VEC_GATHER_MODE + [(match_operand:VEC_GATHER_MODE 1 "register_operand" "") + (match_operand: 2 "memory_operand" "") + (match_operand: 3 "register_operand" "") + (match_operand:VEC_GATHER_MODE 4 "register_operand" "") + (match_operand:SI 5 "const1248_operand " "")] + UNSPEC_GATHER)) + (clobber (match_dup 4))])] "TARGET_AVX2") (define_insn "*avx2_gatherdi" - [(set (match_operand:AVXMODE48P_DI 0 "register_operand" "=x") + [(set (match_operand:AVXMODE48P_DI 0 "register_operand" "=&x") (unspec:AVXMODE48P_DI - [(match_operand:AVXMODE48P_DI 1 "register_operand" "0") + [(match_operand:AVXMODE48P_DI 2 "register_operand" "0") (mem: - (match_operand:P 2 "register_operand" "r")) - (match_operand: 3 "register_operand" "x") - (match_operand:AVXMODE48P_DI 4 "register_operand" "x") - (match_operand:SI 5 "const1248_operand" "n")] - UNSPEC_GATHER))] + (match_operand:P 3 "register_operand" "r")) + (match_operand: 4 "register_operand" "x") + (match_operand:AVXMODE48P_DI 5 "register_operand" "1") + (match_operand:SI 6 "const1248_operand" "n")] + UNSPEC_GATHER)) + (clobber (match_operand:AVXMODE48P_DI 1 "register_operand" "=&x"))] "TARGET_AVX2" - "vgatherq\t{%4, (%2, %3, %c5), %0|%0, (%2, %3, %c5), %4}" + "vgatherq\t{%1, (%3, %4, %c6), %0|%0, (%3, %4, %c6), %1}" [(set_attr "type" "ssemov") (set_attr "prefix" "vex") (set_attr "mode" "")]) @@ -12577,28 +12581,30 @@ (define_insn "*avx2_gatherdi" ;; Special handling for VEX.256 with float arguments ;; since there're still xmms as operands (define_expand "avx2_gatherdi256" - [(set (match_operand:VI4F_128 0 "register_operand" "") - (unspec:VI4F_128 - [(match_operand:VI4F_128 1 "register_operand" "") - (match_operand: 2 "memory_operand" "") - (match_operand:V4DI 3 "register_operand" "") - (match_operand:VI4F_128 4 "register_operand" "") - (match_operand:SI 5 "const1248_operand " "")] - UNSPEC_GATHER))] + [(parallel [(set (match_operand:VI4F_128 0 "register_operand" "") + (unspec:VI4F_128 + [(match_operand:VI4F_128 1 "register_operand" "") + (match_operand: 2 "memory_operand" "") + (match_operand:V4DI 3 "register_operand" "") + (match_operand:VI4F_128 4 "register_operand" "") + (match_operand:SI 5 "const1248_operand " "")] + UNSPEC_GATHER)) + (clobber (match_dup 4))])] "TARGET_AVX2") (define_insn "*avx2_gatherdi256" [(set (match_operand:VI4F_128 0 "register_operand" "=x") (unspec:VI4F_128 - [(match_operand:VI4F_128 1 "register_operand" "0") + [(match_operand:VI4F_128 2 "register_operand" "0") (mem: - (match_operand:P 2 "register_operand" "r")) - (match_operand:V4DI 3 "register_operand" "x") - (match_operand:VI4F_128 4 "register_operand" "x") - (match_operand:SI 5 "const1248_operand" "n")] - UNSPEC_GATHER))] + (match_operand:P 3 "register_operand" "r")) + (match_operand:V4DI 4 "register_operand" "x") + (match_operand:VI4F_128 5 "register_operand" "1") + (match_operand:SI 6 "const1248_operand" "n")] + UNSPEC_GATHER)) + (clobber (match_operand:VI4F_128 1 "register_operand" "=&x"))] "TARGET_AVX2" - "vgatherq\t{%4, (%2, %3, %c5), %0|%0, (%2, %3, %c5), %4}" + "vgatherq\t{%1, (%3, %4, %c6), %0|%0, (%3, %4, %c6), %1}" [(set_attr "type" "ssemov") (set_attr "prefix" "vex") (set_attr "mode" "")])