From patchwork Wed Jul 15 07:13:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roger Sayle X-Patchwork-Id: 1329274 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=nextmovesoftware.com header.i=@nextmovesoftware.com header.a=rsa-sha256 header.s=default header.b=ZoPuDGFH; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4B67s70d20z9s1x for ; Wed, 15 Jul 2020 17:13:30 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9F0CE386F812; Wed, 15 Jul 2020 07:13:27 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from server.nextmovesoftware.com (server.nextmovesoftware.com [162.254.253.69]) by sourceware.org (Postfix) with ESMTPS id 82F403851C04 for ; Wed, 15 Jul 2020 07:13:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 82F403851C04 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=roger@nextmovesoftware.com DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nextmovesoftware.com; s=default; h=Content-Type:MIME-Version:Message-ID: Date:Subject:Cc:To:From:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=aUKJiLUSHxUrSAORkui7GM3GrlTown9cEN97GWYOg+o=; b=ZoPuDGFHSYBrEOrii+KrClvqDr BckIYbRIZskfSGOtDfTOCX2pAAMRd8ler/x1VKXSFXJtMIjxUj1iDBko5QeY/VDhvlLwJag6VtwcT 8Acp3hNYWNShANHoqPAH0FigjHB+2SEe1HtxlOokqLbNqb4hxHQSWGfGH4hkkcLmtn1RLKNJ/bhLp Bfg7JplmYPRJAjmRDR9b+KAu+cq7doqGJCha81YxWYWD2ZFkyCH32xF90LgfZNd7K6i8JhrrZ3Ips dfCXGSEHD6TeZcnwaFMOVXHmNIQVPaQzhEZxGIhldz15ekEteQQWOy59DOD3EwIlyI5hZJqHWp3st HuJuBbUA==; Received: from host86-137-89-56.range86-137.btcentralplus.com ([86.137.89.56]:56804 helo=Dell) by server.nextmovesoftware.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1jvbbX-0002Gj-H0; Wed, 15 Jul 2020 03:13:23 -0400 From: "Roger Sayle" To: "'GCC Patches'" Subject: [PATCH] nvptx: Provide vec_set and vec_extract patterns. Date: Wed, 15 Jul 2020 08:13:23 +0100 Message-ID: <000d01d65a77$6e813d60$4b83b820$@nextmovesoftware.com> MIME-Version: 1.0 X-Mailer: Microsoft Outlook 16.0 Thread-Index: AdZadoabHVKPjRD8RTuvHtpI9xZqPA== Content-Language: en-gb X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server.nextmovesoftware.com X-AntiAbuse: Original Domain - gcc.gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - nextmovesoftware.com X-Get-Message-Sender-Via: server.nextmovesoftware.com: authenticated_id: roger@nextmovesoftware.com X-Authenticated-Sender: server.nextmovesoftware.com: roger@nextmovesoftware.com X-Source: X-Source-Args: X-Source-Dir: X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" This patch provides standard vec_extract and vec_set patterns to the nvptx backend, to extract an element from a PTX vector and set an element of a PTX vector respectively. PTX vectors (I hesitate to call them SIMD vectors) may contain up to four elements, so vector modes up to size four are supported by this patch even though the nvptx backend currently only allows V2SI and V2DI, i.e. two out of the ten possible vector modes. As an example of the improvement, the following C function: typedef int __v2si __attribute__((__vector_size__(8))); int foo (__v2si arg) { return arg[0]+arg[1]; } previously generated this code using a shift: mov.u64 %r25, %ar0; ld.v2.u32 %r26, [%r25]; mov.b64 %r28, %r26; shr.s64 %r30, %r28, 32; cvt.u32.u32 %r31, %r26.x; cvt.u32.u64 %r32, %r30; add.u32 %value, %r31, %r32; but with this patch now generates: mov.u64 %r25, %ar0; ld.v2.u32 %r26, [%r25]; mov.u32 %r28, %r26.x; mov.u32 %r29, %r26.y; add.u32 %value, %r28, %r29; I've implemented these getters and setters as their own instructions instead of attempting the much more intrusive patch of changing the backend's definition of register_operand. Given the limited utility of PTX vectors, I'm not convinced that attempting to support them as operands in every instruction would be worth the effort involved. This patch has been tested on nvptx-none hosted on x86_64-pc-linux-gnu with "make" and "make check" with no new regressions. Ok for mainline? 2020-07-15 Roger Sayle gcc/ChangeLog: * config/nvptx/nvptx.md (nvptx_vector_index_operand): New predicate. (VECELEM): New mode attribute for a vector's uppercase element mode. (Vecelem): New mode attribute for a vector's lowercase element mode. (*vec_set_0, *vec_set_1, *vec_set_2, *vec_set_3): New instructions. (vec_set): New expander to generate one of the above insns. (vec_extract): New instruction. Thanks in advance, Roger --- Roger Sayle NextMove Software Cambridge, UK diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md index 6545b81..b363277 100644 --- a/gcc/config/nvptx/nvptx.md +++ b/gcc/config/nvptx/nvptx.md @@ -118,6 +118,10 @@ (define_predicate "nvptx_float_comparison_operator" (match_code "eq,ne,le,ge,lt,gt,uneq,unle,unge,unlt,ungt,unordered,ordered")) +(define_predicate "nvptx_vector_index_operand" + (and (match_code "const_int") + (match_test "UINTVAL (op) < 4"))) + ;; Test for a valid operand for a call instruction. (define_predicate "call_insn_operand" (match_code "symbol_ref,reg") @@ -194,6 +198,10 @@ ;; pointer-sized quantities. Exactly one of the two alternatives will match. (define_mode_iterator P [(SI "Pmode == SImode") (DI "Pmode == DImode")]) +;; Define element mode for each vector mode. +(define_mode_attr VECELEM [(V2SI "SI") (V2DI "DI")]) +(define_mode_attr Vecelem [(V2SI "si") (V2DI "di")]) + ;; We should get away with not defining memory alternatives, since we don't ;; get variables in this mode and pseudos are never spilled. (define_insn "movbi" @@ -1051,6 +1059,79 @@ "" "%.\\tcvt.s%T0%t1\\t%0, %1;") +;; Vector operations + +(define_insn "*vec_set_0" + [(set (match_operand:VECIM 0 "nvptx_register_operand" "=R") + (vec_merge:VECIM + (vec_duplicate:VECIM + (match_operand: 1 "nvptx_register_operand" "R")) + (match_dup 0) + (const_int 1)))] + "" + "%.\\tmov%t1\\t%0.x, %1;") + +(define_insn "*vec_set_1" + [(set (match_operand:VECIM 0 "nvptx_register_operand" "=R") + (vec_merge:VECIM + (vec_duplicate:VECIM + (match_operand: 1 "nvptx_register_operand" "R")) + (match_dup 0) + (const_int 2)))] + "" + "%.\\tmov%t1\\t%0.y, %1;") + +(define_insn "*vec_set_2" + [(set (match_operand:VECIM 0 "nvptx_register_operand" "=R") + (vec_merge:VECIM + (vec_duplicate:VECIM + (match_operand: 1 "nvptx_register_operand" "R")) + (match_dup 0) + (const_int 4)))] + "" + "%.\\tmov%t1\\t%0.z, %1;") + +(define_insn "*vec_set_3" + [(set (match_operand:VECIM 0 "nvptx_register_operand" "=R") + (vec_merge:VECIM + (vec_duplicate:VECIM + (match_operand: 1 "nvptx_register_operand" "R")) + (match_dup 0) + (const_int 8)))] + "" + "%.\\tmov%t1\\t%0.w, %1;") + +(define_expand "vec_set" + [(match_operand:VECIM 0 "nvptx_register_operand") + (match_operand: 1 "nvptx_register_operand") + (match_operand:SI 2 "nvptx_vector_index_operand")] + "" +{ + enum machine_mode mode = GET_MODE (operands[0]); + int mask = 1 << INTVAL (operands[2]); + rtx tmp = gen_rtx_VEC_DUPLICATE (mode, operands[1]); + tmp = gen_rtx_VEC_MERGE (mode, tmp, operands[0], GEN_INT (mask)); + emit_insn (gen_rtx_SET (operands[0], tmp)); + DONE; +}) + +(define_insn "vec_extract" + [(set (match_operand: 0 "nvptx_register_operand" "=R") + (vec_select: + (match_operand:VECIM 1 "nvptx_register_operand" "R") + (parallel [(match_operand:SI 2 "nvptx_vector_index_operand" "")])))] + "" +{ + static const char *const asms[4] = + { + "%.\\tmov%t0\\t%0, %1.x;", + "%.\\tmov%t0\\t%0, %1.y;", + "%.\\tmov%t0\\t%0, %1.z;", + "%.\\tmov%t0\\t%0, %1.w;" + }; + return asms[INTVAL (operands[2])]; +}) + ;; Miscellaneous (define_insn "nop"