From patchwork Mon Feb 22 15:30:12 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cesar Philippidis X-Patchwork-Id: 586317 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id C48AB140BC6 for ; Tue, 23 Feb 2016 02:30:27 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=HeL1EK9+; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :subject:to:message-id:date:mime-version:content-type; q=dns; s= default; b=o9XnR6QDsaGcmOZx5182slfxgVPwnWT7SWBbo1YNFvyQB4K4eYYTi 1bWV8kCYRUaWozU3jzeomysJq4JAwlSN337GQ91qQGOX0Z1dHo7sMEGCIfdbPyvy mVJGzo8uROlbwzYNj8cnFOZ47blcy5TzHXl+Koabveln0GvPBhYO+0= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :subject:to:message-id:date:mime-version:content-type; s= default; bh=VN05sSHw1Nj6MKU1Bxno0MNJcDU=; b=HeL1EK9+dnRjpnvU4oY+ 3hhc/aGooAbtM8nArxkp1CLqrpjBUsO21kEXzBAhVaNC0fFbBXEiynrx4j3g5oAb 6PwgeOizIcLl4U+Qtp/QbqcWNLE8idcAVVLFBDpxb9QA/vqGcP41rk7WH5Y4JLfw mSNqstZO2mNZK6ycxcxi1Kw= Received: (qmail 33172 invoked by alias); 22 Feb 2016 15:30:18 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 33161 invoked by uid 89); 22 Feb 2016 15:30:17 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.5 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=shuffle, consequently, frequently, 5j X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 22 Feb 2016 15:30:16 +0000 Received: from svr-orw-fem-05.mgc.mentorg.com ([147.34.97.43]) by relay1.mentorg.com with esmtp id 1aXsRB-0003H4-75 from Cesar_Philippidis@mentor.com for gcc-patches@gcc.gnu.org; Mon, 22 Feb 2016 07:30:13 -0800 Received: from [127.0.0.1] (147.34.91.1) by svr-orw-fem-05.mgc.mentorg.com (147.34.97.43) with Microsoft SMTP Server id 14.3.224.2; Mon, 22 Feb 2016 07:30:12 -0800 From: Cesar Philippidis Subject: [openacc] vector state propagation To: "gcc-patches@gcc.gnu.org" , Nathan Sidwell Message-ID: <56CB2984.8010906@codesourcery.com> Date: Mon, 22 Feb 2016 07:30:12 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 This patch teaches the nvptx vector state propagator how to handle QImode and HImode variables. Basically, I'm converting the 8- and 16-bit values into 32-bits so that the shuffle broadcast can be used to propagate the register. I'm not sure if my solution is the best way to resolve this problem. It looks like the nvptx backend frequently assigns a larger .u16 and .u32 register for chars and shorts, and consequently masks this problem in -O0. Because a lot of the registers are already u32, the conversion to and from u8 and u16 seems like an unnecessary step, when the nvptx backend should be able to broadcast the origin u32 register directly. Is there a better way to resolve this issue, or is this patch OK for trunk as-is? Cesar 2016-02-22 Cesar Philippidis gcc/ * config/nvptx/nvptx.c (nvptx_gen_shuffle): Add support for QImode and HImode register. libgomp/ * testsuite/libgomp.oacc-c-c++-common/vprop.c: New test. diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c index 3faacd5..728cb00 100644 --- a/gcc/config/nvptx/nvptx.c +++ b/gcc/config/nvptx/nvptx.c @@ -1306,6 +1306,20 @@ nvptx_gen_shuffle (rtx dst, rtx src, rtx idx, nvptx_shuffle_kind kind) end_sequence (); } break; + case QImode: + case HImode: + { + rtx tmp = gen_reg_rtx (SImode); + + start_sequence (); + emit_insn (gen_rtx_SET (tmp, gen_rtx_fmt_e (ZERO_EXTEND, SImode, src))); + emit_insn (nvptx_gen_shuffle (tmp, tmp, idx, kind)); + emit_insn (gen_rtx_SET (dst, gen_rtx_fmt_e (TRUNCATE, GET_MODE (dst), + tmp))); + res = get_insns (); + end_sequence (); + } + break; default: gcc_unreachable (); diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/vprop.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/vprop.c new file mode 100644 index 0000000..a9b63dc --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/vprop.c @@ -0,0 +1,34 @@ +#include + +#define test(type) \ +void \ +test_##type () \ +{ \ + type b[100]; \ + type i, j, x = -1, y = -1; \ + \ + _Pragma("acc parallel loop copyout (b)") \ + for (j = 0; j > -5; j--) \ + { \ + type c = x+y; \ + _Pragma("acc loop vector") \ + for (i = 0; i < 20; i++) \ + b[-j*20 + i] = c; \ + b[5-j] = c; \ + } \ + \ + for (i = 0; i < 100; i++) \ + assert (b[i] == -2); \ +} + +test(char) +test(short) + +int +main () +{ + test_char (); + test_short (); + + return 0; +}