From patchwork Wed Nov 15 13:49:34 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tom de Vries X-Patchwork-Id: 838180 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-466862-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="qRCPnWH8"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3ycQjg4gRzz9s81 for ; Thu, 16 Nov 2017 00:49:56 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:cc:message-id:date:mime-version:content-type; q= dns; s=default; b=CuAS5ls7PA31DaJgfrf/LzUNRRKrFMJpalnspzI/JKgaOY 3Tmq6549tyNAh4CRdiEtHboHpX+sSIY4bLxBldroAbswEyUJpYORbvF71PjhYuNl kT5Pt5ZnDeRFZcdAKeohsiQ+F6R49CJoj/4TMQjBgsOmpYMs6EfIfC9N5Gwag= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:cc:message-id:date:mime-version:content-type; s= default; bh=2sERUmkZF1S8MnTApDgW5ZPJLOY=; b=qRCPnWH8y6yQPvvQE4KZ NeojpjpEhoTJfcr80K9UVUp+m7Cgrr6tItUPcZTa2+hi+cVZgoHqz10+kIkk+sQZ B1dozRJJ7p8A0Bgv/5Kws+bNzNeeRi0z1BJDQml2l0UniTj66mydwf7uNeDpgM3B LSs89ja50mgVRu0Qcf4epZc= Received: (qmail 40231 invoked by alias); 15 Nov 2017 13:49:48 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 40212 invoked by uid 89); 15 Nov 2017 13:49:47 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-24.5 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KB_WAM_FROM_NAME_SINGLEWORD, RCVD_IN_DNSWL_NONE, SPF_PASS, URIBL_RED autolearn=ham version=3.3.2 spammy=fasyncwait3c, f-asyncwait, fasyncwait, f-asyncwait-1.c X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 15 Nov 2017 13:49:45 +0000 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-MBX-04.mgc.mentorg.com) by relay1.mentorg.com with esmtps (TLSv1.2:ECDHE-RSA-AES256-SHA384:256) id 1eEy4U-0006j3-P2 from Tom_deVries@mentor.com ; Wed, 15 Nov 2017 05:49:42 -0800 Received: from [172.30.72.60] (137.202.0.87) by SVR-IES-MBX-04.mgc.mentorg.com (139.181.222.4) with Microsoft SMTP Server (TLS) id 15.0.1320.4; Wed, 15 Nov 2017 13:49:38 +0000 To: Jakub Jelinek From: Tom de Vries Subject: Add libgomp.oacc-c-c++-common/f-asyncwait-{1,2,3}.c CC: GCC Patches , Thomas Schwinge Message-ID: <7e2bc007-f6e7-7647-5076-0b4030447c82@mentor.com> Date: Wed, 15 Nov 2017 14:49:34 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 MIME-Version: 1.0 X-ClientProxiedBy: svr-ies-mbx-01.mgc.mentorg.com (139.181.222.1) To SVR-IES-MBX-04.mgc.mentorg.com (139.181.222.4) Hi, I noticed that there is only one asyncwait testcase for C on trunk. I've rewritten asyncwait-{1,2,3}.f90 into C (and changed the float math into int math to keep things as simple as possible). Tested on top of trunk for host. Tested on top of trunk, gcc-7-branch, openacc-gcc-7-branch, gomp-4-branch for nvptx. On trunk for nvptx, I'm seeing execution failures at -O2. I've verified that I see the same failures with all the async and wait clauses removed. Also, it's not the only failure at -O2 for trunk, so that's probably some orthogonal issue. Committed as obvious. Thanks, - Tom Add libgomp.oacc-c-c++-common/f-asyncwait-{1,2,3}.c 2017-11-15 Tom de Vries * testsuite/libgomp.oacc-c-c++-common/f-asyncwait-1.c: New test, copied from asyncwait-1.f90. Rewrite into C. Rewrite from float to int. * testsuite/libgomp.oacc-c-c++-common/f-asyncwait-2.c: New test, copied from asyncwait-2.f90. Rewrite into C. Rewrite from float to int. * testsuite/libgomp.oacc-c-c++-common/f-asyncwait-3.c: New test, copied from asyncwait-3.f90. Rewrite into C. Rewrite from float to int. --- .../libgomp.oacc-c-c++-common/f-asyncwait-1.c | 297 +++++++++++++++++++++ .../libgomp.oacc-c-c++-common/f-asyncwait-2.c | 61 +++++ .../libgomp.oacc-c-c++-common/f-asyncwait-3.c | 63 +++++ 3 files changed, 421 insertions(+) diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/f-asyncwait-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/f-asyncwait-1.c new file mode 100644 index 0000000..cf85170 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/f-asyncwait-1.c @@ -0,0 +1,297 @@ +/* { dg-do run } */ + +/* Based on asyncwait-1.f90. */ + +#include + +#define N 64 + +int +main (void) +{ + int *a, *b, *c, *d, *e; + + a = (int*)malloc (N * sizeof (*a)); + b = (int*)malloc (N * sizeof (*b)); + c = (int*)malloc (N * sizeof (*c)); + d = (int*)malloc (N * sizeof (*d)); + e = (int*)malloc (N * sizeof (*e)); + + for (int i = 0; i < N; ++i) + { + a[i] = 3; + b[i] = 0; + } + +#pragma acc data copy (a[0:N]) copy (b[0:N]) + { + +#pragma acc parallel async +#pragma acc loop + for (int i = 0; i < N; ++i) + b[i] = a[i]; + +#pragma acc wait + } + + for (int i = 0; i < N; ++i) + { + if (a[i] != 3) + abort (); + if (b[i] != 3) + abort (); + } + + for (int i = 0; i < N; ++i) + { + a[i] = 2; + b[i] = 0; + } + +#pragma acc data copy (a[0:N]) copy (b[0:N]) + { +#pragma acc parallel async (1) +#pragma acc loop + for (int i = 0; i < N; ++i) + b[i] = a[i]; + +#pragma acc wait (1) + } + + for (int i = 0; i < N; ++i) + { + if (a[i] != 2) abort (); + if (b[i] != 2) abort (); + } + + for (int i = 0; i < N; ++i) + { + a[i] = 3; + b[i] = 0; + c[i] = 0; + d[i] = 0; + } + +#pragma acc data copy (a[0:N]) copy (b[0:N]) copy (c[0:N]) copy (d[0:N]) + { + +#pragma acc parallel async (1) + for (int i = 0; i < N; ++i) + b[i] = (a[i] * a[i] * a[i]) / a[i]; + +#pragma acc parallel async (1) + for (int i = 0; i < N; ++i) + c[i] = (a[i] * 4) / a[i]; + + +#pragma acc parallel async (1) +#pragma acc loop + for (int i = 0; i < N; ++i) + d[i] = ((a[i] * a[i] + a[i]) / a[i]) - a[i]; + +#pragma acc wait (1) + } + + for (int i = 0; i < N; ++i) + { + if (a[i] != 3) + abort (); + if (b[i] != 9) + abort (); + if (c[i] != 4) + abort (); + if (d[i] != 1) + abort (); + } + + for (int i = 0; i < N; ++i) + { + a[i] = 2; + b[i] = 0; + c[i] = 0; + d[i] = 0; + e[i] = 0; + } + +#pragma acc data copy (a[0:N], b[0:N], c[0:N], d[0:N], e[0:N]) + { + +#pragma acc parallel async (1) + for (int i = 0; i < N; ++i) + b[i] = (a[i] * a[i] * a[i]) / a[i]; + +#pragma acc parallel async (1) +#pragma acc loop + for (int i = 0; i < N; ++i) + c[i] = (a[i] * 4) / a[i]; + +#pragma acc parallel async (1) +#pragma acc loop + for (int i = 0; i < N; ++i) + d[i] = ((a[i] * a[i] + a[i]) / a[i]) - a[i]; + + +#pragma acc parallel wait (1) async (1) +#pragma acc loop + for (int i = 0; i < N; ++i) + e[i] = a[i] + b[i] + c[i] + d[i]; + +#pragma acc wait (1) + } + + for (int i = 0; i < N; ++i) + { + if (a[i] != 2) + abort (); + if (b[i] != 4) + abort (); + if (c[i] != 4) + abort (); + if (d[i] != 1) + abort (); + if (e[i] != 11) + abort (); + } + + for (int i = 0; i < N; ++i) + { + a[i] = 3; + b[i] = 0; + } + +#pragma acc data copy (a[0:N]) copy (b[0:N]) + { + +#pragma acc kernels async +#pragma acc loop + for (int i = 0; i < N; ++i) + b[i] = a[i]; + +#pragma acc wait + } + + for (int i = 0; i < N; ++i) + { + if (a[i] != 3) + abort (); + if (b[i] != 3) + abort (); + } + + for (int i = 0; i < N; ++i) + { + a[i] = 2; + b[i] = 0; + } + +#pragma acc data copy (a[0:N]) copy (b[0:N]) + { +#pragma acc kernels async (1) +#pragma acc loop + for (int i = 0; i < N; ++i) + b[i] = a[i]; + +#pragma acc wait (1) + } + + for (int i = 0; i < N; ++i) + { + if (a[i] != 2) + abort (); + if (b[i] != 2) + abort (); + } + + for (int i = 0; i < N; ++i) + { + a[i] = 3; + b[i] = 0; + c[i] = 0; + d[i] = 0; + } + +#pragma acc data copy (a[0:N]) copy (b[0:N]) copy (c[0:N]) copy (d[0:N]) + { +#pragma acc kernels async (1) + for (int i = 0; i < N; ++i) + b[i] = (a[i] * a[i] * a[i]) / a[i]; + +#pragma acc kernels async (1) + for (int i = 0; i < N; ++i) + c[i] = (a[i] * 4) / a[i]; + +#pragma acc kernels async (1) +#pragma acc loop + for (int i = 0; i < N; ++i) + d[i] = ((a[i] * a[i] + a[i]) / a[i]) - a[i]; + +#pragma acc wait (1) + } + + for (int i = 0; i < N; ++i) + { + if (a[i] != 3) + abort (); + if (b[i] != 9) + abort (); + if (c[i] != 4) + abort (); + if (d[i] != 1) + abort (); + } + + for (int i = 0; i < N; ++i) + { + a[i] = 2; + b[i] = 0; + c[i] = 0; + d[i] = 0; + e[i] = 0; + } + +#pragma acc data copy (a[0:N], b[0:N], c[0:N], d[0:N], e[0:N]) + { +#pragma acc kernels async (1) + for (int i = 0; i < N; ++i) + b[i] = (a[i] * a[i] * a[i]) / a[i]; + +#pragma acc kernels async (1) +#pragma acc loop + for (int i = 0; i < N; ++i) + c[i] = (a[i] * 4) / a[i]; + +#pragma acc kernels async (1) +#pragma acc loop + for (int i = 0; i < N; ++i) + d[i] = ((a[i] * a[i] + a[i]) / a[i]) - a[i]; + +#pragma acc kernels wait (1) async (1) +#pragma acc loop + for (int i = 0; i < N; ++i) + e[i] = a[i] + b[i] + c[i] + d[i]; + +#pragma acc wait (1) + } + + for (int i = 0; i < N; ++i) + { + if (a[i] != 2) + abort (); + if (b[i] != 4) + abort (); + if (c[i] != 4) + abort (); + if (d[i] != 1) + abort (); + if (e[i] != 11) + abort (); + } + + free (a); + free (b); + free (c); + free (d); + free (e); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/f-asyncwait-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/f-asyncwait-2.c new file mode 100644 index 0000000..5298e4c --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/f-asyncwait-2.c @@ -0,0 +1,61 @@ +/* { dg-do run } */ + +/* Based on asyncwait-2.f90. */ + +#include + +#define N 64 + +int *a, *b, *c; + +int +main (void) +{ + a = (int *)malloc (N * sizeof (*a)); + b = (int *)malloc (N * sizeof (*b)); + c = (int *)malloc (N * sizeof (*c)); + +#pragma acc parallel copy (a[0:N]) async (0) +#pragma acc loop + for (int i = 0; i < N; ++i) + a[i] = 1; + +#pragma acc parallel copy (b[0:N]) async (1) +#pragma acc loop + for (int i = 0; i < N; ++i) + b[i] = 1; + +#pragma acc parallel copy (a[0:N], b[0:N], c[0:N]) wait (0, 1) +#pragma acc loop + for (int i = 0; i < N; ++i) + c[i] = a[i] + b[i]; + + for (int i = 0; i < N; ++i) + if (c[i] != 2) + abort (); + +#if 1 +#pragma acc kernels copy (a[0:N]) async (0) +#pragma acc loop + for (int i = 0; i < N; ++i) + a[i] = 1; + +#pragma acc kernels copy (b[0:N]) async (1) +#pragma acc loop + for (int i = 0; i < N; ++i) + b[i] = 1; + +#pragma acc kernels copy (a[0:N], b[0:N], c[0:N]) wait (0, 1) +#pragma acc loop + for (int i = 0; i < N; ++i) + c[i] = a[i] + b[i]; + + for (int i = 0; i < N; ++i) + if (c[i] != 2) + abort (); +#endif + + free (a); + free (b); + free (c); +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/f-asyncwait-3.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/f-asyncwait-3.c new file mode 100644 index 0000000..319eea6 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/f-asyncwait-3.c @@ -0,0 +1,63 @@ +/* { dg-do run } */ + +/* Based on asyncwait-3.f90. */ + +#include + +#define N 64 + +int +main (void) +{ + int *a, *b, *c; + + a = (int *)malloc (N * sizeof (*a)); + b = (int *)malloc (N * sizeof (*b)); + c = (int *)malloc (N * sizeof (*c)); + +#pragma acc parallel copy (a[0:N]) async (0) +#pragma acc loop + for (int i = 0; i < N; ++i) + a[i] = 1; + +#pragma acc parallel copy (b[0:N]) async (1) +#pragma acc loop + for (int i = 0; i < N; ++i) + b[i] = 1; + +#pragma acc wait (0, 1) + +#pragma acc parallel copy (a[0:N], b[0:N], c[0:N]) +#pragma acc loop + for (int i = 0; i < N; ++i) + c[i] = a[i] + b[i]; + + for (int i = 0; i < N; ++i) + if (c[i] != 2) + abort (); + +#pragma acc kernels copy (a[0:N]) async (0) +#pragma acc loop + for (int i = 0; i < N; ++i) + a[i] = 1; + +#pragma acc kernels copy (b[0:N]) async (1) +#pragma acc loop + for (int i = 0; i < N; ++i) + b[i] = 1; + +#pragma acc wait (0, 1) + +#pragma acc kernels copy (a[0:N], b[0:N], c[0:N]) +#pragma acc loop + for (int i = 0; i < N; ++i) + c[i] = a[i] + b[i]; + + for (int i = 0; i < N; ++i) + if (c[i] != 2) + abort (); + + free (a); + free (b); + free (c); +}