From patchwork Thu Oct 5 22:14:14 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Meissner X-Patchwork-Id: 822124 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-463582-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="a+ZxG4cW"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3y7RtJ1cSSz9t63 for ; Fri, 6 Oct 2017 09:15:51 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:mime-version:content-type:message-id; q=dns; s= default; b=jAn+BQylDrJFSeh6Ld/Pwa+KozYwwhzC72czF4TN9ptcAnrv3kyh7 IV6sa1febNd6b63iKz3llyR+aolLEp3EaMSRQERkQvkJkhCeVrrChiPkqt0+eC0Q /TP6rA+TYLp6DegNztvJEpcNQ2HbybuVz9yJSf0l0CkKdiGupDUSaY= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:mime-version:content-type:message-id; s= default; bh=29095fQPi9NtWEelRmiWEFsIR4U=; b=a+ZxG4cWwrSlG2sxcJ1u h2Q9vWZ61b5x0W6fgi6oAiCMmtsagQNQQDOqtqr/rfwygIGoyJy9wLY2+8XPra6+ 2LNtu6zUKyTHTUZnwQRoIvjxX3hCDY0Uh55bsaHKbEmSceoPNdchCERvtj9H5Kai 17s61tJQtVC5BIW5a6D6odM= Received: (qmail 124295 invoked by alias); 5 Oct 2017 22:15:42 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 124275 invoked by uid 89); 5 Oct 2017 22:15:41 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-10.0 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, KAM_LAZY_DOMAIN_SECURITY, RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.2 spammy=amo, expressed, King X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0b-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.158.5) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 05 Oct 2017 22:15:38 +0000 Received: from pps.filterd (m0098419.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id v95MFLjP099623 for ; Thu, 5 Oct 2017 18:15:34 -0400 Received: from e36.co.us.ibm.com (e36.co.us.ibm.com [32.97.110.154]) by mx0b-001b2d01.pphosted.com with ESMTP id 2ddtyfxq1j-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Thu, 05 Oct 2017 18:15:23 -0400 Received: from localhost by e36.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 5 Oct 2017 16:14:17 -0600 Received: from b03cxnp07028.gho.boulder.ibm.com (9.17.130.15) by e36.co.us.ibm.com (192.168.1.136) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Thu, 5 Oct 2017 16:14:15 -0600 Received: from b03ledav006.gho.boulder.ibm.com (b03ledav006.gho.boulder.ibm.com [9.17.130.237]) by b03cxnp07028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id v95MEFJt2556204; Thu, 5 Oct 2017 15:14:15 -0700 Received: from b03ledav006.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 6F6DDC603C; Thu, 5 Oct 2017 16:14:15 -0600 (MDT) Received: from ibm-tiger.the-meissners.org (unknown [9.32.77.111]) by b03ledav006.gho.boulder.ibm.com (Postfix) with ESMTP id 2B0FBC603E; Thu, 5 Oct 2017 16:14:15 -0600 (MDT) Received: by ibm-tiger.the-meissners.org (Postfix, from userid 500) id 708DB47553; Thu, 5 Oct 2017 18:14:14 -0400 (EDT) Date: Thu, 5 Oct 2017 18:14:14 -0400 From: Michael Meissner To: GCC Patches , Segher Boessenkool , David Edelsohn , Bill Schmidt Subject: [PATCH], Add PowerPC ISA 3.0 Atomic Memory Operation functions Mail-Followup-To: Michael Meissner , GCC Patches , Segher Boessenkool , David Edelsohn , Bill Schmidt MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.20 (2009-12-10) X-TM-AS-GCONF: 00 x-cbid: 17100522-0020-0000-0000-00000CCEFCE3 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007847; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000235; SDB=6.00926980; UDB=6.00466396; IPR=6.00707235; BA=6.00005623; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00017412; XFM=3.00000015; UTC=2017-10-05 22:14:17 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17100522-0021-0000-0000-00005E669403 Message-Id: <20171005221413.GA27169@ibm-tiger.the-meissners.org> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-10-05_09:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=2 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1707230000 definitions=main-1710050297 X-IsSubscribed: yes This patch adds support for most of the PowerPC ISA 3.0 Atomic Memory Operation instructions, listed in section 4.5 of the manual. Currently these functions are done via extended asm. At some point in the future, I will probably move the inner part of the patch to use new built-in functions to replace the extended asm. Some of the atomic memory operations are not currently provided, due to the complexity of the operation (some require two adjacent memory locations to be accessed, some require 3 adjacent GPR registers). I have checked this patch on a little endian power8 and there was no regressions in the bootstrap or make check. I verified that the compile only test (amo1.c) was done during the test run. I have verified that the runtime test (amo2.c) does work on a power9 prototype system. I am doing a full build and make check on the power9 system. Can I check these files into the trunk assuming the power9 bootstrap/make check passes without regressions? [gcc] 2017-10-05 Michael Meissner * config/rs6000/amo.h: New include file to provide ISA 3.0 atomic memory operation instruction support. * config.gcc (powerpc*-*-*): Include amo.h as an extra header. (rs6000-ibm-aix[789]*): Likewise. * doc/extend.texi (PowerPC Atomic Memory Operation Functions): Document new functions. [gcc/testsuite] 2017-10-05 Michael Meissner * gcc.target/powerpc/amo1.c: New test. * gcc.target/powerpc/amo2.c: Likewise. Index: gcc/config/rs6000/amo.h =================================================================== --- gcc/config/rs6000/amo.h (revision 0) +++ gcc/config/rs6000/amo.h (revision 0) @@ -0,0 +1,158 @@ +/* Power ISA 3.0 atomic memory operation include file. + Copyright (C) 2017 Free Software Foundation, Inc. + Contributed by Michael Meissner . + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published + by the Free Software Foundation; either version 3, or (at your + option) any later version. + + GCC is distributed in the hope that it will be useful, but WITHOUT + ANY WARRANTY; without even the implied warranty of MERCHANTABILITY + or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public + License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + +#ifndef _AMO_H +#define _AMO_H + +#if !defined(_ARCH_PWR9) || !defined(_ARCH_PPC64) +#error "The atomic memory operations require Power 64-bit ISA 3.0" + +#else +#include + +/* Enumeration of the LWAT/LDAT sub-opcodes. */ +enum _AMO_LD { + _AMO_LD_ADD = 0x00, /* Fetch and Add. */ + _AMO_LD_XOR = 0x01, /* Fetch and Xor. */ + _AMO_LD_IOR = 0x02, /* Fetch and Ior. */ + _AMO_LD_AND = 0x03, /* Fetch and And. */ + _AMO_LD_UMAX = 0x04, /* Fetch and Unsigned Maximum. */ + _AMO_LD_SMAX = 0x05, /* Fetch and Signed Maximum. */ + _AMO_LD_UMIN = 0x06, /* Fetch and Unsigned Minimum. */ + _AMO_LD_SMIN = 0x07, /* Fetch and Signed Minimum. */ + _AMO_LD_SWAP = 0x08, /* Swap. */ + _AMO_LD_CS_NE = 0x10, /* Compare and Swap Not Equal. */ + _AMO_LD_INC_BOUNDED = 0x18, /* Fetch and Increment Bounded. */ + _AMO_LD_INC_EQUAL = 0x19, /* Fetch and Increment Equal. */ + _AMO_LD_DEC_BOUNDED = 0x1A /* Fetch and Decrement Bounded. */ +}; + +/* Implementation of the simple LWAT/LDAT operations that take one register and + modify one word or double-word of memory and return the value that was + previously in the memory location. + + The LWAT/LDAT opcode requires the address to be a single register, and that + points to a suitably aligned memory location. + + In order to indicate to the compiler that the memory location may be + modified, the memory location is passed both as a memory reference and the + address to put in a register. */ + +#define _AMO_LD_SIMPLE(NAME, TYPE, OPCODE, FC) \ +static __inline__ TYPE \ +NAME (TYPE *_PTR, TYPE _VALUE) \ +{ \ + unsigned __int128 _TMP; \ + TYPE _RET; \ + __asm__ volatile ("mr %L1,%3\n" \ + "\t" OPCODE " %1,%4,%5\t\t# %0\n" \ + "\tmr %2,%1\n" \ + : "+Q" (_PTR[0]), "=&r" (_TMP), "=r" (_RET) \ + : "r" (_VALUE), "b" (&_PTR[0]), "n" (FC)); \ + return _RET; \ +} + +_AMO_LD_SIMPLE (amo_lwat_add, uint32_t, "lwat", _AMO_LD_ADD) +_AMO_LD_SIMPLE (amo_lwat_xor, uint32_t, "lwat", _AMO_LD_XOR) +_AMO_LD_SIMPLE (amo_lwat_ior, uint32_t, "lwat", _AMO_LD_IOR) +_AMO_LD_SIMPLE (amo_lwat_and, uint32_t, "lwat", _AMO_LD_AND) +_AMO_LD_SIMPLE (amo_lwat_umax, uint32_t, "lwat", _AMO_LD_UMAX) +_AMO_LD_SIMPLE (amo_lwat_umin, uint32_t, "lwat", _AMO_LD_UMIN) +_AMO_LD_SIMPLE (amo_lwat_swap, uint32_t, "lwat", _AMO_LD_SWAP) + +_AMO_LD_SIMPLE (amo_lwat_sadd, int32_t, "lwat", _AMO_LD_ADD) +_AMO_LD_SIMPLE (amo_lwat_smax, int32_t, "lwat", _AMO_LD_SMAX) +_AMO_LD_SIMPLE (amo_lwat_smin, int32_t, "lwat", _AMO_LD_SMIN) +_AMO_LD_SIMPLE (amo_lwat_sswap, int32_t, "lwat", _AMO_LD_SWAP) + +_AMO_LD_SIMPLE (amo_ldat_add, uint64_t, "ldat", _AMO_LD_ADD) +_AMO_LD_SIMPLE (amo_ldat_xor, uint64_t, "ldat", _AMO_LD_XOR) +_AMO_LD_SIMPLE (amo_ldat_ior, uint64_t, "ldat", _AMO_LD_IOR) +_AMO_LD_SIMPLE (amo_ldat_and, uint64_t, "ldat", _AMO_LD_AND) +_AMO_LD_SIMPLE (amo_ldat_umax, uint64_t, "ldat", _AMO_LD_UMAX) +_AMO_LD_SIMPLE (amo_ldat_umin, uint64_t, "ldat", _AMO_LD_UMIN) +_AMO_LD_SIMPLE (amo_ldat_swap, uint64_t, "ldat", _AMO_LD_SWAP) + +_AMO_LD_SIMPLE (amo_ldat_sadd, int64_t, "ldat", _AMO_LD_ADD) +_AMO_LD_SIMPLE (amo_ldat_smax, int64_t, "ldat", _AMO_LD_SMAX) +_AMO_LD_SIMPLE (amo_ldat_smin, int64_t, "ldat", _AMO_LD_SMIN) +_AMO_LD_SIMPLE (amo_ldat_sswap, int64_t, "ldat", _AMO_LD_SWAP) + +/* Enumeration of the STWAT/STDAT sub-opcodes. */ +enum _AMO_ST { + _AMO_ST_ADD = 0x00, /* Store Add. */ + _AMO_ST_XOR = 0x01, /* Store Xor. */ + _AMO_ST_IOR = 0x02, /* Store Ior. */ + _AMO_ST_AND = 0x03, /* Store And. */ + _AMO_ST_UMAX = 0x04, /* Store Unsigned Maximum. */ + _AMO_ST_SMAX = 0x05, /* Store Signed Maximum. */ + _AMO_ST_UMIN = 0x06, /* Store Unsigned Minimum. */ + _AMO_ST_SMIN = 0x07, /* Store Signed Minimum. */ + _AMO_ST_TWIN = 0x18 /* Store Twin. */ +}; + +/* Implementation of the simple STWAT/STDAT operations that take one register + and modify one word or double-word of memory. No value is returned. + + The STWAT/STDAT opcode requires the address to be a single register, and + that points to a suitably aligned memory location. + + In order to indicate to the compiler that the memory location may be + modified, the memory location is passed both as a memory reference and the + address to put in a register. */ + +#define _AMO_ST_SIMPLE(NAME, TYPE, OPCODE, FC) \ +static __inline__ void \ +NAME (TYPE *_PTR, TYPE _VALUE) \ +{ \ + __asm__ volatile (OPCODE " %1,%2,%3\t\t# %0" \ + : "+Q" (_PTR[0]) \ + : "r" (_VALUE), "b" (&_PTR[0]), "n" (FC)); \ + return; \ +} + +_AMO_ST_SIMPLE (amo_stwat_add, uint32_t, "stwat", _AMO_ST_ADD) +_AMO_ST_SIMPLE (amo_stwat_xor, uint32_t, "stwat", _AMO_ST_XOR) +_AMO_ST_SIMPLE (amo_stwat_ior, uint32_t, "stwat", _AMO_ST_IOR) +_AMO_ST_SIMPLE (amo_stwat_and, uint32_t, "stwat", _AMO_ST_AND) +_AMO_ST_SIMPLE (amo_stwat_umax, uint32_t, "stwat", _AMO_ST_UMAX) +_AMO_ST_SIMPLE (amo_stwat_umin, uint32_t, "stwat", _AMO_ST_UMIN) + +_AMO_ST_SIMPLE (amo_stwat_sadd, int32_t, "stwat", _AMO_ST_ADD) +_AMO_ST_SIMPLE (amo_stwat_smax, int32_t, "stwat", _AMO_ST_SMAX) +_AMO_ST_SIMPLE (amo_stwat_smin, int32_t, "stwat", _AMO_ST_SMIN) + +_AMO_ST_SIMPLE (amo_stdat_add, uint64_t, "stdat", _AMO_ST_ADD) +_AMO_ST_SIMPLE (amo_stdat_xor, uint64_t, "stdat", _AMO_ST_XOR) +_AMO_ST_SIMPLE (amo_stdat_ior, uint64_t, "stdat", _AMO_ST_IOR) +_AMO_ST_SIMPLE (amo_stdat_and, uint64_t, "stdat", _AMO_ST_AND) +_AMO_ST_SIMPLE (amo_stdat_umax, uint64_t, "stdat", _AMO_ST_UMAX) +_AMO_ST_SIMPLE (amo_stdat_umin, uint64_t, "stdat", _AMO_ST_UMIN) + +_AMO_ST_SIMPLE (amo_stdat_sadd, int64_t, "stdat", _AMO_ST_ADD) +_AMO_ST_SIMPLE (amo_stdat_smax, int64_t, "stdat", _AMO_ST_SMAX) +_AMO_ST_SIMPLE (amo_stdat_smin, int64_t, "stdat", _AMO_ST_SMIN) +#endif /* _ARCH_PWR9 && _ARCH_PPC64. */ +#endif /* _POWERPC_AMO_H. */ Index: gcc/config.gcc =================================================================== --- gcc/config.gcc (revision 253429) +++ gcc/config.gcc (working copy) @@ -461,6 +461,7 @@ powerpc*-*-*) extra_headers="${extra_headers} mmintrin.h x86intrin.h" extra_headers="${extra_headers} ppu_intrinsics.h spu2vmx.h vec_types.h si2vmx.h" extra_headers="${extra_headers} paired.h" + extra_headers="${extra_headers} amo.h" case x$with_cpu in xpowerpc64|xdefault64|x6[23]0|x970|xG5|xpower[3456789]|xpower6x|xrs64a|xcell|xa2|xe500mc64|xe5500|xe6500) cpu_is_64bit=yes @@ -2627,7 +2628,7 @@ rs6000-ibm-aix[789].* | powerpc-ibm-aix[ use_collect2=yes thread_file='aix' use_gcc_stdint=wrap - extra_headers=altivec.h + extra_headers="altivec.h amo.h" default_use_cxa_atexit=yes ;; rl78-*-elf*) Index: gcc/doc/extend.texi =================================================================== --- gcc/doc/extend.texi (revision 253429) +++ gcc/doc/extend.texi (working copy) @@ -12041,6 +12041,7 @@ instructions, but allow the compiler to * PowerPC Built-in Functions:: * PowerPC AltiVec/VSX Built-in Functions:: * PowerPC Hardware Transactional Memory Built-in Functions:: +* PowerPC Atomic Memory Operation Functions:: * RX Built-in Functions:: * S/390 System z Built-in Functions:: * SH Built-in Functions:: @@ -19126,6 +19127,67 @@ while (1) @} @end smallexample +@node PowerPC Atomic Memory Operation Functions +@subsection PowerPC Atomic Memory Operation Functions +ISA 3.0 of the PowerPC added new atomic memory operation (amo) +instructions. GCC provides support for these instructions in 64-bit +environments. All of the functions are declared in the include file +@code{amo.h}. + +The functions supported are: + +@smallexample +#include + +uint32_t amo_lwat_add (uint32_t *, uint32_t); +uint32_t amo_lwat_xor (uint32_t *, uint32_t); +uint32_t amo_lwat_ior (uint32_t *, uint32_t); +uint32_t amo_lwat_and (uint32_t *, uint32_t); +uint32_t amo_lwat_umax (uint32_t *, uint32_t); +uint32_t amo_lwat_umin (uint32_t *, uint32_t); +uint32_t amo_lwat_swap (uint32_t *, uint32_t); + +int32_t amo_lwat_sadd (int32_t *, int32_t); +int32_t amo_lwat_smax (int32_t *, int32_t); +int32_t amo_lwat_smin (int32_t *, int32_t); +int32_t amo_lwat_sswap (int32_t *, int32_t); + +uint64_t amo_ldat_add (uint64_t *, uint64_t); +uint64_t amo_ldat_xor (uint64_t *, uint64_t); +uint64_t amo_ldat_ior (uint64_t *, uint64_t); +uint64_t amo_ldat_and (uint64_t *, uint64_t); +uint64_t amo_ldat_umax (uint64_t *, uint64_t); +uint64_t amo_ldat_umin (uint64_t *, uint64_t); +uint64_t amo_ldat_swap (uint64_t *, uint64_t); + +int64_t amo_ldat_sadd (int64_t *, int64_t); +int64_t amo_ldat_smax (int64_t *, int64_t); +int64_t amo_ldat_smin (int64_t *, int64_t); +int64_t amo_ldat_sswap (int64_t *, int64_t); + +void amo_stwat_add (uint32_t *, uint32_t); +void amo_stwat_xor (uint32_t *, uint32_t); +void amo_stwat_ior (uint32_t *, uint32_t); +void amo_stwat_and (uint32_t *, uint32_t); +void amo_stwat_umax (uint32_t *, uint32_t); +void amo_stwat_umin (uint32_t *, uint32_t); + +void amo_stwat_sadd (int32_t *, int32_t); +void amo_stwat_smax (int32_t *, int32_t); +void amo_stwat_smin (int32_t *, int32_t); + +void amo_stdat_add (uint64_t *, uint64_t); +void amo_stdat_xor (uint64_t *, uint64_t); +void amo_stdat_ior (uint64_t *, uint64_t); +void amo_stdat_and (uint64_t *, uint64_t); +void amo_stdat_umax (uint64_t *, uint64_t); +void amo_stdat_umin (uint64_t *, uint64_t); + +void amo_stdat_sadd (int64_t *, int64_t); +void amo_stdat_smax (int64_t *, int64_t); +void amo_stdat_smin (int64_t *, int64_t); +@end smallexample + @node RX Built-in Functions @subsection RX Built-in Functions GCC supports some of the RX instructions which cannot be expressed in Index: gcc/testsuite/gcc.target/powerpc/amo1.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/amo1.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/amo1.c (revision 0) @@ -0,0 +1,253 @@ +/* { dg-do compile { target { powerpc64*-*-* && lp64 } } } */ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-options "-mpower9-vector -mpower9-misc -O2" } */ + +/* Verify P9 atomic memory operations. */ + +#include +#include + +uint32_t +do_lw_add (uint32_t *mem, uint32_t value) +{ + return amo_lwat_add (mem, value); +} + +int32_t +do_lw_sadd (int32_t *mem, int32_t value) +{ + return amo_lwat_sadd (mem, value); +} + +uint32_t +do_lw_xor (uint32_t *mem, uint32_t value) +{ + return amo_lwat_xor (mem, value); +} + +uint32_t +do_lw_ior (uint32_t *mem, uint32_t value) +{ + return amo_lwat_ior (mem, value); +} + +uint32_t +do_lw_and (uint32_t *mem, uint32_t value) +{ + return amo_lwat_and (mem, value); +} + +uint32_t +do_lw_umax (uint32_t *mem, uint32_t value) +{ + return amo_lwat_umax (mem, value); +} + +int32_t +do_lw_smax (int32_t *mem, int32_t value) +{ + return amo_lwat_smax (mem, value); +} + +uint32_t +do_lw_umin (uint32_t *mem, uint32_t value) +{ + return amo_lwat_umin (mem, value); +} + +int32_t +do_lw_smin (int32_t *mem, int32_t value) +{ + return amo_lwat_smin (mem, value); +} + +uint32_t +do_lw_swap (uint32_t *mem, uint32_t value) +{ + return amo_lwat_swap (mem, value); +} + +int32_t +do_lw_sswap (int32_t *mem, int32_t value) +{ + return amo_lwat_sswap (mem, value); +} + +uint64_t +do_ld_add (uint64_t *mem, uint64_t value) +{ + return amo_ldat_add (mem, value); +} + +int64_t +do_ld_sadd (int64_t *mem, int64_t value) +{ + return amo_ldat_sadd (mem, value); +} + +uint64_t +do_ld_xor (uint64_t *mem, uint64_t value) +{ + return amo_ldat_xor (mem, value); +} + +uint64_t +do_ld_ior (uint64_t *mem, uint64_t value) +{ + return amo_ldat_ior (mem, value); +} + +uint64_t +do_ld_and (uint64_t *mem, uint64_t value) +{ + return amo_ldat_and (mem, value); +} + +uint64_t +do_ld_umax (uint64_t *mem, uint64_t value) +{ + return amo_ldat_umax (mem, value); +} + +int64_t +do_ld_smax (int64_t *mem, int64_t value) +{ + return amo_ldat_smax (mem, value); +} + +uint64_t +do_ld_umin (uint64_t *mem, uint64_t value) +{ + return amo_ldat_umin (mem, value); +} + +int64_t +do_ld_smin (int64_t *mem, int64_t value) +{ + return amo_ldat_smin (mem, value); +} + +uint64_t +do_ld_swap (uint64_t *mem, uint64_t value) +{ + return amo_ldat_swap (mem, value); +} + +int64_t +do_ld_sswap (int64_t *mem, int64_t value) +{ + return amo_ldat_sswap (mem, value); +} + +void +do_sw_add (uint32_t *mem, uint32_t value) +{ + amo_stwat_add (mem, value); +} + +void +do_sw_sadd (int32_t *mem, int32_t value) +{ + amo_stwat_sadd (mem, value); +} + +void +do_sw_xor (uint32_t *mem, uint32_t value) +{ + amo_stwat_xor (mem, value); +} + +void +do_sw_ior (uint32_t *mem, uint32_t value) +{ + amo_stwat_ior (mem, value); +} + +void +do_sw_and (uint32_t *mem, uint32_t value) +{ + amo_stwat_and (mem, value); +} + +void +do_sw_umax (int32_t *mem, int32_t value) +{ + amo_stwat_umax (mem, value); +} + +void +do_sw_smax (int32_t *mem, int32_t value) +{ + amo_stwat_smax (mem, value); +} + +void +do_sw_umin (int32_t *mem, int32_t value) +{ + amo_stwat_umin (mem, value); +} + +void +do_sw_smin (int32_t *mem, int32_t value) +{ + amo_stwat_smin (mem, value); +} + +void +do_sd_add (uint64_t *mem, uint64_t value) +{ + amo_stdat_add (mem, value); +} + +void +do_sd_sadd (int64_t *mem, int64_t value) +{ + amo_stdat_sadd (mem, value); +} + +void +do_sd_xor (uint64_t *mem, uint64_t value) +{ + amo_stdat_xor (mem, value); +} + +void +do_sd_ior (uint64_t *mem, uint64_t value) +{ + amo_stdat_ior (mem, value); +} + +void +do_sd_and (uint64_t *mem, uint64_t value) +{ + amo_stdat_and (mem, value); +} + +void +do_sd_umax (int64_t *mem, int64_t value) +{ + amo_stdat_umax (mem, value); +} + +void +do_sd_smax (int64_t *mem, int64_t value) +{ + amo_stdat_smax (mem, value); +} + +void +do_sd_umin (int64_t *mem, int64_t value) +{ + amo_stdat_umin (mem, value); +} + +void +do_sd_smin (int64_t *mem, int64_t value) +{ + amo_stdat_smin (mem, value); +} + +/* { dg-final { scan-assembler-times {\mldat\M} 11 } } */ +/* { dg-final { scan-assembler-times {\mlwat\M} 11 } } */ +/* { dg-final { scan-assembler-times {\mstdat\M} 9 } } */ +/* { dg-final { scan-assembler-times {\mstwat\M} 9 } } */ Index: gcc/testsuite/gcc.target/powerpc/amo2.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/amo2.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/amo2.c (revision 0) @@ -0,0 +1,121 @@ +/* { dg-do run { target { powerpc*-*-linux* && { lp64 && p9vector_hw } } } } */ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-options "-O2 -mpower9-vector -mpower9-misc" } */ + +#include +#include +#include + +/* Test whether the ISA 3.0 amo (atomic memory operations) functions perform as + expected. */ + +/* 32-bit tests. */ +static uint32_t u32_ld[4] = { + 9, /* add */ + 7, /* xor */ + 6, /* ior */ + 7, /* and */ +}; + +static uint32_t u32_st[4] = { + 9, /* add */ + 7, /* xor */ + 6, /* ior */ + 7, /* and */ +}; + +static uint32_t u32_result[4]; + +static uint32_t u32_update[4] = { + 9 + 1, /* add */ + 7 ^ 1, /* xor */ + 6 | 1, /* ior */ + 7 & 1, /* and */ +}; + +static uint32_t u32_prev[4] = { + 9, /* add */ + 7, /* xor */ + 6, /* ior */ + 7, /* and */ +}; + +/* 64-bit tests. */ +static uint64_t u64_ld[4] = { + 9, /* add */ + 7, /* xor */ + 6, /* ior */ + 7, /* and */ +}; + +static uint64_t u64_st[4] = { + 9, /* add */ + 7, /* xor */ + 6, /* ior */ + 7, /* and */ +}; + +static uint64_t u64_result[4]; + +static uint64_t u64_update[4] = { + 9 + 1, /* add */ + 7 ^ 1, /* xor */ + 6 | 1, /* ior */ + 7 & 1, /* and */ +}; + +static uint64_t u64_prev[4] = { + 9, /* add */ + 7, /* xor */ + 6, /* ior */ + 7, /* and */ +}; + +int +main (void) +{ + size_t i; + + u32_result[0] = amo_lwat_add (&u32_ld[0], 1); + u32_result[1] = amo_lwat_xor (&u32_ld[1], 1); + u32_result[2] = amo_lwat_ior (&u32_ld[2], 1); + u32_result[3] = amo_lwat_and (&u32_ld[3], 1); + + u64_result[0] = amo_ldat_add (&u64_ld[0], 1); + u64_result[1] = amo_ldat_xor (&u64_ld[1], 1); + u64_result[2] = amo_ldat_ior (&u64_ld[2], 1); + u64_result[3] = amo_ldat_and (&u64_ld[3], 1); + + amo_stwat_add (&u32_st[0], 1); + amo_stwat_xor (&u32_st[1], 1); + amo_stwat_ior (&u32_st[2], 1); + amo_stwat_and (&u32_st[3], 1); + + amo_stdat_add (&u64_st[0], 1); + amo_stdat_xor (&u64_st[1], 1); + amo_stdat_ior (&u64_st[2], 1); + amo_stdat_and (&u64_st[3], 1); + + for (i = 0; i < 4; i++) + { + if (u32_result[i] != u32_prev[i]) + abort (); + + if (u32_ld[i] != u32_update[i]) + abort (); + + if (u32_st[i] != u32_update[i]) + abort (); + + if (u64_result[i] != u64_prev[i]) + abort (); + + if (u64_ld[i] != u64_update[i]) + abort (); + + if (u64_st[i] != u64_update[i]) + abort (); + } + + return 0; +}