From patchwork Tue Jun 8 01:11:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xionghu Luo X-Patchwork-Id: 1489057 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=rvZh2Kyn; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4FzXJn5Gj2z9sSn for ; Tue, 8 Jun 2021 11:12:08 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D3BD2385803C for ; Tue, 8 Jun 2021 01:12:05 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D3BD2385803C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1623114725; bh=Gk8eB73rCkJnGxjGaJA+JLnXnubqlwvFLKDAyBeD3bE=; h=Subject:To:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=rvZh2Kyn1yn7XRbyWCrNaA9+4iadBDUUZH6oj76oyuHAjpFkQd2ztfYjMHhVTu8qY vlr/uRC0J8TDZDJApCka3onhvlrgOD+7B9kiGeP7WQ8CEp4pLWoLPXDUmROaQisEgX mI5VSG/xhyU77GmfQzPBwe9z+CzUAqO0a3AwthmA= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id BD84F385803C; Tue, 8 Jun 2021 01:11:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org BD84F385803C Received: from pps.filterd (m0098421.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 15813Lje082770; Mon, 7 Jun 2021 21:11:44 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 391ww38rbw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 07 Jun 2021 21:11:44 -0400 Received: from m0098421.ppops.net (m0098421.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 15817ZVB101347; Mon, 7 Jun 2021 21:11:43 -0400 Received: from ppma04ams.nl.ibm.com (63.31.33a9.ip4.static.sl-reverse.com [169.51.49.99]) by mx0a-001b2d01.pphosted.com with ESMTP id 391ww38rbm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 07 Jun 2021 21:11:43 -0400 Received: from pps.filterd (ppma04ams.nl.ibm.com [127.0.0.1]) by ppma04ams.nl.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 1581883n016078; Tue, 8 Jun 2021 01:11:41 GMT Received: from b06cxnps4074.portsmouth.uk.ibm.com (d06relay11.portsmouth.uk.ibm.com [9.149.109.196]) by ppma04ams.nl.ibm.com with ESMTP id 3900w896s0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 08 Jun 2021 01:11:41 +0000 Received: from b06wcsmtp001.portsmouth.uk.ibm.com (b06wcsmtp001.portsmouth.uk.ibm.com [9.149.105.160]) by b06cxnps4074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 1581Bc4b33554726 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 8 Jun 2021 01:11:38 GMT Received: from b06wcsmtp001.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id BF1B9A405C; Tue, 8 Jun 2021 01:11:38 +0000 (GMT) Received: from b06wcsmtp001.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 66515A405B; Tue, 8 Jun 2021 01:11:36 +0000 (GMT) Received: from luoxhus-MacBook-Pro.local (unknown [9.197.254.53]) by b06wcsmtp001.portsmouth.uk.ibm.com (Postfix) with ESMTPS; Tue, 8 Jun 2021 01:11:36 +0000 (GMT) Subject: [PATCH v2] rs6000: Support doubleword swaps removal in rot64 load store [PR100085] To: Segher Boessenkool References: <20210602081932.2683429-1-luoxhu@linux.ibm.com> <20210602222003.GJ18427@gate.crashing.org> Message-ID: <7ed95783-2b61-487f-93c2-89124674accb@linux.ibm.com> Date: Tue, 8 Jun 2021 09:11:33 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.0; rv:68.0) Gecko/20100101 Thunderbird/68.12.0 MIME-Version: 1.0 In-Reply-To: <20210602222003.GJ18427@gate.crashing.org> Content-Language: en-US X-TM-AS-GCONF: 00 X-Proofpoint-GUID: 7dBoZDiObz8JeXsmxOisezN6x3yz1HzO X-Proofpoint-ORIG-GUID: 5IWUW9FUuZqyxJo5JOtVatOq-F4m613R X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391, 18.0.761 definitions=2021-06-08_01:2021-06-04, 2021-06-08 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 clxscore=1015 adultscore=0 mlxlogscore=999 mlxscore=0 malwarescore=0 bulkscore=0 impostorscore=0 priorityscore=1501 phishscore=0 spamscore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2104190000 definitions=main-2106080005 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Xionghu Luo via Gcc-patches From: Xionghu Luo Reply-To: Xionghu Luo Cc: wschmidt@linux.ibm.com, gcc-patches@gcc.gnu.org, linkw@gcc.gnu.org, dje.gcc@gmail.com Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" Update the patch according to the comments. Thanks. On P8LE, extra rot64+rot64 load or store instructions are generated in float128 to vector __int128 conversion. This patch teaches pass swaps to also handle such pattens to remove extra swap instructions. (insn 7 6 8 2 (set (subreg:V1TI (reg:KF 123) 0) (rotate:V1TI (mem/u/c:V1TI (reg/f:DI 121) [0 S16 A128]) (const_int 64 [0x40]))) {*vsx_le_permute_v1ti}) (insn 8 7 9 2 (set (subreg:V1TI (reg:KF 122) 0) (rotate:V1TI (subreg:V1TI (reg:KF 123) 0) (const_int 64 [0x40]))) {*vsx_le_permute_v1ti}) => (insn 22 6 23 2 (set (subreg:V1TI (reg:KF 123) 0) (mem/u/c:V1TI (and:DI (reg/f:DI 121) (const_int -16 [0xfffffffffffffff0])) [0 S16 A128]))) (insn 23 22 25 2 (set (subreg:V1TI (reg:KF 122) 0) (subreg:V1TI (reg:KF 123) 0))) gcc/ChangeLog: * config/rs6000/rs6000-p8swap.c (pattern_is_rotate64_p): New. (insn_is_load_p): Use pattern_is_rotate64_p. (insn_is_swap_p): Likewise. (quad_aligned_load_p): Likewise. (const_load_sequence_p): Likewise. (replace_swapped_aligned_load): Likewise. (recombine_lvx_pattern): Likewise. (recombine_stvx_pattern): Likewise. gcc/testsuite/ChangeLog: * gcc.target/powerpc/float128-call.c: Adjust. * gcc.target/powerpc/pr100085.c: New test. --- gcc/config/rs6000/rs6000-p8swap.c | 37 +++++++++++++++---- .../gcc.target/powerpc/float128-call.c | 4 +- gcc/testsuite/gcc.target/powerpc/pr100085.c | 24 ++++++++++++ 3 files changed, 56 insertions(+), 9 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/pr100085.c diff --git a/gcc/config/rs6000/rs6000-p8swap.c b/gcc/config/rs6000/rs6000-p8swap.c index ec503ab742f..3b74e05e396 100644 --- a/gcc/config/rs6000/rs6000-p8swap.c +++ b/gcc/config/rs6000/rs6000-p8swap.c @@ -250,6 +250,20 @@ union_uses (swap_web_entry *insn_entry, rtx insn, df_ref def) } } +/* Return 1 iff PAT is a rotate 64 bit expression; else return 0. */ + +static bool +pattern_is_rotate64_p (rtx pat) +{ + rtx rot = SET_SRC (pat); + + if (GET_CODE (rot) == ROTATE && CONST_INT_P (XEXP (rot, 1)) + && INTVAL (XEXP (rot, 1)) == 64) + return true; + + return false; +} + /* Return 1 iff INSN is a load insn, including permuting loads that represent an lvxd2x instruction; else return 0. */ static unsigned int @@ -266,6 +280,9 @@ insn_is_load_p (rtx insn) && MEM_P (XEXP (SET_SRC (body), 0))) return 1; + if (pattern_is_rotate64_p (body) && MEM_P (XEXP (SET_SRC (body), 0))) + return 1; + return 0; } @@ -305,6 +322,8 @@ insn_is_swap_p (rtx insn) if (GET_CODE (body) != SET) return 0; rtx rhs = SET_SRC (body); + if (pattern_is_rotate64_p (body)) + return 1; if (GET_CODE (rhs) != VEC_SELECT) return 0; rtx parallel = XEXP (rhs, 1); @@ -392,7 +411,8 @@ quad_aligned_load_p (swap_web_entry *insn_entry, rtx_insn *insn) false. */ rtx body = PATTERN (def_insn); if (GET_CODE (body) != SET - || GET_CODE (SET_SRC (body)) != VEC_SELECT + || !(GET_CODE (SET_SRC (body)) == VEC_SELECT + || pattern_is_rotate64_p (body)) || !MEM_P (XEXP (SET_SRC (body), 0))) return false; @@ -531,7 +551,8 @@ const_load_sequence_p (swap_web_entry *insn_entry, rtx insn) false. */ rtx body = PATTERN (def_insn); if (GET_CODE (body) != SET - || GET_CODE (SET_SRC (body)) != VEC_SELECT + || !(GET_CODE (SET_SRC (body)) == VEC_SELECT + || pattern_is_rotate64_p (body)) || !MEM_P (XEXP (SET_SRC (body), 0))) return false; @@ -1730,7 +1751,8 @@ replace_swapped_aligned_load (swap_web_entry *insn_entry, rtx swap_insn) swap (indicated by code VEC_SELECT). */ rtx body = PATTERN (def_insn); gcc_assert ((GET_CODE (body) == SET) - && (GET_CODE (SET_SRC (body)) == VEC_SELECT) + && (GET_CODE (SET_SRC (body)) == VEC_SELECT + || pattern_is_rotate64_p (body)) && MEM_P (XEXP (SET_SRC (body), 0))); rtx src_exp = XEXP (SET_SRC (body), 0); @@ -2148,7 +2170,8 @@ recombine_lvx_pattern (rtx_insn *insn, del_info *to_delete) { rtx body = PATTERN (insn); gcc_assert (GET_CODE (body) == SET - && GET_CODE (SET_SRC (body)) == VEC_SELECT + && (GET_CODE (SET_SRC (body)) == VEC_SELECT + || pattern_is_rotate64_p (body)) && MEM_P (XEXP (SET_SRC (body), 0))); rtx mem = XEXP (SET_SRC (body), 0); @@ -2223,9 +2246,9 @@ static void recombine_stvx_pattern (rtx_insn *insn, del_info *to_delete) { rtx body = PATTERN (insn); - gcc_assert (GET_CODE (body) == SET - && MEM_P (SET_DEST (body)) - && GET_CODE (SET_SRC (body)) == VEC_SELECT); + gcc_assert (GET_CODE (body) == SET && MEM_P (SET_DEST (body)) + && (GET_CODE (SET_SRC (body)) == VEC_SELECT + || pattern_is_rotate64_p (body))); rtx mem = SET_DEST (body); rtx base_reg = XEXP (mem, 0); diff --git a/gcc/testsuite/gcc.target/powerpc/float128-call.c b/gcc/testsuite/gcc.target/powerpc/float128-call.c index 5895416e985..a1f09df8a57 100644 --- a/gcc/testsuite/gcc.target/powerpc/float128-call.c +++ b/gcc/testsuite/gcc.target/powerpc/float128-call.c @@ -21,5 +21,5 @@ TYPE one (void) { return ONE; } void store (TYPE a, TYPE *p) { *p = a; } -/* { dg-final { scan-assembler "lxvd2x 34" } } */ -/* { dg-final { scan-assembler "stxvd2x 34" } } */ +/* { dg-final { scan-assembler "lvx 2" } } */ +/* { dg-final { scan-assembler "stvx 2" } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/pr100085.c b/gcc/testsuite/gcc.target/powerpc/pr100085.c new file mode 100644 index 00000000000..0a2e0feaf30 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr100085.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target powerpc_float128_sw_ok } */ +/* { dg-options "-O2 -mdejagnu-cpu=power8" } */ + +typedef __vector unsigned __int128 vui128_t; + +typedef union +{ + __float128 vf1; + vui128_t vx1; +} __VF_128; + +vui128_t +vec_xfer_bin128_2_vui128t (__float128 f128) +{ + __VF_128 vunion; + vunion.vf1 = f128; + return (vunion.vx1); +} + +/* { dg-final { scan-assembler-not "xxpermdi" } } */ +/* { dg-final { scan-assembler-not "stxvd2x" } } */ +/* { dg-final { scan-assembler-not "lxvd2x" } } */ +