From patchwork Mon Sep 14 09:53:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 1363481 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=gcc.gnu.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=I1+8BVGH; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4BqhWc1kYJz9sTM for ; Mon, 14 Sep 2020 19:53:30 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5666A3985825; Mon, 14 Sep 2020 09:53:28 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 5666A3985825 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1600077208; bh=vCzYpKYN3u93qXU2ltuXTtZwJ0h7JWCQ58TxUcywTMI=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=I1+8BVGHD2Jz26XRiky95pZ4CKfzY4lT0LYltbTOKPZNKULHdqfBobuvuWpvzrm4A l3+3eWlBfJwCqXHHeVtLPbw8hSqeduLatpEQ4jCPj78UHmnGkmsjuia96cZWR/Mmti CTpTwatgdtrfEgvML952zVJKmiX6SyH/IhDkvuyA= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id E69043857800 for ; Mon, 14 Sep 2020 09:53:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org E69043857800 Received: from pps.filterd (m0098410.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 08E9ZTV6032107; Mon, 14 Sep 2020 05:53:23 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 33j3xsvd8t-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 14 Sep 2020 05:53:23 -0400 Received: from m0098410.ppops.net (m0098410.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 08E9ZdDY032998; Mon, 14 Sep 2020 05:53:22 -0400 Received: from ppma06ams.nl.ibm.com (66.31.33a9.ip4.static.sl-reverse.com [169.51.49.102]) by mx0a-001b2d01.pphosted.com with ESMTP id 33j3xsvd83-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 14 Sep 2020 05:53:22 -0400 Received: from pps.filterd (ppma06ams.nl.ibm.com [127.0.0.1]) by ppma06ams.nl.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 08E9h8qp005226; Mon, 14 Sep 2020 09:53:20 GMT Received: from b06avi18626390.portsmouth.uk.ibm.com (b06avi18626390.portsmouth.uk.ibm.com [9.149.26.192]) by ppma06ams.nl.ibm.com with ESMTP id 33hb1j15wf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 14 Sep 2020 09:53:20 +0000 Received: from b06wcsmtp001.portsmouth.uk.ibm.com (b06wcsmtp001.portsmouth.uk.ibm.com [9.149.105.160]) by b06avi18626390.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 08E9piQe16384270 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 14 Sep 2020 09:51:44 GMT Received: from b06wcsmtp001.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3FBD5A405B; Mon, 14 Sep 2020 09:53:18 +0000 (GMT) Received: from b06wcsmtp001.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C660FA4062; Mon, 14 Sep 2020 09:53:15 +0000 (GMT) Received: from KewenLins-MacBook-Pro.local (unknown [9.200.58.180]) by b06wcsmtp001.portsmouth.uk.ibm.com (Postfix) with ESMTP; Mon, 14 Sep 2020 09:53:15 +0000 (GMT) To: GCC Patches Subject: [PATCH]rs6000: Remove useless insns fed into lvx/stvx [PR97019] Message-ID: Date: Mon, 14 Sep 2020 17:53:13 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:68.0) Gecko/20100101 Thunderbird/68.9.0 MIME-Version: 1.0 Content-Language: en-US X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.235, 18.0.687 definitions=2020-09-14_02:2020-09-10, 2020-09-14 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 mlxlogscore=999 impostorscore=0 suspectscore=0 clxscore=1015 priorityscore=1501 spamscore=0 bulkscore=0 phishscore=0 malwarescore=0 mlxscore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2009140079 X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Kewen.Lin via Gcc-patches" From: "Kewen.Lin" Reply-To: "Kewen.Lin" Cc: Bill Schmidt , David Edelsohn , Segher Boessenkool Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Hi, This patch is to extend the existing function find_alignment_op to check all defintions of base_reg are AND operations with mask -16B to force the alignment. If all are satifised, it passes all AND operations and instructions in one vector to recombine_lvx_pattern and recombine_stvx_pattern, they will remove all useless ANDs further. Bootstrapped/regtested on powerpc64le-linux-gnu P8. Is it OK for trunk? BR, Kewen ----- gcc/ChangeLog: * config/rs6000/rs6000-p8swap.c (insn_rtx_pair_t): New type. (find_alignment_op): Adjust to support multiple defintions which are all AND operations with the mask -16B. (recombine_lvx_pattern): Adjust to handle multiple AND operations from find_alignment_op. (recombine_stvx_pattern): Likewise. gcc/testsuite/ChangeLog: * gcc.target/powerpc/pr97019.c: New test. diff --git a/gcc/config/rs6000/rs6000-p8swap.c b/gcc/config/rs6000/rs6000-p8swap.c index 3d5dc7d8aae..2de2edeab67 100644 --- a/gcc/config/rs6000/rs6000-p8swap.c +++ b/gcc/config/rs6000/rs6000-p8swap.c @@ -183,6 +183,8 @@ class swap_web_entry : public web_entry_base unsigned int will_delete : 1; }; +typedef std::pair insn_rtx_pair_t; + enum special_handling_values { SH_NONE = 0, SH_CONST_VECTOR, @@ -2095,15 +2097,19 @@ alignment_mask (rtx_insn *insn) return alignment_with_canonical_addr (SET_SRC (body)); } -/* Given INSN that's a load or store based at BASE_REG, look for a - feeding computation that aligns its address on a 16-byte boundary. - Return the rtx and its containing AND_INSN. */ -static rtx -find_alignment_op (rtx_insn *insn, rtx base_reg, rtx_insn **and_insn) +/* Given INSN that's a load or store based at BASE_REG, check if + all of its feeding computations align its address on a 16-byte + boundary. If so, return true and put all the computations + information with the form of pair {and_operation rtx, and_insn} + into AND_INSN_VEC. Otherwise, return false. */ + +static bool +find_alignment_op (rtx_insn *insn, rtx base_reg, + vec *and_insn_vec) { df_ref base_use; struct df_insn_info *insn_info = DF_INSN_INFO_GET (insn); - rtx and_operation = 0; + rtx and_operation = NULL_RTX; FOR_EACH_INSN_INFO_USE (base_use, insn_info) { @@ -2111,22 +2117,30 @@ find_alignment_op (rtx_insn *insn, rtx base_reg, rtx_insn **and_insn) continue; struct df_link *base_def_link = DF_REF_CHAIN (base_use); - if (!base_def_link || base_def_link->next) - break; + if (!base_def_link) + return false; - /* With stack-protector code enabled, and possibly in other - circumstances, there may not be an associated insn for - the def. */ - if (DF_REF_IS_ARTIFICIAL (base_def_link->ref)) - break; + while (base_def_link) + { + /* With stack-protector code enabled, and possibly in other + circumstances, there may not be an associated insn for + the def. */ + if (DF_REF_IS_ARTIFICIAL (base_def_link->ref)) + return false; - *and_insn = DF_REF_INSN (base_def_link->ref); - and_operation = alignment_mask (*and_insn); - if (and_operation != 0) - break; + rtx_insn *and_insn = DF_REF_INSN (base_def_link->ref); + and_operation = alignment_mask (and_insn); + + /* Stop if we find any one which doesn't align. */ + if (and_operation == NULL_RTX) + return false; + + and_insn_vec->safe_push (std::make_pair (and_insn, and_operation)); + base_def_link = base_def_link->next; + } } - return and_operation; + return and_operation != NULL_RTX; } struct del_info { bool replace; rtx_insn *replace_insn; }; @@ -2143,10 +2157,10 @@ recombine_lvx_pattern (rtx_insn *insn, del_info *to_delete) rtx mem = XEXP (SET_SRC (body), 0); rtx base_reg = XEXP (mem, 0); - rtx_insn *and_insn; - rtx and_operation = find_alignment_op (insn, base_reg, &and_insn); + auto_vec and_insn_vec; + bool all_and_p = find_alignment_op (insn, base_reg, &and_insn_vec); - if (and_operation != 0) + if (all_and_p) { df_ref def; struct df_insn_info *insn_info = DF_INSN_INFO_GET (insn); @@ -2168,25 +2182,34 @@ recombine_lvx_pattern (rtx_insn *insn, del_info *to_delete) to_delete[INSN_UID (swap_insn)].replace = true; to_delete[INSN_UID (swap_insn)].replace_insn = swap_insn; - /* However, first we must be sure that we make the - base register from the AND operation available - in case the register has been overwritten. Copy - the base register to a new pseudo and use that - as the base register of the AND operation in - the new LVX instruction. */ - rtx and_base = XEXP (and_operation, 0); - rtx new_reg = gen_reg_rtx (GET_MODE (and_base)); - rtx copy = gen_rtx_SET (new_reg, and_base); - rtx_insn *new_insn = emit_insn_after (copy, and_insn); - set_block_for_insn (new_insn, BLOCK_FOR_INSN (and_insn)); - df_insn_rescan (new_insn); - - XEXP (mem, 0) = gen_rtx_AND (GET_MODE (and_base), new_reg, - XEXP (and_operation, 1)); + rtx new_reg = NULL_RTX; + rtx and_op = NULL_RTX; + for (unsigned i = 0; i < and_insn_vec.length (); ++i) + { + /* However, first we must be sure that we make the + base register from the AND operation available + in case the register has been overwritten. Copy + the base register to a new pseudo and use that + as the base register of the AND operation in + the new LVX instruction. */ + insn_rtx_pair_t and_pair = and_insn_vec[i]; + rtx_insn *and_insn = and_pair.first; + and_op = and_pair.second; + rtx and_base = XEXP (and_op, 0); + if (!new_reg) + new_reg = gen_reg_rtx (GET_MODE (and_base)); + rtx copy = gen_rtx_SET (new_reg, and_base); + rtx_insn *new_insn = emit_insn_after (copy, and_insn); + set_block_for_insn (new_insn, BLOCK_FOR_INSN (and_insn)); + df_insn_rescan (new_insn); + } + + XEXP (mem, 0) = gen_rtx_AND (GET_MODE (new_reg), new_reg, + XEXP (and_op, 1)); SET_SRC (body) = mem; INSN_CODE (insn) = -1; /* Force re-recognition. */ df_insn_rescan (insn); - + if (dump_file) fprintf (dump_file, "lvx opportunity found at %d\n", INSN_UID (insn)); @@ -2205,10 +2228,10 @@ recombine_stvx_pattern (rtx_insn *insn, del_info *to_delete) rtx mem = SET_DEST (body); rtx base_reg = XEXP (mem, 0); - rtx_insn *and_insn; - rtx and_operation = find_alignment_op (insn, base_reg, &and_insn); + auto_vec and_insn_vec; + bool all_and_p = find_alignment_op (insn, base_reg, &and_insn_vec); - if (and_operation != 0) + if (all_and_p) { rtx src_reg = XEXP (SET_SRC (body), 0); df_ref src_use; @@ -2234,25 +2257,34 @@ recombine_stvx_pattern (rtx_insn *insn, del_info *to_delete) to_delete[INSN_UID (swap_insn)].replace = true; to_delete[INSN_UID (swap_insn)].replace_insn = swap_insn; - /* However, first we must be sure that we make the - base register from the AND operation available - in case the register has been overwritten. Copy - the base register to a new pseudo and use that - as the base register of the AND operation in - the new STVX instruction. */ - rtx and_base = XEXP (and_operation, 0); - rtx new_reg = gen_reg_rtx (GET_MODE (and_base)); - rtx copy = gen_rtx_SET (new_reg, and_base); - rtx_insn *new_insn = emit_insn_after (copy, and_insn); - set_block_for_insn (new_insn, BLOCK_FOR_INSN (and_insn)); - df_insn_rescan (new_insn); - - XEXP (mem, 0) = gen_rtx_AND (GET_MODE (and_base), new_reg, - XEXP (and_operation, 1)); + rtx new_reg = NULL_RTX; + rtx and_op = NULL_RTX; + for (unsigned i = 0; i < and_insn_vec.length (); ++i) + { + /* However, first we must be sure that we make the + base register from the AND operation available + in case the register has been overwritten. Copy + the base register to a new pseudo and use that + as the base register of the AND operation in + the new STVX instruction. */ + insn_rtx_pair_t and_pair = and_insn_vec[i]; + rtx_insn *and_insn = and_pair.first; + and_op = and_pair.second; + rtx and_base = XEXP (and_op, 0); + if (!new_reg) + new_reg = gen_reg_rtx (GET_MODE (and_base)); + rtx copy = gen_rtx_SET (new_reg, and_base); + rtx_insn *new_insn = emit_insn_after (copy, and_insn); + set_block_for_insn (new_insn, BLOCK_FOR_INSN (and_insn)); + df_insn_rescan (new_insn); + } + + XEXP (mem, 0) = gen_rtx_AND (GET_MODE (new_reg), new_reg, + XEXP (and_op, 1)); SET_SRC (body) = src_reg; INSN_CODE (insn) = -1; /* Force re-recognition. */ df_insn_rescan (insn); - + if (dump_file) fprintf (dump_file, "stvx opportunity found at %d\n", INSN_UID (insn)); diff --git a/gcc/testsuite/gcc.target/powerpc/pr97019.c b/gcc/testsuite/gcc.target/powerpc/pr97019.c new file mode 100644 index 00000000000..d6bb08ad260 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr97019.c @@ -0,0 +1,79 @@ +/* { dg-do compile { target { powerpc_p8vector_ok && le } } } */ +/* { dg-options "-O2 -mdejagnu-cpu=power8" } */ + +/* Test there are no useless instructions "rldicr x,y,0,59" + to align the addresses for lvx/stvx. */ + +extern int a, b, c; +extern vector unsigned long long ev5, ev6, ev7, ev8; +extern int dummy (vector unsigned long long); + +int test_vec_ld(unsigned char *pe) { + + vector unsigned long long v1, v2, v3, v4, v9; + vector unsigned long long v5 = ev5; + vector unsigned long long v6 = ev6; + vector unsigned long long v7 = ev7; + vector unsigned long long v8 = ev8; + + unsigned char *e = pe; + + do { + if (a) { + v1 = __builtin_vec_ld(16, (unsigned long long *)e); + v2 = __builtin_vec_ld(32, (unsigned long long *)e); + v3 = __builtin_vec_ld(48, (unsigned long long *)e); + e = e + 8; + for (int i = 0; i < a; i++) { + v4 = v5; + v5 = __builtin_crypto_vpmsumd(v1, v6); + v6 = __builtin_crypto_vpmsumd(v2, v7); + v7 = __builtin_crypto_vpmsumd(v3, v8); + e = e + 8; + } + } + v5 = __builtin_vec_ld(16, (unsigned long long *)e); + v6 = __builtin_vec_ld(32, (unsigned long long *)e); + v7 = __builtin_vec_ld(48, (unsigned long long *)e); + if (c) + b = 1; + } while (b); + + return dummy(v4); +} + +int test_vec_st(unsigned char *pe) { + + vector unsigned long long v1, v2, v3, v4; + vector unsigned long long v5 = ev5; + vector unsigned long long v6 = ev6; + vector unsigned long long v7 = ev7; + vector unsigned long long v8 = ev8; + + unsigned char *e = pe; + + do { + if (a) { + __builtin_vec_st(v1, 16, (unsigned long long *)e); + __builtin_vec_st(v2, 32, (unsigned long long *)e); + __builtin_vec_st(v3, 48, (unsigned long long *)e); + e = e + 8; + for (int i = 0; i < a; i++) { + v4 = v5; + v5 = __builtin_crypto_vpmsumd(v1, v6); + v6 = __builtin_crypto_vpmsumd(v2, v7); + v7 = __builtin_crypto_vpmsumd(v3, v8); + e = e + 8; + } + } + __builtin_vec_st(v5, 16, (unsigned long long *)e); + __builtin_vec_st(v6, 32, (unsigned long long *)e); + __builtin_vec_st(v7, 48, (unsigned long long *)e); + if (c) + b = 1; + } while (b); + + return dummy(v4); +} + +/* { dg-final { scan-assembler-not "rldicr\[ \t\]+\[0-9\]+,\[0-9\]+,0,59" } } */