From patchwork Fri Mar 8 08:14:37 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: HAO CHEN GUI X-Patchwork-Id: 1909543 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=ibm.com header.i=@ibm.com header.a=rsa-sha256 header.s=pp1 header.b=CrTkRoF5; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Trf9b6trDz1yX8 for ; Fri, 8 Mar 2024 19:15:15 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B659C3856975 for ; Fri, 8 Mar 2024 08:15:13 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id C9C6B3858C31 for ; Fri, 8 Mar 2024 08:14:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C9C6B3858C31 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org C9C6B3858C31 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=148.163.156.1 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1709885692; cv=none; b=xQE4WKcrE5zjo5eGhNeJz9TYQUyH6RKM0daVVQ8LOGRUHxszZT+zguv8u6Gk3QUd04qcB7AwaBeCTjRT5F3bNHNmih/y2lUbrBZ/l7nHH5777e9SvDhtfEzIQWM6XWv97w61S9Hmpn5xrpMK7Hh3I+VthvFbDHJcrbjf5ZUBbks= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1709885692; c=relaxed/simple; bh=SvVgtRHDnbfXwapEe5sUczwAsr+4rjSd8M6Y0JeGAh8=; h=DKIM-Signature:Message-ID:Date:To:From:Subject:MIME-Version; b=GkJ3ImBtNqgg1/d+kI+U2VBL3Ys2xepMDPXekTJZgYYvaWtcMPTqw76oBW8LtMiWolQpIcCCaCJWqOLQQcdu2aN5UomxMtw/jiX+wHitQ4M4zzV20IWVw9BQ3r2dhK7TWknVxgcBivwUq/IJLdjRF+HAg/ECxPenWNIeSN2/mVw= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0360083.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 42882b3n006886; Fri, 8 Mar 2024 08:14:48 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : to : cc : from : subject : content-type : content-transfer-encoding : mime-version; s=pp1; bh=2tpX2KiVecn018emInwvRXb4LUnNGfTIWMMgGmeGYM8=; b=CrTkRoF5uzwik4E6DSIzkGnYuoxQrjCAStsxoUX/++epjCKycWhrphtJhtK6CA5XPvkv AQtS+1HioPS+FGy/GhRD7WsHoSFs6bzvLDvrMld95pzLTH3m45JH/rR3YZ9wQipLWrHK Pq+KLxRQJzPz7m/UvEq5+OgNKrCXB+dnX+I49PR2zx8h7SV2sxMI0/cZDWllcBJ1MKT9 +s+U+DqMushiBf/CEQkqqI+fwigg1mD+LIQlwSykjFHnwPR2FpNIb//dfesyanUbQ3uf 4FRjOjrEDppSfjYAMg0sose13jYvMxHm4f50w+q7pUKh0+PEBUz8ThZ208brQZCrWuTI fg== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3wqxtcg89p-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 08 Mar 2024 08:14:48 +0000 Received: from m0360083.ppops.net (m0360083.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 4288AMHH030544; Fri, 8 Mar 2024 08:14:48 GMT Received: from ppma13.dal12v.mail.ibm.com (dd.9e.1632.ip4.static.sl-reverse.com [50.22.158.221]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3wqxtcg895-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 08 Mar 2024 08:14:48 +0000 Received: from pps.filterd (ppma13.dal12v.mail.ibm.com [127.0.0.1]) by ppma13.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 4285U9G5031541; Fri, 8 Mar 2024 08:14:47 GMT Received: from smtprelay03.fra02v.mail.ibm.com ([9.218.2.224]) by ppma13.dal12v.mail.ibm.com (PPS) with ESMTPS id 3wmgnkjth5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 08 Mar 2024 08:14:47 +0000 Received: from smtpav01.fra02v.mail.ibm.com (smtpav01.fra02v.mail.ibm.com [10.20.54.100]) by smtprelay03.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 4288Efit39191032 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 8 Mar 2024 08:14:43 GMT Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A663C20040; Fri, 8 Mar 2024 08:14:41 +0000 (GMT) Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 8449B20043; Fri, 8 Mar 2024 08:14:39 +0000 (GMT) Received: from [9.200.53.126] (unknown [9.200.53.126]) by smtpav01.fra02v.mail.ibm.com (Postfix) with ESMTP; Fri, 8 Mar 2024 08:14:39 +0000 (GMT) Message-ID: <3d744087-644f-4426-8bc8-59f52c384024@linux.ibm.com> Date: Fri, 8 Mar 2024 16:14:37 +0800 User-Agent: Mozilla Thunderbird Content-Language: en-US To: gcc-patches Cc: Segher Boessenkool , David , "Kewen.Lin" , Peter Bergner From: HAO CHEN GUI Subject: [PATCHv2, rs6000] Add subreg patterns for SImode rotate and mask insert X-TM-AS-GCONF: 00 X-Proofpoint-GUID: isOyeLBo1W2E3mcaIABxIc19o8HCu3uR X-Proofpoint-ORIG-GUID: tFKGUYfuiPznd_2KXc8l810WLrBZjMcv X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.1011,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2024-03-08_06,2024-03-06_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 mlxlogscore=999 adultscore=0 malwarescore=0 suspectscore=0 lowpriorityscore=0 phishscore=0 impostorscore=0 clxscore=1015 spamscore=0 priorityscore=1501 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311290000 definitions=main-2403080064 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Hi, This patch fixes regression cases in gcc.target/powerpc/rlwimi-2.c. In combine pass, SImode (subreg from DImode) lshiftrt is converted to DImode lshiftrt with an out AND. It matches a DImode rotate and mask insert on rs6000. Trying 2 -> 7: 2: r122:DI=r129:DI REG_DEAD r129:DI 7: r125:SI=r122:DI#0 0>>0x1f REG_DEAD r122:DI Failed to match this instruction: (set (subreg:DI (reg:SI 125 [ x ]) 0) (zero_extract:DI (reg:DI 129) (const_int 32 [0x20]) (const_int 1 [0x1]))) Successfully matched this instruction: (set (subreg:DI (reg:SI 125 [ x ]) 0) (and:DI (lshiftrt:DI (reg:DI 129) (const_int 31 [0x1f])) (const_int 4294967295 [0xffffffff]))) This conversion blocks the further combination which combines to a SImode rotate and mask insert insn. Trying 9, 7 -> 10: 9: r127:SI=r130:DI#0&0xfffffffffffffffe REG_DEAD r130:DI 7: r125:SI#0=r129:DI 0>>0x1f&0xffffffff REG_DEAD r129:DI 10: r124:SI=r127:SI|r125:SI REG_DEAD r125:SI REG_DEAD r127:SI Failed to match this instruction: (set (reg:SI 124) (ior:SI (and:SI (subreg:SI (reg:DI 130) 0) (const_int -2 [0xfffffffffffffffe])) (subreg:SI (zero_extract:DI (reg:DI 129) (const_int 32 [0x20]) (const_int 1 [0x1])) 0))) Failed to match this instruction: (set (reg:SI 124) (ior:SI (and:SI (subreg:SI (reg:DI 130) 0) (const_int -2 [0xfffffffffffffffe])) (subreg:SI (and:DI (lshiftrt:DI (reg:DI 129) (const_int 31 [0x1f])) (const_int 4294967295 [0xffffffff])) 0))) The root cause of the issue is if it's necessary to do the widen mode for lshiftrt when the target already has shiftrt for narrow mode and its cost is not high. My former patch tried to fix the problem but not accepted yet. https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624852.html As it's stage 4 now, I drafted this patch to fix the regression by adding subreg patterns of SImode rotate and mask insert. It actually does reversed things and narrow the mode for lshiftrt so that it can matches the SImode rotate and mask insert. The case "rlwimi-2.c" is fixed and restore the corresponding number of insns to original ones. Compared with last version, the main change is to remove changes for a testcase which was already fixed in another patch. Bootstrapped and tested on x86 and powerpc64-linux BE and LE with no regressions. Is it OK for the trunk? Thanks Gui Haochen ChangeLog rs6000: Add subreg patterns for SImode rotate and mask insert In combine pass, SImode (subreg from DImode) lshiftrt is converted to DImode lshiftrt with an AND. The new pattern matches rotate and mask insert on rs6000. Thus it blocks the pattern to be further combined to a SImode rotate and mask insert pattern. This patch fixes the problem by adding two subreg pattern for SImode rotate and mask insert patterns. gcc/ PR target/93738 * config/rs6000/rs6000.md (*rotlsi3_insert_subreg): New. (*rotlsi3_insert_4_subreg): New. gcc/testsuite/ PR target/93738 * gcc.target/powerpc/rlwimi-2.c: Adjust the number of 64bit and 32bit rotate instructions. patch.diff diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index bc8bc6ab060..996d0740faf 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -4253,6 +4253,36 @@ (define_insn "*rotl3_insert" ; difference between rlwimi and rldimi. We also might want dot forms, ; but not for rlwimi on POWER4 and similar processors. +; Subreg pattern of insn "*rotlsi3_insert" +(define_insn_and_split "*rotlsi3_insert_subreg" + [(set (match_operand:SI 0 "gpc_reg_operand" "=r") + (ior:SI (and:SI + (match_operator:SI 8 "lowpart_subreg_operator" + [(and:DI (match_operator:DI 4 "rotate_mask_operator" + [(match_operand:DI 1 "gpc_reg_operand" "r") + (match_operand:SI 2 "const_int_operand" "n")]) + (match_operand:DI 3 "const_int_operand" "n"))]) + (match_operand:SI 5 "const_int_operand" "n")) + (and:SI (match_operand:SI 6 "gpc_reg_operand" "0") + (match_operand:SI 7 "const_int_operand" "n"))))] + "rs6000_is_valid_insert_mask (operands[5], operands[4], SImode) + && GET_CODE (operands[4]) == LSHIFTRT + && INTVAL (operands[3]) == 0xffffffff + && UINTVAL (operands[5]) + UINTVAL (operands[7]) + 1 == 0" + "#" + "&& 1" + [(set (match_dup 0) + (ior:SI (and:SI (lshiftrt:SI (match_dup 9) + (match_dup 2)) + (match_dup 5)) + (and:SI (match_dup 6) + (match_dup 7))))] +{ + int offset = BYTES_BIG_ENDIAN ? 4 : 0; + operands[9] = gen_rtx_SUBREG (SImode, operands[1], offset); +} + [(set_attr "type" "insert")]) + (define_insn "*rotl3_insert_2" [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") (ior:GPR (and:GPR (match_operand:GPR 5 "gpc_reg_operand" "0") @@ -4331,6 +4361,31 @@ (define_insn "*rotlsi3_insert_4" "rlwimi %0,%1,32-%h2,%h2,31" [(set_attr "type" "insert")]) +; Subreg pattern of insn "*rotlsi3_insert_4" +(define_insn_and_split "*rotlsi3_insert_4_subreg" + [(set (match_operand:SI 0 "gpc_reg_operand" "=r") + (ior:SI (and:SI (match_operand:SI 3 "gpc_reg_operand" "0") + (match_operand:SI 4 "const_int_operand" "n")) + (match_operator:SI 6 "lowpart_subreg_operator" + [(and:DI + (lshiftrt:DI (match_operand:DI 1 "gpc_reg_operand" "r") + (match_operand:SI 2 "const_int_operand" "n")) + (match_operand:DI 5 "const_int_operand" "n"))])))] + "INTVAL (operands[2]) + exact_log2 (-UINTVAL (operands[4])) == 32 + && INTVAL (operands[5]) == 0xffffffff" + "#" + "&& 1" + [(set (match_dup 0) + (ior:SI (and:SI (match_dup 3) + (match_dup 4)) + (lshiftrt:SI (match_dup 7) + (match_dup 2))))] +{ + int offset = BYTES_BIG_ENDIAN ? 4 : 0; + operands[7] = gen_rtx_SUBREG (SImode, operands[1], offset); +} + [(set_attr "type" "insert")]) + (define_insn "*rotlsi3_insert_5" [(set (match_operand:SI 0 "gpc_reg_operand" "=r,r") (ior:SI (and:SI (match_operand:SI 1 "gpc_reg_operand" "0,r") diff --git a/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c b/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c index bafa371db73..62344a95aa0 100644 --- a/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c +++ b/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c @@ -6,10 +6,9 @@ /* { dg-final { scan-assembler-times {(?n)^\s+blr} 6750 } } */ /* { dg-final { scan-assembler-times {(?n)^\s+mr} 643 { target ilp32 } } } */ /* { dg-final { scan-assembler-times {(?n)^\s+mr} 11 { target lp64 } } } */ -/* { dg-final { scan-assembler-times {(?n)^\s+rldicl} 7790 { target lp64 } } } */ +/* { dg-final { scan-assembler-times {(?n)^\s+rldicl} 6728 { target lp64 } } } */ -/* { dg-final { scan-assembler-times {(?n)^\s+rlwimi} 1692 { target ilp32 } } } */ -/* { dg-final { scan-assembler-times {(?n)^\s+rlwimi} 1666 { target lp64 } } } */ +/* { dg-final { scan-assembler-times {(?n)^\s+rlwimi} 1692 } } */ /* { dg-final { scan-assembler-times {(?n)^\s+mulli} 5036 } } */