From patchwork Mon Jun 19 14:23:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefan Schulze Frielinghaus X-Patchwork-Id: 1796621 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=ck8oVETH; dkim-atps=neutral Received: from sourceware.org (ip-8-43-85-97.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4QlBwd4l05z20Wk for ; Tue, 20 Jun 2023 00:29:21 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 878673858C60 for ; Mon, 19 Jun 2023 14:29:19 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 878673858C60 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1687184959; bh=MV2vdizXTrEzlEEI59P/1Pyh2IBKon6Fg5jpLrCj7bA=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=ck8oVETHaD1tZ3cMgm0ZBcmD09PfjOutMpyGJsiLtK2KRIfEF3HRefOLajS3bxEZ/ Fhs7GrAEUFTT3k32zHMBiw43/8C/7AQ4elSkogseHFzdJFypC4JoQk9Ysb1mH7vkSi WkgBIGaFyBGBigNZV0W/hhUT9KqwVLmSjK0gsAIQ= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 52B5A3858CDA for ; Mon, 19 Jun 2023 14:28:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 52B5A3858CDA Received: from pps.filterd (m0353723.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 35JEPoEi024591 for ; Mon, 19 Jun 2023 14:28:58 GMT Received: from ppma05fra.de.ibm.com (6c.4a.5195.ip4.static.sl-reverse.com [149.81.74.108]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3rarsb82ds-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Mon, 19 Jun 2023 14:28:57 +0000 Received: from pps.filterd (ppma05fra.de.ibm.com [127.0.0.1]) by ppma05fra.de.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 35J9Lu0I009469 for ; Mon, 19 Jun 2023 14:28:55 GMT Received: from smtprelay06.fra02v.mail.ibm.com ([9.218.2.230]) by ppma05fra.de.ibm.com (PPS) with ESMTPS id 3r94f512k4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Mon, 19 Jun 2023 14:28:55 +0000 Received: from smtpav01.fra02v.mail.ibm.com (smtpav01.fra02v.mail.ibm.com [10.20.54.100]) by smtprelay06.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 35JESqXd28705374 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 19 Jun 2023 14:28:52 GMT Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id F1D7A2004B; Mon, 19 Jun 2023 14:28:51 +0000 (GMT) Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id D2EE720040; Mon, 19 Jun 2023 14:28:51 +0000 (GMT) Received: from a8345010.lnxne.boe (unknown [9.152.108.100]) by smtpav01.fra02v.mail.ibm.com (Postfix) with ESMTPS; Mon, 19 Jun 2023 14:28:51 +0000 (GMT) To: gcc-patches@gcc.gnu.org Cc: Stefan Schulze Frielinghaus Subject: [PATCH v2] combine: Narrow comparison of memory and constant Date: Mon, 19 Jun 2023 16:23:57 +0200 Message-Id: <20230619142356.345159-1-stefansf@linux.ibm.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: References: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: 3u8p1ATpcL5TbjxyVwaeKCP77FFGRtjY X-Proofpoint-ORIG-GUID: 3u8p1ATpcL5TbjxyVwaeKCP77FFGRtjY X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-06-19_10,2023-06-16_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 clxscore=1015 priorityscore=1501 mlxscore=0 lowpriorityscore=0 adultscore=0 impostorscore=0 malwarescore=0 spamscore=0 phishscore=0 suspectscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2305260000 definitions=main-2306190129 X-Spam-Status: No, score=-8.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Stefan Schulze Frielinghaus via Gcc-patches From: Stefan Schulze Frielinghaus Reply-To: Stefan Schulze Frielinghaus Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" Comparisons between memory and constants might be done in a smaller mode resulting in smaller constants which might finally end up as immediates instead of in the literal pool. For example, on s390x a non-symmetric comparison like x <= 0x3fffffffffffffff results in the constant being spilled to the literal pool and an 8 byte memory comparison is emitted. Ideally, an equivalent comparison x0 <= 0x3f where x0 is the most significant byte of x, is emitted where the constant is smaller and more likely to materialize as an immediate. Similarly, comparisons of the form x >= 0x4000000000000000 can be shortened into x0 >= 0x40. Bootstrapped and regtested on s390x, x64, aarch64, and powerpc64le. Note, the new tests show that for the mentioned little-endian targets the optimization does not materialize since either the costs of the new instructions are higher or they do not match. Still ok for mainline? gcc/ChangeLog: * combine.cc (simplify_compare_const): Narrow comparison of memory and constant. (try_combine): Adapt new function signature. (simplify_comparison): Adapt new function signature. gcc/testsuite/ChangeLog: * gcc.dg/cmp-mem-const-1.c: New test. * gcc.dg/cmp-mem-const-2.c: New test. * gcc.dg/cmp-mem-const-3.c: New test. * gcc.dg/cmp-mem-const-4.c: New test. * gcc.dg/cmp-mem-const-5.c: New test. * gcc.dg/cmp-mem-const-6.c: New test. * gcc.target/s390/cmp-mem-const-1.c: New test. --- gcc/combine.cc | 79 +++++++++++++++++-- gcc/testsuite/gcc.dg/cmp-mem-const-1.c | 17 ++++ gcc/testsuite/gcc.dg/cmp-mem-const-2.c | 17 ++++ gcc/testsuite/gcc.dg/cmp-mem-const-3.c | 17 ++++ gcc/testsuite/gcc.dg/cmp-mem-const-4.c | 17 ++++ gcc/testsuite/gcc.dg/cmp-mem-const-5.c | 17 ++++ gcc/testsuite/gcc.dg/cmp-mem-const-6.c | 17 ++++ .../gcc.target/s390/cmp-mem-const-1.c | 24 ++++++ 8 files changed, 200 insertions(+), 5 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/cmp-mem-const-1.c create mode 100644 gcc/testsuite/gcc.dg/cmp-mem-const-2.c create mode 100644 gcc/testsuite/gcc.dg/cmp-mem-const-3.c create mode 100644 gcc/testsuite/gcc.dg/cmp-mem-const-4.c create mode 100644 gcc/testsuite/gcc.dg/cmp-mem-const-5.c create mode 100644 gcc/testsuite/gcc.dg/cmp-mem-const-6.c create mode 100644 gcc/testsuite/gcc.target/s390/cmp-mem-const-1.c diff --git a/gcc/combine.cc b/gcc/combine.cc index 5aa0ec5c45a..56e15a93409 100644 --- a/gcc/combine.cc +++ b/gcc/combine.cc @@ -460,7 +460,7 @@ static rtx simplify_shift_const (rtx, enum rtx_code, machine_mode, rtx, static int recog_for_combine (rtx *, rtx_insn *, rtx *); static rtx gen_lowpart_for_combine (machine_mode, rtx); static enum rtx_code simplify_compare_const (enum rtx_code, machine_mode, - rtx, rtx *); + rtx *, rtx *); static enum rtx_code simplify_comparison (enum rtx_code, rtx *, rtx *); static void update_table_tick (rtx); static void record_value_for_reg (rtx, rtx_insn *, rtx); @@ -3185,7 +3185,7 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0, compare_code = orig_compare_code = GET_CODE (*cc_use_loc); if (is_a (GET_MODE (i2dest), &mode)) compare_code = simplify_compare_const (compare_code, mode, - op0, &op1); + &op0, &op1); target_canonicalize_comparison (&compare_code, &op0, &op1, 1); } @@ -11796,13 +11796,14 @@ gen_lowpart_for_combine (machine_mode omode, rtx x) (CODE OP0 const0_rtx) form. The result is a possibly different comparison code to use. - *POP1 may be updated. */ + *POP0 and *POP1 may be updated. */ static enum rtx_code simplify_compare_const (enum rtx_code code, machine_mode mode, - rtx op0, rtx *pop1) + rtx *pop0, rtx *pop1) { scalar_int_mode int_mode; + rtx op0 = *pop0; HOST_WIDE_INT const_op = INTVAL (*pop1); /* Get the constant we are comparing against and turn off all bits @@ -11987,6 +11988,74 @@ simplify_compare_const (enum rtx_code code, machine_mode mode, break; } + /* Narrow non-symmetric comparison of memory and constant as e.g. + x0...x7 <= 0x3fffffffffffffff into x0 <= 0x3f where x0 is the most + significant byte. Likewise, transform x0...x7 >= 0x4000000000000000 into + x0 >= 0x40. */ + if ((code == LEU || code == LTU || code == GEU || code == GTU) + && is_a (GET_MODE (op0), &int_mode) + && MEM_P (op0) + && !MEM_VOLATILE_P (op0) + /* The optimization makes only sense for constants which are big enough + so that we have a chance to chop off something at all. */ + && (unsigned HOST_WIDE_INT) const_op > 0xff + /* Ensure that we do not overflow during normalization. */ + && (code != GTU || (unsigned HOST_WIDE_INT) const_op < HOST_WIDE_INT_M1U)) + { + unsigned HOST_WIDE_INT n = (unsigned HOST_WIDE_INT) const_op; + enum rtx_code adjusted_code; + + /* Normalize code to either LEU or GEU. */ + if (code == LTU) + { + --n; + adjusted_code = LEU; + } + else if (code == GTU) + { + ++n; + adjusted_code = GEU; + } + else + adjusted_code = code; + + scalar_int_mode narrow_mode_iter; + FOR_EACH_MODE_UNTIL (narrow_mode_iter, int_mode) + { + unsigned nbits = GET_MODE_PRECISION (int_mode) + - GET_MODE_PRECISION (narrow_mode_iter); + unsigned HOST_WIDE_INT mask = (HOST_WIDE_INT_1U << nbits) - 1; + unsigned HOST_WIDE_INT lower_bits = n & mask; + if ((adjusted_code == LEU && lower_bits == mask) + || (adjusted_code == GEU && lower_bits == 0)) + { + n >>= nbits; + break; + } + } + + if (narrow_mode_iter < int_mode) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fprintf ( + dump_file, "narrow comparison from mode %s to %s: (MEM %s " + HOST_WIDE_INT_PRINT_HEX ") to (MEM %s " + HOST_WIDE_INT_PRINT_HEX ").\n", GET_MODE_NAME (int_mode), + GET_MODE_NAME (narrow_mode_iter), GET_RTX_NAME (code), + (unsigned HOST_WIDE_INT)const_op, GET_RTX_NAME (adjusted_code), + n); + } + poly_int64 offset = (BYTES_BIG_ENDIAN + ? 0 + : (GET_MODE_SIZE (int_mode) + - GET_MODE_SIZE (narrow_mode_iter))); + *pop0 = adjust_address_nv (op0, narrow_mode_iter, offset); + *pop1 = GEN_INT (n); + return adjusted_code; + } + } + *pop1 = GEN_INT (const_op); return code; } @@ -12179,7 +12248,7 @@ simplify_comparison (enum rtx_code code, rtx *pop0, rtx *pop1) /* Try to simplify the compare to constant, possibly changing the comparison op, and/or changing op1 to zero. */ - code = simplify_compare_const (code, raw_mode, op0, &op1); + code = simplify_compare_const (code, raw_mode, &op0, &op1); const_op = INTVAL (op1); /* Compute some predicates to simplify code below. */ diff --git a/gcc/testsuite/gcc.dg/cmp-mem-const-1.c b/gcc/testsuite/gcc.dg/cmp-mem-const-1.c new file mode 100644 index 00000000000..263ad98af79 --- /dev/null +++ b/gcc/testsuite/gcc.dg/cmp-mem-const-1.c @@ -0,0 +1,17 @@ +/* { dg-do compile { target { lp64 } } } */ +/* { dg-options "-O1 -fdump-rtl-combine-details" } */ +/* { dg-final { scan-rtl-dump "narrow comparison from mode DI to QI" "combine" } } */ + +typedef __UINT64_TYPE__ uint64_t; + +int +le_1byte_a (uint64_t *x) +{ + return *x <= 0x3fffffffffffffff; +} + +int +le_1byte_b (uint64_t *x) +{ + return *x < 0x4000000000000000; +} diff --git a/gcc/testsuite/gcc.dg/cmp-mem-const-2.c b/gcc/testsuite/gcc.dg/cmp-mem-const-2.c new file mode 100644 index 00000000000..a7cc5348295 --- /dev/null +++ b/gcc/testsuite/gcc.dg/cmp-mem-const-2.c @@ -0,0 +1,17 @@ +/* { dg-do compile { target { lp64 } } } */ +/* { dg-options "-O1 -fdump-rtl-combine-details" } */ +/* { dg-final { scan-rtl-dump "narrow comparison from mode DI to QI" "combine" } } */ + +typedef __UINT64_TYPE__ uint64_t; + +int +ge_1byte_a (uint64_t *x) +{ + return *x > 0x3fffffffffffffff; +} + +int +ge_1byte_b (uint64_t *x) +{ + return *x >= 0x4000000000000000; +} diff --git a/gcc/testsuite/gcc.dg/cmp-mem-const-3.c b/gcc/testsuite/gcc.dg/cmp-mem-const-3.c new file mode 100644 index 00000000000..06f80bf72d8 --- /dev/null +++ b/gcc/testsuite/gcc.dg/cmp-mem-const-3.c @@ -0,0 +1,17 @@ +/* { dg-do compile { target { lp64 } } } */ +/* { dg-options "-O1 -fdump-rtl-combine-details" } */ +/* { dg-final { scan-rtl-dump "narrow comparison from mode DI to HI" "combine" } } */ + +typedef __UINT64_TYPE__ uint64_t; + +int +le_2bytes_a (uint64_t *x) +{ + return *x <= 0x3ffdffffffffffff; +} + +int +le_2bytes_b (uint64_t *x) +{ + return *x < 0x3ffe000000000000; +} diff --git a/gcc/testsuite/gcc.dg/cmp-mem-const-4.c b/gcc/testsuite/gcc.dg/cmp-mem-const-4.c new file mode 100644 index 00000000000..407999abf7e --- /dev/null +++ b/gcc/testsuite/gcc.dg/cmp-mem-const-4.c @@ -0,0 +1,17 @@ +/* { dg-do compile { target { lp64 } } } */ +/* { dg-options "-O1 -fdump-rtl-combine-details" } */ +/* { dg-final { scan-rtl-dump "narrow comparison from mode DI to HI" "combine" } } */ + +typedef __UINT64_TYPE__ uint64_t; + +int +ge_2bytes_a (uint64_t *x) +{ + return *x > 0x400cffffffffffff; +} + +int +ge_2bytes_b (uint64_t *x) +{ + return *x >= 0x400d000000000000; +} diff --git a/gcc/testsuite/gcc.dg/cmp-mem-const-5.c b/gcc/testsuite/gcc.dg/cmp-mem-const-5.c new file mode 100644 index 00000000000..e16773f5bcf --- /dev/null +++ b/gcc/testsuite/gcc.dg/cmp-mem-const-5.c @@ -0,0 +1,17 @@ +/* { dg-do compile { target { lp64 } } } */ +/* { dg-options "-O1 -fdump-rtl-combine-details" } */ +/* { dg-final { scan-rtl-dump "narrow comparison from mode DI to SI" "combine" } } */ + +typedef __UINT64_TYPE__ uint64_t; + +int +le_4bytes_a (uint64_t *x) +{ + return *x <= 0x3ffffdffffffffff; +} + +int +le_4bytes_b (uint64_t *x) +{ + return *x < 0x3ffffe0000000000; +} diff --git a/gcc/testsuite/gcc.dg/cmp-mem-const-6.c b/gcc/testsuite/gcc.dg/cmp-mem-const-6.c new file mode 100644 index 00000000000..8f53b5678bd --- /dev/null +++ b/gcc/testsuite/gcc.dg/cmp-mem-const-6.c @@ -0,0 +1,17 @@ +/* { dg-do compile { target { lp64 } } } */ +/* { dg-options "-O1 -fdump-rtl-combine-details" } */ +/* { dg-final { scan-rtl-dump "narrow comparison from mode DI to SI" "combine" } } */ + +typedef __UINT64_TYPE__ uint64_t; + +int +ge_4bytes_a (uint64_t *x) +{ + return *x > 0x4000cfffffffffff; +} + +int +ge_4bytes_b (uint64_t *x) +{ + return *x >= 0x4000d00000000000; +} diff --git a/gcc/testsuite/gcc.target/s390/cmp-mem-const-1.c b/gcc/testsuite/gcc.target/s390/cmp-mem-const-1.c new file mode 100644 index 00000000000..309aafbec01 --- /dev/null +++ b/gcc/testsuite/gcc.target/s390/cmp-mem-const-1.c @@ -0,0 +1,24 @@ +/* { dg-do compile { target { lp64 } } } */ +/* { dg-options "-O1 -march=z13 -mzarch -fdump-rtl-combine-details" } */ +/* { dg-final { scan-assembler-not {\tclc\t} } } */ +/* { dg-final { scan-rtl-dump "narrow comparison from mode DI to QI" "combine" } } */ + +struct s +{ + long a; + unsigned b : 1; + unsigned c : 1; +}; + +int foo (struct s *x) +{ + /* Expression + x->b || x->c + is transformed into + _1 = BIT_FIELD_REF <*x_4(D), 64, 64>; + _2 = _1 > 0x3FFFFFFFFFFFFFFF; + where the constant may materialize in the literal pool and an 8 byte CLC + may be emitted. Ensure this is not the case. + */ + return x->b || x->c; +}