From patchwork Thu May 9 11:52:58 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Robin Dapp X-Patchwork-Id: 1097476 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-500363-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="Dmd8NGxm"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 450BYk4ctYz9sBV for ; Thu, 9 May 2019 21:53:14 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :subject:to:date:mime-version:content-type:message-id; q=dns; s= default; b=I7HRbiMW7+Soh4vs7G2q/0dbw6pvyW/e6O8S/RNUAjCxDQadz+8c5 YeMOmgQa+GsmRtxSAFK4798gYMMvzOqtACETO68ewfXyxaEPF2K861CvadFJxptA 62LL0B3FSV0DJqLAAESZ77bJLJmFShzumqpERXi6reYht12Djoi50E= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :subject:to:date:mime-version:content-type:message-id; s= default; bh=Lc+995wIO4wreX6Wp6GmbMGnspg=; b=Dmd8NGxmUt+PVm5P+IZI dcDRnk11waVqOMk8cXTEsl5munw9L0iy/TJWIjRmNn6JpZ0x5tJyjCLir1rpEw/v PWW06Gb/EdMUp0h7Ox+vJcTfjGocLmv2c0ATaBo+QuQchAMnDwHRCWWdu5kCZWXU eEtd2AnTkuM44NjhfJQi0BE= Received: (qmail 49701 invoked by alias); 9 May 2019 11:53:07 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 49692 invoked by uid 89); 9 May 2019 11:53:07 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-24.6 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.1 spammy=scalar_mode, sk:SCALAR_, DImode, Except X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0b-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.158.5) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 09 May 2019 11:53:05 +0000 Received: from pps.filterd (m0098413.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x49BpiPt142960 for ; Thu, 9 May 2019 07:53:03 -0400 Received: from e06smtp01.uk.ibm.com (e06smtp01.uk.ibm.com [195.75.94.97]) by mx0b-001b2d01.pphosted.com with ESMTP id 2scj29cnt0-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Thu, 09 May 2019 07:53:03 -0400 Received: from localhost by e06smtp01.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 9 May 2019 12:53:01 +0100 Received: from b06cxnps4075.portsmouth.uk.ibm.com (9.149.109.197) by e06smtp01.uk.ibm.com (192.168.101.131) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Thu, 9 May 2019 12:53:00 +0100 Received: from d06av21.portsmouth.uk.ibm.com (d06av21.portsmouth.uk.ibm.com [9.149.105.232]) by b06cxnps4075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id x49Bqxl859441268 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Thu, 9 May 2019 11:52:59 GMT Received: from d06av21.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2C3EB52054 for ; Thu, 9 May 2019 11:52:59 +0000 (GMT) Received: from oc6142347168.ibm.com (unknown [9.152.222.42]) by d06av21.portsmouth.uk.ibm.com (Postfix) with ESMTPS id 0E96752050 for ; Thu, 9 May 2019 11:52:59 +0000 (GMT) From: Robin Dapp Subject: [RFC] SHIFT_COUNT_TRUNCATED and shift_truncation_mask To: GCC Patches Date: Thu, 9 May 2019 13:52:58 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 MIME-Version: 1.0 x-cbid: 19050911-4275-0000-0000-00000332FED0 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 19050911-4276-0000-0000-00003842712B Message-Id: <6f3a6c31-5a02-86bb-a659-13f541350616@linux.ibm.com> Hi, while trying to improve s390 code generation for rotate and shift I noticed superfluous subregs for shift count operands. In our backend we already have quite cumbersome patterns that would need to be duplicated (or complicated further by more subst patterns) in order to get rid of the subregs. I had already finished all the patterns when I realized that SHIFT_COUNT_TRUNCATED and the target hook shift_truncation_mask already exist and could do what is needed without extra patterns. Just defining shift_truncation_mask was not enough though as most of the additional insns get introduced by combine. Event SHIFT_COUNT_TRUNCATED is no perfect match to what our hardware does because we always only consider the last 6 bits of a shift operand. Despite all the warnings in the other backends, most notably SHIFT_COUNT_TRUNCATED being "discouraged" as mentioned in riscv.h, I wrote the attached tentative patch. It's a little ad-hoc, uses the SHIFT_COUNT_TRUNCATED paths only if shift_truncation_mask () != 0 and, instead of truncating via & (GET_MODE_BITSIZE (mode) - 1), it applies the mask returned by shift_truncation_mask. Doing so, usage of both "methods" actually reduces to a single way. I assume both were originally intended for different purposes but without knowing the history the separation seems artificial to me. A quick look at other backends showed that at least some (e.g. ARM) do not use SHIFT_COUNT_TRUNCATED because the general behavior is not fine-grained enough, e.g. the masks for shift and rotate differ. While the attached patch might probably work for s390, it will probably not for other targets. In addition to what my patch does, would it be useful to unify both truncation methods in a target hook that takes the operation (shift, rotate, zero_extract, ...) as well as the mode as arguments? Thus, we would let the target decide what to do with the specific combination of both. Maybe this would also allow to distinguish bit test operations from the rest. Of course, when getting everything right in the backend, there will be no difference in result, but in my experience it's easily possible to forget a subreg/... somewhere and end up with worse code by accident. Maybe there is another reason why SHIFT_COUNT_TRUNCATED is discouraged that I missed entirely? Regards Robin diff --git a/gcc/combine.c b/gcc/combine.c index 91e32c88c88..d2a659f929b 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -6445,14 +6445,12 @@ combine_simplify_rtx (rtx x, machine_mode op0_mode, int in_dest, return simplify_shift_const (x, code, mode, XEXP (x, 0), INTVAL (XEXP (x, 1))); - else if (SHIFT_COUNT_TRUNCATED && !REG_P (XEXP (x, 1))) + else if (SHIFT_COUNT_TRUNCATED + && targetm.shift_truncation_mask (mode) + && !REG_P (XEXP (x, 1))) SUBST (XEXP (x, 1), force_to_mode (XEXP (x, 1), GET_MODE (XEXP (x, 1)), - (HOST_WIDE_INT_1U - << exact_log2 (GET_MODE_UNIT_BITSIZE - (GET_MODE (x)))) - - 1, - 0)); + targetm.shift_truncation_mask (mode), 0)); break; default: @@ -10594,8 +10592,8 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode, /* Make sure and truncate the "natural" shift on the way in. We don't want to do this inside the loop as it makes it more difficult to combine shifts. */ - if (SHIFT_COUNT_TRUNCATED) - orig_count &= GET_MODE_UNIT_BITSIZE (mode) - 1; + if (SHIFT_COUNT_TRUNCATED && targetm.shift_truncation_mask (mode)) + orig_count &= targetm.shift_truncation_mask (mode); /* If we were given an invalid count, don't do anything except exactly what was requested. */ @@ -12295,7 +12293,7 @@ simplify_comparison (enum rtx_code code, rtx *pop0, rtx *pop1) between the position and the location of the single bit. */ /* Except we can't if SHIFT_COUNT_TRUNCATED is set, since we might have already reduced the shift count modulo the word size. */ - if (!SHIFT_COUNT_TRUNCATED + if ((!SHIFT_COUNT_TRUNCATED || !targetm.shift_truncation_mask (mode)) && CONST_INT_P (XEXP (op0, 0)) && XEXP (op0, 1) == const1_rtx && equality_comparison_p && const_op == 0 diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c index ad8eacdf4dc..1d723f29e1e 100644 --- a/gcc/config/s390/s390.c +++ b/gcc/config/s390/s390.c @@ -2320,6 +2320,7 @@ s390_single_part (rtx op, part = i; } } + return part == -1 ? -1 : n_parts - 1 - part; } @@ -2702,6 +2703,7 @@ s390_logical_operator_ok_p (rtx *operands) return true; } + /* Narrow logical operation CODE of memory operand MEMOP with immediate operand IMMOP to switch from SS to SI type instructions. */ @@ -16294,6 +16296,13 @@ s390_case_values_threshold (void) return default_case_values_threshold (); } +static unsigned HOST_WIDE_INT +s390_shift_truncation_mask (machine_mode mode) +{ + return (mode == DImode || mode == SImode + || mode == HImode || mode == QImode) ? 63 : 0; +} + /* Initialize GCC target structure. */ #undef TARGET_ASM_ALIGNED_HI_OP @@ -16585,6 +16594,9 @@ s390_case_values_threshold (void) #undef TARGET_CASE_VALUES_THRESHOLD #define TARGET_CASE_VALUES_THRESHOLD s390_case_values_threshold +#undef TARGET_SHIFT_TRUNCATION_MASK +#define TARGET_SHIFT_TRUNCATION_MASK s390_shift_truncation_mask + /* Use only short displacement, since long displacement is not available for the floating point instructions. */ #undef TARGET_MAX_ANCHOR_OFFSET diff --git a/gcc/config/s390/s390.h b/gcc/config/s390/s390.h index 969f58a2ba0..d85bfcc5e04 100644 --- a/gcc/config/s390/s390.h +++ b/gcc/config/s390/s390.h @@ -1188,5 +1188,6 @@ struct GTY(()) machine_function #define TARGET_INDIRECT_BRANCH_TABLE s390_indirect_branch_table +#define SHIFT_COUNT_TRUNCATED 1 #endif /* S390_H */ diff --git a/gcc/cse.c b/gcc/cse.c index 6c9cda16a98..f7bf287e9ef 100644 --- a/gcc/cse.c +++ b/gcc/cse.c @@ -3649,10 +3649,11 @@ fold_rtx (rtx x, rtx_insn *insn) && (INTVAL (const_arg1) >= GET_MODE_UNIT_PRECISION (mode) || INTVAL (const_arg1) < 0)) { - if (SHIFT_COUNT_TRUNCATED) + if (SHIFT_COUNT_TRUNCATED + && targetm.shift_truncation_mask (mode)) canon_const_arg1 = gen_int_shift_amount (mode, (INTVAL (const_arg1) - & (GET_MODE_UNIT_BITSIZE (mode) - 1))); + & targetm.shift_truncation_mask (mode))); else break; } @@ -3698,10 +3699,11 @@ fold_rtx (rtx x, rtx_insn *insn) && (INTVAL (inner_const) >= GET_MODE_UNIT_PRECISION (mode) || INTVAL (inner_const) < 0)) { - if (SHIFT_COUNT_TRUNCATED) + if (SHIFT_COUNT_TRUNCATED + && targetm.shift_truncation_mask (mode)) inner_const = gen_int_shift_amount (mode, (INTVAL (inner_const) - & (GET_MODE_UNIT_BITSIZE (mode) - 1))); + & targetm.shift_truncation_mask (mode))); else break; } diff --git a/gcc/expmed.c b/gcc/expmed.c index d7f8e9a5d76..d8eebac5f08 100644 --- a/gcc/expmed.c +++ b/gcc/expmed.c @@ -2476,11 +2476,14 @@ expand_shift_1 (enum tree_code code, machine_mode mode, rtx shifted, (unsigned HOST_WIDE_INT) GET_MODE_BITSIZE (scalar_mode))) op1 = gen_int_shift_amount (mode, (unsigned HOST_WIDE_INT) INTVAL (op1) - % GET_MODE_BITSIZE (scalar_mode)); + % targetm.shift_truncation_mask (scalar_mode) + 1); else if (GET_CODE (op1) == SUBREG && subreg_lowpart_p (op1) && SCALAR_INT_MODE_P (GET_MODE (SUBREG_REG (op1))) - && SCALAR_INT_MODE_P (GET_MODE (op1))) + && SCALAR_INT_MODE_P (GET_MODE (op1)) + && targetm.shift_truncation_mask (mode) + == (unsigned HOST_WIDE_INT) GET_MODE_BITSIZE (scalar_mode) - 1 + ) op1 = SUBREG_REG (op1); } diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c index 89a46a933fa..88064e4ac57 100644 --- a/gcc/simplify-rtx.c +++ b/gcc/simplify-rtx.c @@ -3525,9 +3525,10 @@ simplify_binary_operation_1 (enum rtx_code code, machine_mode mode, return lowpart_subreg (int_mode, tmp, inner_mode); } - if (SHIFT_COUNT_TRUNCATED && CONST_INT_P (op1)) + if (SHIFT_COUNT_TRUNCATED && targetm.shift_truncation_mask (mode) + && CONST_INT_P (op1)) { - val = INTVAL (op1) & (GET_MODE_UNIT_PRECISION (mode) - 1); + val = INTVAL (op1) & targetm.shift_truncation_mask (mode); if (val != INTVAL (op1)) return simplify_gen_binary (code, mode, op0, gen_int_shift_amount (mode, val)); @@ -4347,8 +4348,9 @@ simplify_const_binary_operation (enum rtx_code code, machine_mode mode, case ASHIFT: { wide_int wop1 = pop1; - if (SHIFT_COUNT_TRUNCATED) - wop1 = wi::umod_trunc (wop1, GET_MODE_PRECISION (int_mode)); + if (SHIFT_COUNT_TRUNCATED && targetm.shift_truncation_mask (int_mode)) + wop1 = wi::umod_trunc (wop1, + targetm.shift_truncation_mask (int_mode) + 1); else if (wi::geu_p (wop1, GET_MODE_PRECISION (int_mode))) return NULL_RTX; @@ -4426,8 +4428,9 @@ simplify_const_binary_operation (enum rtx_code code, machine_mode mode, if (CONST_SCALAR_INT_P (op1)) { wide_int shift = rtx_mode_t (op1, mode); - if (SHIFT_COUNT_TRUNCATED) - shift = wi::umod_trunc (shift, GET_MODE_PRECISION (int_mode)); + if (SHIFT_COUNT_TRUNCATED && targetm.shift_truncation_mask (int_mode)) + shift = wi::umod_trunc (shift, + targetm.shift_truncation_mask (int_mode) + 1); else if (wi::geu_p (shift, GET_MODE_PRECISION (int_mode))) return NULL_RTX; result = wi::to_poly_wide (op0, mode) << shift;