From patchwork Thu May  9 11:52:58 2019
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Robin Dapp <rdapp@linux.ibm.com>
X-Patchwork-Id: 1097476
Return-Path: 
 <gcc-patches-return-500363-incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Authentication-Results: ozlabs.org;
	spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org
	(client-ip=209.132.180.131; helo=sourceware.org;
	envelope-from=gcc-patches-return-500363-incoming=patchwork.ozlabs.org@gcc.gnu.org;
	receiver=<UNKNOWN>)
Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none)
	header.from=linux.ibm.com
Authentication-Results: ozlabs.org; dkim=pass (1024-bit key;
	unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org
	header.b="Dmd8NGxm"; dkim-atps=neutral
Received: from sourceware.org (server1.sourceware.org [209.132.180.131])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
	bits)) (No client certificate requested)
	by ozlabs.org (Postfix) with ESMTPS id 450BYk4ctYz9sBV
	for <incoming@patchwork.ozlabs.org>;
	Thu,  9 May 2019 21:53:14 +1000 (AEST)
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender:from
	:subject:to:date:mime-version:content-type:message-id; q=dns; s=
	default; b=I7HRbiMW7+Soh4vs7G2q/0dbw6pvyW/e6O8S/RNUAjCxDQadz+8c5
	YeMOmgQa+GsmRtxSAFK4798gYMMvzOqtACETO68ewfXyxaEPF2K861CvadFJxptA
	62LL0B3FSV0DJqLAAESZ77bJLJmFShzumqpERXi6reYht12Djoi50E=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender:from
	:subject:to:date:mime-version:content-type:message-id; s=
	default; bh=Lc+995wIO4wreX6Wp6GmbMGnspg=; b=Dmd8NGxmUt+PVm5P+IZI
	dcDRnk11waVqOMk8cXTEsl5munw9L0iy/TJWIjRmNn6JpZ0x5tJyjCLir1rpEw/v
	PWW06Gb/EdMUp0h7Ox+vJcTfjGocLmv2c0ATaBo+QuQchAMnDwHRCWWdu5kCZWXU
	eEtd2AnTkuM44NjhfJQi0BE=
Received: (qmail 49701 invoked by alias); 9 May 2019 11:53:07 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Unsubscribe: 
 <mailto:gcc-patches-unsubscribe-incoming=patchwork.ozlabs.org@gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Delivered-To: mailing list gcc-patches@gcc.gnu.org
Received: (qmail 49692 invoked by uid 89); 9 May 2019 11:53:07 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-24.6 required=5.0 tests=AWL, BAYES_00,
	GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3,
	RCVD_IN_DNSWL_LOW,
	SPF_PASS autolearn=ham version=3.3.1 spammy=scalar_mode,
	sk:SCALAR_, DImode, Except
X-HELO: mx0a-001b2d01.pphosted.com
Received: from mx0b-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com)
	(148.163.158.5) by sourceware.org
	(qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP;
	Thu, 09 May 2019 11:53:05 +0000
Received: from pps.filterd (m0098413.ppops.net [127.0.0.1])	by
	mx0b-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id
	x49BpiPt142960	for <gcc-patches@gcc.gnu.org>;
	Thu, 9 May 2019 07:53:03 -0400
Received: from e06smtp01.uk.ibm.com (e06smtp01.uk.ibm.com [195.75.94.97])	by
	mx0b-001b2d01.pphosted.com with ESMTP id
	2scj29cnt0-1	(version=TLSv1.2 cipher=AES256-GCM-SHA384
	bits=256 verify=NOT)	for <gcc-patches@gcc.gnu.org>;
	Thu, 09 May 2019 07:53:03 -0400
Received: from localhost	by e06smtp01.uk.ibm.com with IBM ESMTP SMTP
	Gateway: Authorized Use Only! Violators will be
	prosecuted	for <gcc-patches@gcc.gnu.org> from
	<rdapp@linux.ibm.com>; Thu, 9 May 2019 12:53:01 +0100
Received: from b06cxnps4075.portsmouth.uk.ibm.com (9.149.109.197)	by
	e06smtp01.uk.ibm.com (192.168.101.131) with IBM ESMTP SMTP
	Gateway: Authorized Use Only! Violators will be prosecuted;
	(version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256)	Thu, 9
	May 2019 12:53:00 +0100
Received: from d06av21.portsmouth.uk.ibm.com (d06av21.portsmouth.uk.ibm.com
	[9.149.105.232])	by b06cxnps4075.portsmouth.uk.ibm.com
	(8.14.9/8.14.9/NCO v10.0) with ESMTP id
	x49Bqxl859441268	(version=TLSv1/SSLv3
	cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK)	for
	<gcc-patches@gcc.gnu.org>; Thu, 9 May 2019 11:52:59 GMT
Received: from d06av21.portsmouth.uk.ibm.com (unknown [127.0.0.1])	by IMSVA
	(Postfix) with ESMTP id 2C3EB52054	for <gcc-patches@gcc.gnu.org>;
	Thu,  9 May 2019 11:52:59 +0000 (GMT)
Received: from oc6142347168.ibm.com (unknown [9.152.222.42])	by
	d06av21.portsmouth.uk.ibm.com (Postfix) with ESMTPS id
	0E96752050	for <gcc-patches@gcc.gnu.org>;
	Thu,  9 May 2019 11:52:59 +0000 (GMT)
From: Robin Dapp <rdapp@linux.ibm.com>
Subject: [RFC] SHIFT_COUNT_TRUNCATED and shift_truncation_mask
To: GCC Patches <gcc-patches@gcc.gnu.org>
Date: Thu, 9 May 2019 13:52:58 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64;
	rv:60.0) Gecko/20100101 Thunderbird/60.6.1
MIME-Version: 1.0
x-cbid: 19050911-4275-0000-0000-00000332FED0
X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused
x-cbparentid: 19050911-4276-0000-0000-00003842712B
Message-Id: <6f3a6c31-5a02-86bb-a659-13f541350616@linux.ibm.com>

Hi,

while trying to improve s390 code generation for rotate and shift I
noticed superfluous subregs for shift count operands. In our backend we
already have quite cumbersome patterns that would need to be duplicated
(or complicated further by more subst patterns) in order to get rid of
the subregs.

I had already finished all the patterns when I realized that
SHIFT_COUNT_TRUNCATED and the target hook shift_truncation_mask already
exist and could do what is needed without extra patterns.  Just defining
 shift_truncation_mask was not enough though as most of the additional
insns get introduced by combine.

Event SHIFT_COUNT_TRUNCATED is no perfect match to what our hardware
does because we always only consider the last 6 bits of a shift operand.

Despite all the warnings in the other backends, most notably
SHIFT_COUNT_TRUNCATED being "discouraged" as mentioned in riscv.h, I
wrote the attached tentative patch.  It's a little ad-hoc, uses the
SHIFT_COUNT_TRUNCATED paths only if shift_truncation_mask () != 0 and,
instead of truncating via & (GET_MODE_BITSIZE (mode) - 1), it applies
the mask returned by shift_truncation_mask.  Doing so, usage of both
"methods" actually reduces to a single way.

I assume both were originally intended for different purposes but
without knowing the history the separation seems artificial to me.  A
quick look at other backends showed that at least some (e.g. ARM) do not
use SHIFT_COUNT_TRUNCATED because the general behavior is not
fine-grained enough, e.g. the masks for shift and rotate differ.

While the attached patch might probably work for s390, it will probably
not for other targets.  In addition to what my patch does, would it be
useful to unify both truncation methods in a target hook that takes the
operation (shift, rotate, zero_extract, ...) as well as the mode as
arguments?  Thus, we would let the target decide what to do with the
specific combination of both.  Maybe this would also allow to
distinguish bit test operations from the rest.

Of course, when getting everything right in the backend, there will be
no difference in result, but in my experience it's easily possible to
forget a subreg/... somewhere and end up with worse code by accident.
Maybe there is another reason why SHIFT_COUNT_TRUNCATED is discouraged
that I missed entirely?

Regards
 Robin
diff --git a/gcc/combine.c b/gcc/combine.c
index 91e32c88c88..d2a659f929b 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -6445,14 +6445,12 @@ combine_simplify_rtx (rtx x, machine_mode op0_mode, int in_dest,
 	return simplify_shift_const (x, code, mode, XEXP (x, 0),
 				     INTVAL (XEXP (x, 1)));
 
-      else if (SHIFT_COUNT_TRUNCATED && !REG_P (XEXP (x, 1)))
+      else if (SHIFT_COUNT_TRUNCATED
+	  && targetm.shift_truncation_mask (mode)
+	  && !REG_P (XEXP (x, 1)))
 	SUBST (XEXP (x, 1),
 	       force_to_mode (XEXP (x, 1), GET_MODE (XEXP (x, 1)),
-			      (HOST_WIDE_INT_1U
-			       << exact_log2 (GET_MODE_UNIT_BITSIZE
-					      (GET_MODE (x))))
-			      - 1,
-			      0));
+		 targetm.shift_truncation_mask (mode), 0));
       break;
 
     default:
@@ -10594,8 +10592,8 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode,
   /* Make sure and truncate the "natural" shift on the way in.  We don't
      want to do this inside the loop as it makes it more difficult to
      combine shifts.  */
-  if (SHIFT_COUNT_TRUNCATED)
-    orig_count &= GET_MODE_UNIT_BITSIZE (mode) - 1;
+  if (SHIFT_COUNT_TRUNCATED && targetm.shift_truncation_mask (mode))
+    orig_count &= targetm.shift_truncation_mask (mode);
 
   /* If we were given an invalid count, don't do anything except exactly
      what was requested.  */
@@ -12295,7 +12293,7 @@ simplify_comparison (enum rtx_code code, rtx *pop0, rtx *pop1)
 	     between the position and the location of the single bit.  */
 	  /* Except we can't if SHIFT_COUNT_TRUNCATED is set, since we might
 	     have already reduced the shift count modulo the word size.  */
-	  if (!SHIFT_COUNT_TRUNCATED
+	  if ((!SHIFT_COUNT_TRUNCATED || !targetm.shift_truncation_mask (mode))
 	      && CONST_INT_P (XEXP (op0, 0))
 	      && XEXP (op0, 1) == const1_rtx
 	      && equality_comparison_p && const_op == 0
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index ad8eacdf4dc..1d723f29e1e 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -2320,6 +2320,7 @@ s390_single_part (rtx op,
 	    part = i;
 	}
     }
+
   return part == -1 ? -1 : n_parts - 1 - part;
 }
 
@@ -2702,6 +2703,7 @@ s390_logical_operator_ok_p (rtx *operands)
   return true;
 }
 
+
 /* Narrow logical operation CODE of memory operand MEMOP with immediate
    operand IMMOP to switch from SS to SI type instructions.  */
 
@@ -16294,6 +16296,13 @@ s390_case_values_threshold (void)
   return default_case_values_threshold ();
 }
 
+static unsigned HOST_WIDE_INT
+s390_shift_truncation_mask (machine_mode mode)
+{
+  return (mode == DImode || mode == SImode
+      || mode == HImode || mode == QImode) ? 63 : 0;
+}
+
 /* Initialize GCC target structure.  */
 
 #undef  TARGET_ASM_ALIGNED_HI_OP
@@ -16585,6 +16594,9 @@ s390_case_values_threshold (void)
 #undef TARGET_CASE_VALUES_THRESHOLD
 #define TARGET_CASE_VALUES_THRESHOLD s390_case_values_threshold
 
+#undef TARGET_SHIFT_TRUNCATION_MASK
+#define TARGET_SHIFT_TRUNCATION_MASK s390_shift_truncation_mask
+
 /* Use only short displacement, since long displacement is not available for
    the floating point instructions.  */
 #undef TARGET_MAX_ANCHOR_OFFSET
diff --git a/gcc/config/s390/s390.h b/gcc/config/s390/s390.h
index 969f58a2ba0..d85bfcc5e04 100644
--- a/gcc/config/s390/s390.h
+++ b/gcc/config/s390/s390.h
@@ -1188,5 +1188,6 @@ struct GTY(()) machine_function
 
 #define TARGET_INDIRECT_BRANCH_TABLE s390_indirect_branch_table
 
+#define SHIFT_COUNT_TRUNCATED 1
 
 #endif /* S390_H */
diff --git a/gcc/cse.c b/gcc/cse.c
index 6c9cda16a98..f7bf287e9ef 100644
--- a/gcc/cse.c
+++ b/gcc/cse.c
@@ -3649,10 +3649,11 @@ fold_rtx (rtx x, rtx_insn *insn)
 		  && (INTVAL (const_arg1) >= GET_MODE_UNIT_PRECISION (mode)
 		      || INTVAL (const_arg1) < 0))
 		{
-		  if (SHIFT_COUNT_TRUNCATED)
+		  if (SHIFT_COUNT_TRUNCATED
+		      && targetm.shift_truncation_mask (mode))
 		    canon_const_arg1 = gen_int_shift_amount
 		      (mode, (INTVAL (const_arg1)
-			      & (GET_MODE_UNIT_BITSIZE (mode) - 1)));
+			      & targetm.shift_truncation_mask (mode)));
 		  else
 		    break;
 		}
@@ -3698,10 +3699,11 @@ fold_rtx (rtx x, rtx_insn *insn)
 		  && (INTVAL (inner_const) >= GET_MODE_UNIT_PRECISION (mode)
 		      || INTVAL (inner_const) < 0))
 		{
-		  if (SHIFT_COUNT_TRUNCATED)
+		  if (SHIFT_COUNT_TRUNCATED
+		      && targetm.shift_truncation_mask (mode))
 		    inner_const = gen_int_shift_amount
 		      (mode, (INTVAL (inner_const)
-			      & (GET_MODE_UNIT_BITSIZE (mode) - 1)));
+			      & targetm.shift_truncation_mask (mode)));
 		  else
 		    break;
 		}
diff --git a/gcc/expmed.c b/gcc/expmed.c
index d7f8e9a5d76..d8eebac5f08 100644
--- a/gcc/expmed.c
+++ b/gcc/expmed.c
@@ -2476,11 +2476,14 @@ expand_shift_1 (enum tree_code code, machine_mode mode, rtx shifted,
 	      (unsigned HOST_WIDE_INT) GET_MODE_BITSIZE (scalar_mode)))
 	op1 = gen_int_shift_amount (mode,
 				    (unsigned HOST_WIDE_INT) INTVAL (op1)
-				    % GET_MODE_BITSIZE (scalar_mode));
+				    % targetm.shift_truncation_mask (scalar_mode) + 1);
       else if (GET_CODE (op1) == SUBREG
 	       && subreg_lowpart_p (op1)
 	       && SCALAR_INT_MODE_P (GET_MODE (SUBREG_REG (op1)))
-	       && SCALAR_INT_MODE_P (GET_MODE (op1)))
+	       && SCALAR_INT_MODE_P (GET_MODE (op1))
+	       && targetm.shift_truncation_mask (mode)
+		== (unsigned HOST_WIDE_INT) GET_MODE_BITSIZE (scalar_mode) - 1
+	       )
 	op1 = SUBREG_REG (op1);
     }
 
diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index 89a46a933fa..88064e4ac57 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -3525,9 +3525,10 @@ simplify_binary_operation_1 (enum rtx_code code, machine_mode mode,
 	  return lowpart_subreg (int_mode, tmp, inner_mode);
 	}
 
-      if (SHIFT_COUNT_TRUNCATED && CONST_INT_P (op1))
+      if (SHIFT_COUNT_TRUNCATED && targetm.shift_truncation_mask (mode)
+	  && CONST_INT_P (op1))
 	{
-	  val = INTVAL (op1) & (GET_MODE_UNIT_PRECISION (mode) - 1);
+	  val = INTVAL (op1) & targetm.shift_truncation_mask (mode);
 	  if (val != INTVAL (op1))
 	    return simplify_gen_binary (code, mode, op0,
 					gen_int_shift_amount (mode, val));
@@ -4347,8 +4348,9 @@ simplify_const_binary_operation (enum rtx_code code, machine_mode mode,
 	case ASHIFT:
 	  {
 	    wide_int wop1 = pop1;
-	    if (SHIFT_COUNT_TRUNCATED)
-	      wop1 = wi::umod_trunc (wop1, GET_MODE_PRECISION (int_mode));
+	    if (SHIFT_COUNT_TRUNCATED && targetm.shift_truncation_mask (int_mode))
+	      wop1 = wi::umod_trunc (wop1,
+		  targetm.shift_truncation_mask (int_mode) + 1);
 	    else if (wi::geu_p (wop1, GET_MODE_PRECISION (int_mode)))
 	      return NULL_RTX;
 
@@ -4426,8 +4428,9 @@ simplify_const_binary_operation (enum rtx_code code, machine_mode mode,
 	  if (CONST_SCALAR_INT_P (op1))
 	    {
 	      wide_int shift = rtx_mode_t (op1, mode);
-	      if (SHIFT_COUNT_TRUNCATED)
-		shift = wi::umod_trunc (shift, GET_MODE_PRECISION (int_mode));
+	      if (SHIFT_COUNT_TRUNCATED && targetm.shift_truncation_mask (int_mode))
+		shift = wi::umod_trunc (shift,
+		    targetm.shift_truncation_mask (int_mode) + 1);
 	      else if (wi::geu_p (shift, GET_MODE_PRECISION (int_mode)))
 		return NULL_RTX;
 	      result = wi::to_poly_wide (op0, mode) << shift;