diff mbox series

forwprop: Fix up rotate pattern matching [PR106523]

Message ID Y8ZuxWBRYJgTleuS@tucnak
State New
Headers show
Series forwprop: Fix up rotate pattern matching [PR106523] | expand

Commit Message

Jakub Jelinek Jan. 17, 2023, 9:47 a.m. UTC
Hi!

The comment above simplify_rotate roughly describes what patterns
are matched into what:
   We are looking for X with unsigned type T with bitsize B, OP being
   +, | or ^, some type T2 wider than T.  For:
   (X << CNT1) OP (X >> CNT2)                           iff CNT1 + CNT2 == B
   ((T) ((T2) X << CNT1)) OP ((T) ((T2) X >> CNT2))     iff CNT1 + CNT2 == B
  
   transform these into:
   X r<< CNT1

   Or for:
   (X << Y) OP (X >> (B - Y))
   (X << (int) Y) OP (X >> (int) (B - Y))
   ((T) ((T2) X << Y)) OP ((T) ((T2) X >> (B - Y)))
   ((T) ((T2) X << (int) Y)) OP ((T) ((T2) X >> (int) (B - Y)))
   (X << Y) | (X >> ((-Y) & (B - 1)))
   (X << (int) Y) | (X >> (int) ((-Y) & (B - 1)))
   ((T) ((T2) X << Y)) | ((T) ((T2) X >> ((-Y) & (B - 1))))
   ((T) ((T2) X << (int) Y)) | ((T) ((T2) X >> (int) ((-Y) & (B - 1))))

   transform these into (last 2 only if ranger can prove Y < B):
   X r<< Y
      
   Or for:
   (X << (Y & (B - 1))) | (X >> ((-Y) & (B - 1)))
   (X << (int) (Y & (B - 1))) | (X >> (int) ((-Y) & (B - 1)))
   ((T) ((T2) X << (Y & (B - 1)))) | ((T) ((T2) X >> ((-Y) & (B - 1))))
   ((T) ((T2) X << (int) (Y & (B - 1)))) \
     | ((T) ((T2) X >> (int) ((-Y) & (B - 1))))
      
   transform these into:
   X r<< (Y & (B - 1))
The following testcase shows that 2 of these are problematic.
If T2 is wider than T, then the 2 which yse (-Y) & (B - 1) on one
of the shift counts but Y on the can do something different from
rotate.  E.g.:
__attribute__((noipa)) unsigned char
f7 (unsigned char x, unsigned int y)
{
  unsigned int t = x;
  return (t << y) | (t >> ((-y) & 7));
}
if y is [0, 7], then it is a normal rotate, and if y is in [32, ~0U]
then it is UB, but for y in [9, 31] the left shift in this case
will never leave any bits in the result, while in a rotate they are
left there.  Say for y 5 and x 0xaa the expression gives
0x55 which is the same thing as rotate, while for y 19 and x 0xaa
0x5, which is different.
Now, I believe the
   ((T) ((T2) X << Y)) OP ((T) ((T2) X >> (B - Y)))
   ((T) ((T2) X << (int) Y)) OP ((T) ((T2) X >> (int) (B - Y)))
forms are ok, because B - Y still needs to be a valid shift count,
and if Y > B then B - Y should be either negative or very large
positive (for unsigned types).
And similarly the last 2 cases above which use & (B - 1) on both
shift operands are definitely ok.

The following patch disables the
   ((T) ((T2) X << Y)) | ((T) ((T2) X >> ((-Y) & (B - 1))))
   ((T) ((T2) X << (int) Y)) | ((T) ((T2) X >> (int) ((-Y) & (B - 1))))
unless ranger says Y is not in [B, B2 - 1] range.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk
(for now)?

Aldy/Andrew, is the ranger query ok or should I use something different
when check_range_stmt is non-NULL and I know on which statement to ask?

And, looking at it again this morning, actually the Y equal to B
case is still fine, if Y is equal to 0, then it is
(T) (((T2) X << 0) | ((T2) X >> 0))
and so X, for Y == B it is
(T) (((T2) X << B) | ((T2) X >> 0))
which is the same as
(T) (0 | ((T2) X >> 0))
which is also X.  So instead of the [B, B2 - 1] range we could use
[B + 1, B2 - 1].  And, if we wanted to go further, even multiplies
of B are ok if they are smaller than B2, so we could construct a detailed
int_range_max if we wanted.

2023-01-17  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/106523
	* tree-ssa-forwprop.cc (simplify_rotate): For the
	patterns with (-Y) & (B - 1) in one operand's shift
	count and Y in another, if T2 has wider precision than T,
	punt if Y could have a value in [B, B2 - 1] range.

	* c-c++-common/rotate-2.c (f5, f6, f7, f8, f13, f14, f15, f16,
	f37, f38, f39, f40, f45, f46, f47, f48): Add assertions using
	__builtin_unreachable about shift count.
	* c-c++-common/rotate-2b.c: New test.
	* c-c++-common/rotate-4.c (f5, f6, f7, f8, f13, f14, f15, f16,
	f37, f38, f39, f40, f45, f46, f47, f48): Add assertions using
	__builtin_unreachable about shift count.
	* c-c++-common/rotate-4b.c: New test.
	* gcc.c-torture/execute/pr106523.c: New test.


	Jakub

Comments

Aldy Hernandez Jan. 17, 2023, 10:59 a.m. UTC | #1
On 1/17/23 10:47, Jakub Jelinek wrote:

> Aldy/Andrew, is the ranger query ok or should I use something different
> when check_range_stmt is non-NULL and I know on which statement to ask?

<snip>

> +	  int_range_max r;
> +	  if (!get_global_range_query ()->range_of_expr (r, rotcnt,
> +							 check_range_stmt))
> +	    return false;

range_of_expr will work with and without a statement.  If no statement 
is provided, it will return the global range.  So you can use the same 
range_of_expr call with a statement or without one if you don't know it.

Note that get_global_range_query () will always return a global query 
object (think SSA_NAME_RANGE_INFO).  It will never use an existing 
ranger (for example, if called within VRP or another pass that has an 
active ranger enabled).  If simplify_rotate() may be used from some of 
these passes you *may* want to use get_range_query() which will pick up 
the active ranger, or a global query object if no ranger is active.

For that matter, since get_global_range_query() uses a global query, it 
really doesn't matter if you pass a statement or not, since our global 
range store has no context (SSA_NAME_RANGE_INFO).  Although, I 
personally always pass the statement if known, because it's good form, 
and if things ever change to an active ranger, everything will just work.

Aldy
Richard Biener Jan. 17, 2023, 11:04 a.m. UTC | #2
On Tue, 17 Jan 2023, Jakub Jelinek wrote:

> Hi!
> 
> The comment above simplify_rotate roughly describes what patterns
> are matched into what:
>    We are looking for X with unsigned type T with bitsize B, OP being
>    +, | or ^, some type T2 wider than T.  For:
>    (X << CNT1) OP (X >> CNT2)                           iff CNT1 + CNT2 == B
>    ((T) ((T2) X << CNT1)) OP ((T) ((T2) X >> CNT2))     iff CNT1 + CNT2 == B
>   
>    transform these into:
>    X r<< CNT1
> 
>    Or for:
>    (X << Y) OP (X >> (B - Y))
>    (X << (int) Y) OP (X >> (int) (B - Y))
>    ((T) ((T2) X << Y)) OP ((T) ((T2) X >> (B - Y)))
>    ((T) ((T2) X << (int) Y)) OP ((T) ((T2) X >> (int) (B - Y)))
>    (X << Y) | (X >> ((-Y) & (B - 1)))
>    (X << (int) Y) | (X >> (int) ((-Y) & (B - 1)))
>    ((T) ((T2) X << Y)) | ((T) ((T2) X >> ((-Y) & (B - 1))))
>    ((T) ((T2) X << (int) Y)) | ((T) ((T2) X >> (int) ((-Y) & (B - 1))))
> 
>    transform these into (last 2 only if ranger can prove Y < B):
>    X r<< Y
>       
>    Or for:
>    (X << (Y & (B - 1))) | (X >> ((-Y) & (B - 1)))
>    (X << (int) (Y & (B - 1))) | (X >> (int) ((-Y) & (B - 1)))
>    ((T) ((T2) X << (Y & (B - 1)))) | ((T) ((T2) X >> ((-Y) & (B - 1))))
>    ((T) ((T2) X << (int) (Y & (B - 1)))) \
>      | ((T) ((T2) X >> (int) ((-Y) & (B - 1))))
>       
>    transform these into:
>    X r<< (Y & (B - 1))
> The following testcase shows that 2 of these are problematic.
> If T2 is wider than T, then the 2 which yse (-Y) & (B - 1) on one
> of the shift counts but Y on the can do something different from
> rotate.  E.g.:
> __attribute__((noipa)) unsigned char
> f7 (unsigned char x, unsigned int y)
> {
>   unsigned int t = x;
>   return (t << y) | (t >> ((-y) & 7));
> }
> if y is [0, 7], then it is a normal rotate, and if y is in [32, ~0U]
> then it is UB, but for y in [9, 31] the left shift in this case
> will never leave any bits in the result, while in a rotate they are
> left there.  Say for y 5 and x 0xaa the expression gives
> 0x55 which is the same thing as rotate, while for y 19 and x 0xaa
> 0x5, which is different.
> Now, I believe the
>    ((T) ((T2) X << Y)) OP ((T) ((T2) X >> (B - Y)))
>    ((T) ((T2) X << (int) Y)) OP ((T) ((T2) X >> (int) (B - Y)))
> forms are ok, because B - Y still needs to be a valid shift count,
> and if Y > B then B - Y should be either negative or very large
> positive (for unsigned types).
> And similarly the last 2 cases above which use & (B - 1) on both
> shift operands are definitely ok.
> 
> The following patch disables the
>    ((T) ((T2) X << Y)) | ((T) ((T2) X >> ((-Y) & (B - 1))))
>    ((T) ((T2) X << (int) Y)) | ((T) ((T2) X >> (int) ((-Y) & (B - 1))))
> unless ranger says Y is not in [B, B2 - 1] range.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk
> (for now)?

OK.

Richard.

> 
> Aldy/Andrew, is the ranger query ok or should I use something different
> when check_range_stmt is non-NULL and I know on which statement to ask?
> 
> And, looking at it again this morning, actually the Y equal to B
> case is still fine, if Y is equal to 0, then it is
> (T) (((T2) X << 0) | ((T2) X >> 0))
> and so X, for Y == B it is
> (T) (((T2) X << B) | ((T2) X >> 0))
> which is the same as
> (T) (0 | ((T2) X >> 0))
> which is also X.  So instead of the [B, B2 - 1] range we could use
> [B + 1, B2 - 1].  And, if we wanted to go further, even multiplies
> of B are ok if they are smaller than B2, so we could construct a detailed
> int_range_max if we wanted.
> 
> 2023-01-17  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR tree-optimization/106523
> 	* tree-ssa-forwprop.cc (simplify_rotate): For the
> 	patterns with (-Y) & (B - 1) in one operand's shift
> 	count and Y in another, if T2 has wider precision than T,
> 	punt if Y could have a value in [B, B2 - 1] range.
> 
> 	* c-c++-common/rotate-2.c (f5, f6, f7, f8, f13, f14, f15, f16,
> 	f37, f38, f39, f40, f45, f46, f47, f48): Add assertions using
> 	__builtin_unreachable about shift count.
> 	* c-c++-common/rotate-2b.c: New test.
> 	* c-c++-common/rotate-4.c (f5, f6, f7, f8, f13, f14, f15, f16,
> 	f37, f38, f39, f40, f45, f46, f47, f48): Add assertions using
> 	__builtin_unreachable about shift count.
> 	* c-c++-common/rotate-4b.c: New test.
> 	* gcc.c-torture/execute/pr106523.c: New test.
> 
> --- gcc/tree-ssa-forwprop.cc.jj	2023-01-02 09:32:26.000000000 +0100
> +++ gcc/tree-ssa-forwprop.cc	2023-01-16 18:18:43.524443879 +0100
> @@ -1837,7 +1837,7 @@ defcodefor_name (tree name, enum tree_co
>     ((T) ((T2) X << Y)) | ((T) ((T2) X >> ((-Y) & (B - 1))))
>     ((T) ((T2) X << (int) Y)) | ((T) ((T2) X >> (int) ((-Y) & (B - 1))))
>  
> -   transform these into:
> +   transform these into (last 2 only if ranger can prove Y < B):
>     X r<< Y
>  
>     Or for:
> @@ -1866,6 +1866,8 @@ simplify_rotate (gimple_stmt_iterator *g
>    int i;
>    bool swapped_p = false;
>    gimple *g;
> +  gimple *def_arg_stmt[2] = { NULL, NULL };
> +  int wider_prec = 0;
>  
>    arg[0] = gimple_assign_rhs1 (stmt);
>    arg[1] = gimple_assign_rhs2 (stmt);
> @@ -1878,7 +1880,11 @@ simplify_rotate (gimple_stmt_iterator *g
>      return false;
>  
>    for (i = 0; i < 2; i++)
> -    defcodefor_name (arg[i], &def_code[i], &def_arg1[i], &def_arg2[i]);
> +    {
> +      defcodefor_name (arg[i], &def_code[i], &def_arg1[i], &def_arg2[i]);
> +      if (TREE_CODE (arg[i]) == SSA_NAME)
> +	def_arg_stmt[i] = SSA_NAME_DEF_STMT (arg[i]);
> +    }
>  
>    /* Look through narrowing (or same precision) conversions.  */
>    if (CONVERT_EXPR_CODE_P (def_code[0])
> @@ -1891,10 +1897,13 @@ simplify_rotate (gimple_stmt_iterator *g
>        && has_single_use (arg[0])
>        && has_single_use (arg[1]))
>      {
> +      wider_prec = TYPE_PRECISION (TREE_TYPE (def_arg1[0]));
>        for (i = 0; i < 2; i++)
>  	{
>  	  arg[i] = def_arg1[i];
>  	  defcodefor_name (arg[i], &def_code[i], &def_arg1[i], &def_arg2[i]);
> +	  if (TREE_CODE (arg[i]) == SSA_NAME)
> +	    def_arg_stmt[i] = SSA_NAME_DEF_STMT (arg[i]);
>  	}
>      }
>    else
> @@ -1910,6 +1919,8 @@ simplify_rotate (gimple_stmt_iterator *g
>  	{
>  	  arg[i] = def_arg1[i];
>  	  defcodefor_name (arg[i], &def_code[i], &def_arg1[i], &def_arg2[i]);
> +	  if (TREE_CODE (arg[i]) == SSA_NAME)
> +	    def_arg_stmt[i] = SSA_NAME_DEF_STMT (arg[i]);
>  	}
>      }
>  
> @@ -1983,6 +1994,9 @@ simplify_rotate (gimple_stmt_iterator *g
>      {
>        tree cdef_arg1[2], cdef_arg2[2], def_arg2_alt[2];
>        enum tree_code cdef_code[2];
> +      gimple *def_arg_alt_stmt[2] = { NULL, NULL };
> +      bool check_range = false;
> +      gimple *check_range_stmt = NULL;
>        /* Look through conversion of the shift count argument.
>  	 The C/C++ FE cast any shift count argument to integer_type_node.
>  	 The only problem might be if the shift count type maximum value
> @@ -1999,9 +2013,13 @@ simplify_rotate (gimple_stmt_iterator *g
>  	      && type_has_mode_precision_p (TREE_TYPE (cdef_arg1[i])))
>  	    {
>  	      def_arg2_alt[i] = cdef_arg1[i];
> +	      if (TREE_CODE (def_arg2[i]) == SSA_NAME)
> +		def_arg_alt_stmt[i] = SSA_NAME_DEF_STMT (def_arg2[i]);
>  	      defcodefor_name (def_arg2_alt[i], &cdef_code[i],
>  			       &cdef_arg1[i], &cdef_arg2[i]);
>  	    }
> +	  else
> +	    def_arg_alt_stmt[i] = def_arg_stmt[i];
>  	}
>        for (i = 0; i < 2; i++)
>  	/* Check for one shift count being Y and the other B - Y,
> @@ -2024,7 +2042,7 @@ simplify_rotate (gimple_stmt_iterator *g
>  	    if (CONVERT_EXPR_CODE_P (code)
>  		&& INTEGRAL_TYPE_P (TREE_TYPE (tem))
>  		&& TYPE_PRECISION (TREE_TYPE (tem))
> -		 > floor_log2 (TYPE_PRECISION (rtype))
> +		   > floor_log2 (TYPE_PRECISION (rtype))
>  		&& type_has_mode_precision_p (TREE_TYPE (tem))
>  		&& (tem == def_arg2[1 - i]
>  		    || tem == def_arg2_alt[1 - i]))
> @@ -2053,7 +2071,7 @@ simplify_rotate (gimple_stmt_iterator *g
>  	    if (CONVERT_EXPR_CODE_P (code)
>  		&& INTEGRAL_TYPE_P (TREE_TYPE (tem))
>  		&& TYPE_PRECISION (TREE_TYPE (tem))
> -		 > floor_log2 (TYPE_PRECISION (rtype))
> +		   > floor_log2 (TYPE_PRECISION (rtype))
>  		&& type_has_mode_precision_p (TREE_TYPE (tem)))
>  	      defcodefor_name (tem, &code, &tem, NULL);
>  
> @@ -2062,6 +2080,11 @@ simplify_rotate (gimple_stmt_iterator *g
>  		if (tem == def_arg2[1 - i] || tem == def_arg2_alt[1 - i])
>  		  {
>  		    rotcnt = tem;
> +		    check_range = true;
> +		    if (tem == def_arg2[1 - i])
> +		      check_range_stmt = def_arg_stmt[1 - i];
> +		    else
> +		      check_range_stmt = def_arg_alt_stmt[1 - i];
>  		    break;
>  		  }
>  		tree tem2;
> @@ -2076,6 +2099,11 @@ simplify_rotate (gimple_stmt_iterator *g
>  			|| tem2 == def_arg2_alt[1 - i])
>  		      {
>  			rotcnt = tem2;
> +			check_range = true;
> +			if (tem2 == def_arg2[1 - i])
> +			  check_range_stmt = def_arg_stmt[1 - i];
> +			else
> +			  check_range_stmt = def_arg_alt_stmt[1 - i];
>  			break;
>  		      }
>  		  }
> @@ -2111,6 +2139,23 @@ simplify_rotate (gimple_stmt_iterator *g
>  		  }
>  	      }
>  	  }
> +      if (check_range && wider_prec > TYPE_PRECISION (rtype))
> +	{
> +	  if (TREE_CODE (rotcnt) != SSA_NAME)
> +	    return false;
> +	  int_range_max r;
> +	  if (!get_global_range_query ()->range_of_expr (r, rotcnt,
> +							 check_range_stmt))
> +	    return false;
> +	  int prec = TYPE_PRECISION (TREE_TYPE (rotcnt));
> +	  signop sign = TYPE_SIGN (TREE_TYPE (rotcnt));
> +	  wide_int min = wide_int::from (TYPE_PRECISION (rtype), prec, sign);
> +	  wide_int max = wide_int::from (wider_prec - 1, prec, sign);
> +	  int_range<2> r2 (TREE_TYPE (rotcnt), min, max);
> +	  r.intersect (r2);
> +	  if (!r.undefined_p ())
> +	    return false;
> +	}
>        if (rotcnt == NULL_TREE)
>  	return false;
>        swapped_p = i != 1;
> --- gcc/testsuite/c-c++-common/rotate-2.c.jj	2020-01-12 11:54:37.023404206 +0100
> +++ gcc/testsuite/c-c++-common/rotate-2.c	2023-01-16 17:53:49.459329241 +0100
> @@ -32,24 +32,32 @@ f4 (unsigned int x, int y __attribute__(
>  unsigned short int
>  f5 (unsigned short int x, unsigned int y)
>  {
> +  if (y >= __CHAR_BIT__ * __SIZEOF_SHORT__)
> +    __builtin_unreachable ();
>    return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
>  }
>  
>  unsigned short int
>  f6 (unsigned short int x, unsigned long int y)
>  {
> +  if (y >= __CHAR_BIT__ * __SIZEOF_SHORT__)
> +    __builtin_unreachable ();
>    return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
>  }
>  
>  unsigned char
>  f7 (unsigned char x, unsigned int y)
>  {
> +  if (y >= __CHAR_BIT__)
> +    __builtin_unreachable ();
>    return (x << y) | (x >> ((-y) & (__CHAR_BIT__ - 1)));
>  }
>  
>  unsigned char
>  f8 (unsigned char x, unsigned long int y)
>  {
> +  if (y >= __CHAR_BIT__)
> +    __builtin_unreachable ();
>    return (x << y) | (x >> ((-y) & (__CHAR_BIT__ - 1)));
>  }
>  
> @@ -80,24 +88,32 @@ f12 (unsigned int x, int y __attribute__
>  unsigned short int
>  f13 (unsigned short int x, unsigned int y)
>  {
> +  if (y >= __CHAR_BIT__ * __SIZEOF_SHORT__)
> +    __builtin_unreachable ();
>    return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
>  }
>  
>  unsigned short int
>  f14 (unsigned short int x, unsigned long int y)
>  {
> +  if (y >= __CHAR_BIT__ * __SIZEOF_SHORT__)
> +    __builtin_unreachable ();
>    return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
>  }
>  
>  unsigned char
>  f15 (unsigned char x, unsigned int y)
>  {
> +  if (y >= __CHAR_BIT__)
> +    __builtin_unreachable ();
>    return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
>  }
>  
>  unsigned char
>  f16 (unsigned char x, unsigned long int y)
>  {
> +  if (y >= __CHAR_BIT__)
> +    __builtin_unreachable ();
>    return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
>  }
>  
> @@ -224,24 +240,32 @@ f36 (unsigned int x, int y __attribute__
>  unsigned short int
>  f37 (unsigned short int x, unsigned int y)
>  {
> +  if (y >= __CHAR_BIT__ * __SIZEOF_SHORT__)
> +    __builtin_unreachable ();
>    return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
>  }
>  
>  unsigned short int
>  f38 (unsigned short int x, unsigned long int y)
>  {
> +  if (y >= __CHAR_BIT__ * __SIZEOF_SHORT__)
> +    __builtin_unreachable ();
>    return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
>  }
>  
>  unsigned char
>  f39 (unsigned char x, unsigned int y)
>  {
> +  if (y >= __CHAR_BIT__)
> +    __builtin_unreachable ();
>    return (x >> y) | (x << ((-y) & (__CHAR_BIT__ - 1)));
>  }
>  
>  unsigned char
>  f40 (unsigned char x, unsigned long int y)
>  {
> +  if (y >= __CHAR_BIT__)
> +    __builtin_unreachable ();
>    return (x >> y) | (x << ((-y) & (__CHAR_BIT__ - 1)));
>  }
>  
> @@ -272,24 +296,32 @@ f44 (unsigned int x, int y __attribute__
>  unsigned short int
>  f45 (unsigned short int x, unsigned int y)
>  {
> +  if (y >= __CHAR_BIT__ * __SIZEOF_SHORT__)
> +    __builtin_unreachable ();
>    return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
>  }
>  
>  unsigned short int
>  f46 (unsigned short int x, unsigned long int y)
>  {
> +  if (y >= __CHAR_BIT__ * __SIZEOF_SHORT__)
> +    __builtin_unreachable ();
>    return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
>  }
>  
>  unsigned char
>  f47 (unsigned char x, unsigned int y)
>  {
> +  if (y >= __CHAR_BIT__)
> +    __builtin_unreachable ();
>    return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
>  }
>  
>  unsigned char
>  f48 (unsigned char x, unsigned long int y)
>  {
> +  if (y >= __CHAR_BIT__)
> +    __builtin_unreachable ();
>    return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
>  }
>  
> --- gcc/testsuite/c-c++-common/rotate-2b.c.jj	2023-01-16 18:25:08.359787649 +0100
> +++ gcc/testsuite/c-c++-common/rotate-2b.c	2023-01-16 18:25:04.078850569 +0100
> @@ -0,0 +1,100 @@
> +/* Check rotate pattern detection.  */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fno-ipa-icf -fdump-tree-optimized" } */
> +/* { dg-final { scan-tree-dump-not "r\[<>]\[<>]" "optimized" } } */
> +
> +unsigned short int
> +f5 (unsigned short int x, unsigned int y)
> +{
> +  return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
> +}
> +
> +unsigned short int
> +f6 (unsigned short int x, unsigned long int y)
> +{
> +  return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
> +}
> +
> +unsigned char
> +f7 (unsigned char x, unsigned int y)
> +{
> +  return (x << y) | (x >> ((-y) & (__CHAR_BIT__ - 1)));
> +}
> +
> +unsigned char
> +f8 (unsigned char x, unsigned long int y)
> +{
> +  return (x << y) | (x >> ((-y) & (__CHAR_BIT__ - 1)));
> +}
> +
> +unsigned short int
> +f13 (unsigned short int x, unsigned int y)
> +{
> +  return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
> +}
> +
> +unsigned short int
> +f14 (unsigned short int x, unsigned long int y)
> +{
> +  return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
> +}
> +
> +unsigned char
> +f15 (unsigned char x, unsigned int y)
> +{
> +  return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
> +}
> +
> +unsigned char
> +f16 (unsigned char x, unsigned long int y)
> +{
> +  return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
> +}
> +
> +unsigned short int
> +f37 (unsigned short int x, unsigned int y)
> +{
> +  return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
> +}
> +
> +unsigned short int
> +f38 (unsigned short int x, unsigned long int y)
> +{
> +  return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
> +}
> +
> +unsigned char
> +f39 (unsigned char x, unsigned int y)
> +{
> +  return (x >> y) | (x << ((-y) & (__CHAR_BIT__ - 1)));
> +}
> +
> +unsigned char
> +f40 (unsigned char x, unsigned long int y)
> +{
> +  return (x >> y) | (x << ((-y) & (__CHAR_BIT__ - 1)));
> +}
> +
> +unsigned short int
> +f45 (unsigned short int x, unsigned int y)
> +{
> +  return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
> +}
> +
> +unsigned short int
> +f46 (unsigned short int x, unsigned long int y)
> +{
> +  return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
> +}
> +
> +unsigned char
> +f47 (unsigned char x, unsigned int y)
> +{
> +  return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
> +}
> +
> +unsigned char
> +f48 (unsigned char x, unsigned long int y)
> +{
> +  return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
> +}
> --- gcc/testsuite/c-c++-common/rotate-4.c.jj	2020-01-12 11:54:37.023404206 +0100
> +++ gcc/testsuite/c-c++-common/rotate-4.c	2023-01-16 18:01:04.161979907 +0100
> @@ -32,24 +32,32 @@ f4 (unsigned int x, int y __attribute__(
>  unsigned short int
>  f5 (unsigned short int x, int y)
>  {
> +  if (y >= __CHAR_BIT__ * __SIZEOF_SHORT__)
> +    __builtin_unreachable ();
>    return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
>  }
>  
>  unsigned short int
>  f6 (unsigned short int x, long int y)
>  {
> +  if (y >= 0UL + __CHAR_BIT__ * __SIZEOF_SHORT__)
> +    __builtin_unreachable ();
>    return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
>  }
>  
>  unsigned char
>  f7 (unsigned char x, int y)
>  {
> +  if (y >= __CHAR_BIT__)
> +    __builtin_unreachable ();
>    return (x << y) | (x >> ((-y) & (__CHAR_BIT__ - 1)));
>  }
>  
>  unsigned char
>  f8 (unsigned char x, long int y)
>  {
> +  if (y >= 0UL + __CHAR_BIT__)
> +    __builtin_unreachable ();
>    return (x << y) | (x >> ((-y) & (__CHAR_BIT__ - 1)));
>  }
>  
> @@ -80,24 +88,32 @@ f12 (unsigned int x, int y __attribute__
>  unsigned short int
>  f13 (unsigned short int x, int y)
>  {
> +  if (y >= __CHAR_BIT__ * __SIZEOF_SHORT__)
> +    __builtin_unreachable ();
>    return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
>  }
>  
>  unsigned short int
>  f14 (unsigned short int x, long int y)
>  {
> +  if (y >= 0UL + __CHAR_BIT__ * __SIZEOF_SHORT__)
> +    __builtin_unreachable ();
>    return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
>  }
>  
>  unsigned char
>  f15 (unsigned char x, int y)
>  {
> +  if (y >= __CHAR_BIT__)
> +    __builtin_unreachable ();
>    return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
>  }
>  
>  unsigned char
>  f16 (unsigned char x, long int y)
>  {
> +  if (y >= 0UL + __CHAR_BIT__)
> +    __builtin_unreachable ();
>    return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
>  }
>  
> @@ -224,24 +240,32 @@ f36 (unsigned int x, int y __attribute__
>  unsigned short int
>  f37 (unsigned short int x, int y)
>  {
> +  if (y >= __CHAR_BIT__ * __SIZEOF_SHORT__)
> +    __builtin_unreachable ();
>    return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
>  }
>  
>  unsigned short int
>  f38 (unsigned short int x, long int y)
>  {
> +  if (y >= 0UL + __CHAR_BIT__ * __SIZEOF_SHORT__)
> +    __builtin_unreachable ();
>    return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
>  }
>  
>  unsigned char
>  f39 (unsigned char x, int y)
>  {
> +  if (y >= __CHAR_BIT__)
> +    __builtin_unreachable ();
>    return (x >> y) | (x << ((-y) & (__CHAR_BIT__ - 1)));
>  }
>  
>  unsigned char
>  f40 (unsigned char x, long int y)
>  {
> +  if (y >= 0UL + __CHAR_BIT__)
> +    __builtin_unreachable ();
>    return (x >> y) | (x << ((-y) & (__CHAR_BIT__ - 1)));
>  }
>  
> @@ -272,24 +296,32 @@ f44 (unsigned int x, int y __attribute__
>  unsigned short int
>  f45 (unsigned short int x, int y)
>  {
> +  if (y >= __CHAR_BIT__ * __SIZEOF_SHORT__)
> +    __builtin_unreachable ();
>    return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
>  }
>  
>  unsigned short int
>  f46 (unsigned short int x, long int y)
>  {
> +  if (y >= 0UL + __CHAR_BIT__ * __SIZEOF_SHORT__)
> +    __builtin_unreachable ();
>    return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
>  }
>  
>  unsigned char
>  f47 (unsigned char x, int y)
>  {
> +  if (y >= __CHAR_BIT__)
> +    __builtin_unreachable ();
>    return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
>  }
>  
>  unsigned char
>  f48 (unsigned char x, long int y)
>  {
> +  if (y >= 0UL + __CHAR_BIT__)
> +    __builtin_unreachable ();
>    return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
>  }
>  
> --- gcc/testsuite/c-c++-common/rotate-4b.c.jj	2023-01-16 18:25:19.645621784 +0100
> +++ gcc/testsuite/c-c++-common/rotate-4b.c	2023-01-16 18:27:26.237761150 +0100
> @@ -0,0 +1,100 @@
> +/* Check rotate pattern detection.  */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fno-ipa-icf -fdump-tree-optimized" } */
> +/* { dg-final { scan-tree-dump-not "r\[<>]\[<>]" "optimized" } } */
> +
> +unsigned short int
> +f5 (unsigned short int x, int y)
> +{
> +  return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
> +}
> +
> +unsigned short int
> +f6 (unsigned short int x, long int y)
> +{
> +  return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
> +}
> +
> +unsigned char
> +f7 (unsigned char x, int y)
> +{
> +  return (x << y) | (x >> ((-y) & (__CHAR_BIT__ - 1)));
> +}
> +
> +unsigned char
> +f8 (unsigned char x, long int y)
> +{
> +  return (x << y) | (x >> ((-y) & (__CHAR_BIT__ - 1)));
> +}
> +
> +unsigned short int
> +f13 (unsigned short int x, int y)
> +{
> +  return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
> +}
> +
> +unsigned short int
> +f14 (unsigned short int x, long int y)
> +{
> +  return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
> +}
> +
> +unsigned char
> +f15 (unsigned char x, int y)
> +{
> +  return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
> +}
> +
> +unsigned char
> +f16 (unsigned char x, long int y)
> +{
> +  return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
> +}
> +
> +unsigned short int
> +f37 (unsigned short int x, int y)
> +{
> +  return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
> +}
> +
> +unsigned short int
> +f38 (unsigned short int x, long int y)
> +{
> +  return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
> +}
> +
> +unsigned char
> +f39 (unsigned char x, int y)
> +{
> +  return (x >> y) | (x << ((-y) & (__CHAR_BIT__ - 1)));
> +}
> +
> +unsigned char
> +f40 (unsigned char x, long int y)
> +{
> +  return (x >> y) | (x << ((-y) & (__CHAR_BIT__ - 1)));
> +}
> +
> +unsigned short int
> +f45 (unsigned short int x, int y)
> +{
> +  return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
> +}
> +
> +unsigned short int
> +f46 (unsigned short int x, long int y)
> +{
> +  return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
> +}
> +
> +unsigned char
> +f47 (unsigned char x, int y)
> +{
> +  return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
> +}
> +
> +unsigned char
> +f48 (unsigned char x, long int y)
> +{
> +  return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
> +}
> --- gcc/testsuite/gcc.c-torture/execute/pr106523.c.jj	2023-01-16 18:37:42.208726762 +0100
> +++ gcc/testsuite/gcc.c-torture/execute/pr106523.c	2023-01-16 18:37:31.898877393 +0100
> @@ -0,0 +1,22 @@
> +/* PR tree-optimization/106523 */
> +
> +__attribute__((noipa)) unsigned char
> +f7 (unsigned char x, unsigned int y)
> +{
> +  unsigned int t = x;
> +  return (t << y) | (t >> ((-y) & 7));
> +}
> +
> +int
> +main ()
> +{
> +  if (__CHAR_BIT__ != 8 || __SIZEOF_INT__ != 4)
> +    return 0;
> +
> +  volatile unsigned char x = 152;
> +  volatile unsigned int y = 19;
> +  if (f7 (x, y) != 4)
> +    __builtin_abort ();
> +
> +  return 0;
> +}
> 
> 	Jakub
> 
>
Jakub Jelinek Jan. 17, 2023, 11:09 a.m. UTC | #3
On Tue, Jan 17, 2023 at 11:59:53AM +0100, Aldy Hernandez wrote:
> 
> 
> On 1/17/23 10:47, Jakub Jelinek wrote:
> 
> > Aldy/Andrew, is the ranger query ok or should I use something different
> > when check_range_stmt is non-NULL and I know on which statement to ask?
> 
> <snip>
> 
> > +	  int_range_max r;
> > +	  if (!get_global_range_query ()->range_of_expr (r, rotcnt,
> > +							 check_range_stmt))
> > +	    return false;
> 
> range_of_expr will work with and without a statement.  If no statement is
> provided, it will return the global range.  So you can use the same
> range_of_expr call with a statement or without one if you don't know it.
> 
> Note that get_global_range_query () will always return a global query object
> (think SSA_NAME_RANGE_INFO).  It will never use an existing ranger (for
> example, if called within VRP or another pass that has an active ranger
> enabled).  If simplify_rotate() may be used from some of these passes you
> *may* want to use get_range_query() which will pick up the active ranger, or
> a global query object if no ranger is active.

This is always in the forwprop pass.
I think it doesn't have any active ranger instance, but I could be wrong.

A question would be if it would be worth to activate it in this spot lazily
if it isn't active yet (and destruct at the end of the pass).

> For that matter, since get_global_range_query() uses a global query, it
> really doesn't matter if you pass a statement or not, since our global range
> store has no context (SSA_NAME_RANGE_INFO).  Although, I personally always
> pass the statement if known, because it's good form, and if things ever
> change to an active ranger, everything will just work.

	Jakub
Aldy Hernandez Jan. 17, 2023, 11:14 a.m. UTC | #4
On 1/17/23 12:09, Jakub Jelinek wrote:
> On Tue, Jan 17, 2023 at 11:59:53AM +0100, Aldy Hernandez wrote:
>>
>>
>> On 1/17/23 10:47, Jakub Jelinek wrote:
>>
>>> Aldy/Andrew, is the ranger query ok or should I use something different
>>> when check_range_stmt is non-NULL and I know on which statement to ask?
>>
>> <snip>
>>
>>> +	  int_range_max r;
>>> +	  if (!get_global_range_query ()->range_of_expr (r, rotcnt,
>>> +							 check_range_stmt))
>>> +	    return false;
>>
>> range_of_expr will work with and without a statement.  If no statement is
>> provided, it will return the global range.  So you can use the same
>> range_of_expr call with a statement or without one if you don't know it.
>>
>> Note that get_global_range_query () will always return a global query object
>> (think SSA_NAME_RANGE_INFO).  It will never use an existing ranger (for
>> example, if called within VRP or another pass that has an active ranger
>> enabled).  If simplify_rotate() may be used from some of these passes you
>> *may* want to use get_range_query() which will pick up the active ranger, or
>> a global query object if no ranger is active.
> 
> This is always in the forwprop pass.
> I think it doesn't have any active ranger instance, but I could be wrong.
> 
> A question would be if it would be worth to activate it in this spot lazily
> if it isn't active yet (and destruct at the end of the pass).

That's what it was designed for :).  If you're making sporadic requests, 
the on-demand mechanism should be fast enough.

Aldy
Jakub Jelinek Jan. 17, 2023, 11:19 a.m. UTC | #5
On Tue, Jan 17, 2023 at 12:14:14PM +0100, Aldy Hernandez wrote:
> > A question would be if it would be worth to activate it in this spot lazily
> > if it isn't active yet (and destruct at the end of the pass).
> 
> That's what it was designed for :).  If you're making sporadic requests, the
> on-demand mechanism should be fast enough.

So what should be done to do the on-demand query rather than global one?

	Jakub
Aldy Hernandez Jan. 17, 2023, 11:22 a.m. UTC | #6
On 1/17/23 12:19, Jakub Jelinek wrote:
> On Tue, Jan 17, 2023 at 12:14:14PM +0100, Aldy Hernandez wrote:
>>> A question would be if it would be worth to activate it in this spot lazily
>>> if it isn't active yet (and destruct at the end of the pass).
>>
>> That's what it was designed for :).  If you're making sporadic requests, the
>> on-demand mechanism should be fast enough.
> 
> So what should be done to do the on-demand query rather than global one?

gimple_ranger ranger;
if (ranger.range_of_expr (r, .....))
    // business as usual
Jakub Jelinek Jan. 17, 2023, 11:33 a.m. UTC | #7
On Tue, Jan 17, 2023 at 12:22:42PM +0100, Aldy Hernandez wrote:
> 
> 
> On 1/17/23 12:19, Jakub Jelinek wrote:
> > On Tue, Jan 17, 2023 at 12:14:14PM +0100, Aldy Hernandez wrote:
> > > > A question would be if it would be worth to activate it in this spot lazily
> > > > if it isn't active yet (and destruct at the end of the pass).
> > > 
> > > That's what it was designed for :).  If you're making sporadic requests, the
> > > on-demand mechanism should be fast enough.
> > 
> > So what should be done to do the on-demand query rather than global one?
> 
> gimple_ranger ranger;
> if (ranger.range_of_expr (r, .....))
>    // business as usual

So not worth making the ranger somewhere in the pass (if it is really
sporadic like this one)?

Will test together with the removal of B from the range.

	Jakub
Aldy Hernandez Jan. 17, 2023, 11:44 a.m. UTC | #8
On 1/17/23 12:33, Jakub Jelinek wrote:
> On Tue, Jan 17, 2023 at 12:22:42PM +0100, Aldy Hernandez wrote:
>>
>>
>> On 1/17/23 12:19, Jakub Jelinek wrote:
>>> On Tue, Jan 17, 2023 at 12:14:14PM +0100, Aldy Hernandez wrote:
>>>>> A question would be if it would be worth to activate it in this spot lazily
>>>>> if it isn't active yet (and destruct at the end of the pass).
>>>>
>>>> That's what it was designed for :).  If you're making sporadic requests, the
>>>> on-demand mechanism should be fast enough.
>>>
>>> So what should be done to do the on-demand query rather than global one?
>>
>> gimple_ranger ranger;
>> if (ranger.range_of_expr (r, .....))
>>     // business as usual
> 
> So not worth making the ranger somewhere in the pass (if it is really
> sporadic like this one)?

If you're going to call range_of_expr various times within a pass, 
creating a pass instance would be the way to go.

// Early in your pass.
enable_ranger (func);
...
if (get_range_query->range_of_expr (....))
   { stuff }

// Late in your pass:
disable_ranger (func);

Note that get_range_query() would work even without enable_ranger...it 
would just pick up the global ranger (SSA_NAME_RANGE_INFO).

Aldy

> 
> Will test together with the removal of B from the range.
> 
> 	Jakub
>
diff mbox series

Patch

--- gcc/tree-ssa-forwprop.cc.jj	2023-01-02 09:32:26.000000000 +0100
+++ gcc/tree-ssa-forwprop.cc	2023-01-16 18:18:43.524443879 +0100
@@ -1837,7 +1837,7 @@  defcodefor_name (tree name, enum tree_co
    ((T) ((T2) X << Y)) | ((T) ((T2) X >> ((-Y) & (B - 1))))
    ((T) ((T2) X << (int) Y)) | ((T) ((T2) X >> (int) ((-Y) & (B - 1))))
 
-   transform these into:
+   transform these into (last 2 only if ranger can prove Y < B):
    X r<< Y
 
    Or for:
@@ -1866,6 +1866,8 @@  simplify_rotate (gimple_stmt_iterator *g
   int i;
   bool swapped_p = false;
   gimple *g;
+  gimple *def_arg_stmt[2] = { NULL, NULL };
+  int wider_prec = 0;
 
   arg[0] = gimple_assign_rhs1 (stmt);
   arg[1] = gimple_assign_rhs2 (stmt);
@@ -1878,7 +1880,11 @@  simplify_rotate (gimple_stmt_iterator *g
     return false;
 
   for (i = 0; i < 2; i++)
-    defcodefor_name (arg[i], &def_code[i], &def_arg1[i], &def_arg2[i]);
+    {
+      defcodefor_name (arg[i], &def_code[i], &def_arg1[i], &def_arg2[i]);
+      if (TREE_CODE (arg[i]) == SSA_NAME)
+	def_arg_stmt[i] = SSA_NAME_DEF_STMT (arg[i]);
+    }
 
   /* Look through narrowing (or same precision) conversions.  */
   if (CONVERT_EXPR_CODE_P (def_code[0])
@@ -1891,10 +1897,13 @@  simplify_rotate (gimple_stmt_iterator *g
       && has_single_use (arg[0])
       && has_single_use (arg[1]))
     {
+      wider_prec = TYPE_PRECISION (TREE_TYPE (def_arg1[0]));
       for (i = 0; i < 2; i++)
 	{
 	  arg[i] = def_arg1[i];
 	  defcodefor_name (arg[i], &def_code[i], &def_arg1[i], &def_arg2[i]);
+	  if (TREE_CODE (arg[i]) == SSA_NAME)
+	    def_arg_stmt[i] = SSA_NAME_DEF_STMT (arg[i]);
 	}
     }
   else
@@ -1910,6 +1919,8 @@  simplify_rotate (gimple_stmt_iterator *g
 	{
 	  arg[i] = def_arg1[i];
 	  defcodefor_name (arg[i], &def_code[i], &def_arg1[i], &def_arg2[i]);
+	  if (TREE_CODE (arg[i]) == SSA_NAME)
+	    def_arg_stmt[i] = SSA_NAME_DEF_STMT (arg[i]);
 	}
     }
 
@@ -1983,6 +1994,9 @@  simplify_rotate (gimple_stmt_iterator *g
     {
       tree cdef_arg1[2], cdef_arg2[2], def_arg2_alt[2];
       enum tree_code cdef_code[2];
+      gimple *def_arg_alt_stmt[2] = { NULL, NULL };
+      bool check_range = false;
+      gimple *check_range_stmt = NULL;
       /* Look through conversion of the shift count argument.
 	 The C/C++ FE cast any shift count argument to integer_type_node.
 	 The only problem might be if the shift count type maximum value
@@ -1999,9 +2013,13 @@  simplify_rotate (gimple_stmt_iterator *g
 	      && type_has_mode_precision_p (TREE_TYPE (cdef_arg1[i])))
 	    {
 	      def_arg2_alt[i] = cdef_arg1[i];
+	      if (TREE_CODE (def_arg2[i]) == SSA_NAME)
+		def_arg_alt_stmt[i] = SSA_NAME_DEF_STMT (def_arg2[i]);
 	      defcodefor_name (def_arg2_alt[i], &cdef_code[i],
 			       &cdef_arg1[i], &cdef_arg2[i]);
 	    }
+	  else
+	    def_arg_alt_stmt[i] = def_arg_stmt[i];
 	}
       for (i = 0; i < 2; i++)
 	/* Check for one shift count being Y and the other B - Y,
@@ -2024,7 +2042,7 @@  simplify_rotate (gimple_stmt_iterator *g
 	    if (CONVERT_EXPR_CODE_P (code)
 		&& INTEGRAL_TYPE_P (TREE_TYPE (tem))
 		&& TYPE_PRECISION (TREE_TYPE (tem))
-		 > floor_log2 (TYPE_PRECISION (rtype))
+		   > floor_log2 (TYPE_PRECISION (rtype))
 		&& type_has_mode_precision_p (TREE_TYPE (tem))
 		&& (tem == def_arg2[1 - i]
 		    || tem == def_arg2_alt[1 - i]))
@@ -2053,7 +2071,7 @@  simplify_rotate (gimple_stmt_iterator *g
 	    if (CONVERT_EXPR_CODE_P (code)
 		&& INTEGRAL_TYPE_P (TREE_TYPE (tem))
 		&& TYPE_PRECISION (TREE_TYPE (tem))
-		 > floor_log2 (TYPE_PRECISION (rtype))
+		   > floor_log2 (TYPE_PRECISION (rtype))
 		&& type_has_mode_precision_p (TREE_TYPE (tem)))
 	      defcodefor_name (tem, &code, &tem, NULL);
 
@@ -2062,6 +2080,11 @@  simplify_rotate (gimple_stmt_iterator *g
 		if (tem == def_arg2[1 - i] || tem == def_arg2_alt[1 - i])
 		  {
 		    rotcnt = tem;
+		    check_range = true;
+		    if (tem == def_arg2[1 - i])
+		      check_range_stmt = def_arg_stmt[1 - i];
+		    else
+		      check_range_stmt = def_arg_alt_stmt[1 - i];
 		    break;
 		  }
 		tree tem2;
@@ -2076,6 +2099,11 @@  simplify_rotate (gimple_stmt_iterator *g
 			|| tem2 == def_arg2_alt[1 - i])
 		      {
 			rotcnt = tem2;
+			check_range = true;
+			if (tem2 == def_arg2[1 - i])
+			  check_range_stmt = def_arg_stmt[1 - i];
+			else
+			  check_range_stmt = def_arg_alt_stmt[1 - i];
 			break;
 		      }
 		  }
@@ -2111,6 +2139,23 @@  simplify_rotate (gimple_stmt_iterator *g
 		  }
 	      }
 	  }
+      if (check_range && wider_prec > TYPE_PRECISION (rtype))
+	{
+	  if (TREE_CODE (rotcnt) != SSA_NAME)
+	    return false;
+	  int_range_max r;
+	  if (!get_global_range_query ()->range_of_expr (r, rotcnt,
+							 check_range_stmt))
+	    return false;
+	  int prec = TYPE_PRECISION (TREE_TYPE (rotcnt));
+	  signop sign = TYPE_SIGN (TREE_TYPE (rotcnt));
+	  wide_int min = wide_int::from (TYPE_PRECISION (rtype), prec, sign);
+	  wide_int max = wide_int::from (wider_prec - 1, prec, sign);
+	  int_range<2> r2 (TREE_TYPE (rotcnt), min, max);
+	  r.intersect (r2);
+	  if (!r.undefined_p ())
+	    return false;
+	}
       if (rotcnt == NULL_TREE)
 	return false;
       swapped_p = i != 1;
--- gcc/testsuite/c-c++-common/rotate-2.c.jj	2020-01-12 11:54:37.023404206 +0100
+++ gcc/testsuite/c-c++-common/rotate-2.c	2023-01-16 17:53:49.459329241 +0100
@@ -32,24 +32,32 @@  f4 (unsigned int x, int y __attribute__(
 unsigned short int
 f5 (unsigned short int x, unsigned int y)
 {
+  if (y >= __CHAR_BIT__ * __SIZEOF_SHORT__)
+    __builtin_unreachable ();
   return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
 }
 
 unsigned short int
 f6 (unsigned short int x, unsigned long int y)
 {
+  if (y >= __CHAR_BIT__ * __SIZEOF_SHORT__)
+    __builtin_unreachable ();
   return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
 }
 
 unsigned char
 f7 (unsigned char x, unsigned int y)
 {
+  if (y >= __CHAR_BIT__)
+    __builtin_unreachable ();
   return (x << y) | (x >> ((-y) & (__CHAR_BIT__ - 1)));
 }
 
 unsigned char
 f8 (unsigned char x, unsigned long int y)
 {
+  if (y >= __CHAR_BIT__)
+    __builtin_unreachable ();
   return (x << y) | (x >> ((-y) & (__CHAR_BIT__ - 1)));
 }
 
@@ -80,24 +88,32 @@  f12 (unsigned int x, int y __attribute__
 unsigned short int
 f13 (unsigned short int x, unsigned int y)
 {
+  if (y >= __CHAR_BIT__ * __SIZEOF_SHORT__)
+    __builtin_unreachable ();
   return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
 }
 
 unsigned short int
 f14 (unsigned short int x, unsigned long int y)
 {
+  if (y >= __CHAR_BIT__ * __SIZEOF_SHORT__)
+    __builtin_unreachable ();
   return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
 }
 
 unsigned char
 f15 (unsigned char x, unsigned int y)
 {
+  if (y >= __CHAR_BIT__)
+    __builtin_unreachable ();
   return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
 }
 
 unsigned char
 f16 (unsigned char x, unsigned long int y)
 {
+  if (y >= __CHAR_BIT__)
+    __builtin_unreachable ();
   return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
 }
 
@@ -224,24 +240,32 @@  f36 (unsigned int x, int y __attribute__
 unsigned short int
 f37 (unsigned short int x, unsigned int y)
 {
+  if (y >= __CHAR_BIT__ * __SIZEOF_SHORT__)
+    __builtin_unreachable ();
   return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
 }
 
 unsigned short int
 f38 (unsigned short int x, unsigned long int y)
 {
+  if (y >= __CHAR_BIT__ * __SIZEOF_SHORT__)
+    __builtin_unreachable ();
   return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
 }
 
 unsigned char
 f39 (unsigned char x, unsigned int y)
 {
+  if (y >= __CHAR_BIT__)
+    __builtin_unreachable ();
   return (x >> y) | (x << ((-y) & (__CHAR_BIT__ - 1)));
 }
 
 unsigned char
 f40 (unsigned char x, unsigned long int y)
 {
+  if (y >= __CHAR_BIT__)
+    __builtin_unreachable ();
   return (x >> y) | (x << ((-y) & (__CHAR_BIT__ - 1)));
 }
 
@@ -272,24 +296,32 @@  f44 (unsigned int x, int y __attribute__
 unsigned short int
 f45 (unsigned short int x, unsigned int y)
 {
+  if (y >= __CHAR_BIT__ * __SIZEOF_SHORT__)
+    __builtin_unreachable ();
   return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
 }
 
 unsigned short int
 f46 (unsigned short int x, unsigned long int y)
 {
+  if (y >= __CHAR_BIT__ * __SIZEOF_SHORT__)
+    __builtin_unreachable ();
   return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
 }
 
 unsigned char
 f47 (unsigned char x, unsigned int y)
 {
+  if (y >= __CHAR_BIT__)
+    __builtin_unreachable ();
   return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
 }
 
 unsigned char
 f48 (unsigned char x, unsigned long int y)
 {
+  if (y >= __CHAR_BIT__)
+    __builtin_unreachable ();
   return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
 }
 
--- gcc/testsuite/c-c++-common/rotate-2b.c.jj	2023-01-16 18:25:08.359787649 +0100
+++ gcc/testsuite/c-c++-common/rotate-2b.c	2023-01-16 18:25:04.078850569 +0100
@@ -0,0 +1,100 @@ 
+/* Check rotate pattern detection.  */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-ipa-icf -fdump-tree-optimized" } */
+/* { dg-final { scan-tree-dump-not "r\[<>]\[<>]" "optimized" } } */
+
+unsigned short int
+f5 (unsigned short int x, unsigned int y)
+{
+  return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
+}
+
+unsigned short int
+f6 (unsigned short int x, unsigned long int y)
+{
+  return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
+}
+
+unsigned char
+f7 (unsigned char x, unsigned int y)
+{
+  return (x << y) | (x >> ((-y) & (__CHAR_BIT__ - 1)));
+}
+
+unsigned char
+f8 (unsigned char x, unsigned long int y)
+{
+  return (x << y) | (x >> ((-y) & (__CHAR_BIT__ - 1)));
+}
+
+unsigned short int
+f13 (unsigned short int x, unsigned int y)
+{
+  return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
+}
+
+unsigned short int
+f14 (unsigned short int x, unsigned long int y)
+{
+  return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
+}
+
+unsigned char
+f15 (unsigned char x, unsigned int y)
+{
+  return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
+}
+
+unsigned char
+f16 (unsigned char x, unsigned long int y)
+{
+  return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
+}
+
+unsigned short int
+f37 (unsigned short int x, unsigned int y)
+{
+  return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
+}
+
+unsigned short int
+f38 (unsigned short int x, unsigned long int y)
+{
+  return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
+}
+
+unsigned char
+f39 (unsigned char x, unsigned int y)
+{
+  return (x >> y) | (x << ((-y) & (__CHAR_BIT__ - 1)));
+}
+
+unsigned char
+f40 (unsigned char x, unsigned long int y)
+{
+  return (x >> y) | (x << ((-y) & (__CHAR_BIT__ - 1)));
+}
+
+unsigned short int
+f45 (unsigned short int x, unsigned int y)
+{
+  return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
+}
+
+unsigned short int
+f46 (unsigned short int x, unsigned long int y)
+{
+  return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
+}
+
+unsigned char
+f47 (unsigned char x, unsigned int y)
+{
+  return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
+}
+
+unsigned char
+f48 (unsigned char x, unsigned long int y)
+{
+  return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
+}
--- gcc/testsuite/c-c++-common/rotate-4.c.jj	2020-01-12 11:54:37.023404206 +0100
+++ gcc/testsuite/c-c++-common/rotate-4.c	2023-01-16 18:01:04.161979907 +0100
@@ -32,24 +32,32 @@  f4 (unsigned int x, int y __attribute__(
 unsigned short int
 f5 (unsigned short int x, int y)
 {
+  if (y >= __CHAR_BIT__ * __SIZEOF_SHORT__)
+    __builtin_unreachable ();
   return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
 }
 
 unsigned short int
 f6 (unsigned short int x, long int y)
 {
+  if (y >= 0UL + __CHAR_BIT__ * __SIZEOF_SHORT__)
+    __builtin_unreachable ();
   return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
 }
 
 unsigned char
 f7 (unsigned char x, int y)
 {
+  if (y >= __CHAR_BIT__)
+    __builtin_unreachable ();
   return (x << y) | (x >> ((-y) & (__CHAR_BIT__ - 1)));
 }
 
 unsigned char
 f8 (unsigned char x, long int y)
 {
+  if (y >= 0UL + __CHAR_BIT__)
+    __builtin_unreachable ();
   return (x << y) | (x >> ((-y) & (__CHAR_BIT__ - 1)));
 }
 
@@ -80,24 +88,32 @@  f12 (unsigned int x, int y __attribute__
 unsigned short int
 f13 (unsigned short int x, int y)
 {
+  if (y >= __CHAR_BIT__ * __SIZEOF_SHORT__)
+    __builtin_unreachable ();
   return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
 }
 
 unsigned short int
 f14 (unsigned short int x, long int y)
 {
+  if (y >= 0UL + __CHAR_BIT__ * __SIZEOF_SHORT__)
+    __builtin_unreachable ();
   return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
 }
 
 unsigned char
 f15 (unsigned char x, int y)
 {
+  if (y >= __CHAR_BIT__)
+    __builtin_unreachable ();
   return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
 }
 
 unsigned char
 f16 (unsigned char x, long int y)
 {
+  if (y >= 0UL + __CHAR_BIT__)
+    __builtin_unreachable ();
   return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
 }
 
@@ -224,24 +240,32 @@  f36 (unsigned int x, int y __attribute__
 unsigned short int
 f37 (unsigned short int x, int y)
 {
+  if (y >= __CHAR_BIT__ * __SIZEOF_SHORT__)
+    __builtin_unreachable ();
   return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
 }
 
 unsigned short int
 f38 (unsigned short int x, long int y)
 {
+  if (y >= 0UL + __CHAR_BIT__ * __SIZEOF_SHORT__)
+    __builtin_unreachable ();
   return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
 }
 
 unsigned char
 f39 (unsigned char x, int y)
 {
+  if (y >= __CHAR_BIT__)
+    __builtin_unreachable ();
   return (x >> y) | (x << ((-y) & (__CHAR_BIT__ - 1)));
 }
 
 unsigned char
 f40 (unsigned char x, long int y)
 {
+  if (y >= 0UL + __CHAR_BIT__)
+    __builtin_unreachable ();
   return (x >> y) | (x << ((-y) & (__CHAR_BIT__ - 1)));
 }
 
@@ -272,24 +296,32 @@  f44 (unsigned int x, int y __attribute__
 unsigned short int
 f45 (unsigned short int x, int y)
 {
+  if (y >= __CHAR_BIT__ * __SIZEOF_SHORT__)
+    __builtin_unreachable ();
   return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
 }
 
 unsigned short int
 f46 (unsigned short int x, long int y)
 {
+  if (y >= 0UL + __CHAR_BIT__ * __SIZEOF_SHORT__)
+    __builtin_unreachable ();
   return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
 }
 
 unsigned char
 f47 (unsigned char x, int y)
 {
+  if (y >= __CHAR_BIT__)
+    __builtin_unreachable ();
   return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
 }
 
 unsigned char
 f48 (unsigned char x, long int y)
 {
+  if (y >= 0UL + __CHAR_BIT__)
+    __builtin_unreachable ();
   return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
 }
 
--- gcc/testsuite/c-c++-common/rotate-4b.c.jj	2023-01-16 18:25:19.645621784 +0100
+++ gcc/testsuite/c-c++-common/rotate-4b.c	2023-01-16 18:27:26.237761150 +0100
@@ -0,0 +1,100 @@ 
+/* Check rotate pattern detection.  */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-ipa-icf -fdump-tree-optimized" } */
+/* { dg-final { scan-tree-dump-not "r\[<>]\[<>]" "optimized" } } */
+
+unsigned short int
+f5 (unsigned short int x, int y)
+{
+  return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
+}
+
+unsigned short int
+f6 (unsigned short int x, long int y)
+{
+  return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
+}
+
+unsigned char
+f7 (unsigned char x, int y)
+{
+  return (x << y) | (x >> ((-y) & (__CHAR_BIT__ - 1)));
+}
+
+unsigned char
+f8 (unsigned char x, long int y)
+{
+  return (x << y) | (x >> ((-y) & (__CHAR_BIT__ - 1)));
+}
+
+unsigned short int
+f13 (unsigned short int x, int y)
+{
+  return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
+}
+
+unsigned short int
+f14 (unsigned short int x, long int y)
+{
+  return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
+}
+
+unsigned char
+f15 (unsigned char x, int y)
+{
+  return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
+}
+
+unsigned char
+f16 (unsigned char x, long int y)
+{
+  return (x << y) | (x >> ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
+}
+
+unsigned short int
+f37 (unsigned short int x, int y)
+{
+  return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
+}
+
+unsigned short int
+f38 (unsigned short int x, long int y)
+{
+  return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * __SIZEOF_SHORT__ - 1)));
+}
+
+unsigned char
+f39 (unsigned char x, int y)
+{
+  return (x >> y) | (x << ((-y) & (__CHAR_BIT__ - 1)));
+}
+
+unsigned char
+f40 (unsigned char x, long int y)
+{
+  return (x >> y) | (x << ((-y) & (__CHAR_BIT__ - 1)));
+}
+
+unsigned short int
+f45 (unsigned short int x, int y)
+{
+  return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
+}
+
+unsigned short int
+f46 (unsigned short int x, long int y)
+{
+  return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned short) - 1)));
+}
+
+unsigned char
+f47 (unsigned char x, int y)
+{
+  return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
+}
+
+unsigned char
+f48 (unsigned char x, long int y)
+{
+  return (x >> y) | (x << ((-y) & (__CHAR_BIT__ * sizeof (unsigned char) - 1)));
+}
--- gcc/testsuite/gcc.c-torture/execute/pr106523.c.jj	2023-01-16 18:37:42.208726762 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr106523.c	2023-01-16 18:37:31.898877393 +0100
@@ -0,0 +1,22 @@ 
+/* PR tree-optimization/106523 */
+
+__attribute__((noipa)) unsigned char
+f7 (unsigned char x, unsigned int y)
+{
+  unsigned int t = x;
+  return (t << y) | (t >> ((-y) & 7));
+}
+
+int
+main ()
+{
+  if (__CHAR_BIT__ != 8 || __SIZEOF_INT__ != 4)
+    return 0;
+
+  volatile unsigned char x = 152;
+  volatile unsigned int y = 19;
+  if (f7 (x, y) != 4)
+    __builtin_abort ();
+
+  return 0;
+}