[ARM] RFA: Use new rtl iterators in arm_find_sub_rtx_with_code
diff mbox

Message ID 87389xlver.fsf@googlemail.com
State New
Headers show

Commit Message

Richard Sandiford Nov. 5, 2014, 11:49 a.m. UTC
I think these functions only want to iterate over instruction patterns
rather than whole instructions (which would include things like
REG_EQUAL notes), since only the patterns are relevant for finding
dependencies.  There's then no need to check for null rtxes.

Tested by making sure there were no code changes for gcc.dg, gcc.c-torture
and g++.dg for plain arm-linux-gnueabi and aarch64-linux-gnu.  Ramana also
asked me to try -mcpu=cortex-a7, -mcpu=cortex-a9, -mcpu=arm9tdmi and
-mcpu=cortex-a15.  There were differences in:

   gcc.c-torture/execute/20060110-2.c
   gcc.c-torture/execute/ashrdi-1.c and
   gcc.dg/tree-ssa/pr24627.c

for -mcpu=cortex-a7 and no differences for the other combinations.
The A7 differences were due to the way that arm_get_set_operands handles
multi-set instructions such as:

            (set (reg:CC_C 100 cc)
                (compare:CC_C (plus:SI (reg:SI 8 r8 [orig:121 a ] [121])
                        (reg:SI 0 r0 [orig:122 b ] [122]))
                    (reg:SI 8 r8 [orig:121 a ] [121])))
            (set (reg:SI 2 r2 [orig:120 D.4117 ] [120])
                (plus:SI (reg:SI 8 r8 [orig:121 a ] [121])
                    (reg:SI 0 r0 [orig:122 b ] [122])))

for_each_rtx iterates over the subrtxes in forward order, so
arm_get_set_operands would pick the set of CC.  The new iterator pushes
the contents of a PARALLEL onto a stack and pulls them in reverse order,
so arm_get_set_operands would pick the set of r2.  This means that after
the patch the code sees a producer/consumer relationship that it
previously missed.  I think the new behaviour is what was intended.

This code shouldn't really be relying on a particular iteration order
though.  There's a dependency if any SET in the potential producer sets
a register used by the potential consumer.  I think any fix for that
should be done separately from the iterator rewrite.

OK to install?

Thanks,
Richard


gcc/
	* config/arm/aarch-common.c: Include rtl-iter.h.
	(search_term, arm_find_sub_rtx_with_search_term): Delete.
	(arm_find_sub_rtx_with_code): Use FOR_EACH_SUBRTX_VAR.
	(arm_get_set_operands): Pass the insn pattern rather than the
	insn itself.
	(arm_no_early_store_addr_dep): Likewise.

Comments

Richard Earnshaw Nov. 7, 2014, 12:22 p.m. UTC | #1
On 05/11/14 11:49, Richard Sandiford wrote:
> I think these functions only want to iterate over instruction patterns
> rather than whole instructions (which would include things like
> REG_EQUAL notes), since only the patterns are relevant for finding
> dependencies.  There's then no need to check for null rtxes.
> 
> Tested by making sure there were no code changes for gcc.dg, gcc.c-torture
> and g++.dg for plain arm-linux-gnueabi and aarch64-linux-gnu.  Ramana also
> asked me to try -mcpu=cortex-a7, -mcpu=cortex-a9, -mcpu=arm9tdmi and
> -mcpu=cortex-a15.  There were differences in:
> 
>    gcc.c-torture/execute/20060110-2.c
>    gcc.c-torture/execute/ashrdi-1.c and
>    gcc.dg/tree-ssa/pr24627.c
> 
> for -mcpu=cortex-a7 and no differences for the other combinations.
> The A7 differences were due to the way that arm_get_set_operands handles
> multi-set instructions such as:
> 
>             (set (reg:CC_C 100 cc)
>                 (compare:CC_C (plus:SI (reg:SI 8 r8 [orig:121 a ] [121])
>                         (reg:SI 0 r0 [orig:122 b ] [122]))
>                     (reg:SI 8 r8 [orig:121 a ] [121])))
>             (set (reg:SI 2 r2 [orig:120 D.4117 ] [120])
>                 (plus:SI (reg:SI 8 r8 [orig:121 a ] [121])
>                     (reg:SI 0 r0 [orig:122 b ] [122])))
> 
> for_each_rtx iterates over the subrtxes in forward order, so
> arm_get_set_operands would pick the set of CC.  The new iterator pushes
> the contents of a PARALLEL onto a stack and pulls them in reverse order,
> so arm_get_set_operands would pick the set of r2.  This means that after
> the patch the code sees a producer/consumer relationship that it
> previously missed.  I think the new behaviour is what was intended.
> 
> This code shouldn't really be relying on a particular iteration order
> though.  There's a dependency if any SET in the potential producer sets
> a register used by the potential consumer.  I think any fix for that
> should be done separately from the iterator rewrite.
> 
> OK to install?
> 
> Thanks,
> Richard
> 
> 
> gcc/
> 	* config/arm/aarch-common.c: Include rtl-iter.h.
> 	(search_term, arm_find_sub_rtx_with_search_term): Delete.
> 	(arm_find_sub_rtx_with_code): Use FOR_EACH_SUBRTX_VAR.
> 	(arm_get_set_operands): Pass the insn pattern rather than the
> 	insn itself.
> 	(arm_no_early_store_addr_dep): Likewise.
> 

OK.

R.

> Index: gcc/config/arm/aarch-common.c
> ===================================================================
> --- gcc/config/arm/aarch-common.c	2014-10-25 09:42:00.631168827 +0100
> +++ gcc/config/arm/aarch-common.c	2014-10-25 09:51:24.212872553 +0100
> @@ -30,6 +30,7 @@
>  #include "tree.h"
>  #include "c-family/c-common.h"
>  #include "rtl.h"
> +#include "rtl-iter.h"
>  
>  /* In ARMv8-A there's a general expectation that AESE/AESMC
>     and AESD/AESIMC sequences of the form:
> @@ -68,13 +69,6 @@ aarch_crypto_can_dual_issue (rtx_insn *p
>    return 0;
>  }
>  
> -typedef struct
> -{
> -  rtx_code search_code;
> -  rtx search_result;
> -  bool find_any_shift;
> -} search_term;
> -
>  /* Return TRUE if X is either an arithmetic shift left, or
>     is a multiplication by a power of two.  */
>  bool
> @@ -96,68 +90,32 @@ static rtx_code shift_rtx_codes[] =
>    { ASHIFT, ROTATE, ASHIFTRT, LSHIFTRT,
>      ROTATERT, ZERO_EXTEND, SIGN_EXTEND };
>  
> -/* Callback function for arm_find_sub_rtx_with_code.
> -   DATA is safe to treat as a SEARCH_TERM, ST.  This will
> -   hold a SEARCH_CODE.  PATTERN is checked to see if it is an
> -   RTX with that code.  If it is, write SEARCH_RESULT in ST
> -   and return 1.  Otherwise, or if we have been passed a NULL_RTX
> -   return 0.  If ST.FIND_ANY_SHIFT then we are interested in
> -   anything which can reasonably be described as a SHIFT RTX.  */
> -static int
> -arm_find_sub_rtx_with_search_term (rtx *pattern, void *data)
> -{
> -  search_term *st = (search_term *) data;
> -  rtx_code pattern_code;
> -  int found = 0;
> -
> -  gcc_assert (pattern);
> -  gcc_assert (st);
> -
> -  /* Poorly formed patterns can really ruin our day.  */
> -  if (*pattern == NULL_RTX)
> -    return 0;
> -
> -  pattern_code = GET_CODE (*pattern);
> -
> -  if (st->find_any_shift)
> -    {
> -      unsigned i = 0;
> -
> -      /* Left shifts might have been canonicalized to a MULT of some
> -	 power of two.  Make sure we catch them.  */
> -      if (arm_rtx_shift_left_p (*pattern))
> -	found = 1;
> -      else
> -	for (i = 0; i < ARRAY_SIZE (shift_rtx_codes); i++)
> -	  if (pattern_code == shift_rtx_codes[i])
> -	    found = 1;
> -    }
> -
> -  if (pattern_code == st->search_code)
> -    found = 1;
> -
> -  if (found)
> -    st->search_result = *pattern;
> -
> -  return found;
> -}
> -
> -/* Traverse PATTERN looking for a sub-rtx with RTX_CODE CODE.  */
> +/* Traverse PATTERN looking for a sub-rtx with RTX_CODE CODE.
> +   If FIND_ANY_SHIFT then we are interested in anything which can
> +   reasonably be described as a SHIFT RTX.  */
>  static rtx
>  arm_find_sub_rtx_with_code (rtx pattern, rtx_code code, bool find_any_shift)
>  {
> -  search_term st;
> -  int result = 0;
> +  subrtx_var_iterator::array_type array;
> +  FOR_EACH_SUBRTX_VAR (iter, array, pattern, NONCONST)
> +    {
> +      rtx x = *iter;
> +      if (find_any_shift)
> +	{
> +	  /* Left shifts might have been canonicalized to a MULT of some
> +	     power of two.  Make sure we catch them.  */
> +	  if (arm_rtx_shift_left_p (x))
> +	    return x;
> +	  else
> +	    for (unsigned int i = 0; i < ARRAY_SIZE (shift_rtx_codes); i++)
> +	      if (GET_CODE (x) == shift_rtx_codes[i])
> +		return x;
> +	}
>  
> -  gcc_assert (pattern != NULL_RTX);
> -  st.search_code = code;
> -  st.search_result = NULL_RTX;
> -  st.find_any_shift = find_any_shift;
> -  result = for_each_rtx (&pattern, arm_find_sub_rtx_with_search_term, &st);
> -  if (result)
> -    return st.search_result;
> -  else
> -    return NULL_RTX;
> +      if (GET_CODE (x) == code)
> +	return x;
> +    }
> +  return NULL_RTX;
>  }
>  
>  /* Traverse PATTERN looking for any sub-rtx which looks like a shift.  */
> @@ -180,8 +138,10 @@ arm_find_shift_sub_rtx (rtx pattern)
>  arm_get_set_operands (rtx producer, rtx consumer,
>  		      rtx *set_source, rtx *set_destination)
>  {
> -  rtx set_producer = arm_find_sub_rtx_with_code (producer, SET, false);
> -  rtx set_consumer = arm_find_sub_rtx_with_code (consumer, SET, false);
> +  rtx set_producer = arm_find_sub_rtx_with_code (PATTERN (producer),
> +						 SET, false);
> +  rtx set_consumer = arm_find_sub_rtx_with_code (PATTERN (consumer),
> +						 SET, false);
>  
>    if (set_producer && set_consumer)
>      {
> @@ -353,8 +313,8 @@ arm_no_early_mul_dep (rtx producer, rtx
>  int
>  arm_no_early_store_addr_dep (rtx producer, rtx consumer)
>  {
> -  rtx value = arm_find_sub_rtx_with_code (producer, SET, false);
> -  rtx addr = arm_find_sub_rtx_with_code (consumer, SET, false);
> +  rtx value = arm_find_sub_rtx_with_code (PATTERN (producer), SET, false);
> +  rtx addr = arm_find_sub_rtx_with_code (PATTERN (consumer), SET, false);
>  
>    if (value)
>      value = SET_DEST (value);
>

Patch
diff mbox

Index: gcc/config/arm/aarch-common.c
===================================================================
--- gcc/config/arm/aarch-common.c	2014-10-25 09:42:00.631168827 +0100
+++ gcc/config/arm/aarch-common.c	2014-10-25 09:51:24.212872553 +0100
@@ -30,6 +30,7 @@ 
 #include "tree.h"
 #include "c-family/c-common.h"
 #include "rtl.h"
+#include "rtl-iter.h"
 
 /* In ARMv8-A there's a general expectation that AESE/AESMC
    and AESD/AESIMC sequences of the form:
@@ -68,13 +69,6 @@  aarch_crypto_can_dual_issue (rtx_insn *p
   return 0;
 }
 
-typedef struct
-{
-  rtx_code search_code;
-  rtx search_result;
-  bool find_any_shift;
-} search_term;
-
 /* Return TRUE if X is either an arithmetic shift left, or
    is a multiplication by a power of two.  */
 bool
@@ -96,68 +90,32 @@  static rtx_code shift_rtx_codes[] =
   { ASHIFT, ROTATE, ASHIFTRT, LSHIFTRT,
     ROTATERT, ZERO_EXTEND, SIGN_EXTEND };
 
-/* Callback function for arm_find_sub_rtx_with_code.
-   DATA is safe to treat as a SEARCH_TERM, ST.  This will
-   hold a SEARCH_CODE.  PATTERN is checked to see if it is an
-   RTX with that code.  If it is, write SEARCH_RESULT in ST
-   and return 1.  Otherwise, or if we have been passed a NULL_RTX
-   return 0.  If ST.FIND_ANY_SHIFT then we are interested in
-   anything which can reasonably be described as a SHIFT RTX.  */
-static int
-arm_find_sub_rtx_with_search_term (rtx *pattern, void *data)
-{
-  search_term *st = (search_term *) data;
-  rtx_code pattern_code;
-  int found = 0;
-
-  gcc_assert (pattern);
-  gcc_assert (st);
-
-  /* Poorly formed patterns can really ruin our day.  */
-  if (*pattern == NULL_RTX)
-    return 0;
-
-  pattern_code = GET_CODE (*pattern);
-
-  if (st->find_any_shift)
-    {
-      unsigned i = 0;
-
-      /* Left shifts might have been canonicalized to a MULT of some
-	 power of two.  Make sure we catch them.  */
-      if (arm_rtx_shift_left_p (*pattern))
-	found = 1;
-      else
-	for (i = 0; i < ARRAY_SIZE (shift_rtx_codes); i++)
-	  if (pattern_code == shift_rtx_codes[i])
-	    found = 1;
-    }
-
-  if (pattern_code == st->search_code)
-    found = 1;
-
-  if (found)
-    st->search_result = *pattern;
-
-  return found;
-}
-
-/* Traverse PATTERN looking for a sub-rtx with RTX_CODE CODE.  */
+/* Traverse PATTERN looking for a sub-rtx with RTX_CODE CODE.
+   If FIND_ANY_SHIFT then we are interested in anything which can
+   reasonably be described as a SHIFT RTX.  */
 static rtx
 arm_find_sub_rtx_with_code (rtx pattern, rtx_code code, bool find_any_shift)
 {
-  search_term st;
-  int result = 0;
+  subrtx_var_iterator::array_type array;
+  FOR_EACH_SUBRTX_VAR (iter, array, pattern, NONCONST)
+    {
+      rtx x = *iter;
+      if (find_any_shift)
+	{
+	  /* Left shifts might have been canonicalized to a MULT of some
+	     power of two.  Make sure we catch them.  */
+	  if (arm_rtx_shift_left_p (x))
+	    return x;
+	  else
+	    for (unsigned int i = 0; i < ARRAY_SIZE (shift_rtx_codes); i++)
+	      if (GET_CODE (x) == shift_rtx_codes[i])
+		return x;
+	}
 
-  gcc_assert (pattern != NULL_RTX);
-  st.search_code = code;
-  st.search_result = NULL_RTX;
-  st.find_any_shift = find_any_shift;
-  result = for_each_rtx (&pattern, arm_find_sub_rtx_with_search_term, &st);
-  if (result)
-    return st.search_result;
-  else
-    return NULL_RTX;
+      if (GET_CODE (x) == code)
+	return x;
+    }
+  return NULL_RTX;
 }
 
 /* Traverse PATTERN looking for any sub-rtx which looks like a shift.  */
@@ -180,8 +138,10 @@  arm_find_shift_sub_rtx (rtx pattern)
 arm_get_set_operands (rtx producer, rtx consumer,
 		      rtx *set_source, rtx *set_destination)
 {
-  rtx set_producer = arm_find_sub_rtx_with_code (producer, SET, false);
-  rtx set_consumer = arm_find_sub_rtx_with_code (consumer, SET, false);
+  rtx set_producer = arm_find_sub_rtx_with_code (PATTERN (producer),
+						 SET, false);
+  rtx set_consumer = arm_find_sub_rtx_with_code (PATTERN (consumer),
+						 SET, false);
 
   if (set_producer && set_consumer)
     {
@@ -353,8 +313,8 @@  arm_no_early_mul_dep (rtx producer, rtx
 int
 arm_no_early_store_addr_dep (rtx producer, rtx consumer)
 {
-  rtx value = arm_find_sub_rtx_with_code (producer, SET, false);
-  rtx addr = arm_find_sub_rtx_with_code (consumer, SET, false);
+  rtx value = arm_find_sub_rtx_with_code (PATTERN (producer), SET, false);
+  rtx addr = arm_find_sub_rtx_with_code (PATTERN (consumer), SET, false);
 
   if (value)
     value = SET_DEST (value);