diff mbox

[ARM] Backport trunk fix to 4.8 branch to properly handle rtx of ARM PLD instruction

Message ID 000201cf11ee$900a8190$b01f84b0$@arm.com
State New
Headers show

Commit Message

Terry Guo Jan. 15, 2014, 12:37 p.m. UTC
> -----Original Message-----
> From: Terry Guo [mailto:terry.guo@arm.com]
> Sent: Wednesday, January 15, 2014 8:21 PM
> To: Richard Earnshaw
> Cc: gcc-patches@gcc.gnu.org
> Subject: RE: [GCC, ARM] Backport trunk fix to 4.8 branch to properly
handle
> rtx of ARM PLD instruction
> 
> >
> > Preferably, particularly since you haven't supplied a testcase.
> >
> > R.
> 
> Bug is reported at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59826. I
shall
> update the patch to include the test case.
> 
> BR,
> Terry

Here is updated patch along with test case. Is it OK?

BR,
Terry

Comments

Richard Earnshaw Jan. 15, 2014, 2:30 p.m. UTC | #1
On 15/01/14 12:37, Terry Guo wrote:
> 
> 
>> -----Original Message-----
>> From: Terry Guo [mailto:terry.guo@arm.com]
>> Sent: Wednesday, January 15, 2014 8:21 PM
>> To: Richard Earnshaw
>> Cc: gcc-patches@gcc.gnu.org
>> Subject: RE: [GCC, ARM] Backport trunk fix to 4.8 branch to properly
> handle
>> rtx of ARM PLD instruction
>>
>>>
>>> Preferably, particularly since you haven't supplied a testcase.
>>>
>>> R.
>>
>> Bug is reported at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59826. I
> shall
>> update the patch to include the test case.
>>
>> BR,
>> Terry
> 
> Here is updated patch along with test case. Is it OK?
> 

I'm rather concerned about the complexity of this patch as a backport.
Furthermore, part of the problem is that the preload insn is
misclassified as an alu_reg operation, which it clearly isn't.

Instead of doing this, please could you try a simpler patch that simply
reclassifies the "type" of preload as "load1".  This would break the
alu->load/store dependency and thereby avoid the trigger of the problem

R.
Terry Guo Jan. 15, 2014, 3:59 p.m. UTC | #2
> -----Original Message-----
> From: Richard Earnshaw
> Sent: Wednesday, January 15, 2014 10:30 PM
> To: Terry Guo
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [GCC, ARM] Backport trunk fix to 4.8 branch to properly
handle
> rtx of ARM PLD instruction
> 
> On 15/01/14 12:37, Terry Guo wrote:
> >
> >
> >> -----Original Message-----
> >> From: Terry Guo [mailto:terry.guo@arm.com]
> >> Sent: Wednesday, January 15, 2014 8:21 PM
> >> To: Richard Earnshaw
> >> Cc: gcc-patches@gcc.gnu.org
> >> Subject: RE: [GCC, ARM] Backport trunk fix to 4.8 branch to properly
> > handle
> >> rtx of ARM PLD instruction
> >>
> >>>
> >>> Preferably, particularly since you haven't supplied a testcase.
> >>>
> >>> R.
> >>
> >> Bug is reported at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59826.
> >> I
> > shall
> >> update the patch to include the test case.
> >>
> >> BR,
> >> Terry
> >
> > Here is updated patch along with test case. Is it OK?
> >
> 
> I'm rather concerned about the complexity of this patch as a backport.
> Furthermore, part of the problem is that the preload insn is misclassified
as
> an alu_reg operation, which it clearly isn't.
> 
> Instead of doing this, please could you try a simpler patch that simply
> reclassifies the "type" of preload as "load1".  This would break the
> alu->load/store dependency and thereby avoid the trigger of the problem
> 
> R.

Thanks Richard and you are right. What you said should be the root cause for
this issue. I will implement another patch for trunk to correctly classify
the preload instruction into load1, and then back port to 4.8 branch.
Therefore please consider my request in this thread discarded.

BR,
Terry
Richard Earnshaw Jan. 15, 2014, 4:40 p.m. UTC | #3
On 15/01/14 15:59, Terry Guo wrote:
> 
> 
>> -----Original Message-----
>> From: Richard Earnshaw
>> Sent: Wednesday, January 15, 2014 10:30 PM
>> To: Terry Guo
>> Cc: gcc-patches@gcc.gnu.org
>> Subject: Re: [GCC, ARM] Backport trunk fix to 4.8 branch to properly
> handle
>> rtx of ARM PLD instruction
>>
>> On 15/01/14 12:37, Terry Guo wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Terry Guo [mailto:terry.guo@arm.com]
>>>> Sent: Wednesday, January 15, 2014 8:21 PM
>>>> To: Richard Earnshaw
>>>> Cc: gcc-patches@gcc.gnu.org
>>>> Subject: RE: [GCC, ARM] Backport trunk fix to 4.8 branch to properly
>>> handle
>>>> rtx of ARM PLD instruction
>>>>
>>>>>
>>>>> Preferably, particularly since you haven't supplied a testcase.
>>>>>
>>>>> R.
>>>>
>>>> Bug is reported at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59826.
>>>> I
>>> shall
>>>> update the patch to include the test case.
>>>>
>>>> BR,
>>>> Terry
>>>
>>> Here is updated patch along with test case. Is it OK?
>>>
>>
>> I'm rather concerned about the complexity of this patch as a backport.
>> Furthermore, part of the problem is that the preload insn is misclassified
> as
>> an alu_reg operation, which it clearly isn't.
>>
>> Instead of doing this, please could you try a simpler patch that simply
>> reclassifies the "type" of preload as "load1".  This would break the
>> alu->load/store dependency and thereby avoid the trigger of the problem
>>
>> R.
> 
> Thanks Richard and you are right. What you said should be the root cause for
> this issue. I will implement another patch for trunk to correctly classify
> the preload instruction into load1, and then back port to 4.8 branch.
> Therefore please consider my request in this thread discarded.
> 

Trunk has already reclassified it this way.

R.
diff mbox

Patch

Index: gcc/ChangeLog
===================================================================
--- gcc/ChangeLog	(revision 206619)
+++ gcc/ChangeLog	(working copy)
@@ -1,3 +1,21 @@ 
+2014-01-15  Terry Guo  <terry.guo@arm.com>
+
+	PR target/59826
+	Backported from mainline r204575 and applied to file arm.c.
+	2013-11-08  James Greenhalgh  <james.greenhalgh@arm.com>
+
+	* config/arm/aarch-common.c
+	(search_term): New typedef.
+	(shift_rtx_costs): New array.
+	(arm_rtx_shift_left_p): New.
+	(arm_find_sub_rtx_with_search_term): Likewise.
+	(arm_find_sub_rtx_with_code): Likewise.
+	(arm_early_load_addr_dep): Add sanity checking.
+	(arm_no_early_alu_shift_dep): Likewise.
+	(arm_no_early_alu_shift_value_dep): Likewise.
+	(arm_no_early_mul_dep): Likewise.
+	(arm_no_early_store_addr_dep): Likewise.
+
 2014-01-14  Uros Bizjak  <ubizjak@gmail.com>
 
 	Revert:
Index: gcc/config/arm/arm.c
===================================================================
--- gcc/config/arm/arm.c	(revision 206619)
+++ gcc/config/arm/arm.c	(working copy)
@@ -1161,6 +1161,30 @@ 
   TLS_DESCSEQ	/* GNU scheme */
 };
 
+typedef struct
+{
+  rtx_code search_code;
+  rtx search_result;
+  bool find_any_shift;
+} search_term;
+
+/* Return TRUE if X is either an arithmetic shift left, or
+   is a multiplication by a power of two.  */
+bool
+arm_rtx_shift_left_p (rtx x)
+{
+  enum rtx_code code = GET_CODE (x);
+
+  if (code == MULT && CONST_INT_P (XEXP (x, 1))
+      && exact_log2 (INTVAL (XEXP (x, 1))) > 0)
+    return true;
+
+  if (code == ASHIFT)
+    return true;
+
+  return false;
+}
+
 /* The maximum number of insns to be used when loading a constant.  */
 inline static int
 arm_constant_limit (bool size_p)
@@ -24604,62 +24628,116 @@ 
     *pretend_size = (NUM_ARG_REGS - nregs) * UNITS_PER_WORD;
 }
 
-/* Return nonzero if the CONSUMER instruction (a store) does not need
-   PRODUCER's value to calculate the address.  */
+static rtx_code shift_rtx_codes[] =
+  { ASHIFT, ROTATE, ASHIFTRT, LSHIFTRT,
+    ROTATERT, ZERO_EXTEND, SIGN_EXTEND };
 
-int
-arm_no_early_store_addr_dep (rtx producer, rtx consumer)
+/* Callback function for arm_find_sub_rtx_with_code.
+   DATA is safe to treat as a SEARCH_TERM, ST.  This will
+   hold a SEARCH_CODE.  PATTERN is checked to see if it is an
+   RTX with that code.  If it is, write SEARCH_RESULT in ST
+   and return 1.  Otherwise, or if we have been passed a NULL_RTX
+   return 0.  If ST.FIND_ANY_SHIFT then we are interested in
+   anything which can reasonably be described as a SHIFT RTX.  */
+static int
+arm_find_sub_rtx_with_search_term (rtx *pattern, void *data)
 {
-  rtx value = PATTERN (producer);
-  rtx addr = PATTERN (consumer);
+  search_term *st = (search_term *) data;
+  rtx_code pattern_code;
+  int found = 0;
 
-  if (GET_CODE (value) == COND_EXEC)
-    value = COND_EXEC_CODE (value);
-  if (GET_CODE (value) == PARALLEL)
-    value = XVECEXP (value, 0, 0);
-  value = XEXP (value, 0);
-  if (GET_CODE (addr) == COND_EXEC)
-    addr = COND_EXEC_CODE (addr);
-  if (GET_CODE (addr) == PARALLEL)
-    addr = XVECEXP (addr, 0, 0);
-  addr = XEXP (addr, 0);
+  gcc_assert (pattern);
+  gcc_assert (st);
 
-  return !reg_overlap_mentioned_p (value, addr);
+  /* Poorly formed patterns can really ruin our day.  */
+  if (*pattern == NULL_RTX)
+    return 0;
+
+  pattern_code = GET_CODE (*pattern);
+
+  if (st->find_any_shift)
+    {
+      unsigned i = 0;
+
+      /* Left shifts might have been canonicalized to a MULT of some
+	 power of two.  Make sure we catch them.  */
+      if (arm_rtx_shift_left_p (*pattern))
+	found = 1;
+      else
+	for (i = 0; i < ARRAY_SIZE (shift_rtx_codes); i++)
+	  if (pattern_code == shift_rtx_codes[i])
+	    found = 1;
+    }
+
+  if (pattern_code == st->search_code)
+    found = 1;
+
+  if (found)
+    st->search_result = *pattern;
+
+  return found;
 }
 
-/* Return nonzero if the CONSUMER instruction (a store) does need
-   PRODUCER's value to calculate the address.  */
+/* Traverse PATTERN looking for a sub-rtx with RTX_CODE CODE.  */
+static rtx
+arm_find_sub_rtx_with_code (rtx pattern, rtx_code code, bool find_any_shift)
+{
+  search_term st;
+  int result = 0;
 
-int
-arm_early_store_addr_dep (rtx producer, rtx consumer)
+  gcc_assert (pattern != NULL_RTX);
+  st.search_code = code;
+  st.search_result = NULL_RTX;
+  st.find_any_shift = find_any_shift;
+  result = for_each_rtx (&pattern, arm_find_sub_rtx_with_search_term, &st);
+  if (result)
+    return st.search_result;
+  else
+    return NULL_RTX;
+}
+
+/* Traverse PATTERN looking for any sub-rtx which looks like a shift.  */
+static rtx
+arm_find_shift_sub_rtx (rtx pattern)
 {
-  return !arm_no_early_store_addr_dep (producer, consumer);
+  return arm_find_sub_rtx_with_code (pattern, ASHIFT, true);
 }
 
+/* PRODUCER and CONSUMER are two potentially dependant RTX.  PRODUCER
+   (possibly) contains a SET which will provide a result we can access
+   using the SET_DEST macro.  We will place the RTX which would be
+   written by PRODUCER in SET_SOURCE.
+   Similarly, CONSUMER (possibly) contains a SET which has an operand
+   we can access using SET_SRC.  We place this operand in
+   SET_DESTINATION.
+
+   Return nonzero if we found the SET RTX we expected.  */
+static int
+arm_get_set_operands (rtx producer, rtx consumer,
+		      rtx *set_source, rtx *set_destination)
+{
+  rtx set_producer = arm_find_sub_rtx_with_code (producer, SET, false);
+  rtx set_consumer = arm_find_sub_rtx_with_code (consumer, SET, false);
+
+  if (set_producer && set_consumer)
+    {
+      *set_source = SET_DEST (set_producer);
+      *set_destination = SET_SRC (set_consumer);
+      return 1;
+    }
+  return 0;
+}
+
 /* Return nonzero if the CONSUMER instruction (a load) does need
    PRODUCER's value to calculate the address.  */
 
 int
 arm_early_load_addr_dep (rtx producer, rtx consumer)
 {
-  rtx value = PATTERN (producer);
-  rtx addr = PATTERN (consumer);
+  rtx value, addr;
 
-  if (GET_CODE (value) == COND_EXEC)
-    value = COND_EXEC_CODE (value);
-  if (GET_CODE (value) == PARALLEL)
-    value = XVECEXP (value, 0, 0);
-  value = XEXP (value, 0);
-  if (GET_CODE (addr) == COND_EXEC)
-    addr = COND_EXEC_CODE (addr);
-  if (GET_CODE (addr) == PARALLEL)
-    {
-      if (GET_CODE (XVECEXP (addr, 0, 0)) == RETURN)
-        addr = XVECEXP (addr, 0, 1);
-      else
-        addr = XVECEXP (addr, 0, 0);
-    }
-  addr = XEXP (addr, 1);
+  if (!arm_get_set_operands (producer, consumer, &value, &addr))
+    return 0;
 
   return reg_overlap_mentioned_p (value, addr);
 }
@@ -24671,29 +24749,21 @@ 
 int
 arm_no_early_alu_shift_dep (rtx producer, rtx consumer)
 {
-  rtx value = PATTERN (producer);
-  rtx op = PATTERN (consumer);
+  rtx value, op;
   rtx early_op;
 
-  if (GET_CODE (value) == COND_EXEC)
-    value = COND_EXEC_CODE (value);
-  if (GET_CODE (value) == PARALLEL)
-    value = XVECEXP (value, 0, 0);
-  value = XEXP (value, 0);
-  if (GET_CODE (op) == COND_EXEC)
-    op = COND_EXEC_CODE (op);
-  if (GET_CODE (op) == PARALLEL)
-    op = XVECEXP (op, 0, 0);
-  op = XEXP (op, 1);
+  if (!arm_get_set_operands (producer, consumer, &value, &op))
+    return 0;
 
-  early_op = XEXP (op, 0);
-  /* This is either an actual independent shift, or a shift applied to
-     the first operand of another operation.  We want the whole shift
-     operation.  */
-  if (REG_P (early_op))
-    early_op = op;
+  if ((early_op = arm_find_shift_sub_rtx (op)))
+    {
+      if (REG_P (early_op))
+	early_op = op;
 
-  return !reg_overlap_mentioned_p (value, early_op);
+      return !reg_overlap_mentioned_p (value, early_op);
+    }
+
+  return 0;
 }
 
 /* Return nonzero if the CONSUMER instruction (an ALU op) does not
@@ -24703,30 +24773,18 @@ 
 int
 arm_no_early_alu_shift_value_dep (rtx producer, rtx consumer)
 {
-  rtx value = PATTERN (producer);
-  rtx op = PATTERN (consumer);
+  rtx value, op;
   rtx early_op;
 
-  if (GET_CODE (value) == COND_EXEC)
-    value = COND_EXEC_CODE (value);
-  if (GET_CODE (value) == PARALLEL)
-    value = XVECEXP (value, 0, 0);
-  value = XEXP (value, 0);
-  if (GET_CODE (op) == COND_EXEC)
-    op = COND_EXEC_CODE (op);
-  if (GET_CODE (op) == PARALLEL)
-    op = XVECEXP (op, 0, 0);
-  op = XEXP (op, 1);
+  if (!arm_get_set_operands (producer, consumer, &value, &op))
+    return 0;
 
-  early_op = XEXP (op, 0);
+  if ((early_op = arm_find_shift_sub_rtx (op)))
+    /* We want to check the value being shifted.  */
+    if (!reg_overlap_mentioned_p (value, XEXP (early_op, 0)))
+      return 1;
 
-  /* This is either an actual independent shift, or a shift applied to
-     the first operand of another operation.  We want the value being
-     shifted, in either case.  */
-  if (!REG_P (early_op))
-    early_op = XEXP (early_op, 0);
-
-  return !reg_overlap_mentioned_p (value, early_op);
+  return 0;
 }
 
 /* Return nonzero if the CONSUMER (a mul or mac op) does not
@@ -24736,19 +24794,10 @@ 
 int
 arm_no_early_mul_dep (rtx producer, rtx consumer)
 {
-  rtx value = PATTERN (producer);
-  rtx op = PATTERN (consumer);
+  rtx value, op;
 
-  if (GET_CODE (value) == COND_EXEC)
-    value = COND_EXEC_CODE (value);
-  if (GET_CODE (value) == PARALLEL)
-    value = XVECEXP (value, 0, 0);
-  value = XEXP (value, 0);
-  if (GET_CODE (op) == COND_EXEC)
-    op = COND_EXEC_CODE (op);
-  if (GET_CODE (op) == PARALLEL)
-    op = XVECEXP (op, 0, 0);
-  op = XEXP (op, 1);
+  if (!arm_get_set_operands (producer, consumer, &value, &op))
+    return 0;
 
   if (GET_CODE (op) == PLUS || GET_CODE (op) == MINUS)
     {
@@ -24761,6 +24810,36 @@ 
   return 0;
 }
 
+/* Return nonzero if the CONSUMER instruction (a store) does not need
+   PRODUCER's value to calculate the address.  */
+
+int
+arm_no_early_store_addr_dep (rtx producer, rtx consumer)
+{
+  rtx value = arm_find_sub_rtx_with_code (producer, SET, false);
+  rtx addr = arm_find_sub_rtx_with_code (consumer, SET, false);
+
+  if (value)
+    value = SET_DEST (value);
+
+  if (addr)
+    addr = SET_DEST (addr);
+
+  if (!value || !addr)
+    return 0;
+
+  return !reg_overlap_mentioned_p (value, addr);
+}
+
+/* Return nonzero if the CONSUMER instruction (a store) does need
+   PRODUCER's value to calculate the address.  */
+
+int
+arm_early_store_addr_dep (rtx producer, rtx consumer)
+{
+  return !arm_no_early_store_addr_dep (producer, consumer);
+}
+
 /* We can't rely on the caller doing the proper promotion when
    using APCS or ATPCS.  */
 
Index: gcc/testsuite/ChangeLog
===================================================================
--- gcc/testsuite/ChangeLog	(revision 206619)
+++ gcc/testsuite/ChangeLog	(working copy)
@@ -1,3 +1,7 @@ 
+2014-01-15  Terry Guo  <terry.guo@arm.com>
+
+	* gcc.target/arm/pr59826.c: New test.
+
 2014-01-10  Yufeng Zhang  <yufeng.zhang@arm.com>
 
 	* gcc.target/arm/neon/vst1Q_laneu64-1.c: New test.
Index: gcc/testsuite/gcc.target/arm/pr59826.c
===================================================================
--- gcc/testsuite/gcc.target/arm/pr59826.c	(revision 0)
+++ gcc/testsuite/gcc.target/arm/pr59826.c	(working copy)
@@ -0,0 +1,35 @@ 
+/* { dg-do compile } */
+/* { dg-options "-mthumb -mcpu=cortex-m4 -fprefetch-loop-arrays -O2" }  */
+
+typedef struct genxWriter_rec * genxWriter;
+typedef unsigned char * utf8;
+typedef const unsigned char * constUtf8;
+
+int genxScrubText(genxWriter w, constUtf8 in, utf8 out)
+{
+  int problems = 0;
+  constUtf8 last = in;
+
+  while (*in)
+  {
+    int c = genxNextUnicodeChar(&in);
+    if (c == -1)
+    {
+      problems++;
+      last = in;
+      continue;
+    }
+
+    if (!isXMLChar(w, c))
+    {
+      problems++;
+      last = in;
+      continue;
+    }
+
+    while (last < in)
+      *out++ = *last++;
+  }
+  *out = 0;
+  return problems;
+}