diff mbox series

[ARM] Fix -mpure-code for v6m

Message ID CAKdteOauQ+ngjbrxcqo=-5KAhviO3gPkUR7rSh29pSva_qTFBg@mail.gmail.com
State New
Headers show
Series [ARM] Fix -mpure-code for v6m | expand

Commit Message

Christophe Lyon Feb. 7, 2020, 1:19 p.m. UTC
When running the testsuite with -fdisable-rtl-fwprop2 and -mpure-code
for cortex-m0, I noticed that some testcases were failing because we
still generate "ldr rX, .LCY", which is what we want to avoid with
-mpure-code. This is latent since a recent improvement in fwprop
(PR88833).

In this patch I change the thumb1_movsi_insn pattern so that it emits
the desired instruction sequence when arm_disable_literal_pool is set.

I tried to add a define_split instead, but couldn't make it work: the
compiler then complains it cannot split the instruction, while my new
define_split accepts the same operand types as thumb1_movsi_insn:

c-c++-common/torture/complex-sign-mixed-add.c:41:1: error: could not split insn
(insn 2989 425 4844 (set (reg/f:SI 3 r3 [1342])
        (symbol_ref/u:SI ("*.LC6") [flags 0x2])) 836 {*thumb1_movsi_insn}
     (expr_list:REG_EQUIV (symbol_ref/u:SI ("*.LC6") [flags 0x2])
        (nil)))
during RTL pass: final

(define_split
  [(set (match_operand:SI 0 "register_operand" "")
        (match_operand:SI 1 "general_operand" ""))]
  "TARGET_THUMB1
   && arm_disable_literal_pool
   && GET_CODE (operands[1]) == SYMBOL_REF"
  [(clobber (const_int 0))]
  "
    gen_thumb1_movsi_symbol_ref(operands[0], operands[1]);
    DONE;
  "
)
and I put this in thumb1_movsi_insn:
if (GET_CODE (operands[1]) == SYMBOL_REF && arm_disable_literal_pool)
  {
    return \"#\";
  }
  return \"ldr\\t%0, %1\";

2020-02-07  Christophe Lyon  <christophe.lyon@linaro.org>

        * config/arm/thumb1.md (thumb1_movsi_insn): Fix ldr alternative to
        work with -mpure-code.

Comments

Richard Earnshaw (lists) Feb. 7, 2020, 1:49 p.m. UTC | #1
On 07/02/2020 13:19, Christophe Lyon wrote:
> When running the testsuite with -fdisable-rtl-fwprop2 and -mpure-code
> for cortex-m0, I noticed that some testcases were failing because we
> still generate "ldr rX, .LCY", which is what we want to avoid with
> -mpure-code. This is latent since a recent improvement in fwprop
> (PR88833).
> 
> In this patch I change the thumb1_movsi_insn pattern so that it emits
> the desired instruction sequence when arm_disable_literal_pool is set.
> 
> I tried to add a define_split instead, but couldn't make it work: the
> compiler then complains it cannot split the instruction, while my new
> define_split accepts the same operand types as thumb1_movsi_insn:
> 
> c-c++-common/torture/complex-sign-mixed-add.c:41:1: error: could not split insn
> (insn 2989 425 4844 (set (reg/f:SI 3 r3 [1342])
>          (symbol_ref/u:SI ("*.LC6") [flags 0x2])) 836 {*thumb1_movsi_insn}
>       (expr_list:REG_EQUIV (symbol_ref/u:SI ("*.LC6") [flags 0x2])
>          (nil)))
> during RTL pass: final
> 
> (define_split
>    [(set (match_operand:SI 0 "register_operand" "")
>          (match_operand:SI 1 "general_operand" ""))]
>    "TARGET_THUMB1
>     && arm_disable_literal_pool
>     && GET_CODE (operands[1]) == SYMBOL_REF"
>    [(clobber (const_int 0))]
>    "
>      gen_thumb1_movsi_symbol_ref(operands[0], operands[1]);
>      DONE;
>    "
> )
> and I put this in thumb1_movsi_insn:
> if (GET_CODE (operands[1]) == SYMBOL_REF && arm_disable_literal_pool)
>    {
>      return \"#\";
>    }
>    return \"ldr\\t%0, %1\";
> 
> 2020-02-07  Christophe Lyon  <christophe.lyon@linaro.org>
> 
>          * config/arm/thumb1.md (thumb1_movsi_insn): Fix ldr alternative to
>          work with -mpure-code.
> 

+    case 0:
+    case 1:
+      return \"movs	%0, %1\";
+    case 2:
+      return \"movw	%0, %1\";

This is OK, but please replace the hard tab in the strings for MOVS/MOVW 
with \\t.

R.
Christophe Lyon Feb. 7, 2020, 4:43 p.m. UTC | #2
On Fri, 7 Feb 2020 at 14:49, Richard Earnshaw (lists)
<Richard.Earnshaw@arm.com> wrote:
>
> On 07/02/2020 13:19, Christophe Lyon wrote:
> > When running the testsuite with -fdisable-rtl-fwprop2 and -mpure-code
> > for cortex-m0, I noticed that some testcases were failing because we
> > still generate "ldr rX, .LCY", which is what we want to avoid with
> > -mpure-code. This is latent since a recent improvement in fwprop
> > (PR88833).
> >
> > In this patch I change the thumb1_movsi_insn pattern so that it emits
> > the desired instruction sequence when arm_disable_literal_pool is set.
> >
> > I tried to add a define_split instead, but couldn't make it work: the
> > compiler then complains it cannot split the instruction, while my new
> > define_split accepts the same operand types as thumb1_movsi_insn:
> >
> > c-c++-common/torture/complex-sign-mixed-add.c:41:1: error: could not split insn
> > (insn 2989 425 4844 (set (reg/f:SI 3 r3 [1342])
> >          (symbol_ref/u:SI ("*.LC6") [flags 0x2])) 836 {*thumb1_movsi_insn}
> >       (expr_list:REG_EQUIV (symbol_ref/u:SI ("*.LC6") [flags 0x2])
> >          (nil)))
> > during RTL pass: final
> >
> > (define_split
> >    [(set (match_operand:SI 0 "register_operand" "")
> >          (match_operand:SI 1 "general_operand" ""))]
> >    "TARGET_THUMB1
> >     && arm_disable_literal_pool
> >     && GET_CODE (operands[1]) == SYMBOL_REF"
> >    [(clobber (const_int 0))]
> >    "
> >      gen_thumb1_movsi_symbol_ref(operands[0], operands[1]);
> >      DONE;
> >    "
> > )
> > and I put this in thumb1_movsi_insn:
> > if (GET_CODE (operands[1]) == SYMBOL_REF && arm_disable_literal_pool)
> >    {
> >      return \"#\";
> >    }
> >    return \"ldr\\t%0, %1\";
> >
> > 2020-02-07  Christophe Lyon  <christophe.lyon@linaro.org>
> >
> >          * config/arm/thumb1.md (thumb1_movsi_insn): Fix ldr alternative to
> >          work with -mpure-code.
> >
>
> +    case 0:
> +    case 1:
> +      return \"movs    %0, %1\";
> +    case 2:
> +      return \"movw    %0, %1\";
>
> This is OK, but please replace the hard tab in the strings for MOVS/MOVW
> with \\t.
>

OK that was merely a cut & paste from the existing code.

I'm concerned that the length attribute is becoming wrong with my
patch, isn't this a problem?

Thanks,

Christophe

> R.
Richard Earnshaw (lists) Feb. 7, 2020, 4:55 p.m. UTC | #3
On 07/02/2020 16:43, Christophe Lyon wrote:
> On Fri, 7 Feb 2020 at 14:49, Richard Earnshaw (lists)
> <Richard.Earnshaw@arm.com> wrote:
>>
>> On 07/02/2020 13:19, Christophe Lyon wrote:
>>> When running the testsuite with -fdisable-rtl-fwprop2 and -mpure-code
>>> for cortex-m0, I noticed that some testcases were failing because we
>>> still generate "ldr rX, .LCY", which is what we want to avoid with
>>> -mpure-code. This is latent since a recent improvement in fwprop
>>> (PR88833).
>>>
>>> In this patch I change the thumb1_movsi_insn pattern so that it emits
>>> the desired instruction sequence when arm_disable_literal_pool is set.
>>>
>>> I tried to add a define_split instead, but couldn't make it work: the
>>> compiler then complains it cannot split the instruction, while my new
>>> define_split accepts the same operand types as thumb1_movsi_insn:
>>>
>>> c-c++-common/torture/complex-sign-mixed-add.c:41:1: error: could not split insn
>>> (insn 2989 425 4844 (set (reg/f:SI 3 r3 [1342])
>>>           (symbol_ref/u:SI ("*.LC6") [flags 0x2])) 836 {*thumb1_movsi_insn}
>>>        (expr_list:REG_EQUIV (symbol_ref/u:SI ("*.LC6") [flags 0x2])
>>>           (nil)))
>>> during RTL pass: final
>>>
>>> (define_split
>>>     [(set (match_operand:SI 0 "register_operand" "")
>>>           (match_operand:SI 1 "general_operand" ""))]
>>>     "TARGET_THUMB1
>>>      && arm_disable_literal_pool
>>>      && GET_CODE (operands[1]) == SYMBOL_REF"
>>>     [(clobber (const_int 0))]
>>>     "
>>>       gen_thumb1_movsi_symbol_ref(operands[0], operands[1]);
>>>       DONE;
>>>     "
>>> )
>>> and I put this in thumb1_movsi_insn:
>>> if (GET_CODE (operands[1]) == SYMBOL_REF && arm_disable_literal_pool)
>>>     {
>>>       return \"#\";
>>>     }
>>>     return \"ldr\\t%0, %1\";
>>>
>>> 2020-02-07  Christophe Lyon  <christophe.lyon@linaro.org>
>>>
>>>           * config/arm/thumb1.md (thumb1_movsi_insn): Fix ldr alternative to
>>>           work with -mpure-code.
>>>
>>
>> +    case 0:
>> +    case 1:
>> +      return \"movs    %0, %1\";
>> +    case 2:
>> +      return \"movw    %0, %1\";
>>
>> This is OK, but please replace the hard tab in the strings for MOVS/MOVW
>> with \\t.
>>
> 
> OK that was merely a cut & paste from the existing code.
> 
> I'm concerned that the length attribute is becoming wrong with my
> patch, isn't this a problem?
> 

Potentially yes.  The branch range code needs this to handle overly long 
jumps correctly.

R.

> Thanks,
> 
> Christophe
> 
>> R.
Christophe Lyon Feb. 10, 2020, 9:27 a.m. UTC | #4
On Fri, 7 Feb 2020 at 17:55, Richard Earnshaw (lists)
<Richard.Earnshaw@arm.com> wrote:
>
> On 07/02/2020 16:43, Christophe Lyon wrote:
> > On Fri, 7 Feb 2020 at 14:49, Richard Earnshaw (lists)
> > <Richard.Earnshaw@arm.com> wrote:
> >>
> >> On 07/02/2020 13:19, Christophe Lyon wrote:
> >>> When running the testsuite with -fdisable-rtl-fwprop2 and -mpure-code
> >>> for cortex-m0, I noticed that some testcases were failing because we
> >>> still generate "ldr rX, .LCY", which is what we want to avoid with
> >>> -mpure-code. This is latent since a recent improvement in fwprop
> >>> (PR88833).
> >>>
> >>> In this patch I change the thumb1_movsi_insn pattern so that it emits
> >>> the desired instruction sequence when arm_disable_literal_pool is set.
> >>>
> >>> I tried to add a define_split instead, but couldn't make it work: the
> >>> compiler then complains it cannot split the instruction, while my new
> >>> define_split accepts the same operand types as thumb1_movsi_insn:
> >>>
> >>> c-c++-common/torture/complex-sign-mixed-add.c:41:1: error: could not split insn
> >>> (insn 2989 425 4844 (set (reg/f:SI 3 r3 [1342])
> >>>           (symbol_ref/u:SI ("*.LC6") [flags 0x2])) 836 {*thumb1_movsi_insn}
> >>>        (expr_list:REG_EQUIV (symbol_ref/u:SI ("*.LC6") [flags 0x2])
> >>>           (nil)))
> >>> during RTL pass: final
> >>>
> >>> (define_split
> >>>     [(set (match_operand:SI 0 "register_operand" "")
> >>>           (match_operand:SI 1 "general_operand" ""))]
> >>>     "TARGET_THUMB1
> >>>      && arm_disable_literal_pool
> >>>      && GET_CODE (operands[1]) == SYMBOL_REF"
> >>>     [(clobber (const_int 0))]
> >>>     "
> >>>       gen_thumb1_movsi_symbol_ref(operands[0], operands[1]);
> >>>       DONE;
> >>>     "
> >>> )
> >>> and I put this in thumb1_movsi_insn:
> >>> if (GET_CODE (operands[1]) == SYMBOL_REF && arm_disable_literal_pool)
> >>>     {
> >>>       return \"#\";
> >>>     }
> >>>     return \"ldr\\t%0, %1\";
> >>>
> >>> 2020-02-07  Christophe Lyon  <christophe.lyon@linaro.org>
> >>>
> >>>           * config/arm/thumb1.md (thumb1_movsi_insn): Fix ldr alternative to
> >>>           work with -mpure-code.
> >>>
> >>
> >> +    case 0:
> >> +    case 1:
> >> +      return \"movs    %0, %1\";
> >> +    case 2:
> >> +      return \"movw    %0, %1\";
> >>
> >> This is OK, but please replace the hard tab in the strings for MOVS/MOVW
> >> with \\t.
> >>
> >
> > OK that was merely a cut & paste from the existing code.
> >
> > I'm concerned that the length attribute is becoming wrong with my
> > patch, isn't this a problem?
> >
>
> Potentially yes.  The branch range code needs this to handle overly long
> jumps correctly.
>

Do you mean that the probability of problems due to that shortcoming
is low enough that I can commit my patch?

Thanks,

Christophe

> R.
>
> > Thanks,
> >
> > Christophe
> >
> >> R.
>
Richard Earnshaw (lists) Feb. 10, 2020, 4:45 p.m. UTC | #5
On 10/02/2020 09:27, Christophe Lyon wrote:
> On Fri, 7 Feb 2020 at 17:55, Richard Earnshaw (lists)
> <Richard.Earnshaw@arm.com> wrote:
>>
>> On 07/02/2020 16:43, Christophe Lyon wrote:
>>> On Fri, 7 Feb 2020 at 14:49, Richard Earnshaw (lists)
>>> <Richard.Earnshaw@arm.com> wrote:
>>>>
>>>> On 07/02/2020 13:19, Christophe Lyon wrote:
>>>>> When running the testsuite with -fdisable-rtl-fwprop2 and -mpure-code
>>>>> for cortex-m0, I noticed that some testcases were failing because we
>>>>> still generate "ldr rX, .LCY", which is what we want to avoid with
>>>>> -mpure-code. This is latent since a recent improvement in fwprop
>>>>> (PR88833).
>>>>>
>>>>> In this patch I change the thumb1_movsi_insn pattern so that it emits
>>>>> the desired instruction sequence when arm_disable_literal_pool is set.
>>>>>
>>>>> I tried to add a define_split instead, but couldn't make it work: the
>>>>> compiler then complains it cannot split the instruction, while my new
>>>>> define_split accepts the same operand types as thumb1_movsi_insn:
>>>>>
>>>>> c-c++-common/torture/complex-sign-mixed-add.c:41:1: error: could not split insn
>>>>> (insn 2989 425 4844 (set (reg/f:SI 3 r3 [1342])
>>>>>            (symbol_ref/u:SI ("*.LC6") [flags 0x2])) 836 {*thumb1_movsi_insn}
>>>>>         (expr_list:REG_EQUIV (symbol_ref/u:SI ("*.LC6") [flags 0x2])
>>>>>            (nil)))
>>>>> during RTL pass: final
>>>>>
>>>>> (define_split
>>>>>      [(set (match_operand:SI 0 "register_operand" "")
>>>>>            (match_operand:SI 1 "general_operand" ""))]
>>>>>      "TARGET_THUMB1
>>>>>       && arm_disable_literal_pool
>>>>>       && GET_CODE (operands[1]) == SYMBOL_REF"
>>>>>      [(clobber (const_int 0))]
>>>>>      "
>>>>>        gen_thumb1_movsi_symbol_ref(operands[0], operands[1]);
>>>>>        DONE;
>>>>>      "
>>>>> )
>>>>> and I put this in thumb1_movsi_insn:
>>>>> if (GET_CODE (operands[1]) == SYMBOL_REF && arm_disable_literal_pool)
>>>>>      {
>>>>>        return \"#\";
>>>>>      }
>>>>>      return \"ldr\\t%0, %1\";
>>>>>
>>>>> 2020-02-07  Christophe Lyon  <christophe.lyon@linaro.org>
>>>>>
>>>>>            * config/arm/thumb1.md (thumb1_movsi_insn): Fix ldr alternative to
>>>>>            work with -mpure-code.
>>>>>
>>>>
>>>> +    case 0:
>>>> +    case 1:
>>>> +      return \"movs    %0, %1\";
>>>> +    case 2:
>>>> +      return \"movw    %0, %1\";
>>>>
>>>> This is OK, but please replace the hard tab in the strings for MOVS/MOVW
>>>> with \\t.
>>>>
>>>
>>> OK that was merely a cut & paste from the existing code.
>>>
>>> I'm concerned that the length attribute is becoming wrong with my
>>> patch, isn't this a problem?
>>>
>>
>> Potentially yes.  The branch range code needs this to handle overly long
>> jumps correctly.
>>
> 
> Do you mean that the probability of problems due to that shortcoming
> is low enough that I can commit my patch?

It's hard to say that.  The number of instructions you generate for this 
is significant, so it likely will throw off the calculations and 
somebody will get unlucky.  On the other hand, I don't think we should 
pessimize code for the non-pure-code variants by inflating the size for 
this unconditionally.

It seems there are two ways to fix this.

1) create a new alternative for the pure-code variant with its own 
length attribute, then only enable that for the case you need.  That 
would also allow you to go back to the templated asm format.
2) use a level of indirection to calculate the length - I haven't tried 
this, but it would be something like:

  - create a new attribute - lets say default_length
  - rename length for this pattern to default_length
  - create another new attribute - lets say purecode_length, add lengths 
for each alternative but adjusted for the purecode variant.
  - make the length attribute for this pattern select based on the 
disable_literal_pool variable between the default_length and 
purecode_length attributes.

R.

> 
> Thanks,
> 
> Christophe
> 
>> R.
>>
>>> Thanks,
>>>
>>> Christophe
>>>
>>>> R.
>>
Christophe Lyon Feb. 13, 2020, 10:14 a.m. UTC | #6
On Mon, 10 Feb 2020 at 17:45, Richard Earnshaw (lists)
<Richard.Earnshaw@arm.com> wrote:
>
> On 10/02/2020 09:27, Christophe Lyon wrote:
> > On Fri, 7 Feb 2020 at 17:55, Richard Earnshaw (lists)
> > <Richard.Earnshaw@arm.com> wrote:
> >>
> >> On 07/02/2020 16:43, Christophe Lyon wrote:
> >>> On Fri, 7 Feb 2020 at 14:49, Richard Earnshaw (lists)
> >>> <Richard.Earnshaw@arm.com> wrote:
> >>>>
> >>>> On 07/02/2020 13:19, Christophe Lyon wrote:
> >>>>> When running the testsuite with -fdisable-rtl-fwprop2 and -mpure-code
> >>>>> for cortex-m0, I noticed that some testcases were failing because we
> >>>>> still generate "ldr rX, .LCY", which is what we want to avoid with
> >>>>> -mpure-code. This is latent since a recent improvement in fwprop
> >>>>> (PR88833).
> >>>>>
> >>>>> In this patch I change the thumb1_movsi_insn pattern so that it emits
> >>>>> the desired instruction sequence when arm_disable_literal_pool is set.
> >>>>>
> >>>>> I tried to add a define_split instead, but couldn't make it work: the
> >>>>> compiler then complains it cannot split the instruction, while my new
> >>>>> define_split accepts the same operand types as thumb1_movsi_insn:
> >>>>>
> >>>>> c-c++-common/torture/complex-sign-mixed-add.c:41:1: error: could not split insn
> >>>>> (insn 2989 425 4844 (set (reg/f:SI 3 r3 [1342])
> >>>>>            (symbol_ref/u:SI ("*.LC6") [flags 0x2])) 836 {*thumb1_movsi_insn}
> >>>>>         (expr_list:REG_EQUIV (symbol_ref/u:SI ("*.LC6") [flags 0x2])
> >>>>>            (nil)))
> >>>>> during RTL pass: final
> >>>>>
> >>>>> (define_split
> >>>>>      [(set (match_operand:SI 0 "register_operand" "")
> >>>>>            (match_operand:SI 1 "general_operand" ""))]
> >>>>>      "TARGET_THUMB1
> >>>>>       && arm_disable_literal_pool
> >>>>>       && GET_CODE (operands[1]) == SYMBOL_REF"
> >>>>>      [(clobber (const_int 0))]
> >>>>>      "
> >>>>>        gen_thumb1_movsi_symbol_ref(operands[0], operands[1]);
> >>>>>        DONE;
> >>>>>      "
> >>>>> )
> >>>>> and I put this in thumb1_movsi_insn:
> >>>>> if (GET_CODE (operands[1]) == SYMBOL_REF && arm_disable_literal_pool)
> >>>>>      {
> >>>>>        return \"#\";
> >>>>>      }
> >>>>>      return \"ldr\\t%0, %1\";
> >>>>>
> >>>>> 2020-02-07  Christophe Lyon  <christophe.lyon@linaro.org>
> >>>>>
> >>>>>            * config/arm/thumb1.md (thumb1_movsi_insn): Fix ldr alternative to
> >>>>>            work with -mpure-code.
> >>>>>
> >>>>
> >>>> +    case 0:
> >>>> +    case 1:
> >>>> +      return \"movs    %0, %1\";
> >>>> +    case 2:
> >>>> +      return \"movw    %0, %1\";
> >>>>
> >>>> This is OK, but please replace the hard tab in the strings for MOVS/MOVW
> >>>> with \\t.
> >>>>
> >>>
> >>> OK that was merely a cut & paste from the existing code.
> >>>
> >>> I'm concerned that the length attribute is becoming wrong with my
> >>> patch, isn't this a problem?
> >>>
> >>
> >> Potentially yes.  The branch range code needs this to handle overly long
> >> jumps correctly.
> >>
> >
> > Do you mean that the probability of problems due to that shortcoming
> > is low enough that I can commit my patch?
>
> It's hard to say that.  The number of instructions you generate for this
> is significant, so it likely will throw off the calculations and
> somebody will get unlucky.  On the other hand, I don't think we should
> pessimize code for the non-pure-code variants by inflating the size for
> this unconditionally.
>
> It seems there are two ways to fix this.
>
> 1) create a new alternative for the pure-code variant with its own
> length attribute, then only enable that for the case you need.  That
> would also allow you to go back to the templated asm format.
> 2) use a level of indirection to calculate the length - I haven't tried
> this, but it would be something like:
>
>   - create a new attribute - lets say default_length
>   - rename length for this pattern to default_length
>   - create another new attribute - lets say purecode_length, add lengths
> for each alternative but adjusted for the purecode variant.
>   - make the length attribute for this pattern select based on the
> disable_literal_pool variable between the default_length and
> purecode_length attributes.
>

Hi Richard,

Thanks for the suggestions.  I've implemented option (1) above, does
it match what you had in mind?

Tested on arm-eabi, with -mpure-code forced, no regression. Manually
checked that the expected sequence is generated with
-fdisable-rtl-fwprop2.

Thanks,

Christophe

> R.
>
> >
> > Thanks,
> >
> > Christophe
> >
> >> R.
> >>
> >>> Thanks,
> >>>
> >>> Christophe
> >>>
> >>>> R.
> >>
>
[ARM] Fix -mpure-code for v6m

When running the testsuite with -fdisable-rtl-fwprop2 and -mpure-code
for cortex-m0, I noticed that some testcases were failing because we
still generate "ldr rX, .LCY", which is what we want to avoid with
-mpure-code. This is latent since a recent improvement in fwprop
(PR88833).

In this patch I change the thumb1_movsi_insn pattern so that it emits
the desired instruction sequence when arm_disable_literal_pool is set.

To achieve that, I introduce a new required_for_purecode attribute to
enable the corresponding alternative in thumb1_movsi_insn and take the
actual instruction sequence length into account.

gcc/ChangeLog:

2020-02-13  Christophe Lyon  <christophe.lyon@linaro.org>

	* config/arm/arm.md (required_for_purecode): New attribute.
	(enabled): Handle required_for_purecode.
	* config/arm/thumb1.md (thumb1_movsi_insn): Add alternative to
	work with -mpure-code.
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index f89a2d4..aa8c34d 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -97,6 +97,11 @@
 ; an IT block in their expansion which is not a short IT.
 (define_attr "enabled_for_short_it" "no,yes" (const_string "yes"))
 
+; Mark an instruction sequence as the required way of loading a
+; constant when -mpure-code is enabled (which implies
+; arm_disable_literal_pool)
+(define_attr "required_for_purecode" "no,yes" (const_string "no"))
+
 ;; Operand number of an input operand that is shifted.  Zero if the
 ;; given instruction does not shift one of its input operands.
 (define_attr "shift" "" (const_int 0))
@@ -230,6 +235,10 @@
 	       (match_test "arm_restrict_it"))
 	  (const_string "no")
 
+	  (and (eq_attr "required_for_purecode" "yes")
+	       (not (match_test "arm_disable_literal_pool")))
+	  (const_string "no")
+
 	  (eq_attr "arch_enabled" "no")
 	  (const_string "no")]
 	 (const_string "yes")))
diff --git a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md
index 613cf9c..2486163 100644
--- a/gcc/config/arm/thumb1.md
+++ b/gcc/config/arm/thumb1.md
@@ -691,8 +691,8 @@
 )
 
 (define_insn "*thumb1_movsi_insn"
-  [(set (match_operand:SI 0 "nonimmediate_operand" "=l,l,r,l,l,l,>,l, m,*l*h*k")
-	(match_operand:SI 1 "general_operand"      "l, I,j,J,K,>,l,mi,l,*l*h*k"))]
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=l,l,r,l,l,l,>,l, l, m,*l*h*k")
+	(match_operand:SI 1 "general_operand"      "l, I,j,J,K,>,l,i, mi,l,*l*h*k"))]
   "TARGET_THUMB1
    && (   register_operand (operands[0], SImode)
        || register_operand (operands[1], SImode))"
@@ -704,14 +704,16 @@
    #
    ldmia\\t%1, {%0}
    stmia\\t%0, {%1}
+   movs\\t%0, #:upper8_15:%1; lsls\\t%0, #8; adds\\t%0, #:upper0_7:%1; lsls\\t%0, #8; adds\\t%0, #:lower8_15:%1; lsls\\t%0, #8; adds\\t%0, #:lower0_7:%1
    ldr\\t%0, %1
    str\\t%1, %0
    mov\\t%0, %1"
-  [(set_attr "length" "2,2,4,4,4,2,2,2,2,2")
-   (set_attr "type" "mov_reg,mov_imm,mov_imm,multiple,multiple,load_4,store_4,load_4,store_4,mov_reg")
-   (set_attr "pool_range" "*,*,*,*,*,*,*,1018,*,*")
-   (set_attr "arch" "t1,t1,v8mb,t1,t1,t1,t1,t1,t1,t1")
-   (set_attr "conds" "set,clob,nocond,*,*,nocond,nocond,nocond,nocond,nocond")])
+  [(set_attr "length" "2,2,4,4,4,2,2,14,2,2,2")
+   (set_attr "type" "mov_reg,mov_imm,mov_imm,multiple,multiple,load_4,store_4,alu_sreg,load_4,store_4,mov_reg")
+   (set_attr "pool_range" "*,*,*,*,*,*,*, *,1018,*,*")
+   (set_attr "arch" "t1,t1,v8mb,t1,t1,t1,t1,t1,t1,t1,t1")
+   (set_attr "required_for_purecode" "no,no,no,no,no,no,no,yes,no,no,no")
+   (set_attr "conds" "set,clob,nocond,*,*,nocond,nocond,nocond,nocond,nocond,nocond")])
 
 ; Split the load of 64-bit constant into two loads for high and low 32-bit parts respectively
 ; to see if we can load them in fewer instructions or fewer cycles.
Christophe Lyon Feb. 24, 2020, 2:16 p.m. UTC | #7
Ping?

I'd also like to backport this and the main patch (svn r279463,
r10-5505-ge24f6408df1e4c5e8c09785d7b488c492dfb68b3)
to the gcc-9 branch.

I found the problem addressed by this patch while validating the
backport to gcc-9: although the patch applies cleanly except for
testcases dg directives, there were some failures which I could
finally reproduce on trunk with -fdisable-rtl-fwprop2.

Here is a summary of the validations I ran using --target arm-eabi:
* without my patches:
(1) --with-cpu cortex-m0
(2) --with-cpu cortex-m4
(3) --with-cpu cortex-m4 CFLAGS_FOR_TARGET=-mpure-code (to build the
libs with -mpure-code)
(4) --with-cpu cortex-m4 CFLAGS_FOR_TARGET=-mpure-code
--target-board=-mpure-code (to also run the tests with -mpure-code)

* with my patches:
(5) --with-cpu cortex-m0 CFLAGS_FOR_TARGET=-mpure-code
--target-board=-mpure-code
(6) --with-cpu cortex-m4 CFLAGS_FOR_TARGET=-mpure-code
--target-board=-mpure-code

Comparing (4) and (6) ensured that my (v6m) patches introduce no
regression in v7m cases.

Comparison of (1) vs (5) gave results similar to (2) vs (6), there's a
bit of noise because some tests cases don't cope well with -mpure-code
despite my previous testsuite-only patch (svn r277828)

Comparison of (1) vs (2) gave similar results to (5) vs (6).

Ideally, we may also want to backport svn r277828 (testsuite-only
patch, to handle -mpure-code better), but that's not mandatory.

In summary, is this patch OK for trunk?

Are this patch and r279463,
r10-5505-ge24f6408df1e4c5e8c09785d7b488c492dfb68b3 OK to backport to
gcc-9?

Thanks,

Christophe

On Thu, 13 Feb 2020 at 11:14, Christophe Lyon
<christophe.lyon@linaro.org> wrote:
>
> On Mon, 10 Feb 2020 at 17:45, Richard Earnshaw (lists)
> <Richard.Earnshaw@arm.com> wrote:
> >
> > On 10/02/2020 09:27, Christophe Lyon wrote:
> > > On Fri, 7 Feb 2020 at 17:55, Richard Earnshaw (lists)
> > > <Richard.Earnshaw@arm.com> wrote:
> > >>
> > >> On 07/02/2020 16:43, Christophe Lyon wrote:
> > >>> On Fri, 7 Feb 2020 at 14:49, Richard Earnshaw (lists)
> > >>> <Richard.Earnshaw@arm.com> wrote:
> > >>>>
> > >>>> On 07/02/2020 13:19, Christophe Lyon wrote:
> > >>>>> When running the testsuite with -fdisable-rtl-fwprop2 and -mpure-code
> > >>>>> for cortex-m0, I noticed that some testcases were failing because we
> > >>>>> still generate "ldr rX, .LCY", which is what we want to avoid with
> > >>>>> -mpure-code. This is latent since a recent improvement in fwprop
> > >>>>> (PR88833).
> > >>>>>
> > >>>>> In this patch I change the thumb1_movsi_insn pattern so that it emits
> > >>>>> the desired instruction sequence when arm_disable_literal_pool is set.
> > >>>>>
> > >>>>> I tried to add a define_split instead, but couldn't make it work: the
> > >>>>> compiler then complains it cannot split the instruction, while my new
> > >>>>> define_split accepts the same operand types as thumb1_movsi_insn:
> > >>>>>
> > >>>>> c-c++-common/torture/complex-sign-mixed-add.c:41:1: error: could not split insn
> > >>>>> (insn 2989 425 4844 (set (reg/f:SI 3 r3 [1342])
> > >>>>>            (symbol_ref/u:SI ("*.LC6") [flags 0x2])) 836 {*thumb1_movsi_insn}
> > >>>>>         (expr_list:REG_EQUIV (symbol_ref/u:SI ("*.LC6") [flags 0x2])
> > >>>>>            (nil)))
> > >>>>> during RTL pass: final
> > >>>>>
> > >>>>> (define_split
> > >>>>>      [(set (match_operand:SI 0 "register_operand" "")
> > >>>>>            (match_operand:SI 1 "general_operand" ""))]
> > >>>>>      "TARGET_THUMB1
> > >>>>>       && arm_disable_literal_pool
> > >>>>>       && GET_CODE (operands[1]) == SYMBOL_REF"
> > >>>>>      [(clobber (const_int 0))]
> > >>>>>      "
> > >>>>>        gen_thumb1_movsi_symbol_ref(operands[0], operands[1]);
> > >>>>>        DONE;
> > >>>>>      "
> > >>>>> )
> > >>>>> and I put this in thumb1_movsi_insn:
> > >>>>> if (GET_CODE (operands[1]) == SYMBOL_REF && arm_disable_literal_pool)
> > >>>>>      {
> > >>>>>        return \"#\";
> > >>>>>      }
> > >>>>>      return \"ldr\\t%0, %1\";
> > >>>>>
> > >>>>> 2020-02-07  Christophe Lyon  <christophe.lyon@linaro.org>
> > >>>>>
> > >>>>>            * config/arm/thumb1.md (thumb1_movsi_insn): Fix ldr alternative to
> > >>>>>            work with -mpure-code.
> > >>>>>
> > >>>>
> > >>>> +    case 0:
> > >>>> +    case 1:
> > >>>> +      return \"movs    %0, %1\";
> > >>>> +    case 2:
> > >>>> +      return \"movw    %0, %1\";
> > >>>>
> > >>>> This is OK, but please replace the hard tab in the strings for MOVS/MOVW
> > >>>> with \\t.
> > >>>>
> > >>>
> > >>> OK that was merely a cut & paste from the existing code.
> > >>>
> > >>> I'm concerned that the length attribute is becoming wrong with my
> > >>> patch, isn't this a problem?
> > >>>
> > >>
> > >> Potentially yes.  The branch range code needs this to handle overly long
> > >> jumps correctly.
> > >>
> > >
> > > Do you mean that the probability of problems due to that shortcoming
> > > is low enough that I can commit my patch?
> >
> > It's hard to say that.  The number of instructions you generate for this
> > is significant, so it likely will throw off the calculations and
> > somebody will get unlucky.  On the other hand, I don't think we should
> > pessimize code for the non-pure-code variants by inflating the size for
> > this unconditionally.
> >
> > It seems there are two ways to fix this.
> >
> > 1) create a new alternative for the pure-code variant with its own
> > length attribute, then only enable that for the case you need.  That
> > would also allow you to go back to the templated asm format.
> > 2) use a level of indirection to calculate the length - I haven't tried
> > this, but it would be something like:
> >
> >   - create a new attribute - lets say default_length
> >   - rename length for this pattern to default_length
> >   - create another new attribute - lets say purecode_length, add lengths
> > for each alternative but adjusted for the purecode variant.
> >   - make the length attribute for this pattern select based on the
> > disable_literal_pool variable between the default_length and
> > purecode_length attributes.
> >
>
> Hi Richard,
>
> Thanks for the suggestions.  I've implemented option (1) above, does
> it match what you had in mind?
>
> Tested on arm-eabi, with -mpure-code forced, no regression. Manually
> checked that the expected sequence is generated with
> -fdisable-rtl-fwprop2.
>
> Thanks,
>
> Christophe
>
> > R.
> >
> > >
> > > Thanks,
> > >
> > > Christophe
> > >
> > >> R.
> > >>
> > >>> Thanks,
> > >>>
> > >>> Christophe
> > >>>
> > >>>> R.
> > >>
> >
Kyrill Tkachov Feb. 25, 2020, 1:44 p.m. UTC | #8
Hi Christophe,

On 2/24/20 2:16 PM, Christophe Lyon wrote:
> Ping?
>
> I'd also like to backport this and the main patch (svn r279463,
> r10-5505-ge24f6408df1e4c5e8c09785d7b488c492dfb68b3)
> to the gcc-9 branch.
>
> I found the problem addressed by this patch while validating the
> backport to gcc-9: although the patch applies cleanly except for
> testcases dg directives, there were some failures which I could
> finally reproduce on trunk with -fdisable-rtl-fwprop2.
>
> Here is a summary of the validations I ran using --target arm-eabi:
> * without my patches:
> (1) --with-cpu cortex-m0
> (2) --with-cpu cortex-m4
> (3) --with-cpu cortex-m4 CFLAGS_FOR_TARGET=-mpure-code (to build the
> libs with -mpure-code)
> (4) --with-cpu cortex-m4 CFLAGS_FOR_TARGET=-mpure-code
> --target-board=-mpure-code (to also run the tests with -mpure-code)
>
> * with my patches:
> (5) --with-cpu cortex-m0 CFLAGS_FOR_TARGET=-mpure-code
> --target-board=-mpure-code
> (6) --with-cpu cortex-m4 CFLAGS_FOR_TARGET=-mpure-code
> --target-board=-mpure-code
>
> Comparing (4) and (6) ensured that my (v6m) patches introduce no
> regression in v7m cases.
>
> Comparison of (1) vs (5) gave results similar to (2) vs (6), there's a
> bit of noise because some tests cases don't cope well with -mpure-code
> despite my previous testsuite-only patch (svn r277828)
>
> Comparison of (1) vs (2) gave similar results to (5) vs (6).
>
> Ideally, we may also want to backport svn r277828 (testsuite-only
> patch, to handle -mpure-code better), but that's not mandatory.
>
> In summary, is this patch OK for trunk?
>
> Are this patch and r279463,
> r10-5505-ge24f6408df1e4c5e8c09785d7b488c492dfb68b3 OK to backport to
> gcc-9?
>

This is okay with me.

I don't think any of the branches are frozen at the moment, so it should 
be okay to backport it.

Thanks,

Kyrill


> Thanks,
>
> Christophe
>
> On Thu, 13 Feb 2020 at 11:14, Christophe Lyon
> <christophe.lyon@linaro.org> wrote:
> >
> > On Mon, 10 Feb 2020 at 17:45, Richard Earnshaw (lists)
> > <Richard.Earnshaw@arm.com> wrote:
> > >
> > > On 10/02/2020 09:27, Christophe Lyon wrote:
> > > > On Fri, 7 Feb 2020 at 17:55, Richard Earnshaw (lists)
> > > > <Richard.Earnshaw@arm.com> wrote:
> > > >>
> > > >> On 07/02/2020 16:43, Christophe Lyon wrote:
> > > >>> On Fri, 7 Feb 2020 at 14:49, Richard Earnshaw (lists)
> > > >>> <Richard.Earnshaw@arm.com> wrote:
> > > >>>>
> > > >>>> On 07/02/2020 13:19, Christophe Lyon wrote:
> > > >>>>> When running the testsuite with -fdisable-rtl-fwprop2 and 
> -mpure-code
> > > >>>>> for cortex-m0, I noticed that some testcases were failing 
> because we
> > > >>>>> still generate "ldr rX, .LCY", which is what we want to 
> avoid with
> > > >>>>> -mpure-code. This is latent since a recent improvement in fwprop
> > > >>>>> (PR88833).
> > > >>>>>
> > > >>>>> In this patch I change the thumb1_movsi_insn pattern so that 
> it emits
> > > >>>>> the desired instruction sequence when 
> arm_disable_literal_pool is set.
> > > >>>>>
> > > >>>>> I tried to add a define_split instead, but couldn't make it 
> work: the
> > > >>>>> compiler then complains it cannot split the instruction, 
> while my new
> > > >>>>> define_split accepts the same operand types as 
> thumb1_movsi_insn:
> > > >>>>>
> > > >>>>> c-c++-common/torture/complex-sign-mixed-add.c:41:1: error: 
> could not split insn
> > > >>>>> (insn 2989 425 4844 (set (reg/f:SI 3 r3 [1342])
> > > >>>>>            (symbol_ref/u:SI ("*.LC6") [flags 0x2])) 836 
> {*thumb1_movsi_insn}
> > > >>>>>         (expr_list:REG_EQUIV (symbol_ref/u:SI ("*.LC6") 
> [flags 0x2])
> > > >>>>>            (nil)))
> > > >>>>> during RTL pass: final
> > > >>>>>
> > > >>>>> (define_split
> > > >>>>>      [(set (match_operand:SI 0 "register_operand" "")
> > > >>>>>            (match_operand:SI 1 "general_operand" ""))]
> > > >>>>>      "TARGET_THUMB1
> > > >>>>>       && arm_disable_literal_pool
> > > >>>>>       && GET_CODE (operands[1]) == SYMBOL_REF"
> > > >>>>>      [(clobber (const_int 0))]
> > > >>>>>      "
> > > >>>>> gen_thumb1_movsi_symbol_ref(operands[0], operands[1]);
> > > >>>>>        DONE;
> > > >>>>>      "
> > > >>>>> )
> > > >>>>> and I put this in thumb1_movsi_insn:
> > > >>>>> if (GET_CODE (operands[1]) == SYMBOL_REF && 
> arm_disable_literal_pool)
> > > >>>>>      {
> > > >>>>>        return \"#\";
> > > >>>>>      }
> > > >>>>>      return \"ldr\\t%0, %1\";
> > > >>>>>
> > > >>>>> 2020-02-07  Christophe Lyon <christophe.lyon@linaro.org>
> > > >>>>>
> > > >>>>>            * config/arm/thumb1.md (thumb1_movsi_insn): Fix 
> ldr alternative to
> > > >>>>>            work with -mpure-code.
> > > >>>>>
> > > >>>>
> > > >>>> +    case 0:
> > > >>>> +    case 1:
> > > >>>> +      return \"movs    %0, %1\";
> > > >>>> +    case 2:
> > > >>>> +      return \"movw    %0, %1\";
> > > >>>>
> > > >>>> This is OK, but please replace the hard tab in the strings 
> for MOVS/MOVW
> > > >>>> with \\t.
> > > >>>>
> > > >>>
> > > >>> OK that was merely a cut & paste from the existing code.
> > > >>>
> > > >>> I'm concerned that the length attribute is becoming wrong with my
> > > >>> patch, isn't this a problem?
> > > >>>
> > > >>
> > > >> Potentially yes.  The branch range code needs this to handle 
> overly long
> > > >> jumps correctly.
> > > >>
> > > >
> > > > Do you mean that the probability of problems due to that shortcoming
> > > > is low enough that I can commit my patch?
> > >
> > > It's hard to say that.  The number of instructions you generate 
> for this
> > > is significant, so it likely will throw off the calculations and
> > > somebody will get unlucky.  On the other hand, I don't think we should
> > > pessimize code for the non-pure-code variants by inflating the 
> size for
> > > this unconditionally.
> > >
> > > It seems there are two ways to fix this.
> > >
> > > 1) create a new alternative for the pure-code variant with its own
> > > length attribute, then only enable that for the case you need.  That
> > > would also allow you to go back to the templated asm format.
> > > 2) use a level of indirection to calculate the length - I haven't 
> tried
> > > this, but it would be something like:
> > >
> > >   - create a new attribute - lets say default_length
> > >   - rename length for this pattern to default_length
> > >   - create another new attribute - lets say purecode_length, add 
> lengths
> > > for each alternative but adjusted for the purecode variant.
> > >   - make the length attribute for this pattern select based on the
> > > disable_literal_pool variable between the default_length and
> > > purecode_length attributes.
> > >
> >
> > Hi Richard,
> >
> > Thanks for the suggestions.  I've implemented option (1) above, does
> > it match what you had in mind?
> >
> > Tested on arm-eabi, with -mpure-code forced, no regression. Manually
> > checked that the expected sequence is generated with
> > -fdisable-rtl-fwprop2.
> >
> > Thanks,
> >
> > Christophe
> >
> > > R.
> > >
> > > >
> > > > Thanks,
> > > >
> > > > Christophe
> > > >
> > > >> R.
> > > >>
> > > >>> Thanks,
> > > >>>
> > > >>> Christophe
> > > >>>
> > > >>>> R.
> > > >>
> > >
Christophe Lyon Feb. 25, 2020, 4:14 p.m. UTC | #9
On Tue, 25 Feb 2020 at 14:44, Kyrill Tkachov
<kyrylo.tkachov@foss.arm.com> wrote:
>
> Hi Christophe,
>
> On 2/24/20 2:16 PM, Christophe Lyon wrote:
> > Ping?
> >
> > I'd also like to backport this and the main patch (svn r279463,
> > r10-5505-ge24f6408df1e4c5e8c09785d7b488c492dfb68b3)
> > to the gcc-9 branch.
> >
> > I found the problem addressed by this patch while validating the
> > backport to gcc-9: although the patch applies cleanly except for
> > testcases dg directives, there were some failures which I could
> > finally reproduce on trunk with -fdisable-rtl-fwprop2.
> >
> > Here is a summary of the validations I ran using --target arm-eabi:
> > * without my patches:
> > (1) --with-cpu cortex-m0
> > (2) --with-cpu cortex-m4
> > (3) --with-cpu cortex-m4 CFLAGS_FOR_TARGET=-mpure-code (to build the
> > libs with -mpure-code)
> > (4) --with-cpu cortex-m4 CFLAGS_FOR_TARGET=-mpure-code
> > --target-board=-mpure-code (to also run the tests with -mpure-code)
> >
> > * with my patches:
> > (5) --with-cpu cortex-m0 CFLAGS_FOR_TARGET=-mpure-code
> > --target-board=-mpure-code
> > (6) --with-cpu cortex-m4 CFLAGS_FOR_TARGET=-mpure-code
> > --target-board=-mpure-code
> >
> > Comparing (4) and (6) ensured that my (v6m) patches introduce no
> > regression in v7m cases.
> >
> > Comparison of (1) vs (5) gave results similar to (2) vs (6), there's a
> > bit of noise because some tests cases don't cope well with -mpure-code
> > despite my previous testsuite-only patch (svn r277828)
> >
> > Comparison of (1) vs (2) gave similar results to (5) vs (6).
> >
> > Ideally, we may also want to backport svn r277828 (testsuite-only
> > patch, to handle -mpure-code better), but that's not mandatory.
> >
> > In summary, is this patch OK for trunk?
> >
> > Are this patch and r279463,
> > r10-5505-ge24f6408df1e4c5e8c09785d7b488c492dfb68b3 OK to backport to
> > gcc-9?
> >
>
> This is okay with me.
>
> I don't think any of the branches are frozen at the moment, so it should
> be okay to backport it.
>

Cool, thanks.

Pushed as r10-6845-ga71f2193d0df71a86c4743aab22891bb0003112e
and backported as
r9-8277-g9c5db942ca332494ef3d79d4a7d494d83cad8304
r9-8278-g7edf9fa1c5f0e05c22e1d719658ed903fe2b44f4

Thanks,

Christophe

> Thanks,
>
> Kyrill
>
>
> > Thanks,
> >
> > Christophe
> >
> > On Thu, 13 Feb 2020 at 11:14, Christophe Lyon
> > <christophe.lyon@linaro.org> wrote:
> > >
> > > On Mon, 10 Feb 2020 at 17:45, Richard Earnshaw (lists)
> > > <Richard.Earnshaw@arm.com> wrote:
> > > >
> > > > On 10/02/2020 09:27, Christophe Lyon wrote:
> > > > > On Fri, 7 Feb 2020 at 17:55, Richard Earnshaw (lists)
> > > > > <Richard.Earnshaw@arm.com> wrote:
> > > > >>
> > > > >> On 07/02/2020 16:43, Christophe Lyon wrote:
> > > > >>> On Fri, 7 Feb 2020 at 14:49, Richard Earnshaw (lists)
> > > > >>> <Richard.Earnshaw@arm.com> wrote:
> > > > >>>>
> > > > >>>> On 07/02/2020 13:19, Christophe Lyon wrote:
> > > > >>>>> When running the testsuite with -fdisable-rtl-fwprop2 and
> > -mpure-code
> > > > >>>>> for cortex-m0, I noticed that some testcases were failing
> > because we
> > > > >>>>> still generate "ldr rX, .LCY", which is what we want to
> > avoid with
> > > > >>>>> -mpure-code. This is latent since a recent improvement in fwprop
> > > > >>>>> (PR88833).
> > > > >>>>>
> > > > >>>>> In this patch I change the thumb1_movsi_insn pattern so that
> > it emits
> > > > >>>>> the desired instruction sequence when
> > arm_disable_literal_pool is set.
> > > > >>>>>
> > > > >>>>> I tried to add a define_split instead, but couldn't make it
> > work: the
> > > > >>>>> compiler then complains it cannot split the instruction,
> > while my new
> > > > >>>>> define_split accepts the same operand types as
> > thumb1_movsi_insn:
> > > > >>>>>
> > > > >>>>> c-c++-common/torture/complex-sign-mixed-add.c:41:1: error:
> > could not split insn
> > > > >>>>> (insn 2989 425 4844 (set (reg/f:SI 3 r3 [1342])
> > > > >>>>>            (symbol_ref/u:SI ("*.LC6") [flags 0x2])) 836
> > {*thumb1_movsi_insn}
> > > > >>>>>         (expr_list:REG_EQUIV (symbol_ref/u:SI ("*.LC6")
> > [flags 0x2])
> > > > >>>>>            (nil)))
> > > > >>>>> during RTL pass: final
> > > > >>>>>
> > > > >>>>> (define_split
> > > > >>>>>      [(set (match_operand:SI 0 "register_operand" "")
> > > > >>>>>            (match_operand:SI 1 "general_operand" ""))]
> > > > >>>>>      "TARGET_THUMB1
> > > > >>>>>       && arm_disable_literal_pool
> > > > >>>>>       && GET_CODE (operands[1]) == SYMBOL_REF"
> > > > >>>>>      [(clobber (const_int 0))]
> > > > >>>>>      "
> > > > >>>>> gen_thumb1_movsi_symbol_ref(operands[0], operands[1]);
> > > > >>>>>        DONE;
> > > > >>>>>      "
> > > > >>>>> )
> > > > >>>>> and I put this in thumb1_movsi_insn:
> > > > >>>>> if (GET_CODE (operands[1]) == SYMBOL_REF &&
> > arm_disable_literal_pool)
> > > > >>>>>      {
> > > > >>>>>        return \"#\";
> > > > >>>>>      }
> > > > >>>>>      return \"ldr\\t%0, %1\";
> > > > >>>>>
> > > > >>>>> 2020-02-07  Christophe Lyon <christophe.lyon@linaro.org>
> > > > >>>>>
> > > > >>>>>            * config/arm/thumb1.md (thumb1_movsi_insn): Fix
> > ldr alternative to
> > > > >>>>>            work with -mpure-code.
> > > > >>>>>
> > > > >>>>
> > > > >>>> +    case 0:
> > > > >>>> +    case 1:
> > > > >>>> +      return \"movs    %0, %1\";
> > > > >>>> +    case 2:
> > > > >>>> +      return \"movw    %0, %1\";
> > > > >>>>
> > > > >>>> This is OK, but please replace the hard tab in the strings
> > for MOVS/MOVW
> > > > >>>> with \\t.
> > > > >>>>
> > > > >>>
> > > > >>> OK that was merely a cut & paste from the existing code.
> > > > >>>
> > > > >>> I'm concerned that the length attribute is becoming wrong with my
> > > > >>> patch, isn't this a problem?
> > > > >>>
> > > > >>
> > > > >> Potentially yes.  The branch range code needs this to handle
> > overly long
> > > > >> jumps correctly.
> > > > >>
> > > > >
> > > > > Do you mean that the probability of problems due to that shortcoming
> > > > > is low enough that I can commit my patch?
> > > >
> > > > It's hard to say that.  The number of instructions you generate
> > for this
> > > > is significant, so it likely will throw off the calculations and
> > > > somebody will get unlucky.  On the other hand, I don't think we should
> > > > pessimize code for the non-pure-code variants by inflating the
> > size for
> > > > this unconditionally.
> > > >
> > > > It seems there are two ways to fix this.
> > > >
> > > > 1) create a new alternative for the pure-code variant with its own
> > > > length attribute, then only enable that for the case you need.  That
> > > > would also allow you to go back to the templated asm format.
> > > > 2) use a level of indirection to calculate the length - I haven't
> > tried
> > > > this, but it would be something like:
> > > >
> > > >   - create a new attribute - lets say default_length
> > > >   - rename length for this pattern to default_length
> > > >   - create another new attribute - lets say purecode_length, add
> > lengths
> > > > for each alternative but adjusted for the purecode variant.
> > > >   - make the length attribute for this pattern select based on the
> > > > disable_literal_pool variable between the default_length and
> > > > purecode_length attributes.
> > > >
> > >
> > > Hi Richard,
> > >
> > > Thanks for the suggestions.  I've implemented option (1) above, does
> > > it match what you had in mind?
> > >
> > > Tested on arm-eabi, with -mpure-code forced, no regression. Manually
> > > checked that the expected sequence is generated with
> > > -fdisable-rtl-fwprop2.
> > >
> > > Thanks,
> > >
> > > Christophe
> > >
> > > > R.
> > > >
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Christophe
> > > > >
> > > > >> R.
> > > > >>
> > > > >>> Thanks,
> > > > >>>
> > > > >>> Christophe
> > > > >>>
> > > > >>>> R.
> > > > >>
> > > >
diff mbox series

Patch

diff --git a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md
index 613cf9c..a722194 100644
--- a/gcc/config/arm/thumb1.md
+++ b/gcc/config/arm/thumb1.md
@@ -696,17 +696,43 @@ 
   "TARGET_THUMB1
    && (   register_operand (operands[0], SImode)
        || register_operand (operands[1], SImode))"
-  "@
-   movs	%0, %1
-   movs	%0, %1
-   movw	%0, %1
-   #
-   #
-   ldmia\\t%1, {%0}
-   stmia\\t%0, {%1}
-   ldr\\t%0, %1
-   str\\t%1, %0
-   mov\\t%0, %1"
+  "*
+  switch (which_alternative)
+  {
+    case 0:
+    case 1:
+      return \"movs	%0, %1\";
+    case 2:
+      return \"movw	%0, %1\";
+    case 3:
+    case 4:
+      return \"#\";
+    case 5:
+      return \"ldmia\\t%1, {%0}\";
+    case 6:
+      return \"stmia\\t%0, {%1}\";
+    case 7:
+      /* Cannot load it directly, split to build it via MOV / LSLS / ADDS.  */
+      if (GET_CODE (operands[1]) == SYMBOL_REF && arm_disable_literal_pool)
+        {
+          output_asm_insn (\"movs\\t%0, #:upper8_15:%1\", operands);
+          output_asm_insn (\"lsls\\t%0, #8\", operands);
+          output_asm_insn (\"adds\\t%0, #:upper0_7:%1\", operands);
+          output_asm_insn (\"lsls\\t%0, #8\", operands);
+          output_asm_insn (\"adds\\t%0, #:lower8_15:%1\", operands);
+          output_asm_insn (\"lsls\\t%0, #8\", operands);
+          output_asm_insn (\"adds\\t%0, #:lower0_7:%1\", operands);
+          return \"\";
+        }
+      else
+        return \"ldr\\t%0, %1\";
+    case 8:
+      return \"str\\t%1, %0\";
+    case 9:
+      return \"mov\\t%0, %1\";
+    default:
+      gcc_unreachable ();
+  }"
   [(set_attr "length" "2,2,4,4,4,2,2,2,2,2")
    (set_attr "type" "mov_reg,mov_imm,mov_imm,multiple,multiple,load_4,store_4,load_4,store_4,mov_reg")
    (set_attr "pool_range" "*,*,*,*,*,*,*,1018,*,*")