diff mbox

[ARM] PR71607: New approach to arm_disable_literal_pool

Message ID 58638CBC.9000303@arm.com
State New
Headers show

Commit Message

Andre Vieira (lists) Dec. 28, 2016, 9:58 a.m. UTC
On 29/11/16 09:45, Andre Vieira (lists) wrote:
> On 17/11/16 10:00, Ramana Radhakrishnan wrote:
>> On Thu, Oct 6, 2016 at 2:57 PM, Andre Vieira (lists)
>> <Andre.SimoesDiasVieira@arm.com> wrote:
>>> Hello,
>>>
>>> This patch tackles the issue reported in PR71607. This patch takes a
>>> different approach for disabling the creation of literal pools. Instead
>>> of disabling the patterns that would normally transform the rtl into
>>> actual literal pools, it disables the creation of this literal pool rtl
>>> by making the target hook TARGET_CANNOT_FORCE_CONST_MEM return true if
>>> arm_disable_literal_pool is true. I added patterns to split floating
>>> point constants for both SF and DFmode. A pattern to handle the
>>> addressing of label_refs had to be included as well since all
>>> "memory_operand" patterns are disabled when
>>> TARGET_CANNOT_FORCE_CONST_MEM returns true. Also the pattern for
>>> splitting 32-bit immediates had to be changed, it was not accepting
>>> unsigned 32-bit unsigned integers with the MSB set. I believe
>>> const_int_operand expects the mode of the operand to be set to VOIDmode
>>> and not SImode. I have only changed it in the patterns that were
>>> affecting this code, though I suggest looking into changing it in the
>>> rest of the ARM backend.
>>>
>>> I added more test cases. No regressions for arm-none-eabi with
>>> Cortex-M0, Cortex-M3 and Cortex-M7.
>>>
>>> Is this OK for trunk?
>>
>> Including -mslow-flash-data in your multilib flags ? If no regressions
>> with that ok .
>>
>>
>> regards
>> Ramana
>>
>>>
> 
> Hello,
> 
> I found some new ICE's with the -mslow-flash-data testing so I had to
> rework this patch. I took the opportunity to rebase it as well.
> 
> The problem was with the way the old version of the patch handled label
> references.  After some digging I found I wasn't using the right target
> hook and so I implemented the 'TARGET_USE_BLOCKS_FOR_CONSTANT_P' for
> ARM.  This target hook determines whether a literal pool ends up in an
> 'object_block' structure. So I reverted the changes made in the old
> version of the patch to the ARM implementation of the
> 'TARGET_CANNOT_FORCE_CONST_MEM' hook and rely on
> 'TARGET_USE_BLOCKS_FOR_CONSTANT_P' instead. This patch adds an ARM
> implementation for this hook that returns false if
> 'arm_disable_literal_pool' is set to true and true otherwise.
> 
> This version of the patch also reverts back to using the check for
> 'SYMBOL_REF' in 'thumb2_legitimate_address_p' that was removed in the
> last version, this code is required to place the label references in
> rodata sections.
> 
> Another thing this patch does is revert the changes made to the 32-bit
> constant split in arm.md. The reason this was needed before was because
> 'real_to_target' returns a long array and does not sign-extend values in
> it, which would make sense on hosts with 64-bit longs. To fix this the
> value is now casted to 'int' first.  It would probably be a good idea to
> change the 'real_to_target' function to return an array with
> 'HOST_WIDE_INT' elements instead and either use all 64-bits or
> sign-extend them.  Something for the future?
> 
> I added more test cases in this patch and reran regression tests for:
> Cortex-M0, Cortex-M4 with and without -mslow-flash-data. Also did a
> bootstrap+regressions on arm-none-linux-gnueabihf.
> 
> Is this OK for trunk?
> 
> Cheers,
> Andre
> 
> gcc/ChangeLog:
> 
> 2016-11-29  Andre Vieira  <andre.simoesdiasvieira@arm.com>
> 
>     PR target/71607
>     * config/arm/arm.md (use_literal_pool): Removes.
>     (64-bit immediate split): No longer takes cost into consideration
>     if 'arm_disable_literal_pool' is enabled.
>     * config/arm/arm.c (arm_use_blocks_for_constant_p): New.
>     (TARGET_USE_BLOCKS_FOR_CONSTANT_P): Define.
>     (arm_max_const_double_inline_cost): Remove use of
> arm_disable_literal_pool.
>     * config/arm/vfp.md (no_literal_pool_df_immediate): New.
>     (no_literal_pool_sf_immediate): New.
> 
> 
> gcc/testsuite/ChangeLog:
> 
> 2016-11-29  Andre Vieira  <andre.simoesdiasvieira@arm.com>
>             Thomas Preud'homme  <thomas.preudhomme@arm.com>
> 
>     PR target/71607
>     * gcc.target/arm/thumb2-slow-flash-data.c: Renamed to ...
>     * gcc.target/arm/thumb2-slow-flash-data-1.c: ... this.
>     * gcc.target/arm/thumb2-slow-flash-data-2.c: New.
>     * gcc.target/arm/thumb2-slow-flash-data-3.c: New.
>     * gcc.target/arm/thumb2-slow-flash-data-4.c: New.
>     * gcc.target/arm/thumb2-slow-flash-data-5.c: New.
> 
Hello,

As I have mentioned in my commit message for the fix on embedded-6  (see
https://gcc.gnu.org/ml/gcc-patches/2016-12/msg01304.html) I found an
issue with this code when testing its backport on the embedded-6-branch.

I misread the definition of the target hook
TARGET_USE_BLOCKS_FOR_CONSTANT_P and it seems the way I implemented it
before only changed code generation if the -mslow-flash-data option
wasn't passed. I.e. I don't need to implement it to solve this issue.
The other changes in this patch are sufficient...

I reran regressions for arm-none-eabi-gcc with the following targets:
Cortex-M0, Cortex-M3, Cortex-M7 and ARMv8-M Baseline. I also ran
bootstraps and full regressions for arm-none-linux-gnueabihf both in ARM
and THUMB mode. All green!

Is this OK for trunk?

Cheers,
Andre

gcc/ChangeLog:

2016-12-28  Andre Vieira  <andre.simoesdiasvieira@arm.com>

    PR target/71607
    * config/arm/arm.md (use_literal_pool): Removes.
    (64-bit immediate split): No longer takes cost into consideration
    if 'arm_disable_literal_pool' is enabled.
    * config/arm/arm.c (arm_max_const_double_inline_cost): Remove use
    of arm_disable_literal_pool.
    * config/arm/vfp.md (no_literal_pool_df_immediate): New.
    (no_literal_pool_sf_immediate): New.


gcc/testsuite/ChangeLog:

2016-12-28  Andre Vieira  <andre.simoesdiasvieira@arm.com>
            Thomas Preud'homme  <thomas.preudhomme@arm.com>

    PR target/71607
    * gcc.target/arm/thumb2-slow-flash-data.c: Renamed to ...
    * gcc.target/arm/thumb2-slow-flash-data-1.c: ... this.
    * gcc.target/arm/thumb2-slow-flash-data-2.c: New.
    * gcc.target/arm/thumb2-slow-flash-data-3.c: New.
    * gcc.target/arm/thumb2-slow-flash-data-4.c: New.
    * gcc.target/arm/thumb2-slow-flash-data-5.c: New.

Comments

Andre Vieira (lists) Jan. 16, 2017, 3:25 p.m. UTC | #1
On 28/12/16 09:58, Andre Vieira (lists) wrote:
> On 29/11/16 09:45, Andre Vieira (lists) wrote:
>> On 17/11/16 10:00, Ramana Radhakrishnan wrote:
>>> On Thu, Oct 6, 2016 at 2:57 PM, Andre Vieira (lists)
>>> <Andre.SimoesDiasVieira@arm.com> wrote:
>>>> Hello,
>>>>
>>>> This patch tackles the issue reported in PR71607. This patch takes a
>>>> different approach for disabling the creation of literal pools. Instead
>>>> of disabling the patterns that would normally transform the rtl into
>>>> actual literal pools, it disables the creation of this literal pool rtl
>>>> by making the target hook TARGET_CANNOT_FORCE_CONST_MEM return true if
>>>> arm_disable_literal_pool is true. I added patterns to split floating
>>>> point constants for both SF and DFmode. A pattern to handle the
>>>> addressing of label_refs had to be included as well since all
>>>> "memory_operand" patterns are disabled when
>>>> TARGET_CANNOT_FORCE_CONST_MEM returns true. Also the pattern for
>>>> splitting 32-bit immediates had to be changed, it was not accepting
>>>> unsigned 32-bit unsigned integers with the MSB set. I believe
>>>> const_int_operand expects the mode of the operand to be set to VOIDmode
>>>> and not SImode. I have only changed it in the patterns that were
>>>> affecting this code, though I suggest looking into changing it in the
>>>> rest of the ARM backend.
>>>>
>>>> I added more test cases. No regressions for arm-none-eabi with
>>>> Cortex-M0, Cortex-M3 and Cortex-M7.
>>>>
>>>> Is this OK for trunk?
>>>
>>> Including -mslow-flash-data in your multilib flags ? If no regressions
>>> with that ok .
>>>
>>>
>>> regards
>>> Ramana
>>>
>>>>
>>
>> Hello,
>>
>> I found some new ICE's with the -mslow-flash-data testing so I had to
>> rework this patch. I took the opportunity to rebase it as well.
>>
>> The problem was with the way the old version of the patch handled label
>> references.  After some digging I found I wasn't using the right target
>> hook and so I implemented the 'TARGET_USE_BLOCKS_FOR_CONSTANT_P' for
>> ARM.  This target hook determines whether a literal pool ends up in an
>> 'object_block' structure. So I reverted the changes made in the old
>> version of the patch to the ARM implementation of the
>> 'TARGET_CANNOT_FORCE_CONST_MEM' hook and rely on
>> 'TARGET_USE_BLOCKS_FOR_CONSTANT_P' instead. This patch adds an ARM
>> implementation for this hook that returns false if
>> 'arm_disable_literal_pool' is set to true and true otherwise.
>>
>> This version of the patch also reverts back to using the check for
>> 'SYMBOL_REF' in 'thumb2_legitimate_address_p' that was removed in the
>> last version, this code is required to place the label references in
>> rodata sections.
>>
>> Another thing this patch does is revert the changes made to the 32-bit
>> constant split in arm.md. The reason this was needed before was because
>> 'real_to_target' returns a long array and does not sign-extend values in
>> it, which would make sense on hosts with 64-bit longs. To fix this the
>> value is now casted to 'int' first.  It would probably be a good idea to
>> change the 'real_to_target' function to return an array with
>> 'HOST_WIDE_INT' elements instead and either use all 64-bits or
>> sign-extend them.  Something for the future?
>>
>> I added more test cases in this patch and reran regression tests for:
>> Cortex-M0, Cortex-M4 with and without -mslow-flash-data. Also did a
>> bootstrap+regressions on arm-none-linux-gnueabihf.
>>
>> Is this OK for trunk?
>>
>> Cheers,
>> Andre
>>
>> gcc/ChangeLog:
>>
>> 2016-11-29  Andre Vieira  <andre.simoesdiasvieira@arm.com>
>>
>>     PR target/71607
>>     * config/arm/arm.md (use_literal_pool): Removes.
>>     (64-bit immediate split): No longer takes cost into consideration
>>     if 'arm_disable_literal_pool' is enabled.
>>     * config/arm/arm.c (arm_use_blocks_for_constant_p): New.
>>     (TARGET_USE_BLOCKS_FOR_CONSTANT_P): Define.
>>     (arm_max_const_double_inline_cost): Remove use of
>> arm_disable_literal_pool.
>>     * config/arm/vfp.md (no_literal_pool_df_immediate): New.
>>     (no_literal_pool_sf_immediate): New.
>>
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2016-11-29  Andre Vieira  <andre.simoesdiasvieira@arm.com>
>>             Thomas Preud'homme  <thomas.preudhomme@arm.com>
>>
>>     PR target/71607
>>     * gcc.target/arm/thumb2-slow-flash-data.c: Renamed to ...
>>     * gcc.target/arm/thumb2-slow-flash-data-1.c: ... this.
>>     * gcc.target/arm/thumb2-slow-flash-data-2.c: New.
>>     * gcc.target/arm/thumb2-slow-flash-data-3.c: New.
>>     * gcc.target/arm/thumb2-slow-flash-data-4.c: New.
>>     * gcc.target/arm/thumb2-slow-flash-data-5.c: New.
>>
> Hello,
> 
> As I have mentioned in my commit message for the fix on embedded-6  (see
> https://gcc.gnu.org/ml/gcc-patches/2016-12/msg01304.html) I found an
> issue with this code when testing its backport on the embedded-6-branch.
> 
> I misread the definition of the target hook
> TARGET_USE_BLOCKS_FOR_CONSTANT_P and it seems the way I implemented it
> before only changed code generation if the -mslow-flash-data option
> wasn't passed. I.e. I don't need to implement it to solve this issue.
> The other changes in this patch are sufficient...
> 
> I reran regressions for arm-none-eabi-gcc with the following targets:
> Cortex-M0, Cortex-M3, Cortex-M7 and ARMv8-M Baseline. I also ran
> bootstraps and full regressions for arm-none-linux-gnueabihf both in ARM
> and THUMB mode. All green!
> 
> Is this OK for trunk?
> 
> Cheers,
> Andre
> 
> gcc/ChangeLog:
> 
> 2016-12-28  Andre Vieira  <andre.simoesdiasvieira@arm.com>
> 
>     PR target/71607
>     * config/arm/arm.md (use_literal_pool): Removes.
>     (64-bit immediate split): No longer takes cost into consideration
>     if 'arm_disable_literal_pool' is enabled.
>     * config/arm/arm.c (arm_max_const_double_inline_cost): Remove use
>     of arm_disable_literal_pool.
>     * config/arm/vfp.md (no_literal_pool_df_immediate): New.
>     (no_literal_pool_sf_immediate): New.
> 
> 
> gcc/testsuite/ChangeLog:
> 
> 2016-12-28  Andre Vieira  <andre.simoesdiasvieira@arm.com>
>             Thomas Preud'homme  <thomas.preudhomme@arm.com>
> 
>     PR target/71607
>     * gcc.target/arm/thumb2-slow-flash-data.c: Renamed to ...
>     * gcc.target/arm/thumb2-slow-flash-data-1.c: ... this.
>     * gcc.target/arm/thumb2-slow-flash-data-2.c: New.
>     * gcc.target/arm/thumb2-slow-flash-data-3.c: New.
>     * gcc.target/arm/thumb2-slow-flash-data-4.c: New.
>     * gcc.target/arm/thumb2-slow-flash-data-5.c: New.
> 

Ping.

Cheers,
Andre
Ramana Radhakrishnan Jan. 20, 2017, 2:08 p.m. UTC | #2
On Wed, Dec 28, 2016 at 9:58 AM, Andre Vieira (lists)
<Andre.SimoesDiasVieira@arm.com> wrote:
> On 29/11/16 09:45, Andre Vieira (lists) wrote:
>> On 17/11/16 10:00, Ramana Radhakrishnan wrote:
>>> On Thu, Oct 6, 2016 at 2:57 PM, Andre Vieira (lists)
>>> <Andre.SimoesDiasVieira@arm.com> wrote:
>>>> Hello,
>>>>
>>>> This patch tackles the issue reported in PR71607. This patch takes a
>>>> different approach for disabling the creation of literal pools. Instead
>>>> of disabling the patterns that would normally transform the rtl into
>>>> actual literal pools, it disables the creation of this literal pool rtl
>>>> by making the target hook TARGET_CANNOT_FORCE_CONST_MEM return true if
>>>> arm_disable_literal_pool is true. I added patterns to split floating
>>>> point constants for both SF and DFmode. A pattern to handle the
>>>> addressing of label_refs had to be included as well since all
>>>> "memory_operand" patterns are disabled when
>>>> TARGET_CANNOT_FORCE_CONST_MEM returns true. Also the pattern for
>>>> splitting 32-bit immediates had to be changed, it was not accepting
>>>> unsigned 32-bit unsigned integers with the MSB set. I believe
>>>> const_int_operand expects the mode of the operand to be set to VOIDmode
>>>> and not SImode. I have only changed it in the patterns that were
>>>> affecting this code, though I suggest looking into changing it in the
>>>> rest of the ARM backend.
>>>>
>>>> I added more test cases. No regressions for arm-none-eabi with
>>>> Cortex-M0, Cortex-M3 and Cortex-M7.
>>>>
>>>> Is this OK for trunk?
>>>
>>> Including -mslow-flash-data in your multilib flags ? If no regressions
>>> with that ok .
>>>
>>>
>>> regards
>>> Ramana
>>>
>>>>
>>
>> Hello,
>>
>> I found some new ICE's with the -mslow-flash-data testing so I had to
>> rework this patch. I took the opportunity to rebase it as well.
>>
>> The problem was with the way the old version of the patch handled label
>> references.  After some digging I found I wasn't using the right target
>> hook and so I implemented the 'TARGET_USE_BLOCKS_FOR_CONSTANT_P' for
>> ARM.  This target hook determines whether a literal pool ends up in an
>> 'object_block' structure. So I reverted the changes made in the old
>> version of the patch to the ARM implementation of the
>> 'TARGET_CANNOT_FORCE_CONST_MEM' hook and rely on
>> 'TARGET_USE_BLOCKS_FOR_CONSTANT_P' instead. This patch adds an ARM
>> implementation for this hook that returns false if
>> 'arm_disable_literal_pool' is set to true and true otherwise.
>>
>> This version of the patch also reverts back to using the check for
>> 'SYMBOL_REF' in 'thumb2_legitimate_address_p' that was removed in the
>> last version, this code is required to place the label references in
>> rodata sections.
>>
>> Another thing this patch does is revert the changes made to the 32-bit
>> constant split in arm.md. The reason this was needed before was because
>> 'real_to_target' returns a long array and does not sign-extend values in
>> it, which would make sense on hosts with 64-bit longs. To fix this the
>> value is now casted to 'int' first.  It would probably be a good idea to
>> change the 'real_to_target' function to return an array with
>> 'HOST_WIDE_INT' elements instead and either use all 64-bits or
>> sign-extend them.  Something for the future?
>>
>> I added more test cases in this patch and reran regression tests for:
>> Cortex-M0, Cortex-M4 with and without -mslow-flash-data. Also did a
>> bootstrap+regressions on arm-none-linux-gnueabihf.
>>
>> Is this OK for trunk?
>>
>> Cheers,
>> Andre
>>
>> gcc/ChangeLog:
>>
>> 2016-11-29  Andre Vieira  <andre.simoesdiasvieira@arm.com>
>>
>>     PR target/71607
>>     * config/arm/arm.md (use_literal_pool): Removes.
>>     (64-bit immediate split): No longer takes cost into consideration
>>     if 'arm_disable_literal_pool' is enabled.
>>     * config/arm/arm.c (arm_use_blocks_for_constant_p): New.
>>     (TARGET_USE_BLOCKS_FOR_CONSTANT_P): Define.
>>     (arm_max_const_double_inline_cost): Remove use of
>> arm_disable_literal_pool.
>>     * config/arm/vfp.md (no_literal_pool_df_immediate): New.
>>     (no_literal_pool_sf_immediate): New.
>>
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2016-11-29  Andre Vieira  <andre.simoesdiasvieira@arm.com>
>>             Thomas Preud'homme  <thomas.preudhomme@arm.com>
>>
>>     PR target/71607
>>     * gcc.target/arm/thumb2-slow-flash-data.c: Renamed to ...
>>     * gcc.target/arm/thumb2-slow-flash-data-1.c: ... this.
>>     * gcc.target/arm/thumb2-slow-flash-data-2.c: New.
>>     * gcc.target/arm/thumb2-slow-flash-data-3.c: New.
>>     * gcc.target/arm/thumb2-slow-flash-data-4.c: New.
>>     * gcc.target/arm/thumb2-slow-flash-data-5.c: New.
>>
> Hello,
>
> As I have mentioned in my commit message for the fix on embedded-6  (see
> https://gcc.gnu.org/ml/gcc-patches/2016-12/msg01304.html) I found an
> issue with this code when testing its backport on the embedded-6-branch.
>
> I misread the definition of the target hook
> TARGET_USE_BLOCKS_FOR_CONSTANT_P and it seems the way I implemented it
> before only changed code generation if the -mslow-flash-data option
> wasn't passed. I.e. I don't need to implement it to solve this issue.
> The other changes in this patch are sufficient...
>
> I reran regressions for arm-none-eabi-gcc with the following targets:
> Cortex-M0, Cortex-M3, Cortex-M7 and ARMv8-M Baseline. I also ran
> bootstraps and full regressions for arm-none-linux-gnueabihf both in ARM
> and THUMB mode. All green!

And presumably with -mslow-flash-data as a multilib again ?

>
> Is this OK for trunk?

We should also be adding to arm_reorg


if (arm_disable_literal_pool)
  return ;

after we split all the insns.


After all if the splitters added had done their job, there's no reason
to run the minipool placement code at all - is there  :) ?

Furthermore is it guaranteed that we will not have any references to
the literal pool post reload - IIRC , reload in its older days could
easily just push any old constant into a standard constant pool entry
and make things work . Is LRA likely not to produce any constant pool
entries in this form ? The bad thing here would be producing literal
pool entries when the user had asked the compiler to do so - thus
having more belts and braces for this would be good (thus , why is a
change to arm_cannot_force_const_mem to return false in the face of
arm_disable_literal_pool not appropriate ? ) Nick also had a couple of
good points on making the test more usable across all testers in
private conversation - so please make those changes.




regards
Ramana


>
> Cheers,
> Andre
>
> gcc/ChangeLog:
>
> 2016-12-28  Andre Vieira  <andre.simoesdiasvieira@arm.com>
>
>     PR target/71607
>     * config/arm/arm.md (use_literal_pool): Removes.
>     (64-bit immediate split): No longer takes cost into consideration
>     if 'arm_disable_literal_pool' is enabled.
>     * config/arm/arm.c (arm_max_const_double_inline_cost): Remove use
>     of arm_disable_literal_pool.
>     * config/arm/vfp.md (no_literal_pool_df_immediate): New.
>     (no_literal_pool_sf_immediate): New.
>
>
> gcc/testsuite/ChangeLog:
>
> 2016-12-28  Andre Vieira  <andre.simoesdiasvieira@arm.com>
>             Thomas Preud'homme  <thomas.preudhomme@arm.com>
>
>     PR target/71607
>     * gcc.target/arm/thumb2-slow-flash-data.c: Renamed to ...
>     * gcc.target/arm/thumb2-slow-flash-data-1.c: ... this.
>     * gcc.target/arm/thumb2-slow-flash-data-2.c: New.
>     * gcc.target/arm/thumb2-slow-flash-data-3.c: New.
>     * gcc.target/arm/thumb2-slow-flash-data-4.c: New.
>     * gcc.target/arm/thumb2-slow-flash-data-5.c: New.
diff mbox

Patch

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index bbf10f23987e58a1e066715e2168772da0245ff1..1f5c7b3e6108b8931e286f5b86de9c3aa1804d98 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -16373,10 +16373,6 @@  push_minipool_fix (rtx_insn *insn, HOST_WIDE_INT address, rtx *loc,
 int
 arm_max_const_double_inline_cost ()
 {
-  /* Let the value get synthesized to avoid the use of literal pools.  */
-  if (arm_disable_literal_pool)
-    return 99;
-
   return ((optimize_size || arm_ld_sched) ? 3 : 4);
 }
 
@@ -30887,5 +30883,4 @@  arm_expand_divmod_libfunc (rtx libfunc, machine_mode mode,
   *quot_p = quotient;
   *rem_p = remainder;
 }
-
 #include "gt-arm.h"
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index ff1f565b850d1b9e7378461c5a808e7d486936bc..7d16e412765ccff4c527871b6b5d6142aa088b56 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -233,10 +233,6 @@ 
 	       (match_test "arm_restrict_it"))
 	  (const_string "no")
 
-	  (and (eq_attr "use_literal_pool" "yes")
-	       (match_test "arm_disable_literal_pool"))
-	  (const_string "no")
-
 	  (eq_attr "arch_enabled" "no")
 	  (const_string "no")]
 	 (const_string "yes")))
@@ -5871,8 +5867,9 @@ 
 	(match_operand:ANY64 1 "immediate_operand" ""))]
   "TARGET_32BIT
    && reload_completed
-   && (arm_const_double_inline_cost (operands[1])
-       <= arm_max_const_double_inline_cost ())"
+   && (arm_disable_literal_pool
+       || (arm_const_double_inline_cost (operands[1])
+	   <= arm_max_const_double_inline_cost ()))"
   [(const_int 0)]
   "
   arm_split_constant (SET, SImode, curr_insn,
diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md
index f83dc9b130e4288faa49e40ee8285a0cd7d325b1..b8db34a607e4fac75789fa6a7c09b3b6104f45d3 100644
--- a/gcc/config/arm/vfp.md
+++ b/gcc/config/arm/vfp.md
@@ -2079,3 +2079,40 @@ 
 ;; fmdhr et al (VFPv1)
 ;; Support for xD (single precision only) variants.
 ;; fmrrs, fmsrr
+
+;; Split an immediate DF move to two immediate SI moves.
+(define_insn_and_split "no_literal_pool_df_immediate"
+  [(set (match_operand 0 "s_register_operand" "")
+	(match_operand:DF 1 "const_double_operand" ""))]
+  "TARGET_THUMB2 && arm_disable_literal_pool
+  && !(TARGET_HARD_FLOAT && TARGET_VFP_DOUBLE
+       && vfp3_const_double_rtx (operands[1]))"
+  "#"
+  "&& !reload_completed"
+  [(set (subreg:SI (match_dup 1) 0) (match_dup 2))
+   (set (subreg:SI (match_dup 1) 4) (match_dup 3))
+   (set (match_dup 0) (match_dup 1))]
+  "
+  long buf[2];
+  real_to_target (buf, CONST_DOUBLE_REAL_VALUE (operands[1]), DFmode);
+  operands[2] = GEN_INT ((int) buf[0]);
+  operands[3] = GEN_INT ((int) buf[1]);
+  operands[1] = gen_reg_rtx (DFmode);
+  ")
+
+;; Split an immediate SF move to one immediate SI move.
+(define_insn_and_split "no_literal_pool_sf_immediate"
+  [(set (match_operand 0 "s_register_operand" "")
+	(match_operand:SF 1 "const_double_operand" ""))]
+  "TARGET_THUMB2 && arm_disable_literal_pool
+  && !(TARGET_HARD_FLOAT && vfp3_const_double_rtx (operands[1]))"
+  "#"
+  "&& !reload_completed"
+  [(set (subreg:SI (match_dup 1) 0) (match_dup 2))
+   (set (match_dup 0) (match_dup 1))]
+  "
+  long buf;
+  real_to_target (&buf, CONST_DOUBLE_REAL_VALUE (operands[1]), SFmode);
+  operands[2] = GEN_INT ((int) buf);
+  operands[1] = gen_reg_rtx (SFmode);
+  ")
diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c
new file mode 100644
index 0000000000000000000000000000000000000000..c52de113809b63986a4621524d68686fc795f7ee
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c
@@ -0,0 +1,27 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_cortex_m } */
+/* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */
+/* { dg-options "-O2 -mthumb -mfloat-abi=hard -mslow-flash-data" } */
+
+float f (float);
+
+const float max = 0.01f;
+
+int
+g (float in)
+{
+  if (f (in) + f (in) < max)
+    return 0;
+  return 1;
+}
+
+double foo (void)
+{
+  return 0xF1EC7A5239123AF;
+}
+
+double bar (void)
+{
+  return 0.0f;
+}
diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c
new file mode 100644
index 0000000000000000000000000000000000000000..d25ba87413cb949e5e2162dbf4e695e205b01a34
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c
@@ -0,0 +1,24 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_cortex_m } */
+/* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */
+/* { dg-options "-mthumb -mfloat-abi=hard -mslow-flash-data" } */
+
+/* From PR71607 */
+
+float b;
+void fn1 ();
+
+float
+fn2 ()
+{
+  return 1.1f;
+}
+
+void
+fn3 ()
+{
+  float a[2];
+  a[1] = b;
+  fn1 (a);
+}
diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c
new file mode 100644
index 0000000000000000000000000000000000000000..dbe129168d7dbc86232587ad5ba5ec0438ed2622
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c
@@ -0,0 +1,26 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_cortex_m } */
+/* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-require-effective-target arm_vfp_ok } */
+/* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */
+/* { dg-options "-O2 -mthumb -mfloat-abi=hard -mslow-flash-data" } */
+
+double __attribute__ ((target ("fpu=fpv5-d16")))
+foo (void)
+{
+  return 1.0f;
+}
+
+float __attribute__ ((target ("fpu=fpv5-d16")))
+bar (void)
+{
+  return 1.0f;
+}
+
+float __attribute__ ((target ("fpu=fpv5-sp-d16")))
+baz (void)
+{
+  return 1.0f;
+}
+
+/* { dg-final { scan-assembler-times "#1\\.0e\\+0" 3 } } */
diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c
new file mode 100644
index 0000000000000000000000000000000000000000..7d1b2384738c7f495be257a46e6587bf43b6534a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c
@@ -0,0 +1,14 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_cortex_m } */
+/* { dg-require-effective-target arm_vfp_ok } */
+/* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */
+/* { dg-options "-O2 -mthumb -mfloat-abi=hard -mslow-flash-data" } */
+
+double __attribute__ ((target ("fpu=fpv5-sp-d16")))
+foo (void)
+{
+  return 1.0f;
+}
+
+/* { dg-final { scan-assembler-not "#1\\.0e\\+0" } } */
diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-1.c
similarity index 100%
rename from gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data.c
rename to gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-1.c