Patchwork [RFC,ARM] Fix PR43721 - combine multiple div and mod into a single divmod call.

login
register
mail settings
Submitter Ramana Radhakrishnan
Date Jan. 22, 2011, 12:37 a.m.
Message ID <AANLkTinnRCXxgVGO3OrGxmMc7G+UVPV3EtS0q99Osmpr@mail.gmail.com>
Download mbox | patch
Permalink /patch/79969/
State New
Headers show

Comments

Ramana Radhakrishnan - Jan. 22, 2011, 12:37 a.m.
Hi,

Though I don't claim to like this patch very much it seems to be the
(cough) "standard" way for backends to get the single divmodsi4
library call generated rather than writing a separate pass to detect
this . There has been some discussion about this with respect to the
audit trail for http://gcc.gnu.org/PR23726 but that talks about
writing a new pass to detect this or enhancing CSE to deal with
multiple SETs across presumably various rtl insns.

Ok to commit to trunk when stage1 opens and if tests show no regressions ?

Any other comments with respect to this would be more than welcome.


Ramana




2011-01-21  Ramana Radhakrishnan  <ramana.radhakrishnan@arm.com>

        PR target/43721
	gcc/
	* config/arm/arm.md (divmodsi4): New.
	(aeabi_divmodsi4_call): New.
	(aeabi_divsi3_call): New.
	(udivmodsi4): Likewise.
	(aeabi_udivmodsi4_call): New.
	(aeabi_udivsi3_call): New.
Richard Henderson - Jan. 23, 2011, 10:49 p.m.
On 01/21/2011 04:37 PM, Ramana Radhakrishnan wrote:
> +(define_insn_and_split "divmodsi4"
> +  [(parallel [(set (match_operand:SI 0 "s_register_operand" "")
> +                   (div:SI (match_operand:SI 1 "s_register_operand" "")
> +                           (match_operand:SI 2 "s_register_operand" "")))
> +              (set (match_operand:SI 3 "s_register_operand" "")
> +                   (mod:SI (match_dup 1) (match_dup 2)))
> +              (clobber (reg:SI 0))
> +              (clobber (reg:SI 1))
> +              (clobber (reg:SI LR_REGNUM))])]

define_insn, unlike define_expand, has an implicit parallel.
You've got a redundant parallel in there.

> +  ""
> +  "#"
> +  "!(TARGET_THUMB2 && arm_arch_hwdiv)"

You have an unconditional insn, that is must-split, with a conditional
split?  That seems very wrong.  I suspect you want that condition moved
to the first "".

> +(define_insn "aeabi_divsi3_call"
> +  [(set (reg:SI 0) (div:SI (reg:SI 0) (reg:SI 1)))
> +   (clobber (reg:SI LR_REGNUM))]
> +  "!(TARGET_THUMB2 && arm_arch_hwdiv)"
> +  "bl\t__aeabi_idiv"
> +  [(set_attr "type" "call")
> +   (set_attr "conds" "clob")])

Does the eabi really say that no register except r0 is modified?
That doesn't seem to be the case, based on a quick glance over
the implementation in lib1funcs.asm.

If that's the case, your representation is very wrong.


r~
Ramana Radhakrishnan - Jan. 24, 2011, 9:33 p.m.
Hi Richard,


On Sun, Jan 23, 2011 at 10:49 PM, Richard Henderson <rth@redhat.com> wrote:
> On 01/21/2011 04:37 PM, Ramana Radhakrishnan wrote:
>> +(define_insn_and_split "divmodsi4"
>> +  [(parallel [(set (match_operand:SI 0 "s_register_operand" "")
>> +                   (div:SI (match_operand:SI 1 "s_register_operand" "")
>> +                           (match_operand:SI 2 "s_register_operand" "")))
>> +              (set (match_operand:SI 3 "s_register_operand" "")
>> +                   (mod:SI (match_dup 1) (match_dup 2)))
>> +              (clobber (reg:SI 0))
>> +              (clobber (reg:SI 1))
>> +              (clobber (reg:SI LR_REGNUM))])]
>
> define_insn, unlike define_expand, has an implicit parallel.
> You've got a redundant parallel in there.

Thanks for the review.

>
>> +  ""
>> +  "#"
>> +  "!(TARGET_THUMB2 && arm_arch_hwdiv)"
>
> You have an unconditional insn, that is must-split, with a conditional
> split?  That seems very wrong.  I suspect you want that condition moved
> to the first "".

Thanks for pointing that out.

>
>> +(define_insn "aeabi_divsi3_call"
>> +  [(set (reg:SI 0) (div:SI (reg:SI 0) (reg:SI 1)))
>> +   (clobber (reg:SI LR_REGNUM))]
>> +  "!(TARGET_THUMB2 && arm_arch_hwdiv)"
>> +  "bl\t__aeabi_idiv"
>> +  [(set_attr "type" "call")
>> +   (set_attr "conds" "clob")])
>
> Does the eabi really say that no register except r0 is modified?
> That doesn't seem to be the case, based on a quick glance over
> the implementation in lib1funcs.asm.

I've realized I should have moved the output function into a separate
function in arm.c that prints out the appropriate function name
depending on whether it is TARGET_AAPCS or not .

Yes I must admit I haven't covered all the caller saves that ideally
need to be marked as being clobbered rather than just those changed by
this implementation of _aeabi_idiv or _aeabi_idivmod since
the registers clobbered by _aeabi_div and _aeabi_divmod would vary
with each implementor of this function..

Sorry I must have mentioned this earlier but I was concerned more
about the approach in this mechanism where we fake a call insn through
the output template rather than through representing a call insn in
the RTL. Does anyone know if this is likely to cause problems
elsewhere ?


cheers
Ramana
Richard Henderson - Jan. 25, 2011, 1:56 a.m.
On 01/24/2011 01:33 PM, Ramana Radhakrishnan wrote:
> Sorry I must have mentioned this earlier but I was concerned more
> about the approach in this mechanism where we fake a call insn through
> the output template rather than through representing a call insn in
> the RTL. Does anyone know if this is likely to cause problems
> elsewhere ?

As long as you're representing all of the register changes that are
possible, it should be ok.


r~

Patch

=== modified file 'gcc/config/arm/arm.md'
--- gcc/config/arm/arm.md	2011-01-13 16:06:19 +0000
+++ gcc/config/arm/arm.md	2011-01-21 17:34:35 +0000
@@ -11554,6 +11554,103 @@ 
   "
 )

+(define_insn_and_split "divmodsi4"
+  [(parallel [(set (match_operand:SI 0 "s_register_operand" "")
+                   (div:SI (match_operand:SI 1 "s_register_operand" "")
+                           (match_operand:SI 2 "s_register_operand" "")))
+              (set (match_operand:SI 3 "s_register_operand" "")
+                   (mod:SI (match_dup 1) (match_dup 2)))
+              (clobber (reg:SI 0))
+              (clobber (reg:SI 1))
+              (clobber (reg:SI LR_REGNUM))])]
+  ""
+  "#"
+  "!(TARGET_THUMB2 && arm_arch_hwdiv)"
+  [(const_int 0)]
+  "{
+     rtx r0_reg = gen_rtx_REG (SImode, 0);
+     rtx r1_reg = gen_rtx_REG (SImode, 1);
+     emit_move_insn (r0_reg, operands[1]);
+     emit_move_insn (r1_reg, operands[2]);
+     if (find_reg_note (curr_insn, REG_UNUSED, operands[3]))
+        /* This is only a call for div.  */
+	emit_insn (gen_aeabi_divsi3_call ());
+     else
+      {
+        emit_insn (gen_aeabi_divmodsi4_call ());
+	emit_move_insn (operands[3], r1_reg);
+      }
+     emit_move_insn (operands[0], r0_reg);
+     DONE;
+  }
+  ")
+
+(define_insn "aeabi_divmodsi4_call"
+  [(set (reg:SI 0) (div:SI (reg:SI 0) (reg:SI 1)))
+   (set (reg:SI 1) (mod:SI (reg:SI 0) (reg:SI 1)))
+   (clobber (reg:SI LR_REGNUM))]
+  "!(TARGET_THUMB2 && arm_arch_hwdiv)"
+  "bl\t__aeabi_idivmod"
+  [(set_attr "type" "call")
+   (set_attr "conds" "clob")])
+
+(define_insn "aeabi_divsi3_call"
+  [(set (reg:SI 0) (div:SI (reg:SI 0) (reg:SI 1)))
+   (clobber (reg:SI LR_REGNUM))]
+  "!(TARGET_THUMB2 && arm_arch_hwdiv)"
+  "bl\t__aeabi_idiv"
+  [(set_attr "type" "call")
+   (set_attr "conds" "clob")])
+
+
+(define_insn_and_split "udivmodsi4"
+  [(parallel [(set (match_operand:SI 0 "s_register_operand" "")
+                   (udiv:SI (match_operand:SI 1 "s_register_operand" "")
+                           (match_operand:SI 2 "s_register_operand" "")))
+              (set (match_operand:SI 3 "s_register_operand" "")
+                   (umod:SI (match_dup 1) (match_dup 2)))
+              (clobber (reg:SI 0))
+              (clobber (reg:SI 1))
+              (clobber (reg:SI LR_REGNUM))])]
+  ""
+  "#"
+  "!(TARGET_THUMB2 && arm_arch_hwdiv)"
+  [(const_int 0)]
+  "{
+     rtx r0_reg = gen_rtx_REG (SImode, 0);
+     rtx r1_reg = gen_rtx_REG (SImode, 1);
+     emit_move_insn (r0_reg, operands[1]);
+     emit_move_insn (r1_reg, operands[2]);
+     if (find_reg_note (curr_insn, REG_UNUSED, operands[3]))
+        /* This is only a call for div.  */
+	emit_insn (gen_aeabi_udivsi3_call ());
+     else
+      {
+        emit_insn (gen_aeabi_udivmodsi4_call ());
+	emit_move_insn (operands[3], r1_reg);
+      }
+     emit_move_insn (operands[0], r0_reg);
+     DONE;
+  }
+  ")
+
+(define_insn "aeabi_udivmodsi4_call"
+  [(set (reg:SI 0) (udiv:SI (reg:SI 0) (reg:SI 1)))
+   (set (reg:SI 1) (umod:SI (reg:SI 0) (reg:SI 1)))
+   (clobber (reg:SI LR_REGNUM))]
+  "!(TARGET_THUMB2 && arm_arch_hwdiv)"
+  "bl\t__aeabi_uidivmod"
+  [(set_attr "type" "call")
+   (set_attr "conds" "clob")])
+
+(define_insn "aeabi_udivsi3_call"
+  [(set (reg:SI 0) (udiv:SI (reg:SI 0) (reg:SI 1)))
+   (clobber (reg:SI LR_REGNUM))]
+  "!(TARGET_THUMB2 && arm_arch_hwdiv)"
+  "bl\t__aeabi_uidiv"
+  [(set_attr "type" "call")
+   (set_attr "conds" "clob")])
+
 ;; Load the FPA co-processor patterns
 (include "fpa.md")
 ;; Load the Maverick co-processor patterns