Patchwork PATCH: PR target/44588: Very inefficient 8bit mod/div

login
register
mail settings
Submitter H.J. Lu
Date June 24, 2010, 4:47 p.m.
Message ID <AANLkTikxPwUaabNIw8dQa-9_L3whc9HC8lq2yomfpFpj@mail.gmail.com>
Download mbox | patch
Permalink /patch/56810/
State New
Headers show

Comments

H.J. Lu - June 24, 2010, 4:47 p.m.
On Wed, Jun 23, 2010 at 4:50 PM, Paolo Bonzini <bonzini@gnu.org> wrote:
> On 06/24/2010 01:13 AM, H.J. Lu wrote:
>>>
>>> I don't understand exactly what the problem was; clearly it couldn't work
>>> as
>>> long as this pattern was cheating blatantly:
>>
>> Upper 8bit registers aren't real registers. You can't
>> do RA with them.
>
> Please read what other people write.  I'm saying that you should have used a
> HI destination just like you did with your new UNSPEC, which is acceptable.
>
> The pattern I quoted below used a QI destination; then magically you
> attempted to extract bit 8..15 of it with an unspec, or something like that.
>  For anything like that the optimizers are going to bite back sooner or
> later.  And you'd deserve it.
>
>>> (define_insn "*<u>divqi3"
>>>  [(set (match_operand:QI 0 "register_operand" "=a")
>>>        (any_div:QI
>>>          (match_operand:HI 1 "register_operand" "0")
>>>          (match_operand:QI 2 "nonimmediate_operand" "qm")))
>>>
>>> Anyway this patch is IMO much nicer than the first ones, so if Uros is
>>> okay
>>> I don't think it's useful to pursue a more accurate representation.  Just
>>> make sure that REG_EQUAL notes are added to the two extractions.
>>>
>>
>> What are REG_EQUAL notes used for?
>> As far as the rest of gcc is concerned, there are no upper 8bit
>> registers. But you can access bits 8-15 of HI, SI,
>> DI registers via XXX_extract.
>
> Understood.  Using an unspec is fine for me, even though it's not an
> approval.  But that's completely orthogonal to putting a REG_EQUAL note _on
> the two regs that you extract out of AX_.  The notes' value should be one of
>
>   (subreg:QI (udiv:HI (...) (zero_extend (...))))
>   (subreg:QI (div:HI (...) (sign_extend (...))))
>   (subreg:QI (umod:HI (...) (zero_extend (...))))
>   (subreg:QI (mod:HI (...) (sign_extend (...))))
>
> But I think that with over ten years of GCC experience you do not need
> anyone to tell you this.  In fact, regarding "what are REG_EQUAL notes used
> for?" my first thought was RTFM, but anyway: they are used by CSE, fwprop,
> GCSE and combine to simplify and eliminate redundant expressions.

I am learning new things all the time. Thanks for the pointer.

Here is a new patch without UNSPEC.  OK for trunk?

Thanks.
Paolo Bonzini - June 24, 2010, 4:55 p.m.
On 06/24/2010 06:47 PM, H.J. Lu wrote:
> I am learning new things all the time. Thanks for the pointer.
>
> Here is a new patch without UNSPEC.  OK for trunk?

Cannot approve it, but thanks very much!

Paolo
Uros Bizjak - June 24, 2010, 6:03 p.m.
On Thu, 2010-06-24 at 09:47 -0700, H.J. Lu wrote:
> On Wed, Jun 23, 2010 at 4:50 PM, Paolo Bonzini <bonzini@gnu.org> wrote:
> > On 06/24/2010 01:13 AM, H.J. Lu wrote:
> >>>
> >>> I don't understand exactly what the problem was; clearly it couldn't work
> >>> as
> >>> long as this pattern was cheating blatantly:
> >>
> >> Upper 8bit registers aren't real registers. You can't
> >> do RA with them.
> >
> > Please read what other people write.  I'm saying that you should have used a
> > HI destination just like you did with your new UNSPEC, which is acceptable.
> >
> > The pattern I quoted below used a QI destination; then magically you
> > attempted to extract bit 8..15 of it with an unspec, or something like that.
> >  For anything like that the optimizers are going to bite back sooner or
> > later.  And you'd deserve it.
> >
> >>> (define_insn "*<u>divqi3"
> >>>  [(set (match_operand:QI 0 "register_operand" "=a")
> >>>        (any_div:QI
> >>>          (match_operand:HI 1 "register_operand" "0")
> >>>          (match_operand:QI 2 "nonimmediate_operand" "qm")))
> >>>
> >>> Anyway this patch is IMO much nicer than the first ones, so if Uros is
> >>> okay
> >>> I don't think it's useful to pursue a more accurate representation.  Just
> >>> make sure that REG_EQUAL notes are added to the two extractions.
> >>>
> >>
> >> What are REG_EQUAL notes used for?
> >> As far as the rest of gcc is concerned, there are no upper 8bit
> >> registers. But you can access bits 8-15 of HI, SI,
> >> DI registers via XXX_extract.
> >
> > Understood.  Using an unspec is fine for me, even though it's not an
> > approval.  But that's completely orthogonal to putting a REG_EQUAL note _on
> > the two regs that you extract out of AX_.  The notes' value should be one of
> >
> >   (subreg:QI (udiv:HI (...) (zero_extend (...))))
> >   (subreg:QI (div:HI (...) (sign_extend (...))))
> >   (subreg:QI (umod:HI (...) (zero_extend (...))))
> >   (subreg:QI (mod:HI (...) (sign_extend (...))))
> >
> > But I think that with over ten years of GCC experience you do not need
> > anyone to tell you this.  In fact, regarding "what are REG_EQUAL notes used
> > for?" my first thought was RTFM, but anyway: they are used by CSE, fwprop,
> > GCSE and combine to simplify and eliminate redundant expressions.
> 
> I am learning new things all the time. Thanks for the pointer.
> 
> Here is a new patch without UNSPEC.  OK for trunk?

OK for mainline, the approach in this patch is Really Good (TM).

Please just rename operand0 and operand1 internal variables to perhaps
tmp0 and tmp1, so they will not be confused with operand[0] and
operand[1].

Thanks,
Uros.

Patch

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 1f7369b..6514ee4 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -760,6 +760,8 @@ 
 
 ;; Used in signed and unsigned divisions.
 (define_code_iterator any_div [div udiv])
+(define_code_attr extract_code
+  [(div "SIGN_EXTRACT") (udiv "ZERO_EXTRACT")])
 
 ;; Instruction prefix for signed and unsigned operations.
 (define_code_attr sgnprefix [(sign_extend "i") (zero_extend "")
@@ -7269,17 +7271,6 @@ 
 
 ;; Divide instructions
 
-(define_insn "<u>divqi3"
-  [(set (match_operand:QI 0 "register_operand" "=a")
-	(any_div:QI
-	  (match_operand:HI 1 "register_operand" "0")
-	  (match_operand:QI 2 "nonimmediate_operand" "qm")))
-   (clobber (reg:CC FLAGS_REG))]
-  "TARGET_QIMODE_MATH"
-  "<sgnprefix>div{b}\t%2"
-  [(set_attr "type" "idiv")
-   (set_attr "mode" "QI")])
-
 ;; The patterns that match these are at the end of this file.
 
 (define_expand "divxf3"
@@ -7316,6 +7307,92 @@ 
 
 ;; Divmod instructions.
 
+(define_expand "<u>divmodqi4"
+  [(parallel [(set (match_operand:QI 0 "register_operand" "")
+		   (any_div:QI
+		     (match_operand:QI 1 "register_operand" "")
+		     (match_operand:QI 2 "nonimmediate_operand" "")))
+	      (set (match_operand:QI 3 "register_operand" "")
+		   (mod:QI (match_dup 1) (match_dup 2)))
+	      (clobber (reg:CC FLAGS_REG))])]
+  "TARGET_QIMODE_MATH"
+{
+  rtx div, mod, insn;
+  rtx operand0, operand1;
+  
+  operand0 = gen_reg_rtx (HImode);
+  operand1 = gen_reg_rtx (HImode);
+
+  /* Extend operands[1] to HImode.  Generate 8bit divide.  Result is
+     in AX.  */
+  if (<extract_code> == SIGN_EXTRACT)
+    {
+      emit_insn (gen_extendqihi2 (operand1, operands[1]));
+      emit_insn (gen_divmodhiqi3 (operand0, operand1, operands[2]));
+
+      div = gen_rtx_DIV (QImode, operands[1], operands[2]);
+      mod = gen_rtx_MOD (QImode, operands[1], operands[2]);
+
+      operand1 = gen_rtx_<extract_code> (QImode, operand0,
+					 GEN_INT (8), GEN_INT (8));
+    }
+  else
+    {
+      emit_insn (gen_zero_extendqihi2 (operand1, operands[1]));
+      emit_insn (gen_udivmodhiqi3 (operand0, operand1, operands[2]));
+
+      div = gen_rtx_UDIV (QImode, operands[1], operands[2]);
+      mod = gen_rtx_UMOD (QImode, operands[1], operands[2]);
+
+      operand1 = gen_rtx_<extract_code> (SImode, operand0,
+					 GEN_INT (8), GEN_INT (8));
+      operand1 = simplify_gen_subreg (QImode, operand1, SImode, 0);
+    }
+
+  /* Extract remainder from AH.  */
+  insn = emit_move_insn (operands[3], operand1);
+  set_unique_reg_note (insn, REG_EQUAL, mod);
+
+  /* Extract quotient from AL.  */
+  insn = emit_move_insn (operands[0], gen_lowpart (QImode, operand0));
+  set_unique_reg_note (insn, REG_EQUAL, div);
+
+  DONE;
+})
+
+;; Divide AX by r/m8, with result stored in
+;; AL <- Quotient
+;; AH <- Remainder
+(define_insn "divmodhiqi3"
+  [(set (match_operand:HI 0 "register_operand" "=a")
+	(ior:HI
+	  (ashift:HI
+	    (zero_extend:HI
+	      (mod:QI (match_operand:HI 1 "register_operand" "0")
+		      (match_operand:QI 2 "nonimmediate_operand" "qm")))
+	    (const_int 8))
+	  (zero_extend:HI (div:QI (match_dup 1) (match_dup 2)))))
+   (clobber (reg:CC FLAGS_REG))]
+  "TARGET_QIMODE_MATH"
+  "idiv{b}\t%2"
+  [(set_attr "type" "idiv")
+   (set_attr "mode" "QI")])
+
+(define_insn "udivmodhiqi3"
+  [(set (match_operand:HI 0 "register_operand" "=a")
+	(ior:HI
+	  (ashift:HI
+	    (zero_extend:HI
+	      (umod:QI (match_operand:HI 1 "register_operand" "0")
+		       (match_operand:QI 2 "nonimmediate_operand" "qm")))
+	    (const_int 8))
+	  (zero_extend:HI (udiv:QI (match_dup 1) (match_dup 2)))))
+   (clobber (reg:CC FLAGS_REG))]
+  "TARGET_QIMODE_MATH"
+  "div{b}\t%2"
+  [(set_attr "type" "idiv")
+   (set_attr "mode" "QI")])
+
 (define_expand "divmod<mode>4"
   [(parallel [(set (match_operand:SWIM248 0 "register_operand" "")
 		   (div:SWIM248
--- /dev/null	2010-06-16 10:11:06.602750711 -0700
+++ gcc/gcc/testsuite/gcc.target/i386/mod-1.c	2010-06-24 09:41:47.401923084 -0700
@@ -0,0 +1,29 @@ 
+/* { dg-do compile } */
+/* { dg-options "-Os -mtune=atom" } */
+
+typedef struct {
+  int a;
+} VCR;
+
+typedef struct {
+  VCR vcr[8];
+} VCRC;
+
+typedef struct {
+  char vcr;
+} OWN;
+
+OWN Own[16];
+
+void
+f (VCRC *x, OWN *own)
+{
+  x[own->vcr / 8].vcr[own->vcr % 8].a--;
+  x[own->vcr / 8].vcr[own->vcr % 8].a = x[own->vcr / 8].vcr[own->vcr % 8].a;
+}
+
+/* { dg-final { scan-assembler-times "idivb" 1 } } */
+/* { dg-final { scan-assembler-not "incl" } } */
+/* { dg-final { scan-assembler-not "orl" } } */
+/* { dg-final { scan-assembler-not "andb" } } */
+/* { dg-final { scan-assembler-not "jns" } } */
--- /dev/null	2010-06-16 10:11:06.602750711 -0700
+++ gcc/gcc/testsuite/gcc.target/i386/umod-1.c	2010-06-21 12:01:25.705950180 -0700
@@ -0,0 +1,11 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -mtune=atom" } */
+
+unsigned char
+foo (unsigned char x, unsigned char y)
+{
+  return x % y;
+}
+
+/* { dg-final { scan-assembler-times "divb" 1 } } */
+/* { dg-final { scan-assembler-not "divw" } } */
--- /dev/null	2010-06-16 10:11:06.602750711 -0700
+++ gcc/gcc/testsuite/gcc.target/i386/umod-2.c	2010-06-21 12:01:17.857932744 -0700
@@ -0,0 +1,14 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -mtune=atom" } */
+
+extern unsigned char z;
+
+unsigned char
+foo (unsigned char x, unsigned char y)
+{
+  z = x/y;
+  return x % y;
+}
+
+/* { dg-final { scan-assembler-times "divb" 1 } } */
+/* { dg-final { scan-assembler-not "divw" } } */
--- /dev/null	2010-06-16 10:11:06.602750711 -0700
+++ gcc/gcc/testsuite/gcc.target/i386/umod-3.c	2010-06-21 12:02:36.809962702 -0700
@@ -0,0 +1,21 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -mtune=atom" } */
+
+extern void abort (void);
+extern void exit (int);
+
+unsigned char cx = 7;
+
+int
+main ()
+{
+  unsigned char cy;
+  
+  cy = cx / 6; if (cy != 1) abort ();
+  cy = cx % 6; if (cy != 1) abort ();
+
+  exit(0);
+}
+
+/* { dg-final { scan-assembler-times "divb" 1 } } */
+/* { dg-final { scan-assembler-not "divw" } } */