Patchwork RFC: PATCH: Add -m8bit-idiv for x86

login
register
mail settings
Submitter H.J. Lu
Date Sept. 14, 2010, 9:33 p.m.
Message ID <AANLkTi=k90A-WDbrr-U7nvJpHQy4s11hDMrqRc530Bdh@mail.gmail.com>
Download mbox | patch
Permalink /patch/64760/
State New
Headers show

Comments

H.J. Lu - Sept. 14, 2010, 9:33 p.m.
On Tue, Sep 14, 2010 at 12:13 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Tue, Sep 14, 2010 at 7:05 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>
>>>> This patch generates 2 idivbs since the optimization is done at RTL
>>>> expansion. Is there a way to delay this until later when 2 idivls are
>>>> optimized into 1 idivl and before IRA since this optimization needs
>>>> a scratch register.
>>>
>>> Splitter with && can_create_pseudo_p () split constraint will limit
>>> splits to pre-regalloc passes, or ...
>>
>> try_split doesn't allow any insn of the result matches the original pattern
>> to avoid infinite loop.
>
> So, switch the places of div and mod RTXes in the parallel and provide
> another divl_1 insn pattern that matches this new parallel.
>

Here is the updated patch.  I added 2 splitters for each divmod pattern.
It splits 32bit divmod into

if (dividend and divisor are in [0-255])
 use 8bit unsigned integer divide
else
 use 32bit integer divide

before IRA. It works quite well.  OK for trunk if there are no regressions
on Linux./ia32 and Linux/x86-64?

Thanks.
Uros Bizjak - Sept. 15, 2010, 6:14 p.m.
On Tue, Sep 14, 2010 at 11:33 PM, H.J. Lu <hjl.tools@gmail.com> wrote:

>>>>> This patch generates 2 idivbs since the optimization is done at RTL
>>>>> expansion. Is there a way to delay this until later when 2 idivls are
>>>>> optimized into 1 idivl and before IRA since this optimization needs
>>>>> a scratch register.
>>>>
>>>> Splitter with && can_create_pseudo_p () split constraint will limit
>>>> splits to pre-regalloc passes, or ...
>>>
>>> try_split doesn't allow any insn of the result matches the original pattern
>>> to avoid infinite loop.
>>
>> So, switch the places of div and mod RTXes in the parallel and provide
>> another divl_1 insn pattern that matches this new parallel.
>>
>
> Here is the updated patch.  I added 2 splitters for each divmod pattern.
> It splits 32bit divmod into
>
> if (dividend and divisor are in [0-255])
>  use 8bit unsigned integer divide
> else
>  use 32bit integer divide
>
> before IRA. It works quite well.  OK for trunk if there are no regressions
> on Linux./ia32 and Linux/x86-64?

> +m8bit-idiv
> +Target Report Var(flag_8bit_idiv) Init(-1) Save
> +Expand 32bit integer divide into control flow with 8bit unsigned integer divide

Please redefine -m8bit-idiv as target mask:

Target Report Mask(USE_8BIT_IDIV) Save

Also, please do not forget to update:

  /* Flag options.  */
  static struct ix86_target_opts flag_opts[] =

in i386.c.

You will be able to use TARGET_USE_8BIT_IDIV automatically, and
hopefully it can be also used as a per file/function target attribute.

> +(define_split
> +  [(set (match_operand:SWIM248 0 "register_operand" "=a")
> +	(div:SWIM248 (match_operand:SWIM248 2 "register_operand" "0")
> +		     (match_operand:SWIM248 3 "nonimmediate_operand" "rm")))
> +   (set (match_operand:SWIM248 1 "register_operand" "=&d")
> +	(mod:SWIM248 (match_dup 2) (match_dup 3)))
> +   (clobber (reg:CC FLAGS_REG))]
> +  "<MODE>mode == SImode
> +   && flag_8bit_idiv
> +   && TARGET_QIMODE_MATH
> +   && can_create_pseudo_p ()
> +   && !optimize_insn_for_size_p ()"
> +  [(const_int 0)]
> +  "ix86_split_idivmod (DIV, <MODE>mode, operands); DONE;")

No need for mode macro, just use SImode explicitly in the splitter.
And due to previous change, flag_8but_idiv can be substituted with
TARGET_USE_8BIT_IDIV define.

> +(define_split
> +  [(set (match_operand:SWIM248 0 "register_operand" "=a")
> +	(udiv:SWIM248 (match_operand:SWIM248 2 "register_operand" "0")
> +		      (match_operand:SWIM248 3 "nonimmediate_operand" "rm")))
> +   (set (match_operand:SWIM248 1 "register_operand" "=&d")
> +	(umod:SWIM248 (match_dup 2) (match_dup 3)))
> +   (clobber (reg:CC FLAGS_REG))]
> +  "reload_completed"
> +  [(set (match_dup 1) (const_int 0))
> +   (parallel [(set (match_dup 0)
> +		   (udiv:SWIM248 (match_dup 2) (match_dup 3)))
> +	      (set (match_dup 1)
> +		   (umod:SWIM248 (match_dup 2) (match_dup 3)))
> +	      (use (match_dup 1))
> +	      (clobber (reg:CC FLAGS_REG))])]
> +  "")

Please omit empty splitter constraints.

> +void
> +ix86_split_idivmod (enum rtx_code code, enum machine_mode mode,
> +		    rtx operands[])

No need for rtx_code, just use "bool unsigned":

+void
+ix86_split_idivmod (enum machine_mode mode, rtx operands[], bool unsigned)

> +  switch (mode)
> +    {
> +    case SImode:
> +      gen_divmod4_1 = code == DIV ? gen_divmodsi4_1 : gen_udivmodsi4_1;
> +      break;
> +    default:
> +      gcc_unreachable ();
> +    }

gcc_assert (mode == SImode);

gen_divmod4_1 = unsigned ? gen_udivmodsi...

Hm.... no DImode?

> +  if (code == DIV)
> +    {
> +      div = gen_rtx_DIV (SImode, operands[2], operands[3]);
> +      mod = gen_rtx_MOD (SImode, operands[2], operands[3]);
> +    }
> +  else
> +    {
> +      div = gen_rtx_UDIV (SImode, operands[2], operands[3]);
> +      mod = gen_rtx_UMOD (SImode, operands[2], operands[3]);
> +    }

if (unsigned)
...

> +This option will enable GCC to expand 32bit integer divide into control
> +flow with 8bit unsigned integer divide.

IMO, you should expand this comment a bit, at least explaining the
reason for this (non-obvious) option and describing some more "control
flow with 8bit ...". If you provide a thorough explanation and a good
reasoning for this option, then it will be used much more.

> 2010-09-14  H.J. Lu  <hongjiu.lu@intel.com>
>
>        * config/i386/i386-protos.h (ix86_split_idivmod): New.

New prototype.

Also, I agree with Andi, this conversion should be also triggered from
profile information.

Thanks,
Uros.

Patch

diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
index 900b424..15a5d4c 100644
--- a/gcc/config/i386/i386-protos.h
+++ b/gcc/config/i386/i386-protos.h
@@ -129,6 +129,7 @@  extern void ix86_split_ashr (rtx *, rtx, enum machine_mode);
 extern void ix86_split_lshr (rtx *, rtx, enum machine_mode);
 extern rtx ix86_find_base_term (rtx);
 extern bool ix86_check_movabs (rtx, int);
+extern void ix86_split_idivmod (enum rtx_code, enum machine_mode, rtx[]);
 
 extern rtx assign_386_stack_local (enum machine_mode, enum ix86_stack_slot);
 extern int ix86_attr_length_immediate_default (rtx, int);
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 19d6387..905aaa5 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -1985,6 +1985,7 @@  static bool ix86_expand_vector_init_one_nonzero (bool, enum machine_mode,
 static void ix86_add_new_builtins (int);
 static rtx ix86_expand_vec_perm_builtin (tree);
 static tree ix86_canonical_va_list_type (tree);
+static void predict_jump (int);
 
 enum ix86_function_specific_strings
 {
@@ -3703,6 +3704,9 @@  override_options (bool main_args_p)
 #endif
    }
 
+  if (flag_8bit_idiv < 0)
+    flag_8bit_idiv = 0;
+
   /* Save the initial options in case the user does function specific options */
   if (main_args_p)
     target_option_default_node = target_option_current_node
@@ -14651,6 +14655,93 @@  ix86_expand_unary_operator (enum rtx_code code, enum machine_mode mode,
     emit_move_insn (operands[0], dst);
 }
 
+/* Split 32bit divmod with 8bit unsigned divmod if dividend and
+   divisor are within the the range [0-255].  */
+
+void
+ix86_split_idivmod (enum rtx_code code, enum machine_mode mode,
+		    rtx operands[])
+{
+  rtx end_label, qimode_label;
+  rtx insn, div, mod;
+  rtx scratch, tmp0, tmp1, tmp2;
+  rtx (*gen_divmod4_1) (rtx, rtx, rtx, rtx);
+
+  switch (mode)
+    {
+    case SImode:
+      gen_divmod4_1 = code == DIV ? gen_divmodsi4_1 : gen_udivmodsi4_1;
+      break;
+    default:
+      gcc_unreachable ();
+    }
+
+  end_label = gen_label_rtx ();
+  qimode_label = gen_label_rtx ();
+
+  scratch = gen_reg_rtx (mode);
+
+  /* Use 8bit unsigned divimod if dividend and divisor are within the
+     the range [0-255].  */
+  emit_move_insn (scratch, operands[2]);
+  scratch = expand_simple_binop (mode, IOR, scratch, operands[3],
+				 scratch, 1, OPTAB_DIRECT);
+  emit_insn (gen_testsi_ccno_1 (scratch, GEN_INT (-0x100)));
+  tmp0 = gen_rtx_REG (CCNOmode, FLAGS_REG);
+  tmp0 = gen_rtx_EQ (VOIDmode, tmp0, const0_rtx);
+  tmp0 = gen_rtx_IF_THEN_ELSE (VOIDmode, tmp0,
+			       gen_rtx_LABEL_REF (VOIDmode, qimode_label),
+			       pc_rtx);
+  insn = emit_jump_insn (gen_rtx_SET (VOIDmode, pc_rtx, tmp0));
+  predict_jump (REG_BR_PROB_BASE * 50 / 100);
+  JUMP_LABEL (insn) = qimode_label;
+
+  /* Generate original signed/unsigned divimod.  */
+  div = (*gen_divmod4_1) (operands[0], operands[1],
+			  operands[2], operands[3]);
+  emit_insn (div);
+
+  /* Branch to the end.  */
+  emit_jump_insn (gen_jump (end_label));
+  emit_barrier ();
+
+  /* Generate 8bit unsigned divide.  */
+  emit_label (qimode_label);
+  tmp0 = simplify_gen_subreg (HImode, scratch, mode, 0);
+  tmp1 = simplify_gen_subreg (HImode, operands[2], mode, 0);
+  tmp2 = simplify_gen_subreg (QImode, operands[3], mode, 0);
+  emit_insn (gen_udivmodhiqi3 (tmp0, tmp1, tmp2));
+
+  if (code == DIV)
+    {
+      div = gen_rtx_DIV (SImode, operands[2], operands[3]);
+      mod = gen_rtx_MOD (SImode, operands[2], operands[3]);
+    }
+  else
+    {
+      div = gen_rtx_UDIV (SImode, operands[2], operands[3]);
+      mod = gen_rtx_UMOD (SImode, operands[2], operands[3]);
+    }
+
+  /* Zero extend quotient from AL.  */
+  tmp1 = gen_lowpart (QImode, tmp0);
+  insn = emit_insn (gen_zero_extendqisi2 (operands[0], tmp1));
+  set_unique_reg_note (insn, REG_EQUAL, div);
+
+  /* Extract remainder from AH.  */
+  tmp1 = gen_rtx_ZERO_EXTRACT (mode, tmp0, GEN_INT (8), GEN_INT (8));
+  if (REG_P (operands[1]))
+    insn = emit_move_insn (operands[1], tmp1);
+  else
+    {
+      emit_move_insn (scratch, tmp1);
+      insn = emit_move_insn (operands[1], scratch);
+    }
+  set_unique_reg_note (insn, REG_EQUAL, mod);
+
+  emit_label (end_label);
+}
+
 #define LEA_SEARCH_THRESHOLD 12
 
 /* Search backward for non-agu definition of register number REGNO1
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 9780eef..2f2f374 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -7306,13 +7306,76 @@ 
   ""
   "")
 
-(define_insn_and_split "*divmod<mode>4"
+(define_insn "*divmod<mode>4"
   [(set (match_operand:SWIM248 0 "register_operand" "=a")
 	(div:SWIM248 (match_operand:SWIM248 2 "register_operand" "0")
-		    (match_operand:SWIM248 3 "nonimmediate_operand" "rm")))
+		     (match_operand:SWIM248 3 "nonimmediate_operand" "rm")))
+   (set (match_operand:SWIM248 1 "register_operand" "=&d")
+	(mod:SWIM248 (match_dup 2) (match_dup 3)))
+   (clobber (reg:CC FLAGS_REG))]
+  ""
+  "#"
+  [(set_attr "type" "multi")
+   (set_attr "mode" "<MODE>")])
+
+;; Split with 8bit unsigned divide:
+;; 	if (dividend an divisor are in [0-255])
+;;	   use 8bit unsigned integer divide
+;;	 else
+;;	   use original integer divide
+(define_split
+  [(set (match_operand:SWIM248 0 "register_operand" "=a")
+	(div:SWIM248 (match_operand:SWIM248 2 "register_operand" "0")
+		     (match_operand:SWIM248 3 "nonimmediate_operand" "rm")))
+   (set (match_operand:SWIM248 1 "register_operand" "=&d")
+	(mod:SWIM248 (match_dup 2) (match_dup 3)))
+   (clobber (reg:CC FLAGS_REG))]
+  "<MODE>mode == SImode
+   && flag_8bit_idiv
+   && TARGET_QIMODE_MATH
+   && can_create_pseudo_p ()
+   && !optimize_insn_for_size_p ()"
+  [(const_int 0)]
+  "ix86_split_idivmod (DIV, <MODE>mode, operands); DONE;")
+
+(define_split
+  [(set (match_operand:SWIM248 0 "register_operand" "=a")
+	(div:SWIM248 (match_operand:SWIM248 2 "register_operand" "0")
+		     (match_operand:SWIM248 3 "nonimmediate_operand" "rm")))
    (set (match_operand:SWIM248 1 "register_operand" "=&d")
 	(mod:SWIM248 (match_dup 2) (match_dup 3)))
    (clobber (reg:CC FLAGS_REG))]
+  "reload_completed"
+  [(parallel [(set (match_dup 1)
+		   (ashiftrt:SWIM248 (match_dup 4) (match_dup 5)))
+	      (clobber (reg:CC FLAGS_REG))])
+   (parallel [(set (match_dup 0)
+	           (div:SWIM248 (match_dup 2) (match_dup 3)))
+	      (set (match_dup 1)
+		   (mod:SWIM248 (match_dup 2) (match_dup 3)))
+	      (use (match_dup 1))
+	      (clobber (reg:CC FLAGS_REG))])]
+{
+  operands[5] = GEN_INT (GET_MODE_BITSIZE (<MODE>mode)-1);
+
+  if (<MODE>mode != HImode
+      && (optimize_function_for_size_p (cfun) || TARGET_USE_CLTD))
+    operands[4] = operands[2];
+  else
+    {
+      /* Avoid use of cltd in favor of a mov+shift.  */
+      emit_move_insn (operands[1], operands[2]);
+      operands[4] = operands[1];
+    }
+})
+
+(define_insn_and_split "divmod<mode>4_1"
+  [(set (match_operand:SWIM248 1 "register_operand" "=&d")
+	(mod:SWIM248 (match_operand:SWIM248 2 "register_operand" "0")
+		     (match_operand:SWIM248 3 "nonimmediate_operand" "rm")))
+   (set (match_operand:SWIM248 0 "register_operand" "=a")
+	(div:SWIM248 (match_dup 2) (match_dup 3)))
+   (clobber (reg:CC FLAGS_REG))]
   ""
   "#"
   "reload_completed"
@@ -7365,13 +7428,62 @@ 
   ""
   "")
 
-(define_insn_and_split "*udivmod<mode>4"
+(define_insn "*udivmod<mode>4"
+  [(set (match_operand:SWIM248 0 "register_operand" "=a")
+	(udiv:SWIM248 (match_operand:SWIM248 2 "register_operand" "0")
+		      (match_operand:SWIM248 3 "nonimmediate_operand" "rm")))
+   (set (match_operand:SWIM248 1 "register_operand" "=&d")
+	(umod:SWIM248 (match_dup 2) (match_dup 3)))
+   (clobber (reg:CC FLAGS_REG))]
+  ""
+  "#"
+  [(set_attr "type" "multi")
+   (set_attr "mode" "<MODE>")])
+
+;; Split with 8bit unsigned divide:
+;; 	if (dividend an divisor are in [0-255])
+;;	   use 8bit unsigned integer divide
+;;	 else
+;;	   use original integer divide
+(define_split
   [(set (match_operand:SWIM248 0 "register_operand" "=a")
 	(udiv:SWIM248 (match_operand:SWIM248 2 "register_operand" "0")
 		      (match_operand:SWIM248 3 "nonimmediate_operand" "rm")))
    (set (match_operand:SWIM248 1 "register_operand" "=&d")
 	(umod:SWIM248 (match_dup 2) (match_dup 3)))
    (clobber (reg:CC FLAGS_REG))]
+  "<MODE>mode == SImode
+   && flag_8bit_idiv
+   && TARGET_QIMODE_MATH
+   && can_create_pseudo_p ()
+   && !optimize_insn_for_size_p ()"
+  [(const_int 0)]
+  "ix86_split_idivmod (UDIV, <MODE>mode, operands); DONE;")
+
+(define_split
+  [(set (match_operand:SWIM248 0 "register_operand" "=a")
+	(udiv:SWIM248 (match_operand:SWIM248 2 "register_operand" "0")
+		      (match_operand:SWIM248 3 "nonimmediate_operand" "rm")))
+   (set (match_operand:SWIM248 1 "register_operand" "=&d")
+	(umod:SWIM248 (match_dup 2) (match_dup 3)))
+   (clobber (reg:CC FLAGS_REG))]
+  "reload_completed"
+  [(set (match_dup 1) (const_int 0))
+   (parallel [(set (match_dup 0)
+		   (udiv:SWIM248 (match_dup 2) (match_dup 3)))
+	      (set (match_dup 1)
+		   (umod:SWIM248 (match_dup 2) (match_dup 3)))
+	      (use (match_dup 1))
+	      (clobber (reg:CC FLAGS_REG))])]
+  "")
+
+(define_insn_and_split "udivmod<mode>4_1"
+  [(set (match_operand:SWIM248 1 "register_operand" "=&d")
+	(umod:SWIM248 (match_operand:SWIM248 2 "register_operand" "0")
+		      (match_operand:SWIM248 3 "nonimmediate_operand" "rm")))
+   (set (match_operand:SWIM248 0 "register_operand" "=a")
+	(udiv:SWIM248 (match_dup 2) (match_dup 3)))
+   (clobber (reg:CC FLAGS_REG))]
   ""
   "#"
   "reload_completed"
diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
index 5790e76..f7abe75 100644
--- a/gcc/config/i386/i386.opt
+++ b/gcc/config/i386/i386.opt
@@ -388,3 +388,7 @@  Support F16C built-in functions and code generation
 mfentry
 Target Report Var(flag_fentry) Init(-1)
 Emit profiling counter call at function entry before prologue.
+
+m8bit-idiv
+Target Report Var(flag_8bit_idiv) Init(-1) Save
+Expand 32bit integer divide into control flow with 8bit unsigned integer divide
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index b354382..cb3a756 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -602,7 +602,7 @@  Objective-C and Objective-C++ Dialects}.
 -momit-leaf-frame-pointer  -mno-red-zone -mno-tls-direct-seg-refs @gol
 -mcmodel=@var{code-model} -mabi=@var{name} @gol
 -m32  -m64 -mlarge-data-threshold=@var{num} @gol
--msse2avx -mfentry}
+-msse2avx -mfentry -m8bit-idiv}
 
 @emph{IA-64 Options}
 @gccoptlist{-mbig-endian  -mlittle-endian  -mgnu-as  -mgnu-ld  -mno-pic @gol
@@ -12647,6 +12647,13 @@  If profiling is active @option{-pg} put the profiling
 counter call before prologue.
 Note: On x86 architectures the attribute @code{ms_hook_prologue}
 isn't possible at the moment for @option{-mfentry} and @option{-pg}.
+
+@item -m8bit-idiv
+@itemx -mno-8bit-idiv
+@opindex 8bit-idiv
+This option will enable GCC to expand 32bit integer divide into control
+flow with 8bit unsigned integer divide.
+
 @end table
 
 These @samp{-m} switches are supported in addition to the above
diff --git a/gcc/testsuite/gcc.target/i386/divmod-1.c b/gcc/testsuite/gcc.target/i386/divmod-1.c
new file mode 100644
index 0000000..2769a21
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/divmod-1.c
@@ -0,0 +1,30 @@ 
+/* { dg-do run } */
+/* { dg-options "-O2 -m8bit-idiv" } */
+
+extern void abort (void);
+
+void
+__attribute__((noinline))
+test (int x, int y, int q, int r)
+{
+  if ((x / y) != q || (x % y) != r)
+    abort ();
+}
+
+int
+main ()
+{
+  test (7, 6, 1, 1);
+  test (-7, -6, 1, -1);
+  test (-7, 6, -1, -1);
+  test (7, -6, -1, 1);
+  test (255, 254, 1, 1);
+  test (256, 254, 1, 2);
+  test (256, 256, 1, 0);
+  test (254, 256, 0, 254);
+  test (254, 255, 0, 254);
+  test (254, 1, 254, 0);
+  test (255, 2, 127, 1);
+  test (1, 256, 0, 1);
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/divmod-2.c b/gcc/testsuite/gcc.target/i386/divmod-2.c
new file mode 100644
index 0000000..0e73b27
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/divmod-2.c
@@ -0,0 +1,11 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -m8bit-idiv" } */
+
+int
+foo (int x, int y)
+{
+   return x / y;
+}
+
+/* { dg-final { scan-assembler-times "divb" 1 } } */
+/* { dg-final { scan-assembler-times "idivl" 1 } } */
diff --git a/gcc/testsuite/gcc.target/i386/divmod-3.c b/gcc/testsuite/gcc.target/i386/divmod-3.c
new file mode 100644
index 0000000..4b84436
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/divmod-3.c
@@ -0,0 +1,11 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -m8bit-idiv" } */
+
+int
+foo (int x, int y)
+{
+   return x % y;
+}
+
+/* { dg-final { scan-assembler-times "divb" 1 } } */
+/* { dg-final { scan-assembler-times "idivl" 1 } } */
diff --git a/gcc/testsuite/gcc.target/i386/divmod-4.c b/gcc/testsuite/gcc.target/i386/divmod-4.c
new file mode 100644
index 0000000..7124d7a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/divmod-4.c
@@ -0,0 +1,14 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -m8bit-idiv" } */
+
+extern void abort (void);
+
+void
+test (int x, int y, int q, int r)
+{
+  if ((x / y) != q || (x % y) != r)
+    abort ();
+}
+
+/* { dg-final { scan-assembler-times "divb" 1 } } */
+/* { dg-final { scan-assembler-times "idivl" 1 } } */
diff --git a/gcc/testsuite/gcc.target/i386/divmod-4a.c b/gcc/testsuite/gcc.target/i386/divmod-4a.c
new file mode 100644
index 0000000..572b3df
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/divmod-4a.c
@@ -0,0 +1,14 @@ 
+/* { dg-do compile } */
+/* { dg-options "-Os -m8bit-idiv" } */
+
+extern void abort (void);
+
+void
+test (int x, int y, int q, int r)
+{
+  if ((x / y) != q || (x % y) != r)
+    abort ();
+}
+
+/* { dg-final { scan-assembler-not "divb" } } */
+/* { dg-final { scan-assembler-times "idivl" 1 } } */
diff --git a/gcc/testsuite/gcc.target/i386/udivmod-1.c b/gcc/testsuite/gcc.target/i386/udivmod-1.c
new file mode 100644
index 0000000..eebd843
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/udivmod-1.c
@@ -0,0 +1,31 @@ 
+/* { dg-do run } */
+/* { dg-options "-O2 -m8bit-idiv" } */
+
+extern void abort (void);
+
+void
+__attribute__((noinline))
+test (unsigned int x, unsigned int y, unsigned int q, unsigned int r)
+{
+  if ((x / y) != q || (x % y) != r)
+    abort ();
+}
+
+int
+main ()
+{
+  test (7, 6, 1, 1);
+  test (255, 254, 1, 1);
+  test (256, 254, 1, 2);
+  test (256, 256, 1, 0);
+  test (254, 256, 0, 254);
+  test (254, 255, 0, 254);
+  test (254, 1, 254, 0);
+  test (255, 2, 127, 1);
+  test (1, 256, 0, 1);
+  test (0x80000000, 0x7fffffff, 1, 1);
+  test (0x7fffffff, 0x80000000, 0, 0x7fffffff);
+  test (0x80000000, 0x80000003, 0, 0x80000000);
+  test (0xfffffffd, 0xfffffffe, 0, 0xfffffffd);
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/udivmod-2.c b/gcc/testsuite/gcc.target/i386/udivmod-2.c
new file mode 100644
index 0000000..2bba8f3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/udivmod-2.c
@@ -0,0 +1,11 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -m8bit-idiv" } */
+
+unsigned int
+foo (unsigned int x, unsigned int y)
+{
+   return x / y;
+}
+
+/* { dg-final { scan-assembler-times "divb" 1 } } */
+/* { dg-final { scan-assembler-times "divl" 1 } } */
diff --git a/gcc/testsuite/gcc.target/i386/udivmod-3.c b/gcc/testsuite/gcc.target/i386/udivmod-3.c
new file mode 100644
index 0000000..f2ac4e5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/udivmod-3.c
@@ -0,0 +1,11 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -m8bit-idiv" } */
+
+unsigned int
+foo (unsigned int x, unsigned int y)
+{
+   return x % y;
+}
+
+/* { dg-final { scan-assembler-times "divb" 1 } } */
+/* { dg-final { scan-assembler-times "divl" 1 } } */
diff --git a/gcc/testsuite/gcc.target/i386/udivmod-4.c b/gcc/testsuite/gcc.target/i386/udivmod-4.c
new file mode 100644
index 0000000..14dd87c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/udivmod-4.c
@@ -0,0 +1,14 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -m8bit-idiv" } */
+
+extern void abort (void);
+
+void
+test (unsigned int x, unsigned int y, unsigned int q, unsigned int r)
+{
+  if ((x / y) != q || (x % y) != r)
+    abort ();
+}
+
+/* { dg-final { scan-assembler-times "divb" 1 } } */
+/* { dg-final { scan-assembler-times "divl" 1 } } */
diff --git a/gcc/testsuite/gcc.target/i386/udivmod-4a.c b/gcc/testsuite/gcc.target/i386/udivmod-4a.c
new file mode 100644
index 0000000..f1ff389
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/udivmod-4a.c
@@ -0,0 +1,14 @@ 
+/* { dg-do compile } */
+/* { dg-options "-Os -m8bit-idiv" } */
+
+extern void abort (void);
+
+void
+test (unsigned int x, unsigned int y, unsigned int q, unsigned int r)
+{
+  if ((x / y) != q || (x % y) != r)
+    abort ();
+}
+
+/* { dg-final { scan-assembler-not "divb" } } */
+/* { dg-final { scan-assembler-times "divl" 1 } } */
diff --git a/gcc/testsuite/gcc.target/i386/udivmod-5.c b/gcc/testsuite/gcc.target/i386/udivmod-5.c
new file mode 100644
index 0000000..3ff1cf1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/udivmod-5.c
@@ -0,0 +1,13 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -m8bit-idiv" } */
+
+extern void foo (int, int, int, int, int, int);
+
+void
+bar (int x, int y)
+{
+  foo (0, 0, 0, 0, x / y, x % y);
+}
+
+/* { dg-final { scan-assembler-times "divb" 1 } } */
+/* { dg-final { scan-assembler-times "divl" 1 } } */
diff --git a/gcc/testsuite/gcc.target/i386/umod-1.c b/gcc/testsuite/gcc.target/i386/umod-1.c
index 54edf13..a39e75b 100644
--- a/gcc/testsuite/gcc.target/i386/umod-1.c
+++ b/gcc/testsuite/gcc.target/i386/umod-1.c
@@ -1,5 +1,5 @@ 
 /* { dg-do compile } */
-/* { dg-options "-O2 -mtune=atom" } */
+/* { dg-options "-O2 -m8bit-idiv" } */
 
 unsigned char
 foo (unsigned char x, unsigned char y)
diff --git a/gcc/testsuite/gcc.target/i386/umod-2.c b/gcc/testsuite/gcc.target/i386/umod-2.c
index 6fe7384..1ef33bb 100644
--- a/gcc/testsuite/gcc.target/i386/umod-2.c
+++ b/gcc/testsuite/gcc.target/i386/umod-2.c
@@ -1,5 +1,5 @@ 
 /* { dg-do compile } */
-/* { dg-options "-O2 -mtune=atom" } */
+/* { dg-options "-O2 -m8bit-idiv" } */
 
 extern unsigned char z;
 
diff --git a/gcc/testsuite/gcc.target/i386/umod-3.c b/gcc/testsuite/gcc.target/i386/umod-3.c
index 7123bc9..3e5bc40 100644
--- a/gcc/testsuite/gcc.target/i386/umod-3.c
+++ b/gcc/testsuite/gcc.target/i386/umod-3.c
@@ -1,5 +1,5 @@ 
 /* { dg-do compile } */
-/* { dg-options "-O2 -mtune=atom" } */
+/* { dg-options "-O2 -m8bit-idiv" } */
 
 extern void abort (void);
 extern void exit (int);