diff mbox series

[i386] : Fix PR 82628, wrong code at -Os on x86_64-linux-gnu in the 32-bit mode

Message ID CAFULd4ac9j-uJEatmjkid4=--CfkMs=RRYuD2eVb9wP0ER4MKA@mail.gmail.com
State New
Headers show
Series [i386] : Fix PR 82628, wrong code at -Os on x86_64-linux-gnu in the 32-bit mode | expand

Commit Message

Uros Bizjak Oct. 22, 2017, 6:04 p.m. UTC
Hello!

In PR 82628 Jakub figured out that insn patterns that consume carry
flag were not 100% correct. Due to this issue, combine is able to
simplify various CC_REG propagations that result in invalid code.

Attached patch fixes (well, mitigates) the above problem by splitting
the double-mode compare after the reload, in the same way other
*_doubleword patterns are handled from "the beginning of the time".

2017-10-22  Uros Bizjak  <ubizjak@gmail.com>

    PR target/82628
    * config/i386/i386.md (cmp<dwi>_doubleword): New pattern.
    * config/i386/i386.c (ix86_expand_branch) <case E_TImode>:
    Expand with cmp<dwi>_doubleword.

2017-10-22  Uros Bizjak  <ubizjak@gmail.com>
        Jakub Jelinek  <jakub@redhat.com>

    PR target/82628
    * gcc.dg/torture/pr82628.c: New test.

Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.

Comments

Jakub Jelinek Oct. 23, 2017, 10:09 a.m. UTC | #1
On Sun, Oct 22, 2017 at 08:04:28PM +0200, Uros Bizjak wrote:
> Hello!
> 
> In PR 82628 Jakub figured out that insn patterns that consume carry
> flag were not 100% correct. Due to this issue, combine is able to
> simplify various CC_REG propagations that result in invalid code.
> 
> Attached patch fixes (well, mitigates) the above problem by splitting
> the double-mode compare after the reload, in the same way other
> *_doubleword patterns are handled from "the beginning of the time".

I'm afraid this is going to haunt us sooner or later, combine isn't the
only pass that uses simplify-rtx.c infrastructure heavily and when we lie
in the RTL pattern, eventually something will be simplified wrongly.

So, at least we'd need to use UNSPEC for the pattern, like (only lightly
tested so far) below.

I'm not sure the double-word pattern is a win though, it causes PR82662
you've filed (the problem is that during ifcvt because of the double-word
comparison the condition is canonicalized as (lt (reg:TI) (reg:TI)) and
there is no instruction in the MD that would take such arguments, there
are only instructions that compare flags registers.
If you look at say normal DImode comparisons, it is the same thing,
ifcvt also can't do anything with those, the reason they work is that we
have a cstoredi4 optab (for 64-bit), but don't have a cstoreti4 optab.
So, we'd need that (and only handle the GE/GEU/LT/LTU + the others that can
be handled by swapping the operands).
I think the double-word pattern has other issues, it will result in RA not
knowing in detail what is going on and thus can at least reserve one extra
register that otherwise would not be needed.  The reason we have the
doubleword patterns elsewhere is that splitting double-word early makes it
harder/impossible for STV to use SSE registers; in this case we don't have
something reasonable to expand to anyway, we always split.

The alternative I have is the patch attached in the PR, if the unrelated
addcarry/subborrow changes are removed, then it doesn't regress anything,
the pr50038.c FAIL is from some other earlier change even on vanilla
branch and pr67317-* FAILs were caused by the addcarry/subborrow changes,
will look at those in detail.

2017-10-23  Jakub Jelinek  <jakub@redhat.com>

	PR target/82628
	* config/i386/i386.md (UNSPEC_SBB): New unspec.
	(cmp<dwi>_doubleword): Use unspec instead of compare.
	(sub<mode>3_carry_ccgz): Use unspec instead of compare.

--- gcc/config/i386/i386.md.jj	2017-10-23 10:13:05.462218947 +0200
+++ gcc/config/i386/i386.md	2017-10-23 11:07:55.470376791 +0200
@@ -112,6 +112,7 @@ (define_c_enum "unspec" [
   UNSPEC_STOS
   UNSPEC_PEEPSIB
   UNSPEC_INSN_FALSE_DEP
+  UNSPEC_SBB
 
   ;; For SSE/MMX support:
   UNSPEC_FIX_NOTRUNC
@@ -1285,11 +1286,10 @@ (define_insn_and_split "cmp<dwi>_doublew
   [(set (reg:CC FLAGS_REG)
 	(compare:CC (match_dup 1) (match_dup 2)))
    (parallel [(set (reg:CCGZ FLAGS_REG)
-		   (compare: CCGZ
-		     (match_dup 4)
-		     (plus:DWIH
-		       (ltu:DWIH (reg:CC FLAGS_REG) (const_int 0))
-		       (match_dup 5))))
+		   (unspec:CCGZ [(match_dup 4)
+				 (match_dup 5)
+				 (ltu:DWIH (reg:CC FLAGS_REG) (const_int 0))]
+		    UNSPEC_SBB))
 	      (clobber (match_dup 3))])]
   "split_double_mode (<DWI>mode, &operands[0], 3, &operands[0], &operands[3]);")
 
@@ -6911,13 +6911,18 @@ (define_insn "*subsi3_carry_zext"
    (set_attr "pent_pair" "pu")
    (set_attr "mode" "SI")])
 
+;; The sign flag is set from the
+;; (compare (match_dup 1) (plus:DWIH (ltu:DWIH ...) (match_dup 2)))
+;; result, the overflow flag likewise, but the overflow flag is also
+;; set if the (plus:DWIH (ltu:DWIH ...) (match_dup 2)) overflows.
+;; The borrow flag can be modelled, but differently from SF and OF
+;; and is quite difficult to handle.
 (define_insn "*sub<mode>3_carry_ccgz"
   [(set (reg:CCGZ FLAGS_REG)
-	(compare:CCGZ
-	  (match_operand:DWIH 1 "register_operand" "0")
-	  (plus:DWIH
-	    (ltu:DWIH (reg:CC FLAGS_REG) (const_int 0))
-	    (match_operand:DWIH 2 "x86_64_general_operand" "rme"))))
+	(unspec:CCGZ [(match_operand:DWIH 1 "register_operand" "0")
+		      (match_operand:DWIH 2 "x86_64_general_operand" "rme")
+		      (ltu:DWIH (reg:CC FLAGS_REG) (const_int 0))]
+		     UNSPEC_SBB))
    (clobber (match_scratch:DWIH 0 "=r"))]
   ""
   "sbb{<imodesuffix>}\t{%2, %0|%0, %2}"


	Jakub
Uros Bizjak Oct. 23, 2017, 10:27 a.m. UTC | #2
On Mon, Oct 23, 2017 at 12:09 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Sun, Oct 22, 2017 at 08:04:28PM +0200, Uros Bizjak wrote:
>> Hello!
>>
>> In PR 82628 Jakub figured out that insn patterns that consume carry
>> flag were not 100% correct. Due to this issue, combine is able to
>> simplify various CC_REG propagations that result in invalid code.
>>
>> Attached patch fixes (well, mitigates) the above problem by splitting
>> the double-mode compare after the reload, in the same way other
>> *_doubleword patterns are handled from "the beginning of the time".
>
> I'm afraid this is going to haunt us sooner or later, combine isn't the
> only pass that uses simplify-rtx.c infrastructure heavily and when we lie
> in the RTL pattern, eventually something will be simplified wrongly.
>
> So, at least we'd need to use UNSPEC for the pattern, like (only lightly
> tested so far) below.

I agree with the above. Patterns that consume Carry flag are now
marked with (plus (ltu (...)), but effectively, they behave like
unspecs. So, I see no problem to change all SBB and ADC to unspec at
once, similar to the change you proposed in the patch.

> I'm not sure the double-word pattern is a win though, it causes PR82662
> you've filed (the problem is that during ifcvt because of the double-word
> comparison the condition is canonicalized as (lt (reg:TI) (reg:TI)) and
> there is no instruction in the MD that would take such arguments, there
> are only instructions that compare flags registers.

It is not a win, my patch was more of a band-aid to mitigate the
failure. It works, but it produces extra moves (as you mentione
below), due to RA not knowing that CMP doesn't clobber the register.
But, let's change the pattern back to expand-time splitting after the
above patch that changes SBB and ADC to unspecs is committed.

> If you look at say normal DImode comparisons, it is the same thing,
> ifcvt also can't do anything with those, the reason they work is that we
> have a cstoredi4 optab (for 64-bit), but don't have a cstoreti4 optab.
> So, we'd need that (and only handle the GE/GEU/LT/LTU + the others that can
> be handled by swapping the operands).
> I think the double-word pattern has other issues, it will result in RA not
> knowing in detail what is going on and thus can at least reserve one extra
> register that otherwise would not be needed.  The reason we have the
> doubleword patterns elsewhere is that splitting double-word early makes it
> harder/impossible for STV to use SSE registers; in this case we don't have
> something reasonable to expand to anyway, we always split.
>
> The alternative I have is the patch attached in the PR, if the unrelated
> addcarry/subborrow changes are removed, then it doesn't regress anything,
> the pr50038.c FAIL is from some other earlier change even on vanilla
> branch and pr67317-* FAILs were caused by the addcarry/subborrow changes,
> will look at those in detail.

I do have patch that allows double-mode for cstore, but it is not an
elegant solution. Splitting to SBB at expand time would be
considerably better.

Thanks,
Uros.

> 2017-10-23  Jakub Jelinek  <jakub@redhat.com>
>
>         PR target/82628
>         * config/i386/i386.md (UNSPEC_SBB): New unspec.
>         (cmp<dwi>_doubleword): Use unspec instead of compare.
>         (sub<mode>3_carry_ccgz): Use unspec instead of compare.
>
> --- gcc/config/i386/i386.md.jj  2017-10-23 10:13:05.462218947 +0200
> +++ gcc/config/i386/i386.md     2017-10-23 11:07:55.470376791 +0200
> @@ -112,6 +112,7 @@ (define_c_enum "unspec" [
>    UNSPEC_STOS
>    UNSPEC_PEEPSIB
>    UNSPEC_INSN_FALSE_DEP
> +  UNSPEC_SBB
>
>    ;; For SSE/MMX support:
>    UNSPEC_FIX_NOTRUNC
> @@ -1285,11 +1286,10 @@ (define_insn_and_split "cmp<dwi>_doublew
>    [(set (reg:CC FLAGS_REG)
>         (compare:CC (match_dup 1) (match_dup 2)))
>     (parallel [(set (reg:CCGZ FLAGS_REG)
> -                  (compare: CCGZ
> -                    (match_dup 4)
> -                    (plus:DWIH
> -                      (ltu:DWIH (reg:CC FLAGS_REG) (const_int 0))
> -                      (match_dup 5))))
> +                  (unspec:CCGZ [(match_dup 4)
> +                                (match_dup 5)
> +                                (ltu:DWIH (reg:CC FLAGS_REG) (const_int 0))]
> +                   UNSPEC_SBB))
>               (clobber (match_dup 3))])]
>    "split_double_mode (<DWI>mode, &operands[0], 3, &operands[0], &operands[3]);")
>
> @@ -6911,13 +6911,18 @@ (define_insn "*subsi3_carry_zext"
>     (set_attr "pent_pair" "pu")
>     (set_attr "mode" "SI")])
>
> +;; The sign flag is set from the
> +;; (compare (match_dup 1) (plus:DWIH (ltu:DWIH ...) (match_dup 2)))
> +;; result, the overflow flag likewise, but the overflow flag is also
> +;; set if the (plus:DWIH (ltu:DWIH ...) (match_dup 2)) overflows.
> +;; The borrow flag can be modelled, but differently from SF and OF
> +;; and is quite difficult to handle.
>  (define_insn "*sub<mode>3_carry_ccgz"
>    [(set (reg:CCGZ FLAGS_REG)
> -       (compare:CCGZ
> -         (match_operand:DWIH 1 "register_operand" "0")
> -         (plus:DWIH
> -           (ltu:DWIH (reg:CC FLAGS_REG) (const_int 0))
> -           (match_operand:DWIH 2 "x86_64_general_operand" "rme"))))
> +       (unspec:CCGZ [(match_operand:DWIH 1 "register_operand" "0")
> +                     (match_operand:DWIH 2 "x86_64_general_operand" "rme")
> +                     (ltu:DWIH (reg:CC FLAGS_REG) (const_int 0))]
> +                    UNSPEC_SBB))
>     (clobber (match_scratch:DWIH 0 "=r"))]
>    ""
>    "sbb{<imodesuffix>}\t{%2, %0|%0, %2}"
>
>
>         Jakub
Jakub Jelinek Oct. 23, 2017, 11:07 a.m. UTC | #3
On Mon, Oct 23, 2017 at 12:27:15PM +0200, Uros Bizjak wrote:
> On Mon, Oct 23, 2017 at 12:09 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> > On Sun, Oct 22, 2017 at 08:04:28PM +0200, Uros Bizjak wrote:
> >> Hello!
> >>
> >> In PR 82628 Jakub figured out that insn patterns that consume carry
> >> flag were not 100% correct. Due to this issue, combine is able to
> >> simplify various CC_REG propagations that result in invalid code.
> >>
> >> Attached patch fixes (well, mitigates) the above problem by splitting
> >> the double-mode compare after the reload, in the same way other
> >> *_doubleword patterns are handled from "the beginning of the time".
> >
> > I'm afraid this is going to haunt us sooner or later, combine isn't the
> > only pass that uses simplify-rtx.c infrastructure heavily and when we lie
> > in the RTL pattern, eventually something will be simplified wrongly.
> >
> > So, at least we'd need to use UNSPEC for the pattern, like (only lightly
> > tested so far) below.
> 
> I agree with the above. Patterns that consume Carry flag are now
> marked with (plus (ltu (...)), but effectively, they behave like
> unspecs. So, I see no problem to change all SBB and ADC to unspec at
> once, similar to the change you proposed in the patch.

So like this (addcarry/subborrow defered to a separate patch)?
Or do you want to use UNSPEC even for the unsigned comparison case,
i.e. from the patch remove the predicates.md/constraints.md part,
sub<mode>3_carry_ccc{,_1} and anything related to that?

As for addcarry/subborrow, the problem is that we expect in the pr67317*
tests that combine is able to notice that the CF setter sets CF to
unconditional 0 and matches the pattern.  With the patch I wrote
we end up with the combiner trying to match an insn where the CCC
is set from a TImode comparison:
(parallel [
        (set (reg:CC 17 flags)
            (compare:CC (zero_extend:TI (plus:DI (reg/v:DI 92 [ a ])
                        (reg/v:DI 94 [ c ])))
                (zero_extend:TI (reg/v:DI 94 [ c ]))))
        (set (reg:DI 98)
            (plus:DI (reg/v:DI 92 [ a ])
                (reg/v:DI 94 [ c ])))
    ])
So, either we need a define_insn_and_split pattern that would deal with
that (for UNSPEC it would be the same thing, have a define_insn_and_split
that would replace the (ltu...) with (const_int 0)), or perhaps be smarter
during expansion, if we see the first argument is constant 0, expand it
like a normal add instruction with CC setter.

2017-10-23  Jakub Jelinek  <jakub@redhat.com>

	PR target/82628
	* config/i386/predicates.md (x86_64_dwzext_immediate_operand): New.
	* config/i386/constraints.md (Wf): New constraint.
	* config/i386/i386.md (UNSPEC_SBB): New unspec.
	(cmp<dwi>_doubleword): Removed.
	(sub<mode>3_carry_ccc, *sub<mode>3_carry_ccc_1): New patterns.
	(sub<mode>3_carry_ccgz): Use unspec instead of compare.
	* config/i386/i386.c (ix86_expand_branch) <case E_TImode>: Don't
	expand with cmp<dwi>_doubleword.  For LTU and GEU use
	sub<mode>3_carry_ccc instead of sub<mode>3_carry_ccgz and use CCCmode.

--- gcc/config/i386/predicates.md.jj	2017-10-23 12:00:13.899355249 +0200
+++ gcc/config/i386/predicates.md	2017-10-23 12:52:20.696576114 +0200
@@ -366,6 +366,31 @@ (define_predicate "x86_64_hilo_int_opera
     }
 })
 
+;; Return true if VALUE is a constant integer whose value is
+;; x86_64_immediate_operand value zero extended from word mode to mode.
+(define_predicate "x86_64_dwzext_immediate_operand"
+  (match_code "const_int,const_wide_int")
+{
+  switch (GET_CODE (op))
+    {
+    case CONST_INT:
+      if (!TARGET_64BIT)
+	return UINTVAL (op) <= HOST_WIDE_INT_UC (0xffffffff);
+      return UINTVAL (op) <= HOST_WIDE_INT_UC (0x7fffffff);
+
+    case CONST_WIDE_INT:
+      if (!TARGET_64BIT)
+	return false;
+      return (CONST_WIDE_INT_NUNITS (op) == 2
+	      && CONST_WIDE_INT_ELT (op, 1) == 0
+	      && (trunc_int_for_mode (CONST_WIDE_INT_ELT (op, 0), SImode)
+		  == (HOST_WIDE_INT) CONST_WIDE_INT_ELT (op, 0)));
+
+    default:
+      gcc_unreachable ();
+    }
+})
+
 ;; Return true if size of VALUE can be stored in a sign
 ;; extended immediate field.
 (define_predicate "x86_64_immediate_size_operand"
--- gcc/config/i386/constraints.md.jj	2017-10-23 12:00:13.850355874 +0200
+++ gcc/config/i386/constraints.md	2017-10-23 12:52:20.697576102 +0200
@@ -332,6 +332,11 @@ (define_constraint "Wd"
    of it satisfies the e constraint."
   (match_operand 0 "x86_64_hilo_int_operand"))
 
+(define_constraint "Wf"
+  "32-bit signed integer constant zero extended from word size
+   to double word size."
+  (match_operand 0 "x86_64_dwzext_immediate_operand"))
+
 (define_constraint "Z"
   "32-bit unsigned integer constant, or a symbolic reference known
    to fit that range (for immediate operands in zero-extending x86-64
--- gcc/config/i386/i386.md.jj	2017-10-23 12:51:19.350356044 +0200
+++ gcc/config/i386/i386.md	2017-10-23 12:52:20.701576051 +0200
@@ -112,6 +112,7 @@ (define_c_enum "unspec" [
   UNSPEC_STOS
   UNSPEC_PEEPSIB
   UNSPEC_INSN_FALSE_DEP
+  UNSPEC_SBB
 
   ;; For SSE/MMX support:
   UNSPEC_FIX_NOTRUNC
@@ -1273,26 +1274,6 @@ (define_expand "cmp<mode>_1"
 	(compare:CC (match_operand:SWI48 0 "nonimmediate_operand")
 		    (match_operand:SWI48 1 "<general_operand>")))])
 
-(define_insn_and_split "cmp<dwi>_doubleword"
-  [(set (reg:CCGZ FLAGS_REG)
-	(compare:CCGZ
-	  (match_operand:<DWI> 1 "register_operand" "0")
-	  (match_operand:<DWI> 2 "x86_64_hilo_general_operand" "ro<di>")))
-   (clobber (match_scratch:<DWI> 0 "=r"))]
-  ""
-  "#"
-  "reload_completed"
-  [(set (reg:CC FLAGS_REG)
-	(compare:CC (match_dup 1) (match_dup 2)))
-   (parallel [(set (reg:CCGZ FLAGS_REG)
-		   (compare: CCGZ
-		     (match_dup 4)
-		     (plus:DWIH
-		       (ltu:DWIH (reg:CC FLAGS_REG) (const_int 0))
-		       (match_dup 5))))
-	      (clobber (match_dup 3))])]
-  "split_double_mode (<DWI>mode, &operands[0], 3, &operands[0], &operands[3]);")
-
 (define_insn "*cmp<mode>_ccno_1"
   [(set (reg FLAGS_REG)
 	(compare (match_operand:SWI 0 "nonimmediate_operand" "<r>,?m<r>")
@@ -6911,13 +6892,46 @@ (define_insn "*subsi3_carry_zext"
    (set_attr "pent_pair" "pu")
    (set_attr "mode" "SI")])
 
-(define_insn "*sub<mode>3_carry_ccgz"
+(define_insn "sub<mode>3_carry_ccc"
+  [(set (reg:CCC FLAGS_REG)
+	(compare:CCC
+	  (zero_extend:<DWI> (match_operand:DWIH 1 "register_operand" "0"))
+	  (plus:<DWI>
+	    (ltu:<DWI> (reg:CC FLAGS_REG) (const_int 0))
+	    (zero_extend:<DWI>
+	      (match_operand:DWIH 2 "x86_64_sext_operand" "rmWe")))))
+   (clobber (match_scratch:DWIH 0 "=r"))]
+  ""
+  "sbb{<imodesuffix>}\t{%2, %0|%0, %2}"
+  [(set_attr "type" "alu")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "*sub<mode>3_carry_ccc_1"
+  [(set (reg:CCC FLAGS_REG)
+	(compare:CCC
+	  (zero_extend:<DWI> (match_operand:DWIH 1 "register_operand" "0"))
+	  (plus:<DWI>
+	    (ltu:<DWI> (reg:CC FLAGS_REG) (const_int 0))
+	    (match_operand:<DWI> 2 "x86_64_dwzext_immediate_operand" "Wf"))))
+   (clobber (match_scratch:DWIH 0 "=r"))]
+  ""
+{
+  operands[3] = simplify_subreg (<MODE>mode, operands[2], <DWI>mode, 0);
+  return "sbb{<imodesuffix>}\t{%3, %0|%0, %3}";
+}
+  [(set_attr "type" "alu")
+   (set_attr "mode" "<MODE>")])
+
+;; The sign flag is set from the
+;; (compare (match_dup 1) (plus:DWIH (ltu:DWIH ...) (match_dup 2)))
+;; result, the overflow flag likewise, but the overflow flag is also
+;; set if the (plus:DWIH (ltu:DWIH ...) (match_dup 2)) overflows.
+(define_insn "sub<mode>3_carry_ccgz"
   [(set (reg:CCGZ FLAGS_REG)
-	(compare:CCGZ
-	  (match_operand:DWIH 1 "register_operand" "0")
-	  (plus:DWIH
-	    (ltu:DWIH (reg:CC FLAGS_REG) (const_int 0))
-	    (match_operand:DWIH 2 "x86_64_general_operand" "rme"))))
+	(unspec:CCGZ [(match_operand:DWIH 1 "register_operand" "0")
+		      (match_operand:DWIH 2 "x86_64_general_operand" "rme")
+		      (ltu:DWIH (reg:CC FLAGS_REG) (const_int 0))]
+		     UNSPEC_SBB))
    (clobber (match_scratch:DWIH 0 "=r"))]
   ""
   "sbb{<imodesuffix>}\t{%2, %0|%0, %2}"
--- gcc/config/i386/i386.c.jj	2017-10-23 12:51:19.349356057 +0200
+++ gcc/config/i386/i386.c	2017-10-23 12:52:20.711575924 +0200
@@ -22378,27 +22378,45 @@ ix86_expand_branch (enum rtx_code code,
 	switch (code)
 	  {
 	  case LE: case LEU: case GT: case GTU:
-	    std::swap (op0, op1);
+	    std::swap (lo[0], lo[1]);
+	    std::swap (hi[0], hi[1]);
 	    code = swap_condition (code);
 	    /* FALLTHRU */
 
 	  case LT: case LTU: case GE: case GEU:
 	    {
-	      rtx (*cmp_insn) (rtx, rtx, rtx);
+	      rtx (*cmp_insn) (rtx, rtx);
+	      rtx (*sbb_insn) (rtx, rtx, rtx);
+	      bool uns = (code == LTU || code == GEU);
 
 	      if (TARGET_64BIT)
-		cmp_insn = gen_cmpti_doubleword;
+		{
+		  cmp_insn = gen_cmpdi_1;
+		  sbb_insn
+		    = uns ? gen_subdi3_carry_ccc : gen_subdi3_carry_ccgz;
+		}
 	      else
-		cmp_insn = gen_cmpdi_doubleword;
+		{
+		  cmp_insn = gen_cmpsi_1;
+		  sbb_insn
+		    = uns ? gen_subsi3_carry_ccc : gen_subsi3_carry_ccgz;
+		}
+
+	      if (!nonimmediate_operand (lo[0], submode))
+		lo[0] = force_reg (submode, lo[0]);
+	      if (!x86_64_general_operand (lo[1], submode))
+		lo[1] = force_reg (submode, lo[1]);
+
+	      if (!register_operand (hi[0], submode))
+		hi[0] = force_reg (submode, hi[0]);
+	      if ((uns && !nonimmediate_operand (hi[1], submode))
+		  || (!uns && !x86_64_general_operand (hi[1], submode)))
+		hi[1] = force_reg (submode, hi[1]);
 
-	      if (!register_operand (op0, mode))
-		op0 = force_reg (mode, op0);
-	      if (!x86_64_hilo_general_operand (op1, mode))
-		op1 = force_reg (mode, op1);
+	      emit_insn (cmp_insn (lo[0], lo[1]));
+	      emit_insn (sbb_insn (gen_rtx_SCRATCH (submode), hi[0], hi[1]));
 
-	      emit_insn (cmp_insn (gen_rtx_SCRATCH (mode), op0, op1));
-
-	      tmp = gen_rtx_REG (CCGZmode, FLAGS_REG);
+	      tmp = gen_rtx_REG (uns ? CCCmode : CCGZmode, FLAGS_REG);
 
 	      ix86_expand_branch (code, tmp, const0_rtx, label);
 	      return;


	Jakub
Uros Bizjak Oct. 23, 2017, 11:27 a.m. UTC | #4
On Mon, Oct 23, 2017 at 1:07 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Mon, Oct 23, 2017 at 12:27:15PM +0200, Uros Bizjak wrote:
>> On Mon, Oct 23, 2017 at 12:09 PM, Jakub Jelinek <jakub@redhat.com> wrote:
>> > On Sun, Oct 22, 2017 at 08:04:28PM +0200, Uros Bizjak wrote:
>> >> Hello!
>> >>
>> >> In PR 82628 Jakub figured out that insn patterns that consume carry
>> >> flag were not 100% correct. Due to this issue, combine is able to
>> >> simplify various CC_REG propagations that result in invalid code.
>> >>
>> >> Attached patch fixes (well, mitigates) the above problem by splitting
>> >> the double-mode compare after the reload, in the same way other
>> >> *_doubleword patterns are handled from "the beginning of the time".
>> >
>> > I'm afraid this is going to haunt us sooner or later, combine isn't the
>> > only pass that uses simplify-rtx.c infrastructure heavily and when we lie
>> > in the RTL pattern, eventually something will be simplified wrongly.
>> >
>> > So, at least we'd need to use UNSPEC for the pattern, like (only lightly
>> > tested so far) below.
>>
>> I agree with the above. Patterns that consume Carry flag are now
>> marked with (plus (ltu (...)), but effectively, they behave like
>> unspecs. So, I see no problem to change all SBB and ADC to unspec at
>> once, similar to the change you proposed in the patch.
>
> So like this (addcarry/subborrow defered to a separate patch)?
> Or do you want to use UNSPEC even for the unsigned comparison case,
> i.e. from the patch remove the predicates.md/constraints.md part,
> sub<mode>3_carry_ccc{,_1} and anything related to that?

Looking at the attached patch, I think, this won't be necessary
anymore. The pattern is quite important for 32bit targets, so this
fact warrants a couple of complicated patterns.

> As for addcarry/subborrow, the problem is that we expect in the pr67317*
> tests that combine is able to notice that the CF setter sets CF to
> unconditional 0 and matches the pattern.  With the patch I wrote
> we end up with the combiner trying to match an insn where the CCC
> is set from a TImode comparison:
> (parallel [
>         (set (reg:CC 17 flags)
>             (compare:CC (zero_extend:TI (plus:DI (reg/v:DI 92 [ a ])
>                         (reg/v:DI 94 [ c ])))
>                 (zero_extend:TI (reg/v:DI 94 [ c ]))))
>         (set (reg:DI 98)
>             (plus:DI (reg/v:DI 92 [ a ])
>                 (reg/v:DI 94 [ c ])))
>     ])
> So, either we need a define_insn_and_split pattern that would deal with
> that (for UNSPEC it would be the same thing, have a define_insn_and_split
> that would replace the (ltu...) with (const_int 0)), or perhaps be smarter
> during expansion, if we see the first argument is constant 0, expand it
> like a normal add instruction with CC setter.
>
> 2017-10-23  Jakub Jelinek  <jakub@redhat.com>
>
>         PR target/82628
>         * config/i386/predicates.md (x86_64_dwzext_immediate_operand): New.
>         * config/i386/constraints.md (Wf): New constraint.
>         * config/i386/i386.md (UNSPEC_SBB): New unspec.
>         (cmp<dwi>_doubleword): Removed.
>         (sub<mode>3_carry_ccc, *sub<mode>3_carry_ccc_1): New patterns.
>         (sub<mode>3_carry_ccgz): Use unspec instead of compare.
>         * config/i386/i386.c (ix86_expand_branch) <case E_TImode>: Don't
>         expand with cmp<dwi>_doubleword.  For LTU and GEU use
>         sub<mode>3_carry_ccc instead of sub<mode>3_carry_ccgz and use CCCmode.

OK.

Thanks,
Uros.

> --- gcc/config/i386/predicates.md.jj    2017-10-23 12:00:13.899355249 +0200
> +++ gcc/config/i386/predicates.md       2017-10-23 12:52:20.696576114 +0200
> @@ -366,6 +366,31 @@ (define_predicate "x86_64_hilo_int_opera
>      }
>  })
>
> +;; Return true if VALUE is a constant integer whose value is
> +;; x86_64_immediate_operand value zero extended from word mode to mode.
> +(define_predicate "x86_64_dwzext_immediate_operand"
> +  (match_code "const_int,const_wide_int")
> +{
> +  switch (GET_CODE (op))
> +    {
> +    case CONST_INT:
> +      if (!TARGET_64BIT)
> +       return UINTVAL (op) <= HOST_WIDE_INT_UC (0xffffffff);
> +      return UINTVAL (op) <= HOST_WIDE_INT_UC (0x7fffffff);
> +
> +    case CONST_WIDE_INT:
> +      if (!TARGET_64BIT)
> +       return false;
> +      return (CONST_WIDE_INT_NUNITS (op) == 2
> +             && CONST_WIDE_INT_ELT (op, 1) == 0
> +             && (trunc_int_for_mode (CONST_WIDE_INT_ELT (op, 0), SImode)
> +                 == (HOST_WIDE_INT) CONST_WIDE_INT_ELT (op, 0)));
> +
> +    default:
> +      gcc_unreachable ();
> +    }
> +})
> +
>  ;; Return true if size of VALUE can be stored in a sign
>  ;; extended immediate field.
>  (define_predicate "x86_64_immediate_size_operand"
> --- gcc/config/i386/constraints.md.jj   2017-10-23 12:00:13.850355874 +0200
> +++ gcc/config/i386/constraints.md      2017-10-23 12:52:20.697576102 +0200
> @@ -332,6 +332,11 @@ (define_constraint "Wd"
>     of it satisfies the e constraint."
>    (match_operand 0 "x86_64_hilo_int_operand"))
>
> +(define_constraint "Wf"
> +  "32-bit signed integer constant zero extended from word size
> +   to double word size."
> +  (match_operand 0 "x86_64_dwzext_immediate_operand"))
> +
>  (define_constraint "Z"
>    "32-bit unsigned integer constant, or a symbolic reference known
>     to fit that range (for immediate operands in zero-extending x86-64
> --- gcc/config/i386/i386.md.jj  2017-10-23 12:51:19.350356044 +0200
> +++ gcc/config/i386/i386.md     2017-10-23 12:52:20.701576051 +0200
> @@ -112,6 +112,7 @@ (define_c_enum "unspec" [
>    UNSPEC_STOS
>    UNSPEC_PEEPSIB
>    UNSPEC_INSN_FALSE_DEP
> +  UNSPEC_SBB
>
>    ;; For SSE/MMX support:
>    UNSPEC_FIX_NOTRUNC
> @@ -1273,26 +1274,6 @@ (define_expand "cmp<mode>_1"
>         (compare:CC (match_operand:SWI48 0 "nonimmediate_operand")
>                     (match_operand:SWI48 1 "<general_operand>")))])
>
> -(define_insn_and_split "cmp<dwi>_doubleword"
> -  [(set (reg:CCGZ FLAGS_REG)
> -       (compare:CCGZ
> -         (match_operand:<DWI> 1 "register_operand" "0")
> -         (match_operand:<DWI> 2 "x86_64_hilo_general_operand" "ro<di>")))
> -   (clobber (match_scratch:<DWI> 0 "=r"))]
> -  ""
> -  "#"
> -  "reload_completed"
> -  [(set (reg:CC FLAGS_REG)
> -       (compare:CC (match_dup 1) (match_dup 2)))
> -   (parallel [(set (reg:CCGZ FLAGS_REG)
> -                  (compare: CCGZ
> -                    (match_dup 4)
> -                    (plus:DWIH
> -                      (ltu:DWIH (reg:CC FLAGS_REG) (const_int 0))
> -                      (match_dup 5))))
> -             (clobber (match_dup 3))])]
> -  "split_double_mode (<DWI>mode, &operands[0], 3, &operands[0], &operands[3]);")
> -
>  (define_insn "*cmp<mode>_ccno_1"
>    [(set (reg FLAGS_REG)
>         (compare (match_operand:SWI 0 "nonimmediate_operand" "<r>,?m<r>")
> @@ -6911,13 +6892,46 @@ (define_insn "*subsi3_carry_zext"
>     (set_attr "pent_pair" "pu")
>     (set_attr "mode" "SI")])
>
> -(define_insn "*sub<mode>3_carry_ccgz"
> +(define_insn "sub<mode>3_carry_ccc"
> +  [(set (reg:CCC FLAGS_REG)
> +       (compare:CCC
> +         (zero_extend:<DWI> (match_operand:DWIH 1 "register_operand" "0"))
> +         (plus:<DWI>
> +           (ltu:<DWI> (reg:CC FLAGS_REG) (const_int 0))
> +           (zero_extend:<DWI>
> +             (match_operand:DWIH 2 "x86_64_sext_operand" "rmWe")))))
> +   (clobber (match_scratch:DWIH 0 "=r"))]
> +  ""
> +  "sbb{<imodesuffix>}\t{%2, %0|%0, %2}"
> +  [(set_attr "type" "alu")
> +   (set_attr "mode" "<MODE>")])
> +
> +(define_insn "*sub<mode>3_carry_ccc_1"
> +  [(set (reg:CCC FLAGS_REG)
> +       (compare:CCC
> +         (zero_extend:<DWI> (match_operand:DWIH 1 "register_operand" "0"))
> +         (plus:<DWI>
> +           (ltu:<DWI> (reg:CC FLAGS_REG) (const_int 0))
> +           (match_operand:<DWI> 2 "x86_64_dwzext_immediate_operand" "Wf"))))
> +   (clobber (match_scratch:DWIH 0 "=r"))]
> +  ""
> +{
> +  operands[3] = simplify_subreg (<MODE>mode, operands[2], <DWI>mode, 0);
> +  return "sbb{<imodesuffix>}\t{%3, %0|%0, %3}";
> +}
> +  [(set_attr "type" "alu")
> +   (set_attr "mode" "<MODE>")])
> +
> +;; The sign flag is set from the
> +;; (compare (match_dup 1) (plus:DWIH (ltu:DWIH ...) (match_dup 2)))
> +;; result, the overflow flag likewise, but the overflow flag is also
> +;; set if the (plus:DWIH (ltu:DWIH ...) (match_dup 2)) overflows.
> +(define_insn "sub<mode>3_carry_ccgz"
>    [(set (reg:CCGZ FLAGS_REG)
> -       (compare:CCGZ
> -         (match_operand:DWIH 1 "register_operand" "0")
> -         (plus:DWIH
> -           (ltu:DWIH (reg:CC FLAGS_REG) (const_int 0))
> -           (match_operand:DWIH 2 "x86_64_general_operand" "rme"))))
> +       (unspec:CCGZ [(match_operand:DWIH 1 "register_operand" "0")
> +                     (match_operand:DWIH 2 "x86_64_general_operand" "rme")
> +                     (ltu:DWIH (reg:CC FLAGS_REG) (const_int 0))]
> +                    UNSPEC_SBB))
>     (clobber (match_scratch:DWIH 0 "=r"))]
>    ""
>    "sbb{<imodesuffix>}\t{%2, %0|%0, %2}"
> --- gcc/config/i386/i386.c.jj   2017-10-23 12:51:19.349356057 +0200
> +++ gcc/config/i386/i386.c      2017-10-23 12:52:20.711575924 +0200
> @@ -22378,27 +22378,45 @@ ix86_expand_branch (enum rtx_code code,
>         switch (code)
>           {
>           case LE: case LEU: case GT: case GTU:
> -           std::swap (op0, op1);
> +           std::swap (lo[0], lo[1]);
> +           std::swap (hi[0], hi[1]);
>             code = swap_condition (code);
>             /* FALLTHRU */
>
>           case LT: case LTU: case GE: case GEU:
>             {
> -             rtx (*cmp_insn) (rtx, rtx, rtx);
> +             rtx (*cmp_insn) (rtx, rtx);
> +             rtx (*sbb_insn) (rtx, rtx, rtx);
> +             bool uns = (code == LTU || code == GEU);
>
>               if (TARGET_64BIT)
> -               cmp_insn = gen_cmpti_doubleword;
> +               {
> +                 cmp_insn = gen_cmpdi_1;
> +                 sbb_insn
> +                   = uns ? gen_subdi3_carry_ccc : gen_subdi3_carry_ccgz;
> +               }
>               else
> -               cmp_insn = gen_cmpdi_doubleword;
> +               {
> +                 cmp_insn = gen_cmpsi_1;
> +                 sbb_insn
> +                   = uns ? gen_subsi3_carry_ccc : gen_subsi3_carry_ccgz;
> +               }
> +
> +             if (!nonimmediate_operand (lo[0], submode))
> +               lo[0] = force_reg (submode, lo[0]);
> +             if (!x86_64_general_operand (lo[1], submode))
> +               lo[1] = force_reg (submode, lo[1]);
> +
> +             if (!register_operand (hi[0], submode))
> +               hi[0] = force_reg (submode, hi[0]);
> +             if ((uns && !nonimmediate_operand (hi[1], submode))
> +                 || (!uns && !x86_64_general_operand (hi[1], submode)))
> +               hi[1] = force_reg (submode, hi[1]);
>
> -             if (!register_operand (op0, mode))
> -               op0 = force_reg (mode, op0);
> -             if (!x86_64_hilo_general_operand (op1, mode))
> -               op1 = force_reg (mode, op1);
> +             emit_insn (cmp_insn (lo[0], lo[1]));
> +             emit_insn (sbb_insn (gen_rtx_SCRATCH (submode), hi[0], hi[1]));
>
> -             emit_insn (cmp_insn (gen_rtx_SCRATCH (mode), op0, op1));
> -
> -             tmp = gen_rtx_REG (CCGZmode, FLAGS_REG);
> +             tmp = gen_rtx_REG (uns ? CCCmode : CCGZmode, FLAGS_REG);
>
>               ix86_expand_branch (code, tmp, const0_rtx, label);
>               return;
>
>
>         Jakub
Uros Bizjak Oct. 23, 2017, 7:03 p.m. UTC | #5
On Mon, Oct 23, 2017 at 1:27 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Mon, Oct 23, 2017 at 1:07 PM, Jakub Jelinek <jakub@redhat.com> wrote:
>> On Mon, Oct 23, 2017 at 12:27:15PM +0200, Uros Bizjak wrote:
>>> On Mon, Oct 23, 2017 at 12:09 PM, Jakub Jelinek <jakub@redhat.com> wrote:
>>> > On Sun, Oct 22, 2017 at 08:04:28PM +0200, Uros Bizjak wrote:
>>> >> Hello!
>>> >>
>>> >> In PR 82628 Jakub figured out that insn patterns that consume carry
>>> >> flag were not 100% correct. Due to this issue, combine is able to
>>> >> simplify various CC_REG propagations that result in invalid code.
>>> >>
>>> >> Attached patch fixes (well, mitigates) the above problem by splitting
>>> >> the double-mode compare after the reload, in the same way other
>>> >> *_doubleword patterns are handled from "the beginning of the time".
>>> >
>>> > I'm afraid this is going to haunt us sooner or later, combine isn't the
>>> > only pass that uses simplify-rtx.c infrastructure heavily and when we lie
>>> > in the RTL pattern, eventually something will be simplified wrongly.
>>> >
>>> > So, at least we'd need to use UNSPEC for the pattern, like (only lightly
>>> > tested so far) below.
>>>
>>> I agree with the above. Patterns that consume Carry flag are now
>>> marked with (plus (ltu (...)), but effectively, they behave like
>>> unspecs. So, I see no problem to change all SBB and ADC to unspec at
>>> once, similar to the change you proposed in the patch.
>>
>> So like this (addcarry/subborrow defered to a separate patch)?
>> Or do you want to use UNSPEC even for the unsigned comparison case,
>> i.e. from the patch remove the predicates.md/constraints.md part,
>> sub<mode>3_carry_ccc{,_1} and anything related to that?
>
> Looking at the attached patch, I think, this won't be necessary
> anymore. The pattern is quite important for 32bit targets, so this
> fact warrants a couple of complicated patterns.
>
>> As for addcarry/subborrow, the problem is that we expect in the pr67317*
>> tests that combine is able to notice that the CF setter sets CF to
>> unconditional 0 and matches the pattern.  With the patch I wrote
>> we end up with the combiner trying to match an insn where the CCC
>> is set from a TImode comparison:
>> (parallel [
>>         (set (reg:CC 17 flags)
>>             (compare:CC (zero_extend:TI (plus:DI (reg/v:DI 92 [ a ])
>>                         (reg/v:DI 94 [ c ])))
>>                 (zero_extend:TI (reg/v:DI 94 [ c ]))))
>>         (set (reg:DI 98)
>>             (plus:DI (reg/v:DI 92 [ a ])
>>                 (reg/v:DI 94 [ c ])))
>>     ])
>> So, either we need a define_insn_and_split pattern that would deal with
>> that (for UNSPEC it would be the same thing, have a define_insn_and_split
>> that would replace the (ltu...) with (const_int 0)), or perhaps be smarter
>> during expansion, if we see the first argument is constant 0, expand it
>> like a normal add instruction with CC setter.
>>
>> 2017-10-23  Jakub Jelinek  <jakub@redhat.com>
>>
>>         PR target/82628
>>         * config/i386/predicates.md (x86_64_dwzext_immediate_operand): New.
>>         * config/i386/constraints.md (Wf): New constraint.
>>         * config/i386/i386.md (UNSPEC_SBB): New unspec.
>>         (cmp<dwi>_doubleword): Removed.
>>         (sub<mode>3_carry_ccc, *sub<mode>3_carry_ccc_1): New patterns.
>>         (sub<mode>3_carry_ccgz): Use unspec instead of compare.
>>         * config/i386/i386.c (ix86_expand_branch) <case E_TImode>: Don't
>>         expand with cmp<dwi>_doubleword.  For LTU and GEU use
>>         sub<mode>3_carry_ccc instead of sub<mode>3_carry_ccgz and use CCCmode.
>
> OK.

The patch also fixes PR 82662. I have added the following testcase and
closed the PR.

2017-10-23  Uros Bizjak  <ubizjak@gmail.com>

    PR target/82662
    * gcc.target/i386/pr82662.c: New test.

Tested on x86_64-linux-gnu {,-m32} and committed to mailine SVN.

Uros.
Index: gcc.target/i386/pr82662.c
===================================================================
--- gcc.target/i386/pr82662.c	(nonexistent)
+++ gcc.target/i386/pr82662.c	(working copy)
@@ -0,0 +1,26 @@
+/* PR target/82580 */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+#ifdef __SIZEOF_INT128__
+typedef unsigned __int128 U;
+typedef signed __int128 S;
+#else
+typedef unsigned long long U;
+typedef signed long long S;
+#endif
+void bar (void);
+int f0 (U x, U y) { return x == y; }
+int f1 (U x, U y) { return x != y; }
+int f2 (U x, U y) { return x > y; }
+int f3 (U x, U y) { return x >= y; }
+int f4 (U x, U y) { return x < y; }
+int f5 (U x, U y) { return x <= y; }
+int f6 (S x, S y) { return x == y; }
+int f7 (S x, S y) { return x != y; }
+int f8 (S x, S y) { return x > y; }
+int f9 (S x, S y) { return x >= y; }
+int f10 (S x, S y) { return x < y; }
+int f11 (S x, S y) { return x <= y; }
+
+/* { dg-final { scan-assembler-times {\mset} 12 } } */
diff mbox series

Patch

Index: config/i386/i386.c
===================================================================
--- config/i386/i386.c	(revision 253949)
+++ config/i386/i386.c	(working copy)
@@ -22175,34 +22175,26 @@  ix86_expand_branch (enum rtx_code code, rtx op0, r
 	switch (code)
 	  {
 	  case LE: case LEU: case GT: case GTU:
-	    std::swap (lo[0], lo[1]);
-	    std::swap (hi[0], hi[1]);
+	    std::swap (op0, op1);
 	    code = swap_condition (code);
 	    /* FALLTHRU */
 
 	  case LT: case LTU: case GE: case GEU:
 	    {
-	      rtx (*cmp_insn) (rtx, rtx);
-	      rtx (*sbb_insn) (rtx, rtx, rtx);
+	      rtx (*cmp_insn) (rtx, rtx, rtx);
 
 	      if (TARGET_64BIT)
-		cmp_insn = gen_cmpdi_1, sbb_insn = gen_subdi3_carry_ccgz;
+		cmp_insn = gen_cmpti_doubleword;
 	      else
-		cmp_insn = gen_cmpsi_1, sbb_insn = gen_subsi3_carry_ccgz;
+		cmp_insn = gen_cmpdi_doubleword;
 
-	      if (!nonimmediate_operand (lo[0], submode))
-		lo[0] = force_reg (submode, lo[0]);
-	      if (!x86_64_general_operand (lo[1], submode))
-		lo[1] = force_reg (submode, lo[1]);
+	      if (!register_operand (op0, mode))
+		op0 = force_reg (mode, op0);
+	      if (!x86_64_hilo_general_operand (op1, mode))
+		op1 = force_reg (mode, op1);
 
-	      if (!register_operand (hi[0], submode))
-		hi[0] = force_reg (submode, hi[0]);
-	      if (!x86_64_general_operand (hi[1], submode))
-		hi[1] = force_reg (submode, hi[1]);
+	      emit_insn (cmp_insn (gen_rtx_SCRATCH (mode), op0, op1));
 
-	      emit_insn (cmp_insn (lo[0], lo[1]));
-	      emit_insn (sbb_insn (gen_rtx_SCRATCH (submode), hi[0], hi[1]));
-
 	      tmp = gen_rtx_REG (CCGZmode, FLAGS_REG);
 
 	      ix86_expand_branch (code, tmp, const0_rtx, label);
Index: config/i386/i386.md
===================================================================
--- config/i386/i386.md	(revision 253949)
+++ config/i386/i386.md	(working copy)
@@ -1262,6 +1262,26 @@ 
 	(compare:CC (match_operand:SWI48 0 "nonimmediate_operand")
 		    (match_operand:SWI48 1 "<general_operand>")))])
 
+(define_insn_and_split "cmp<dwi>_doubleword"
+  [(set (reg:CCGZ FLAGS_REG)
+	(compare:CCGZ
+	  (match_operand:<DWI> 1 "register_operand" "0")
+	  (match_operand:<DWI> 2 "x86_64_hilo_general_operand" "ro<di>")))
+   (clobber (match_scratch:<DWI> 0 "=r"))]
+  ""
+  "#"
+  "reload_completed"
+  [(set (reg:CC FLAGS_REG)
+	(compare:CC (match_dup 1) (match_dup 2)))
+   (parallel [(set (reg:CCGZ FLAGS_REG)
+		   (compare: CCGZ
+		     (match_dup 4)
+		     (plus:DWIH
+		       (ltu:DWIH (reg:CC FLAGS_REG) (const_int 0))
+		       (match_dup 5))))
+	      (clobber (match_dup 3))])]
+  "split_double_mode (<DWI>mode, &operands[0], 3, &operands[0], &operands[3]);")
+
 (define_insn "*cmp<mode>_ccno_1"
   [(set (reg FLAGS_REG)
 	(compare (match_operand:SWI 0 "nonimmediate_operand" "<r>,?m<r>")
@@ -6880,7 +6900,7 @@ 
    (set_attr "pent_pair" "pu")
    (set_attr "mode" "SI")])
 
-(define_insn "sub<mode>3_carry_ccgz"
+(define_insn "*sub<mode>3_carry_ccgz"
   [(set (reg:CCGZ FLAGS_REG)
 	(compare:CCGZ
 	  (match_operand:DWIH 1 "register_operand" "0")
Index: testsuite/gcc.target/i386/pr82628.c
===================================================================
--- testsuite/gcc.target/i386/pr82628.c	(nonexistent)
+++ testsuite/gcc.target/i386/pr82628.c	(working copy)
@@ -0,0 +1,34 @@ 
+/* { dg-do run { target ia32 } } */
+/* { dg-options "-Os" } */
+
+void
+__attribute__ ((noipa))
+foo (const char *x)
+{
+  asm volatile ("" : "+g" (x) : : "memory");
+  if (x)
+    __builtin_abort ();
+}
+
+int a, b = 1;
+
+int
+main ()
+{
+  while (1)
+    {
+      unsigned long long d = 18446744073709551615UL;
+      while (1)
+	{
+	  int e = b;
+	  while (d < 2)
+	    foo ("0");
+	  if (a)
+	    d++;
+	  if (b)
+	    break;
+	}
+      break;
+    }
+  return 0;
+}