Message ID | 20200123081643.GV10088@tucnak |
---|---|
State | New |
Headers | show |
Series | wide-int: i386: Fix ICEs on TImode signed overflow add/sub patterns [PR93376] | expand |
Jakub Jelinek <jakub@redhat.com> writes: > Hi! > > The following testcase ICEs, because during try_combine of i3: > (insn 18 17 19 2 (parallel [ > (set (reg:CCO 17 flags) > (eq:CCO (plus:OI (sign_extend:OI (reg:TI 96)) > (const_int 1 [0x1])) > (sign_extend:OI (plus:TI (reg:TI 96) > (const_int 1 [0x1]))))) > (set (reg:TI 98) > (plus:TI (reg:TI 96) > (const_int 1 [0x1]))) > ]) "pr93376.c":8:10 223 {*addvti4_doubleword_1} > (expr_list:REG_UNUSED (reg:TI 98) > (expr_list:REG_DEAD (reg:TI 96) > (nil)))) > and i2: > (insn 17 37 18 2 (set (reg:TI 96) > (const_wide_int 0x7fffffffffffffffffffffffffffffff)) "pr93376.c":8:10 65 {*movti_internal} > (nil)) > the eq in there gets simplified into: > (eq:CCO (const_wide_int 0x080000000000000000000000000000000) > (const_wide_int 0x80000000000000000000000000000000)) > and simplify-rtx.c tries to simplify it by simplifying MINUS > of the two operands. > Now, i386 defines MAX_BITSIZE_MODE_ANY_INT to 128, because OImode > and XImode are used mainly as a placeholder for the vector modes; > these new signed overflow patterns are an exception to that, > but what they really need is just TImode precision + 1 (maybe 2 worst case) > bits at any time. > > wide-int.h defines WIDE_INT_MAX_ELTS in a way that it contains one more > HWI above number of HWIs to cover WIDE_INT_MAX_ELTS, so on i386 that is > 3 HWIs, meaning that TImode precision + 1/2 bits is still representable in > there. Unfortunately, the way wi::sub_large is implemented, it needs > not just those 3 HWIs, but one HWI above the maximum of the lengths of > both operands, which means it buffer overflows, overwrites the following > precision in wide_int_storage and ICEs later on. The need for 4 HWIs is > only temporary, because canonize immediately after it canonicalizes it > back to 3 HWIs only. > > Attached are two patches, the first one is admittedly a hack, but could be > considered an optimization too, simply before overwriting the last HWI > ({add,sub}_large seem to be the only one that have such len++) perform > a cheap check if canonize won't optimize it away immediately and in such > case don't store it. Yes, we are playing with fire because full OImode > is 256 bits and we can't store that many, but we in the end only need 129 > bits. > > The other patch is something suggested by Richard S., avoid using OImode > for this and instead use a partial int mode that is smaller. This is still > playing with fire because even the partial int mode is larger than > MAX_BITSIZE_MODE_ANY_INT, but at least its precision fits into those 3 > WIDE_INT_MAX_ELTS wide-int.h uses. I think we should increase MAX_BITSIZE_MODE_ANY_INT to the precision of POImode at the same time, and make that precision less than 192 (191 maybe?). That way everything is "correct" without increasing the size of wide_int. > The disadvantage of that approach is that it is more costly at compile > time, various RA data structures contains number_of_modes^2 sized > arrays, and RA initialization will want to compute at runtime various > properties of each of the modes. True, but if that's a real source of slow compilation in practice then we should do something about it independently of this, since many mode combinations make no sense. If I've counted correctly, x86 already has 109 modes, so we're talking about less than a 1% increase in O(modes) operations and less than a 1.9% increase in O(modes^2) operations. Thanks, Richard > I've tried to think about other ways how to represent signed overflow check > in RTL, but didn't find any that would actually properly describe it and not > be way too complicated for the patterns. > > So, my preference is the first patch, but Richard S. doesn't like that much. > > Both patches have been bootstrapped/regtested on x86_64-linux and > i686-linux, ok for trunk (which one)? > > Jakub
On Thu, Jan 23, 2020 at 09:14:42AM +0000, Richard Sandiford wrote: > > The other patch is something suggested by Richard S., avoid using OImode > > for this and instead use a partial int mode that is smaller. This is still > > playing with fire because even the partial int mode is larger than > > MAX_BITSIZE_MODE_ANY_INT, but at least its precision fits into those 3 > > WIDE_INT_MAX_ELTS wide-int.h uses. > > I think we should increase MAX_BITSIZE_MODE_ANY_INT to the precision > of POImode at the same time, and make that precision less than 192 > (191 maybe?). That way everything is "correct" without increasing > the size of wide_int. The 192 was chosen just to be nice and round, I guess it could be just 130 or say 160 (128 + 32). Uros, would you be ok with bumping MAX_BITSIZE_MODE_ANY_INT to 160 and changing the POImode patch to use 160 bits instead of 192 if it passes testing? I guess it would still affect the sizes of the wide-int.cc arrays: unsigned HOST_HALF_WIDE_INT u[4 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_HALF_WIDE_INT]; unsigned HOST_HALF_WIDE_INT v[4 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_HALF_WIDE_INT]; /* The '2' in 'R' is because we are internally doing a full multiply. */ unsigned HOST_HALF_WIDE_INT r[2 * 4 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_HALF_WIDE_INT]; so if we didn't want that, we would need to use 132; but those are weird anyway, because they round down rather than up. Jakub
On Thu, Jan 23, 2020 at 10:33 AM Jakub Jelinek <jakub@redhat.com> wrote: > > On Thu, Jan 23, 2020 at 09:14:42AM +0000, Richard Sandiford wrote: > > > The other patch is something suggested by Richard S., avoid using OImode > > > for this and instead use a partial int mode that is smaller. This is still > > > playing with fire because even the partial int mode is larger than > > > MAX_BITSIZE_MODE_ANY_INT, but at least its precision fits into those 3 > > > WIDE_INT_MAX_ELTS wide-int.h uses. > > > > I think we should increase MAX_BITSIZE_MODE_ANY_INT to the precision > > of POImode at the same time, and make that precision less than 192 > > (191 maybe?). That way everything is "correct" without increasing > > the size of wide_int. > > The 192 was chosen just to be nice and round, I guess it could be just 130 > or say 160 (128 + 32). > > Uros, would you be ok with bumping MAX_BITSIZE_MODE_ANY_INT to 160 > and changing the POImode patch to use 160 bits instead of 192 > if it passes testing? We can try 160 and hope that nothing breaks due to "weird" number. Uros. > I guess it would still affect the sizes of the wide-int.cc arrays: > unsigned HOST_HALF_WIDE_INT > u[4 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_HALF_WIDE_INT]; > unsigned HOST_HALF_WIDE_INT > v[4 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_HALF_WIDE_INT]; > /* The '2' in 'R' is because we are internally doing a full > multiply. */ > unsigned HOST_HALF_WIDE_INT > r[2 * 4 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_HALF_WIDE_INT]; > so if we didn't want that, we would need to use 132; but those > are weird anyway, because they round down rather than up. > > Jakub >
On Thu, Jan 23, 2020 at 10:38:31AM +0100, Uros Bizjak wrote: > On Thu, Jan 23, 2020 at 10:33 AM Jakub Jelinek <jakub@redhat.com> wrote: > > > > On Thu, Jan 23, 2020 at 09:14:42AM +0000, Richard Sandiford wrote: > > > > The other patch is something suggested by Richard S., avoid using OImode > > > > for this and instead use a partial int mode that is smaller. This is still > > > > playing with fire because even the partial int mode is larger than > > > > MAX_BITSIZE_MODE_ANY_INT, but at least its precision fits into those 3 > > > > WIDE_INT_MAX_ELTS wide-int.h uses. > > > > > > I think we should increase MAX_BITSIZE_MODE_ANY_INT to the precision > > > of POImode at the same time, and make that precision less than 192 > > > (191 maybe?). That way everything is "correct" without increasing > > > the size of wide_int. > > > > The 192 was chosen just to be nice and round, I guess it could be just 130 > > or say 160 (128 + 32). > > > > Uros, would you be ok with bumping MAX_BITSIZE_MODE_ANY_INT to 160 > > and changing the POImode patch to use 160 bits instead of 192 > > if it passes testing? > > We can try 160 and hope that nothing breaks due to "weird" number. Following passed bootstrap/regtest on x86_64-linux and i686-linux. Ok for trunk then? 2020-01-23 Jakub Jelinek <jakub@redhat.com> PR target/93376 * config/i386/i386-modes.def (POImode): New mode. (MAX_BITSIZE_MODE_ANY_INT): Change from 128 to 160. * config/i386/i386.md (DPWI): New mode attribute. (addv<mode>4, subv<mode>4): Use <DPWI> instead of <DWI>. (QWI): Rename to... (QPWI): ... this. Use POI instead of OI for TImode. (*addv<dwi>4_doubleword, *addv<dwi>4_doubleword_1, *subv<dwi>4_doubleword, *subv<dwi>4_doubleword_1): Use <QPWI> instead of <QWI>. * gcc.dg/pr93376.c: New test. --- gcc/config/i386/i386-modes.def.jj 2020-01-22 16:10:29.812837184 +0100 +++ gcc/config/i386/i386-modes.def 2020-01-23 11:43:37.931345071 +0100 @@ -107,10 +107,19 @@ INT_MODE (XI, 64); PARTIAL_INT_MODE (HI, 16, P2QI); PARTIAL_INT_MODE (SI, 32, P2HI); +/* Mode used for signed overflow checking of TImode. As + MAX_BITSIZE_MODE_ANY_INT is only 160, wide-int.h reserves only that + rounded up to multiple of HOST_BITS_PER_WIDE_INT bits in wide_int etc., + so OImode is too large. For the overflow checking we actually need + just 1 or 2 bits beyond TImode precision. Use 160 bits to have + a multiple of 32. */ +PARTIAL_INT_MODE (OI, 160, POI); + /* Keep the OI and XI modes from confusing the compiler into thinking that these modes could actually be used for computation. They are - only holders for vectors during data movement. */ -#define MAX_BITSIZE_MODE_ANY_INT (128) + only holders for vectors during data movement. Include POImode precision + though. */ +#define MAX_BITSIZE_MODE_ANY_INT (160) /* The symbol Pmode stands for one of the above machine modes (usually SImode). The tm.h file specifies which one. It is not a distinct mode. */ --- gcc/config/i386/i386.md.jj 2020-01-22 16:10:29.815837139 +0100 +++ gcc/config/i386/i386.md 2020-01-23 11:40:54.923822116 +0100 @@ -6054,15 +6054,18 @@ (define_insn "*addqi_ext_2" [(set_attr "type" "alu") (set_attr "mode" "QI")]) +;; Like DWI, but use POImode instead of OImode. +(define_mode_attr DPWI [(QI "HI") (HI "SI") (SI "DI") (DI "TI") (TI "POI")]) + ;; Add with jump on overflow. (define_expand "addv<mode>4" [(parallel [(set (reg:CCO FLAGS_REG) (eq:CCO - (plus:<DWI> - (sign_extend:<DWI> + (plus:<DPWI> + (sign_extend:<DPWI> (match_operand:SWIDWI 1 "nonimmediate_operand")) (match_dup 4)) - (sign_extend:<DWI> + (sign_extend:<DPWI> (plus:SWIDWI (match_dup 1) (match_operand:SWIDWI 2 "<general_hilo_operand>"))))) @@ -6078,7 +6081,7 @@ (define_expand "addv<mode>4" if (CONST_SCALAR_INT_P (operands[2])) operands[4] = operands[2]; else - operands[4] = gen_rtx_SIGN_EXTEND (<DWI>mode, operands[2]); + operands[4] = gen_rtx_SIGN_EXTEND (<DPWI>mode, operands[2]); }) (define_insn "*addv<mode>4" @@ -6123,17 +6126,17 @@ (define_insn "addv<mode>4_1" (const_string "<MODE_SIZE>")))]) ;; Quad word integer modes as mode attribute. -(define_mode_attr QWI [(SI "TI") (DI "OI")]) +(define_mode_attr QPWI [(SI "TI") (DI "POI")]) (define_insn_and_split "*addv<dwi>4_doubleword" [(set (reg:CCO FLAGS_REG) (eq:CCO - (plus:<QWI> - (sign_extend:<QWI> + (plus:<QPWI> + (sign_extend:<QPWI> (match_operand:<DWI> 1 "nonimmediate_operand" "%0,0")) - (sign_extend:<QWI> + (sign_extend:<QPWI> (match_operand:<DWI> 2 "x86_64_hilo_general_operand" "r<di>,o"))) - (sign_extend:<QWI> + (sign_extend:<QPWI> (plus:<DWI> (match_dup 1) (match_dup 2))))) (set (match_operand:<DWI> 0 "nonimmediate_operand" "=ro,r") (plus:<DWI> (match_dup 1) (match_dup 2)))] @@ -6172,11 +6175,11 @@ (define_insn_and_split "*addv<dwi>4_doub (define_insn_and_split "*addv<dwi>4_doubleword_1" [(set (reg:CCO FLAGS_REG) (eq:CCO - (plus:<QWI> - (sign_extend:<QWI> + (plus:<QPWI> + (sign_extend:<QPWI> (match_operand:<DWI> 1 "nonimmediate_operand" "%0")) - (match_operand:<QWI> 3 "const_scalar_int_operand" "")) - (sign_extend:<QWI> + (match_operand:<QPWI> 3 "const_scalar_int_operand" "")) + (sign_extend:<QPWI> (plus:<DWI> (match_dup 1) (match_operand:<DWI> 2 "x86_64_hilo_general_operand" "<di>"))))) @@ -6570,11 +6573,11 @@ (define_insn "*subsi_2_zext" (define_expand "subv<mode>4" [(parallel [(set (reg:CCO FLAGS_REG) (eq:CCO - (minus:<DWI> - (sign_extend:<DWI> + (minus:<DPWI> + (sign_extend:<DPWI> (match_operand:SWIDWI 1 "nonimmediate_operand")) (match_dup 4)) - (sign_extend:<DWI> + (sign_extend:<DPWI> (minus:SWIDWI (match_dup 1) (match_operand:SWIDWI 2 "<general_hilo_operand>"))))) @@ -6590,7 +6593,7 @@ (define_expand "subv<mode>4" if (CONST_SCALAR_INT_P (operands[2])) operands[4] = operands[2]; else - operands[4] = gen_rtx_SIGN_EXTEND (<DWI>mode, operands[2]); + operands[4] = gen_rtx_SIGN_EXTEND (<DPWI>mode, operands[2]); }) (define_insn "*subv<mode>4" @@ -6637,12 +6640,12 @@ (define_insn "subv<mode>4_1" (define_insn_and_split "*subv<dwi>4_doubleword" [(set (reg:CCO FLAGS_REG) (eq:CCO - (minus:<QWI> - (sign_extend:<QWI> + (minus:<QPWI> + (sign_extend:<QPWI> (match_operand:<DWI> 1 "nonimmediate_operand" "0,0")) - (sign_extend:<QWI> + (sign_extend:<QPWI> (match_operand:<DWI> 2 "x86_64_hilo_general_operand" "r<di>,o"))) - (sign_extend:<QWI> + (sign_extend:<QPWI> (minus:<DWI> (match_dup 1) (match_dup 2))))) (set (match_operand:<DWI> 0 "nonimmediate_operand" "=ro,r") (minus:<DWI> (match_dup 1) (match_dup 2)))] @@ -6679,11 +6682,11 @@ (define_insn_and_split "*subv<dwi>4_doub (define_insn_and_split "*subv<dwi>4_doubleword_1" [(set (reg:CCO FLAGS_REG) (eq:CCO - (minus:<QWI> - (sign_extend:<QWI> + (minus:<QPWI> + (sign_extend:<QPWI> (match_operand:<DWI> 1 "nonimmediate_operand" "0")) - (match_operand:<QWI> 3 "const_scalar_int_operand" "")) - (sign_extend:<QWI> + (match_operand:<QPWI> 3 "const_scalar_int_operand" "")) + (sign_extend:<QPWI> (minus:<DWI> (match_dup 1) (match_operand:<DWI> 2 "x86_64_hilo_general_operand" "<di>"))))) --- gcc/testsuite/gcc.dg/pr93376.c.jj 2020-01-23 11:40:54.923822116 +0100 +++ gcc/testsuite/gcc.dg/pr93376.c 2020-01-23 11:40:54.923822116 +0100 @@ -0,0 +1,20 @@ +/* PR target/93376 */ +/* { dg-do compile { target int128 } } */ +/* { dg-options "-Og -finline-functions-called-once" } */ + +unsigned a, b, c; + +int +bar (int x) +{ + short s = __builtin_sub_overflow (~x, 0, &b); + a = __builtin_ffsll (~x); + return __builtin_add_overflow_p (-(unsigned __int128) a, s, + (unsigned __int128) 0); +} + +void +foo (void) +{ + c = bar (0); +} Jakub
On Thu, Jan 23, 2020 at 2:17 PM Jakub Jelinek <jakub@redhat.com> wrote: > > On Thu, Jan 23, 2020 at 10:38:31AM +0100, Uros Bizjak wrote: > > On Thu, Jan 23, 2020 at 10:33 AM Jakub Jelinek <jakub@redhat.com> wrote: > > > > > > On Thu, Jan 23, 2020 at 09:14:42AM +0000, Richard Sandiford wrote: > > > > > The other patch is something suggested by Richard S., avoid using OImode > > > > > for this and instead use a partial int mode that is smaller. This is still > > > > > playing with fire because even the partial int mode is larger than > > > > > MAX_BITSIZE_MODE_ANY_INT, but at least its precision fits into those 3 > > > > > WIDE_INT_MAX_ELTS wide-int.h uses. > > > > > > > > I think we should increase MAX_BITSIZE_MODE_ANY_INT to the precision > > > > of POImode at the same time, and make that precision less than 192 > > > > (191 maybe?). That way everything is "correct" without increasing > > > > the size of wide_int. > > > > > > The 192 was chosen just to be nice and round, I guess it could be just 130 > > > or say 160 (128 + 32). > > > > > > Uros, would you be ok with bumping MAX_BITSIZE_MODE_ANY_INT to 160 > > > and changing the POImode patch to use 160 bits instead of 192 > > > if it passes testing? > > > > We can try 160 and hope that nothing breaks due to "weird" number. > > Following passed bootstrap/regtest on x86_64-linux and i686-linux. > > Ok for trunk then? > > 2020-01-23 Jakub Jelinek <jakub@redhat.com> > > PR target/93376 > * config/i386/i386-modes.def (POImode): New mode. > (MAX_BITSIZE_MODE_ANY_INT): Change from 128 to 160. > * config/i386/i386.md (DPWI): New mode attribute. > (addv<mode>4, subv<mode>4): Use <DPWI> instead of <DWI>. > (QWI): Rename to... > (QPWI): ... this. Use POI instead of OI for TImode. > (*addv<dwi>4_doubleword, *addv<dwi>4_doubleword_1, > *subv<dwi>4_doubleword, *subv<dwi>4_doubleword_1): Use <QPWI> > instead of <QWI>. > > * gcc.dg/pr93376.c: New test. LGTM. Thanks, Uros. > --- gcc/config/i386/i386-modes.def.jj 2020-01-22 16:10:29.812837184 +0100 > +++ gcc/config/i386/i386-modes.def 2020-01-23 11:43:37.931345071 +0100 > @@ -107,10 +107,19 @@ INT_MODE (XI, 64); > PARTIAL_INT_MODE (HI, 16, P2QI); > PARTIAL_INT_MODE (SI, 32, P2HI); > > +/* Mode used for signed overflow checking of TImode. As > + MAX_BITSIZE_MODE_ANY_INT is only 160, wide-int.h reserves only that > + rounded up to multiple of HOST_BITS_PER_WIDE_INT bits in wide_int etc., > + so OImode is too large. For the overflow checking we actually need > + just 1 or 2 bits beyond TImode precision. Use 160 bits to have > + a multiple of 32. */ > +PARTIAL_INT_MODE (OI, 160, POI); > + > /* Keep the OI and XI modes from confusing the compiler into thinking > that these modes could actually be used for computation. They are > - only holders for vectors during data movement. */ > -#define MAX_BITSIZE_MODE_ANY_INT (128) > + only holders for vectors during data movement. Include POImode precision > + though. */ > +#define MAX_BITSIZE_MODE_ANY_INT (160) > > /* The symbol Pmode stands for one of the above machine modes (usually SImode). > The tm.h file specifies which one. It is not a distinct mode. */ > --- gcc/config/i386/i386.md.jj 2020-01-22 16:10:29.815837139 +0100 > +++ gcc/config/i386/i386.md 2020-01-23 11:40:54.923822116 +0100 > @@ -6054,15 +6054,18 @@ (define_insn "*addqi_ext_2" > [(set_attr "type" "alu") > (set_attr "mode" "QI")]) > > +;; Like DWI, but use POImode instead of OImode. > +(define_mode_attr DPWI [(QI "HI") (HI "SI") (SI "DI") (DI "TI") (TI "POI")]) > + > ;; Add with jump on overflow. > (define_expand "addv<mode>4" > [(parallel [(set (reg:CCO FLAGS_REG) > (eq:CCO > - (plus:<DWI> > - (sign_extend:<DWI> > + (plus:<DPWI> > + (sign_extend:<DPWI> > (match_operand:SWIDWI 1 "nonimmediate_operand")) > (match_dup 4)) > - (sign_extend:<DWI> > + (sign_extend:<DPWI> > (plus:SWIDWI (match_dup 1) > (match_operand:SWIDWI 2 > "<general_hilo_operand>"))))) > @@ -6078,7 +6081,7 @@ (define_expand "addv<mode>4" > if (CONST_SCALAR_INT_P (operands[2])) > operands[4] = operands[2]; > else > - operands[4] = gen_rtx_SIGN_EXTEND (<DWI>mode, operands[2]); > + operands[4] = gen_rtx_SIGN_EXTEND (<DPWI>mode, operands[2]); > }) > > (define_insn "*addv<mode>4" > @@ -6123,17 +6126,17 @@ (define_insn "addv<mode>4_1" > (const_string "<MODE_SIZE>")))]) > > ;; Quad word integer modes as mode attribute. > -(define_mode_attr QWI [(SI "TI") (DI "OI")]) > +(define_mode_attr QPWI [(SI "TI") (DI "POI")]) > > (define_insn_and_split "*addv<dwi>4_doubleword" > [(set (reg:CCO FLAGS_REG) > (eq:CCO > - (plus:<QWI> > - (sign_extend:<QWI> > + (plus:<QPWI> > + (sign_extend:<QPWI> > (match_operand:<DWI> 1 "nonimmediate_operand" "%0,0")) > - (sign_extend:<QWI> > + (sign_extend:<QPWI> > (match_operand:<DWI> 2 "x86_64_hilo_general_operand" "r<di>,o"))) > - (sign_extend:<QWI> > + (sign_extend:<QPWI> > (plus:<DWI> (match_dup 1) (match_dup 2))))) > (set (match_operand:<DWI> 0 "nonimmediate_operand" "=ro,r") > (plus:<DWI> (match_dup 1) (match_dup 2)))] > @@ -6172,11 +6175,11 @@ (define_insn_and_split "*addv<dwi>4_doub > (define_insn_and_split "*addv<dwi>4_doubleword_1" > [(set (reg:CCO FLAGS_REG) > (eq:CCO > - (plus:<QWI> > - (sign_extend:<QWI> > + (plus:<QPWI> > + (sign_extend:<QPWI> > (match_operand:<DWI> 1 "nonimmediate_operand" "%0")) > - (match_operand:<QWI> 3 "const_scalar_int_operand" "")) > - (sign_extend:<QWI> > + (match_operand:<QPWI> 3 "const_scalar_int_operand" "")) > + (sign_extend:<QPWI> > (plus:<DWI> > (match_dup 1) > (match_operand:<DWI> 2 "x86_64_hilo_general_operand" "<di>"))))) > @@ -6570,11 +6573,11 @@ (define_insn "*subsi_2_zext" > (define_expand "subv<mode>4" > [(parallel [(set (reg:CCO FLAGS_REG) > (eq:CCO > - (minus:<DWI> > - (sign_extend:<DWI> > + (minus:<DPWI> > + (sign_extend:<DPWI> > (match_operand:SWIDWI 1 "nonimmediate_operand")) > (match_dup 4)) > - (sign_extend:<DWI> > + (sign_extend:<DPWI> > (minus:SWIDWI (match_dup 1) > (match_operand:SWIDWI 2 > "<general_hilo_operand>"))))) > @@ -6590,7 +6593,7 @@ (define_expand "subv<mode>4" > if (CONST_SCALAR_INT_P (operands[2])) > operands[4] = operands[2]; > else > - operands[4] = gen_rtx_SIGN_EXTEND (<DWI>mode, operands[2]); > + operands[4] = gen_rtx_SIGN_EXTEND (<DPWI>mode, operands[2]); > }) > > (define_insn "*subv<mode>4" > @@ -6637,12 +6640,12 @@ (define_insn "subv<mode>4_1" > (define_insn_and_split "*subv<dwi>4_doubleword" > [(set (reg:CCO FLAGS_REG) > (eq:CCO > - (minus:<QWI> > - (sign_extend:<QWI> > + (minus:<QPWI> > + (sign_extend:<QPWI> > (match_operand:<DWI> 1 "nonimmediate_operand" "0,0")) > - (sign_extend:<QWI> > + (sign_extend:<QPWI> > (match_operand:<DWI> 2 "x86_64_hilo_general_operand" "r<di>,o"))) > - (sign_extend:<QWI> > + (sign_extend:<QPWI> > (minus:<DWI> (match_dup 1) (match_dup 2))))) > (set (match_operand:<DWI> 0 "nonimmediate_operand" "=ro,r") > (minus:<DWI> (match_dup 1) (match_dup 2)))] > @@ -6679,11 +6682,11 @@ (define_insn_and_split "*subv<dwi>4_doub > (define_insn_and_split "*subv<dwi>4_doubleword_1" > [(set (reg:CCO FLAGS_REG) > (eq:CCO > - (minus:<QWI> > - (sign_extend:<QWI> > + (minus:<QPWI> > + (sign_extend:<QPWI> > (match_operand:<DWI> 1 "nonimmediate_operand" "0")) > - (match_operand:<QWI> 3 "const_scalar_int_operand" "")) > - (sign_extend:<QWI> > + (match_operand:<QPWI> 3 "const_scalar_int_operand" "")) > + (sign_extend:<QPWI> > (minus:<DWI> > (match_dup 1) > (match_operand:<DWI> 2 "x86_64_hilo_general_operand" "<di>"))))) > --- gcc/testsuite/gcc.dg/pr93376.c.jj 2020-01-23 11:40:54.923822116 +0100 > +++ gcc/testsuite/gcc.dg/pr93376.c 2020-01-23 11:40:54.923822116 +0100 > @@ -0,0 +1,20 @@ > +/* PR target/93376 */ > +/* { dg-do compile { target int128 } } */ > +/* { dg-options "-Og -finline-functions-called-once" } */ > + > +unsigned a, b, c; > + > +int > +bar (int x) > +{ > + short s = __builtin_sub_overflow (~x, 0, &b); > + a = __builtin_ffsll (~x); > + return __builtin_add_overflow_p (-(unsigned __int128) a, s, > + (unsigned __int128) 0); > +} > + > +void > +foo (void) > +{ > + c = bar (0); > +} > > > Jakub >
--- gcc/wide-int.cc.jj 2020-01-12 11:54:38.535381394 +0100 +++ gcc/wide-int.cc 2020-01-22 12:05:06.739486117 +0100 @@ -1155,8 +1155,14 @@ wi::add_large (HOST_WIDE_INT *val, const if (len * HOST_BITS_PER_WIDE_INT < prec) { - val[len] = mask0 + mask1 + carry; - len++; + HOST_WIDE_INT val_top = mask0 + mask1 + carry; + /* Don't extend len unnecessarily when canonize would shrink it + again immediately. */ + if (SIGN_MASK (val[len - 1]) != val_top) + { + val[len] = val_top; + len++; + } if (overflow) *overflow = (sgn == UNSIGNED && carry) ? wi::OVF_OVERFLOW : wi::OVF_NONE; @@ -1566,8 +1572,14 @@ wi::sub_large (HOST_WIDE_INT *val, const if (len * HOST_BITS_PER_WIDE_INT < prec) { - val[len] = mask0 - mask1 - borrow; - len++; + HOST_WIDE_INT val_top = mask0 - mask1 - borrow; + /* Don't extend len unnecessarily when canonize would shrink it + again immediately. */ + if (SIGN_MASK (val[len - 1]) != val_top) + { + val[len] = val_top; + len++; + } if (overflow) *overflow = (sgn == UNSIGNED && borrow) ? OVF_UNDERFLOW : OVF_NONE; } --- gcc/testsuite/gcc.dg/pr93376.c.jj 2020-01-22 12:16:40.231937796 +0100 +++ gcc/testsuite/gcc.dg/pr93376.c 2020-01-22 12:16:23.953190057 +0100 @@ -0,0 +1,20 @@ +/* PR target/93376 */ +/* { dg-do compile { target int128 } } */ +/* { dg-options "-Og -finline-functions-called-once" } */ + +unsigned a, b, c; + +int +bar (int x) +{ + short s = __builtin_sub_overflow (~x, 0, &b); + a = __builtin_ffsll (~x); + return __builtin_add_overflow_p (-(unsigned __int128) a, s, + (unsigned __int128) 0); +} + +void +foo (void) +{ + c = bar (0); +}